A SYSTEM AND A METHOD FOR DOMAIN-BASED WORD EMBEDDING MODEL FOR TELUGU LANGUAGE

Fecha de publicación: 04/11/2022
Fuente: WIPO "apiculture"
ABSTRACTA SYSTEM AND A METHOD FOR DOMAIN-BASED WORD EMBEDDING MODEL FOR TELUGU LANGUAGEThe present disclosure discloses a system (100) and a method (200) for a domain-based word embedding model for Telugu language. The system (100) comprises a an input module (104) to receive a Telugu corpus; a preprocessing module (106) to extract most frequently used words in the Telugu corpus by eliminating linguistic specific characteristics of the Telugu language using the machine learning models; an inflection processing module (108) to prune inflectional suffixes from extracted words; a frequency analyzer module (110) to analyze the distribution of the pruned words in the received Telugu corpus using the machine learning models; and a word embedding module (112) process the analyzed distribution of the pruned words by means of performance parameter for building a domain-based word embedding model for Telugu language.