CP7018 LANGUAGE TECHNOLOGIES SYLLABUS
M.E. COMPUTER SCIENCE AND ENGINEERING
SEMESTER II
OBJECTIVES:
To understand the mathematical foundations needed for language processing
To understand the representation and processing of Morphology and Part-of Speech Taggers
To understand different aspects of natural language syntax and the various methods used for processing syntax
To understand different methods of disambiguating word senses
To know about various applications of natural language processing
To learn the indexing and searching processes of a typical information retrieval system and to study NLP based retrieval systems
To gain knowledge about typical text categorization and clustering techniques
UNIT I INTRODUCTION
Natural Language Processing – Mathematical Foundations – Elementary Probability Theory – Essential information Theory - Linguistics Essentials - Parts of Speech and Morphology – Phrase Structure – Semantics – Corpus Based Work.
UNIT II WORDS
Collocations – Statistical Inference – n-gram Models – Word Sense Disambiguation – Lexical Acquisition.
UNIT III GRAMMAR
Markov Models – Part-of-Speech Tagging – Probabilistic Context Free Grammars - Parsing.
UNIT IV INFORMATION RETRIEVAL
Information Retrieval Architecture – Indexing - Storage – Compression Techniques – Retrieval Approaches – Evaluation - Search Engines - Commercial Search Engine Features – Comparison - Performance Measures – Document Processing - NLP based Information Retrieval – Information Extraction.
UNIT V TEXT MINING
Categorization – Extraction Based Categorization – Clustering - Hierarchical Clustering - Document Classification and Routing - Finding and Organizing Answers from Text Search – Text Categorization and Efficient Summarization using Lexical Chains – Machine Translation - Transfer Metaphor - Interlingual and Statistical Approaches.
OUTCOMES:
Upon completion of the course, the students will be able to
Identify the different linguistic components of given sentences
Design a morphological analyser for a language of your choice using finite state automata
concepts
Implement a parser by providing suitable grammar and words
Discuss algorithms for word sense disambiguation
Build a tagger to semantically tag words using WordNet
Design an application that uses different aspects of language processing.
REFERENCES:
1. Christopher D.Manning and Hinrich Schutze, “ Foundations of Statistical Natural Language Processing “, MIT Press, 1999.
2. Daniel Jurafsky and James H. Martin, “ Speech and Language Processing” , Pearson, 2008.
3. Ron Cole, J.Mariani, et.al “Survey of the State of the Art in Human Language Technology”,cambridge University Press, 1997.
4. Michael W. Berry, “ Survey of Text Mining: Clustering, Classification and Retrieval”, Springer Verlag, 2003.
No comments:
Post a Comment