CS6007 INFORMATION RETRIEVAL SYLLABUS FOR 7TH SEM CSE REGULATION 2013 - Anna University Internal marks 2018

CS6007 INFORMATION RETRIEVAL SYLLABUS FOR 7TH SEM CSE REGULATION 2013

ANNA UNIVERSITY CSE SYLLABUS
CS6007 INFORMATION RETRIEVAL SYLLABUS
7TH SEM CSE SYLLABUS
REGULATION 2013
CS6007 INFORMATION RETRIEVAL SYLLABUS
CS6007 INFORMATION RETRIEVAL SYLLABUS
OBJECTIVES:
The Student should be made to:
-> Learn the information retrieval models.
-> Be familiar with Web Search Engine.
-> Be exposed to Link Analysis.
-> Understand Hadoop and Map Reduce.
-> Learn document text mining techniques.
 
UNIT I INTRODUCTION
Introduction -History of IR- Components of IR - Issues –Open source Search engine Frameworks - The impact of the web on IR - The role of artificial intelligence (AI) in IR – IR Versus Web Search - Components of a Search engine- Characterizing the web.
 
UNIT II INFORMATION RETRIEVAL
Boolean and vector-space retrieval models- Term weighting - TF-IDF weighting- cosine similarity – Preprocessing - Inverted indices - efficient processing with sparse vectors – Language Model based IR - Probabilistic IR –Latent Semantic Indexing - Relevance feedback and query expansion.
 
UNIT III WEB SEARCH ENGINE – INTRODUCTION AND CRAWLING
Web search overview, web structure, the user, paid placement, search engine optimization/ spam. Web size measurement - search engine optimization/spam – Web Search Architectures - crawling - meta-crawlers- Focused Crawling - web indexes –- Near-duplicate detection - Index Compression - XML retrieval.

UNIT IV WEB SEARCH – LINK ANALYSIS AND SPECIALIZED SEARCH
Link Analysis –hubs and authorities – Page Rank and HITS algorithms -Searching and Ranking – Relevance Scoring and ranking for Web – Similarity - Hadoop & Map Reduce - Evaluation -
Personalized search - Collaborative filtering and content-based recommendation of documents and products – handling “invisible” Web - Snippet generation, Summarization, Question Answering, Cross- Lingual Retrieval.
 
UNIT V DOCUMENT TEXT MINING
Information filtering; organization and relevance feedback – Text Mining -Text classification and
clustering - Categorization algorithms: naive Bayes; decision trees; and nearest neighbor -  Clustering algorithms: agglomerative clustering; k-means; expectation maximization (EM).
 
TOTAL: 45 PERIODS
 
OUTCOMES:

Upon completion of the course, students will be able to
-> Apply information retrieval models.
-> Design Web Search Engine.
-> Use Link Analysis.
-> Use Hadoop and Map Reduce.
-> Apply document text mining techniques.
 
TEXT BOOKS:
1. C. Manning, P. Raghavan, and H. Sch├╝tze, Introduction to Information Retrieval , Cambridge University Press, 2008.
2. Ricardo Baeza -Yates and Berthier Ribeiro - Neto, Modern Information Retrieval: The Concepts and Technology behind Search 2 nd Edition, ACM Press Books 2011.
3. Bruce Croft, Donald Metzler and Trevor Strohman, Search Engines: Information Retrieval in Practice, 1 st Edition Addison Wesley, 2009.
4. Mark Levene, An Introduction to Search Engines and Web Navigation, 2 nd Edition Wiley, 2010.
 
REFERENCES:
1. Stefan Buettcher, Charles L. A. Clarke, Gordon V. Cormack, Information Retrieval: Implementing and Evaluating Search Engines, The MIT Press, 2010.
2. Ophir Frieder “Information Retrieval: Algorithms and Heuristics: The Information Retrieval Series “, 2 nd Edition, Springer, 2004.
3. Manu Konchady, “Building Search Applications: Lucene, Ling Pipe”, and First Edition, Gate Mustru Publishing, 2008.

2 comments: