Information Retrieval and Extraction
Fall 2007
Fridays, 9:10 ~12:00 AM
Instructor:
Dr. Berlin Chen (陳柏琳)

 

        Homework Page

Tentative Topic List:

09/21

Course Overview & Introduction

 
09/28   Retrieval Models (I) - Classic Retrieval Models (Boolean, Vector Space and Probabilistic Models)  
10/05 Retrieval Performance Evaluation - Measures HW-1 (Due 10/31)
10/12 Retrieval Performance Evaluation - Collections
10/19 Retrieval Models (II) - Improved Approaches (Fuzzy Set, Extended Boolean, Generalized Vector Space Models)
10/26 Query Operations (Query Expansion and Term Re-weighting) HW-2 (Due 11/30)
11/02 Retrieval Models (III) - Latent Semantic Analysis (LSA)
11/09 Retrieval Models (IV) - Language Models
11/16 Break (Taipei Programming Contest)
11/23 Retrieval Models (IV) - Language Models
11/30 School Sports Day
12/07 Clustering for Information Retrieval
12/14 Clustering for Information Retrieval
12/21 Indexing and Searching
12/28 Spoken Document Recognition, Retrieval and Summarization
01/04 Web Search Basics Project (Due 01/25)
01/11 Final

Textbook: 

1.
 
R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Addison Wesley Longman, 1999.
2.
 
D. A. Grossman, O. Frieder, Information Retrieval: Algorithms and Heuristics, Springer. 2004.
3.
 
Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008.

References:
 
Books:

1. W.B. Croft and J. Lafferty (eds), Language Models for Information Retrieval, Kluwer International Series on Information Retrieval, Volume 13, Kluwer Academic Publishers, 2002.
2. W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures & Algorithms,  Prentice-Hall, 1992.
3.
 
A. D. Bimbo, "Visual Information Retrieval", Morgan Kaufmann, 1999.
4.  I. H. Witten, A. Moffat, and T. C. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann Publishing, 1999.
5. C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
6. D. Jurafsky and J. H. Martin, Speech and Language Processing, Prentice-Hall, 2000.

Papers:

1. D. Blei, A. Ng, and M. Jordan, "Latent Dirichlet allocation,"  Journal of Machine Learning Research, 3:993-1022, January 2003.
2. V. Lavrenko and W.B. Croft, "Relevance-Based Language Models"  ACM SIGIR 2001.
3. C. H. Papadimitriou, P. Raghavan, H. Tamaki, S. Vempala, "Latent semantic indexing: A probabilistic analysis,'' analyzes an information retrieval technique related to principle components analysis.
4. Liu, X. and Croft, W.B., "Statistical Language Modeling For Information Retrieval,"  the Annual Review of Information Science and Technology, vol. 39, 2005
5. Lan Huang. A Survey On Web Information Retrieval Technologies. 2000.

 

TA: 朱芳輝同學

– E-mail: g94470144@mail.csie.ntnu.edu.tw

– Tel: 29322411ext 208 (資工系208)