Information Retrieval and Extraction
Fall 2004
Thursdays, 9:10 ~12:00 AM
Instructor:
Berlin Chen

Homework Webpage

Topic List and Schedule

9/23
 
  Course Overview & Introduction
 
 
 9/30

 
  Retrieval Models (I) - Classic Retrieval Models (Boolean, Vector Space and Probabilistic Models)
 
 
10/7
 
Break (ICSLP2004, Jeju island)
 
10/14
 
Retrieval Performance Evaluation (I) - Measures
 
HW-01: (Due 10/28)
IR Performance Evaluation
10/21
 
Retrieval Performance Evaluation (II) - Reference Collections
 
10/28
 
Retrieval Models (II) - Improved Approaches (Fuzzy Set, Extended Boolean, Generalized Vector Space Models)
 
HW-02: (Due 11/11)
Classical Retrieval Models
1. Vector Space Model
2. Probabilistic Model
11/4
 
Query Operations (Query Expansion and Term Re-weighting)
 
11/11
 
Retrieval Models (III) - Statistical Modeling Approaches (HMM/N-Gram: Language Model Approach )
 
11/18
 
Midterm 
 
HW-03: (Due 11/20)
Term Weighting & Query Expansion
11/25
 
Retrieval Models (III) - Statistical Modeling Approaches (TMM: Topical Mixture Model)
 
HW-04: (Due 12/9)
HMM/N-gram-based Retrieval Model
12/2
 
Retrieval Models (III) - Statistical Modeling Approaches (LSA, PLSA)
 
12/9
 
Text Clustering  &  LSA Homework Description

 
12/16
 
Retrieval Models (IV) - Structural Retrieval Models and Browsing Models
 
HW-05: (Due 12/23)
LSA Retrieval Model
12/23
 
Paper Survey - SIGIR 2004 (I)
 張黎文  Configurable indexing and ranking for XML information retrieval
 劉士弘  Language-Specific Models in Multilingual Topic Tracking
 陳怡婷  Discriminative Models for Information Retrieval
 
12/30


 
Paper Survey - SIGIR 2004 (I)
 陳燦輝  Polynomial filtering in latent semantic indexing for information retrieval
 張佑傑  Search Strategies in Content-based Image Retrieval
 邱炫盛  Parsimonious Language Models for Information Retrieval
 
1/6
 

 
Query LanguagesText Languages and Text Statistics
 
HW-06: (Due 1/30) Optional: Extra Bonus +5
Web Search Engine using Inverted Files
1/13
 
Text PreprocessingIndexing and Searching
 
1/27 Final
Text Categorization, Text Summarization、Information Extraction
(To be discussed in the Natural Language Processing course offered in the next semester)

Textbook: 

1.
 
R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Addison Wesley Longman, 1999.
 
2.
 
W. B. Croft and J. Lafferty (Editors). Language Modeling for Information Retrieval. Kluwer-Academic Publishers, July 2003.
 

References:
 
Books:

1. W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures & Algorithms,  Prentice-Hall, 1992.
2. A. D. Bimbo, "Visual Information Retrieval", Morgan Kaufmann, 1999.
3.
 
 I. H. Witten, A. Moffat, and T. C. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann Publishing, 1999.
4. C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
5. D. Jurafsky and J. H. Martin, Speech and Language Processing, Prentice-Hall, 2000.

Papers:

Grading:
     1. Final: 20%
     2. Presentations 20%
     3. Homework: 20%
     4. Project: 25%
     5. Attendance/Other: 15%