Information Retrieval
Spring 2013
2:10 ~5:00 PM, Mondays
Instructor:
Prof. Berlin Chen (陳柏琳)

Tentative List of Topics:

02/18

Course Overview & Introduction

cf. Modern Information Retrieval, Ch. 1
     Paper: The History of Information Retrieval Research
02/25   Classical Models cf. Modern Information Retrieval, Ch.3
03/04   Classical Models Experimental Corpus
03/11   Evaluation Metrics cf. Modern Information Retrieval, Ch.4
HW#1: Evaluations for IR (Due3/25)
03/18   Benchmark Collections
Extensions of Classic (Set, Algebra & Probabilistic) Models
cf. Modern Information Retrieval, Ch.3
03/25   Relevance Feedback and Query Expansion HW#2: Classic Retrieval Models (Due 4/8)
04/01   Relevance Feedback and Query Expansion HW#3: Query Expansion and Term Weighting (Due 4/22)
04/08   Latent Semantic Analysis  
04/15   Language Modeling for Information Retrieval  
04/22   Language Modeling for Information Retrieval HW#4: LM for IR (Due 5/13)
04/29   Clustering: Metrics and Techniques HW#5: LSA for IR (Due 5/27)
05/06   Clustering: Metrics and Techniques  
05/13   Programming Exercise (HW#4 and HW#5)  
05/20   Indexing and Searching HW#6: Clustering (Due 6/10)
[e.g., leveraging PLSA or document clustering techniques
to enhance the performance of LM for IR]
05/27   Midterm  
06/03   Paper Presentations:
鄭舜宸: Personalized Diversification of Search Results (SIGIR 2012)
陳黃威: Positional Language Models for Information Retrieval (SIGIR 2009)
葉懿萱: Domain Dependent Query Reformulation for Web Search (CIKM 2012)
謝欣汝: Social Annotation in Query Expansion: a Machine Learning Approach (SIGIR 2011)
張崴: One Seed to Find Them All: Mining Opinion Features via Association (CIKM 2012)
 
06/10   Bamfa Ceesay: Discriminative Models for Information Retrieval (SIGIR 2004)
黃楨喻: Adaptive Query Suggestion for Difficult Queries (SIGIR 2012)
郭家祥: Travel Route Recommendation Using Geotags in Photo Sharing Sites (CIKM 2010)
謝景然: Improving Retrieval of Short Texts Through Document Expansion (SIGIR 2012)
 
06/17   User Interfaces for Search
Web Search Basics
 

Textbooks: 

R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search (2nd Edition), ACM Press, 2011

Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press, 2008
W. Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, Addison Wesley, 2009

References:
 
Books:

C. C. Aggarwal, ,C.X. Zhai (eds.), Mining Text Data, Springer, 2012.
W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures & Algorithms,  Prentice-Hall, 1992.
C.X. Zhai, Statistical Language Models for Information Retrieval (Synthesis Lectures Series on Human Language Technologies), Morgan & Claypool Publishers, 2008)
W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures & Algorithms,  Prentice-Hall, 1992.

T. K. Landauer, D. S. McNamara, S. Dennis, W. Kintsch (eds.) , Handbook of Latent Semantic Analysis, Lawrence Erlbaum, 2007
D. A. Grossman, O. Frieder, Information Retrieval: Algorithms and Heuristics, Springer, 2004.
 I. H. Witten, A. Moffat, and T. C. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann Publishing, 1999.
C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
D. Jurafsky and J. H. Martin, Speech and Language Processing, Prentice-Hall, 2000.
W.B. Croft and J. Lafferty (eds.), Language Models for Information Retrieval, Kluwer International Series on Information Retrieval, Volume 13, Kluwer Academic Publishers, 2002.
Stephen Robertson and Hugo Zaragoza, The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval 3 no. 4, 333-389 (2009).

.

Papers:

 M. Sanderson and W. B. Croft, "The history of information retrieval research," Proceedings of the IEEE, Vol. 100, pp. 1444 - 1451, May 2012.
O. Kolomiyets, M.-F. Moens, "A survey on question answering technology from an information retrieval perspective," Information Sciences 181 (2011) 5412–5434
Johan Schalkwyk et al., "Google Search by Voice: A case study," 2010.
D. Blei, A. Ng, and M. Jordan, "Latent Dirichlet allocation,"  Journal of Machine Learning Research, 3:993-1022, January 2003.
V. Lavrenko and W.B. Croft, "Relevance-Based Language Models"  ACM SIGIR 2001.
C. H. Papadimitriou, P. Raghavan, H. Tamaki, S. Vempala, "Latent semantic indexing: A probabilistic analysis,'' analyzes an information retrieval technique related to principle components analysis.
Liu, X. and Croft, W.B., "Statistical Language Modeling For Information Retrieval,"  the Annual Review of Information Science and Technology, vol. 39, 2005
Lan Huang. A Survey On Web Information Retrieval Technologies. 2000.
Karen Spa¨rck Jones, "Some Points in a Time," Computational Linguistics, Vol. 31, No. 1, 2005.
D. Hiemstra, "Information Retrieval Model," In: A. Goker, J. Davies, and M. Graham (eds.), Information Retrieval: Searching in the 21st Century, Wiley, 2009
M. Steyvers, T. Griffiths,  "Probabilistic Topic Models," In T. K. Landauer, D. S. McNamara, S. Dennis, W. Kintsch (eds.). Handbook of Latent Semantic Analysis, Mahwah NJ: Lawrence Erlbaum, 2007.
X. Yi, J. Allan,  "A Comparative Study of Utilizing Topic Models for Information Retrieval," in the Proceedings of ECIR'09.
Nallapati, Discriminative Models for Information Retrieval, in the Proceedings of SIGIR 2004
T. Joachims and F. Radlinski, Search Engines that Learn from Implicit Feedback, IEEE Trans. on Computer 40(8), pp. 34-40, 2007
B. Chen, H.M. Wang, L.S. Lee, “A discriminative HMM/N-gram-based retrieval approach for Mandarin spoken documents,” ACM Transactions on Asian Language Information Processing, Vol. 3, No. 2, pp. 128-145, June 2004.

 

Information Retrieval Resources

            SIGIR-Information Retrieval Resources