Information Retrieval
Spring 2012
2:10 ~5:00 PM, Mondays
Instructor:
Prof. Berlin Chen (陳柏琳)

Tentative List of Topics:

02/20

Course Overview & Introduction

cf. Modern Information Retrieval, Ch. 1
02/27   Break  
03/05   Classical Models cf. Modern Information Retrieval, Ch.3
03/12   Evaluation Metrics cf. Modern Information Retrieval, Ch.4
03/19   Benchmark Collections cf. Modern Information Retrieval, Ch.4
03/26   Exercise (Evaluation Metrics)  
04/02   Extensions of Classic (Set, Algebra & Probabilistic) Models cf. Modern Information Retrieval, Ch.3
HW-2: Retrieval Models
(Due 4/16)
04/09   Relevance Feedback and Query Expansion HW-3: Relevance Feedback (Due 4/30)
04/16   Relevance Feedback and Query Expansion  
04/23   Latent Semantic Analysis HW-4: Latent Semantic Analysis (Due 5/18)
04/30   Midterm  
05/07   Language Models for Information Retrieval  
05/14   Language Modeling for Information Retrieval  
05/21   User Interfaces for Search  
05/28   Indexing and Searching HW-5: Language Modeling for IR (Due 6/11)
06/04   Paper Presentations:
劉爾剛: Finding Relevant Information of Certain Types from Enterprise Data (CIKM 2011)
邱文寬: An Agenda for Green Information Retrieval (IP&M 2012)
陳昱年: Learning Word Vectors for Sentiment Analysis (ACL 2011)
邱俊嘉: Interactive Sense Feedback for Difficult Queries (CIKM 2011)
蔡秉翰: Phrase-Based Translation Model for Question Retrieval in Community (ACL 2011)
邱奕智: Collaborative Filtering in Social Tagging Systems Based on Joint Item-Tag Recommendations (CIKM 2010)
汪逸婷: Quantifying Test Collection Quality Based on the Consistency of Relevance Judgements (SIGIR 2011)
 
06/11   Paper Presentations:
陳俊諭: User Behavior in Zero-Recall eCommerce Queries (SIGIR 2011)
李承翰: “Then Click OK!” Extracting References to Interface Elements in Online Documentation (CHI 2012)
洪孝宗: A Class of Submodular Functions for Document Summarization (ACL 2011)
蔡育霖: Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments (ACL 2011)
郝柏翰: CRTER: Using Cross Terms to Enhance Probabilistic Information Retrieval (SIGIR 2011)
蔡麟傑: A Boosting Approach to Improving Pseudo-Relevance Feedback (SIGIR 2011)
 
    Web Search Basics  

Textbooks: 

R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search (2nd Edition), ACM Press, 2011

Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press, 2008
W. Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, Addison Wesley, 2009

References:
 
Books:

C. C. Aggarwal, ,C.X. Zhai (eds.), Mining Text Data, Springer, 2012.
W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures & Algorithms,  Prentice-Hall, 1992.
C.X. Zhai, Statistical Language Models for Information Retrieval (Synthesis Lectures Series on Human Language Technologies), Morgan & Claypool Publishers, 2008)
W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures & Algorithms,  Prentice-Hall, 1992.

T. K. Landauer, D. S. McNamara, S. Dennis, W. Kintsch (eds.) , Handbook of Latent Semantic Analysis, Lawrence Erlbaum, 2007
D. A. Grossman, O. Frieder, Information Retrieval: Algorithms and Heuristics, Springer, 2004.
 I. H. Witten, A. Moffat, and T. C. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann Publishing, 1999.
C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
D. Jurafsky and J. H. Martin, Speech and Language Processing, Prentice-Hall, 2000.
W.B. Croft and J. Lafferty (eds.), Language Models for Information Retrieval, Kluwer International Series on Information Retrieval, Volume 13, Kluwer Academic Publishers, 2002.
Stephen Robertson and Hugo Zaragoza, The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval 3 no. 4, 333-389 (2009).

.

Papers:

 M. Sanderson and W. B. Croft, "The history of information retrieval research," Proceedings of the IEEE, Vol. 100, pp. 1444 - 1451, May 2012.
O. Kolomiyets, M.-F. Moens, "A survey on question answering technology from an information retrieval perspective," Information Sciences 181 (2011) 5412–5434
Johan Schalkwyk et al., "Google Search by Voice: A case study," 2010.
D. Blei, A. Ng, and M. Jordan, "Latent Dirichlet allocation,"  Journal of Machine Learning Research, 3:993-1022, January 2003.
V. Lavrenko and W.B. Croft, "Relevance-Based Language Models"  ACM SIGIR 2001.
C. H. Papadimitriou, P. Raghavan, H. Tamaki, S. Vempala, "Latent semantic indexing: A probabilistic analysis,'' analyzes an information retrieval technique related to principle components analysis.
Liu, X. and Croft, W.B., "Statistical Language Modeling For Information Retrieval,"  the Annual Review of Information Science and Technology, vol. 39, 2005
Lan Huang. A Survey On Web Information Retrieval Technologies. 2000.
Karen Spa¨rck Jones, "Some Points in a Time," Computational Linguistics, Vol. 31, No. 1, 2005.
D. Hiemstra, "Information Retrieval Model," In: A. Goker, J. Davies, and M. Graham (eds.), Information Retrieval: Searching in the 21st Century, Wiley, 2009
M. Steyvers, T. Griffiths,  "Probabilistic Topic Models," In T. K. Landauer, D. S. McNamara, S. Dennis, W. Kintsch (eds.). Handbook of Latent Semantic Analysis, Mahwah NJ: Lawrence Erlbaum, 2007.
X. Yi, J. Allan,  "A Comparative Study of Utilizing Topic Models for Information Retrieval," in the Proceedings of ECIR'09.
Nallapati, Discriminative Models for Information Retrieval, in the Proceedings of SIGIR 2004
T. Joachims and F. Radlinski, Search Engines that Learn from Implicit Feedback, IEEE Trans. on Computer 40(8), pp. 34-40, 2007
B. Chen, H.M. Wang, L.S. Lee, “A discriminative HMM/N-gram-based retrieval approach for Mandarin spoken documents,” ACM Transactions on Asian Language Information Processing, Vol. 3, No. 2, pp. 128-145, June 2004.

 

Information Retrieval Resources

            SIGIR-Information Retrieval Resources