Information Retrieval
Spring 2016
9:10 ~12:10 AM, Fridays
Instructor:
Prof. Berlin Chen (陳柏琳)

Tentative List of Topics:

02/26

Course Overview & Introduction

Book Chapter: Modern Information Retrieval, Ch. 1
Paper: The History of Information Retrieval Research
03/04   Classic Models cf. Modern Information Retrieval, Ch.3
03/11   Retrieval Evaluation cf. Modern Information Retrieval, Ch.4
Homework #1 :Evaluation Metrics for IR
Homework #2 : Retrieval Models
03/25   Benchmark Collections  
04/08   Extensions of Classical (Set, Algebra & Probabilistic) Models  
04/15   Relevance Feedback and Query Expansion Homework #3 : Query Expansion and Relevance Feedback
04/22   Latent Semantic Analysis Homework #4 : LSA for IR
04/29   Language Modeling for Information Retrieval  
05/06   Clustering: Metrics and Techniques  
05/13   Clustering: Metrics and Techniques  
05/20   Indexing And Searching Homework #5 : Efficiently Indexing for IR
    Paper Presentations
05/27

黃家儀 (CIKM 2015) Protecting Your Children from Inappropriate Content in Mobile Apps: An Automatic Maturity Rating Framework
蔡淳伊 (SIGIR 2015) Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks
吳佳樺 (ICML 2006) The Relationship Between Precision-Recall and ROC Curves
臧之瑄 (CIKM 2015) Short Text Similarity with Word Embeddings
胡全燊 (IJCAI 2013) Persistent Homology: An Introduction and a New Text Representation for Natural Language Processing
06/03
陳映文 (SIGIR 2015) Image-based Recommendations on Styles and Substitutes
蔡謹安 (KDD 2015) Gender and Interest Targeting for Sponsored Post Advertising at Tumblr
林奕儒 (arxiv 2015) Combining temporal and content aware features for microblog retrieval
許宸瑋 (CIKM 2015) Rank by Time or by Relevance? Revisiting Email Search
李怡慧 (WSDM 2016) Towards Modelling Language Innovation Acceptance in Online Social Networks
楊登堯 (ICSE 2012) Where Should the Bugs Be Fixed? More Accurate Information Retrieval-Based Bug Localization Based on Bug Reports
歐陽亦凡 (ICAICTA 2015) Combining temporal and content aware features for microblog retrieval
張庭韶 (BigComp 2016) Bagging-Based Active Learning Model for Named Entity Recognition with Distant Supervision
邱琬琇 (SIIE 2015) Neural Networks for Proper Name Retrieval in the Framework of Automatic Speech Recognition.
袁儀齡 (WSDM 2016) Semantic Documents Relatedness using Concept Graph Representation
簡少凡 (CIKM 2015) Organic or Organized? Exploring URL Sharing Behavior
石敬弘 (ICML 2015) Learning Word Representations with Hierarchical Sparse Coding
顏必成 (IEEE HIS) Using NMF-based Text Summarization to Improve Supervised and Unsupervised Classification
    User Interface for Search  
    Web Search Basics  
    Brief Overview of Automatic Summarization  
    Brief Overview of Text Readability Assessment  

Textbooks: 

R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search (2nd Edition), ACM Press, 2011

Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press, 2008
W. Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, Addison Wesley, 2009

References:

C. C. Aggarwal, ,C.X. Zhai (eds.), Mining Text Data, Springer, 2012.
W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures & Algorithms,  Prentice-Hall, 1992.
C.X. Zhai, Statistical Language Models for Information Retrieval (Synthesis Lectures Series on Human Language Technologies), Morgan & Claypool Publishers, 2008)
W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures & Algorithms,  Prentice-Hall, 1992.

T. K. Landauer, D. S. McNamara, S. Dennis, W. Kintsch (eds.) , Handbook of Latent Semantic Analysis, Lawrence Erlbaum, 2007
D. A. Grossman, O. Frieder, Information Retrieval: Algorithms and Heuristics, Springer, 2004.
 I. H. Witten, A. Moffat, and T. C. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann Publishing, 1999.
C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
D. Jurafsky and J. H. Martin, Speech and Language Processing, Prentice-Hall, 2000.
W.B. Croft and J. Lafferty (eds.), Language Models for Information Retrieval, Kluwer International Series on Information Retrieval, Volume 13, Kluwer Academic Publishers, 2002.
Stephen Robertson and Hugo Zaragoza, The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval 3 no. 4, 333-389 (2009).
D. Carmel and E. Yom-Tov , "Estimating the Query Difficulty for Information Retrieval," Synthesis Lectures on Information Concepts, Retrieval, and Services, Morgan & Claypool Publishers, 2010.
Juan-Manuel Torres-Moreno , "Automatic Text Summarization," Wiley-ISTE, 2014.

Papers:

M. Sanderson and W. B. Croft, "The history of information retrieval research," Proceedings of the IEEE, Vol. 100, pp. 1444 - 1451, May 2012.
O. Kolomiyets, M.-F. Moens, "A survey on question answering technology from an information retrieval perspective," Information Sciences 181 (2011) 5412–5434
Johan Schalkwyk et al., "Google Search by Voice: A case study," 2010.
D. Blei, A. Ng, and M. Jordan, "Latent Dirichlet allocation,"  Journal of Machine Learning Research, 3:993-1022, January 2003.
V. Lavrenko and W.B. Croft, "Relevance-Based Language Models"  ACM SIGIR 2001.
C. H. Papadimitriou, P. Raghavan, H. Tamaki, S. Vempala, "Latent semantic indexing: A probabilistic analysis,'' analyzes an information retrieval technique related to principle components analysis.
Liu, X. and Croft, W.B., "Statistical Language Modeling For Information Retrieval,"  the Annual Review of Information Science and Technology, vol. 39, 2005
Lan Huang. A Survey On Web Information Retrieval Technologies. 2000.
Karen Spa¨rck Jones, "Some Points in a Time," Computational Linguistics, Vol. 31, No. 1, 2005.
D. Hiemstra, "Information Retrieval Model," In: A. Goker, J. Davies, and M. Graham (eds.), Information Retrieval: Searching in the 21st Century, Wiley, 2009
M. Steyvers, T. Griffiths,  "Probabilistic Topic Models," In T. K. Landauer, D. S. McNamara, S. Dennis, W. Kintsch (eds.). Handbook of Latent Semantic Analysis, Mahwah NJ: Lawrence Erlbaum, 2007.
X. Yi, J. Allan,  "A Comparative Study of Utilizing Topic Models for Information Retrieval," in the Proceedings of ECIR'09.
Nallapati, Discriminative Models for Information Retrieval, in the Proceedings of SIGIR 2004
T. Joachims and F. Radlinski, Search Engines that Learn from Implicit Feedback, IEEE Trans. on Computer 40(8), pp. 34-40, 2007
B. Chen, H.M. Wang, L.S. Lee, “A discriminative HMM/N-gram-based retrieval approach for Mandarin spoken documents,” ACM Transactions on Asian Language Information Processing, Vol. 3, No. 2, pp. 128-145, June 2004.

 

Information Retrieval Resources

            SIGIR-Information Retrieval Resources