CSCI B656 Web Mining (3 CR)


Syllabus | Schedule | Projects | Resources | Google Group
Short description Machine learning techniques to mine the Web and other unstructured/semistructured, hypertextual, distributed information repositories. Crawling, indexing, ranking and filtering algorithms using text and link analysis. Applications to search, classification, tracking, monitoring, and Web intelligence. Group project on one of the topics covered in class.
Prerequisites This course is open to CS, Informatics, SLIS, CogSci, and other graduate students with an interest in information systems, artificial intelligence, and the Web. Although prior exposure to machine learning algorithms, information retrieval, and/or Web programming is helpful, there are no advanced AI or DB prerequisites.
Textbook Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data by Bing Liu (with a chapter on crawling by yours truly, slides here), Springer, 2007. Another excellent reference is Mining the Web by Soumen Chakrabarti, Morgan-Kaufmann, 2002, which we used in past offerings of this course. Note the second edition of this book is in the making.
Lecture TR 11:15A-12:30P in room I 107 (map)
Instructor Fil Menczer (Office hours by appointment in I2:300; please schedule in class)
AI Jacob Ratkiewicz (Office hours by appointment in I2:310)
Contact Use the group discussions and pages for all class-related questions and communications, unless privacy is necessary.