CSCI B656 Web Mining (3 CR)

Tentative Spring 2009 Schedule

Note: paper links allow full text download within IU; from outside IU you may find other sources via Google Scholar or CiteSeer.

Week Date Topic Readings Person in charge, Deadlines, Notes
1 1/13 Intro to course Add/Drop management
1/15 Intro to Web mining Chapter 1 Fil; Scheduling & assignment of papers
2 1/20 Supervised learning (text classifiers) Chapter 3 Fil
1/22 NB: L98, McCN98 Xiaoyi
3 1/27 SVM: H+98, J98, B98 Jiayi
1/29 Text classification: YP97, YL99 Pulan
4 2/3 Unsupervised learning (clustering & communities) Chapter 4 Fil; Project proposal (1 page) & online page(s) for project due
2/5 Doc clusters: HP96, ZE98 Huina
5 2/10 Communities: FLG02, FC08 Dimitar
2/12 Duplicates: BBDH00, CSG-M00 Chintan T.
6 2/17 Information retrieval & Web search Chapter 6 Fil
2/19 LSI: DDFLH90, DLL96 Amit
7 2/24 PageRank: BP98, FBFMxx Prashant
2/26 Bias: CR04, FMFV06 Chris
8 3/3 Link analysis (Web graph) Chapter 7 Fil
3/5 HITS 1: K99, CDKRRTGK99 Diep
9 3/10 HITS 2: BH98, DH99 Poornima
3/12 Web structure: AJB99, BKMRRSTW00, SMBFV07 Rahul
3/17 Spring break
3/19
10 3/24 Web crawling Chapter 8 Fil; Project progress report due (1 page max)
3/26 Preferential crawlers: CG-MP98, NW01 Feng
11 3/31 Focused crawlers: CvdBD99, DCLGG00, CPS02 Chintan S.
4/2 Topical crawlers: MB00, MPS04 Felix
12 4/7 Additional topics Coverage & growth: LG98, LG99, NCO04 DongInn
4/9 Locality: M02, M04a, MMRV05 Jeff
13 4/14 Models: BA99, KKRRT99, PFLGG02, M04b, FFM06 Thilina + Jasleen
4/16 Web traffic: MMFFV08, LGLZMHL08, GMRFM09 Mehool
14 4/21 Social tagging: M05, GH06, MRM08, HKG-M08, MCMBHS09, Deepak + Bin
4/23 P2P: AWMM06, MWA08 Shijin
15 4/28 Project presentations
4/30 Project due: Final report (2-page WWW poster format) + online
16 5/5 at 10:15 a.m.- 12:15 p.m. Project demostrations (by appointment)