CSCI B659 Topics in Artificial Intelligence (3 CR)
Web Mining

Class Projects


2007 | 2006 | 2005 | 2004 | Ideas | Proposal format | Evaluation form

Spring 2007

  1. Mining Pharmacogenomics Information using Topical Web Crawlers
  2. Pagerank & Sample Size
  3. Can we beat Cinematch (Netflix Recommendation System)?
  4. SMILES Index for Retrieving Chemical Information
  5. Social network community emergence around new digital media
  6. Trend Prediction
  7. Evaluating Hypertext Documents for Authenticity

Spring 2006

  1. Usage Statistics of Robots Exclusion Standard (paper in Proc. IADIS WWW/Internet 2006)
  2. Mining for Blog communities
  3. Directed News Analysis
  4. KidsCrawler
  5. Ontology Generation from Specialized Corpora
  6. Using Online Social Networks in Topical Sentiment Analysis
  7. Using Page History to Rank Search Results
  8. Web Mining Developmental Trends in Social Networks
  9. Web Topology of the Indiana University Domain (paper in Proc. IV07)
  10. Web user profiling and its applicability to system security

Top

Spring 2005

  1. Multilingual news search
  2. Effects of guided summarization on QA using the Web
  3. Mining people connections
  4. Personalized search by history context
  5. Phishing Attacks Using Social Networks (see coverage in IDS, IDS, and Slashdot; paper to appear in CACM)
  6. Structural evolution of Web content
  7. Clustering of political opinion sites using unsupervised techniques

Top

Spring 2004 (sample)

  1. Experiments with PageRank Computation
  2. Clustering Weblogs using LSA and Link-based Methods
  3. Sherlock News Search Engine
  4. Focused Crawlers vs Accelerated Focused Crawlers
  5. Domain-Based PageRank Personalization (paper presented at WebKDD 2004!)

Top

Ideas for Future Projects

Top

Project Proposal Format

The project proposal is free-format but cannot be longer than a page in length. Use your best judgement about margins and font size (fitting too much on a page would be a bad idea!). The proposal should be concise, concrete, focused and to the point. It should answer a few basic questions:
  1. Why? (Motivate your idea; is it interesting, important, relevant?)
  2. What? (Exactly what do you propose?)
  3. How? (State your hypothesis and evaluation procedure)
  4. When? (You need a realistic timetable and deliverable; is it doable?)

    Top