Topics in Text Categorization

 

Example of an old test here.

 

 

Lecture 1. General introduction; comparison of learning algorithms

Slides : Lecture1

Readings:

 

Machine learning in automated text categorization
F Sebastiani

 

Inductive Learning Algorithms and Representations for Text Categorization
S Dumais, J Platt, M Sahami, D Heckerman

 

A re-examination of text categorization methods
Y Yang, X Liu

 

Text Categorization with Support Vector Machines
T Joachims

 

 

 

Lecture 2.  Naïve Bayes; feature selection 

Slides : Lecture2

Readings:

 

A comparison of event models for Naïve Bayes text classification
A McCallum, K Nigam

 

A comparative study on feature selection in text categorization
Y Yang, JO Pedersen

 

An extensive empirical study of feature selection metrics for text classification
G Forman

 

 

 

Lecture 3: Authorship Attribution

 

Slides : Lecture3

 

Readings:

 

Computational Methods in Authorship Attribution

M. Koppel, J. Schler, S. Argamon

 

 

 

Lecture 4: Author profiling

Slides : Lecture4

Readings:

 

Determining an Author's Native Language by Mining a Text for Errors
M. Koppel, J. Schler, K. Zigdon

 

Automatically Profiling the Author of an Anonymous Text

S. Argamon, M. Koppel, J. Pennebaker and J. Schler

 

 

Lecture 5: Authorship verification

 

Slides : Lecture5

 

Readings:

 

Measuring Differentiability: Unmasking Pseudonymous Authors

M. Koppel, J. Schler, E. Bonchek-Dokow

 

Authorship Attribution in the Wild

M. Koppel, J. Schler,  S. Argamon

 

 

 

Lecture 6. The Ultimate Authorship Problem: Verification for Short Docs (skipped).

 

Slides : Lecture6

 

Readings:

 

Authorship Attribution with Thousands of Candidate Authors

M. Koppel, J. Schler, S. Argamon and E. Messeri

 

 

 

Lecture 7. Decomposition of a Document to Authorial Components

 

Slides : Lecture7

 

Readings:

Introduction to Information Retrieval (Chapter 16)

C Manning, P Raghavan, H Schutze

 

On Spectral Clustering: Analysis and an Algorithm

A Ng, M Jordan, Y Weiss

 

Unsupervised Decomposition of a Document Into Authorial Components

M. Koppel, N. Akiva, I. Dershowitz and N. Dershowitz

 

 

 

Lecture 8. Bottom-up sentiment analysis

 

Slides : Lecture8

 

Readings:

 

Predicting the semantic orientation of adjectives
V Hatzivassiloglou, KR McKeown

 

Thumbs Up or Thumbs Down?
P Turney

 

Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis
T Wilson, J Wiebe, P Hoffmann



Lecture 9. Top-down sentiment analysis

Slides : Lecture9

 

Readings:


Opinion Mining and Sentiment Analysis

B Pang, L Lee


Mining the peanut gallery: opinion extraction and semantic classification of product reviews
K Dave, S Lawrence, DM Pennock


The Importance of Neutral Examples for Learning Sentiment
M Koppel, J Schler

 

 

 

Lecture 10. Spam filtering

Slides : Lecture10

 

 

Lecture 11. Latent semantic analysis

 

Slides: Lecture11

 

Readings:

 

Introduction to Information Retrieval (Chapter 18)

C Manning, P Raghavan, H Schutze

 

Indexing by latent semantic analysis
S Deerwester, ST Dumais, GW Furnas, TK Landauer

 

 

Lecture 12: Latent Dirichlet Allocation

Slides: Lecture12

Readings:

Latent Dirichlet Allocation

D. Blei, A. Ng and M. Jordan