Topics in Text Categorization


Example of an old test here.



Lecture 1. General introduction; comparison of learning algorithms

Slides : Lecture1



Machine learning in automated text categorization
F Sebastiani


Inductive Learning Algorithms and Representations for Text Categorization
S Dumais, J Platt, M Sahami, D Heckerman


A re-examination of text categorization methods
Y Yang, X Liu


Text Categorization with Support Vector Machines
T Joachims




Lecture 2.  Naïve Bayes; feature selection 

Slides : Lecture2



A comparison of event models for Naïve Bayes text classification
A McCallum, K Nigam


A comparative study on feature selection in text categorization
Y Yang, JO Pedersen


An extensive empirical study of feature selection metrics for text classification
G Forman




Lecture 3: Authorship Attribution


Slides : Lecture3




Computational Methods in Authorship Attribution

M. Koppel, J. Schler, S. Argamon




Lecture 4: Author profiling

Slides : Lecture4



Determining an Author's Native Language by Mining a Text for Errors
M. Koppel, J. Schler, K. Zigdon


Automatically Profiling the Author of an Anonymous Text

S. Argamon, M. Koppel, J. Pennebaker and J. Schler



Lecture 5: Authorship verification


Slides : Lecture5




Measuring Differentiability: Unmasking Pseudonymous Authors

M. Koppel, J. Schler, E. Bonchek-Dokow


Authorship Attribution in the Wild

M. Koppel, J. Schler,  S. Argamon




Lecture 6. The Ultimate Authorship Problem: Verification for Short Docs (skipped).


Slides : Lecture6




Authorship Attribution with Thousands of Candidate Authors

M. Koppel, J. Schler, S. Argamon and E. Messeri




Lecture 7. Decomposition of a Document to Authorial Components


Slides : Lecture7



Introduction to Information Retrieval (Chapter 16)

C Manning, P Raghavan, H Schutze


On Spectral Clustering: Analysis and an Algorithm

A Ng, M Jordan, Y Weiss


Unsupervised Decomposition of a Document Into Authorial Components

M. Koppel, N. Akiva, I. Dershowitz and N. Dershowitz




Lecture 8. Bottom-up sentiment analysis


Slides : Lecture8




Predicting the semantic orientation of adjectives
V Hatzivassiloglou, KR McKeown


Thumbs Up or Thumbs Down?
P Turney


Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis
T Wilson, J Wiebe, P Hoffmann

Lecture 9. Top-down sentiment analysis

Slides : Lecture9



Opinion Mining and Sentiment Analysis

B Pang, L Lee

Mining the peanut gallery: opinion extraction and semantic classification of product reviews
K Dave, S Lawrence, DM Pennock

The Importance of Neutral Examples for Learning Sentiment
M Koppel, J Schler




Lecture 10. Spam filtering

Slides : Lecture10



Lecture 11. Latent semantic analysis


Slides: Lecture11




Introduction to Information Retrieval (Chapter 18)

C Manning, P Raghavan, H Schutze


Indexing by latent semantic analysis
S Deerwester, ST Dumais, GW Furnas, TK Landauer



Lecture 12: Latent Dirichlet Allocation

Slides: Lecture12


Latent Dirichlet Allocation

D. Blei, A. Ng and M. Jordan