Interactive Semantic Featuring for Text Classification

Patrice Simard; Max Chickering; Jina Suh

Interactive Semantic Featuring for Text Classification

Patrice Simard ,
Max Chickering ,
Jina Suh

Proceedings of the 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016) | June 2016

Download BibTex

In text classification, dictionaries can be used to define human-comprehensible features. We propose an improvement to dictionary features called smoothed dictionary features. These features recognize document contexts instead of ngrams. We describe a principled methodology to solicit dictionary features from a teacher, and present results showing that models built using these human-comprehensible features are competitive with models trained with Bag of Words features.