Timezone: »
For many supervised learning problems, we possess prior knowledge about which features yield similar information about the target variable. In predicting the topic of a document, we might know that two words are synonyms, or when performing image recognition, we know which pixels are adjacent. Such synonymous or neighboring features are near-duplicates and should therefore be expected to have similar weights in a good model. Here we present a framework for regularized learning in settings where one has prior knowledge about which features are expected to have similar and dissimilar weights. This prior knowledge is encoded as a graph whose vertices represent features and whose edges represent similarities and dissimilarities between them. During learning, each feature's weight is penalized by the amount it differs from the average weight of its neighbors. For text classification, regularization using graphs of word co-occurrences outperforms manifold learning and compares favorably to other recently proposed semi-supervised learning methods. For sentiment analysis, feature graphs constructed from declarative human knowledge, as well as from auxiliary task learning, significantly improve prediction accuracy.
Author Information
Ted Sandler (University of Pennsylvania)
John Blitzer (Google)
Partha P Talukdar (University of Pennsylvania)
Lyle Ungar (University of Pennsylvania)
More from the Same Authors
-
2013 Poster: New Subsampling Algorithms for Fast Least Squares Regression »
Paramveer Dhillon · Yichao Lu · Dean P Foster · Lyle Ungar -
2013 Poster: Faster Ridge Regression via the Subsampled Randomized Hadamard Transform »
Yichao Lu · Paramveer Dhillon · Dean P Foster · Lyle Ungar -
2011 Poster: Multi-View Learning of Word Embeddings via CCA »
Paramveer Dhillon · Dean P Foster · Lyle Ungar -
2007 Poster: Learning Bounds for Domain Adaptation »
John Blitzer · Yacov Crammer · Alex Kulesza · Fernando Pereira · Jennifer Wortman Vaughan -
2006 Workshop: Novel Applications of Dimensionality Reduction »
John Blitzer · Rajarshi Das · Irina Rish · Kilian Q Weinberger -
2006 Poster: Analysis of Representations for Domain Adaptation »
John Blitzer · Shai Ben-David · Yacov Crammer · Fernando Pereira