Timezone: »
For classifying time series, a nearestneighbor approach is widely used in practice with performance often competitive with or better than more elaborate methods such as neural networks, decision trees, and support vector machines. We develop theoretical justification for the effectiveness of nearestneighborlike classification of time series. Our guiding hypothesis is that in many applications, such as forecasting which topics will become trends on Twitter, there aren't actually that many prototypical time series to begin with, relative to the number of time series we have access to, e.g., topics become trends on Twitter only in a few distinct manners whereas we can collect massive amounts of Twitter data. To operationalize this hypothesis, we propose a latent source model for time series, which naturally leads to a "weighted majority voting" classification rule that can be approximated by a nearestneighbor classifier. We establish nonasymptotic performance guarantees of both weighted majority voting and nearestneighbor classification under our model accounting for how much of the time series we observe and the model complexity. Experimental results on synthetic data show weighted majority voting achieving the same misclassification rate as nearestneighbor classification while observing less of the time series. We then use weighted majority to forecast which news topics on Twitter become trends, where we are able to detect such "trending topics" in advance of Twitter 79% of the time, with a mean early advantage of 1 hour and 26 minutes, a true positive rate of 95%, and a false positive rate of 4%.
Author Information
George H Chen (Carnegie Mellon University)
Stanislav Nikolov (Twitter)
Devavrat Shah (Massachusetts Institute of Technology)
Devavrat Shah is a professor of Electrical Engineering & Computer Science and Director of Statistics and Data Science at MIT. He received PhD in Computer Science from Stanford. He received Erlang Prize from Applied Probability Society of INFORMS in 2010 and NeuIPS best paper award in 2008.
More from the Same Authors

2021 Spotlight: Regulating algorithmic filtering on social media »
Sarah Cen · Devavrat Shah 
2021 : Regret, stability, and fairness in matching markets with bandit learners »
Sarah Cen · Devavrat Shah 
2021 : Regret, stability, and fairness in matching markets with bandit learners »
Sarah Cen · Devavrat Shah 
2022 : A Causal Inference Framework for Network Interference with Panel Data »
Sarah Cen · Anish Agarwal · Christina Yu · Devavrat Shah 
2022 : On counterfactual inference with unobserved confounding »
Abhin Shah · Raaz Dwivedi · Devavrat Shah · Gregory Wornell 
2021 Poster: A Computationally Efficient Method for Learning Exponential Family Distributions »
Abhin Shah · Devavrat Shah · Gregory Wornell 
2021 Poster: Regulating algorithmic filtering on social media »
Sarah Cen · Devavrat Shah 
2021 Poster: Change Point Detection via Multivariate Singular Spectrum Analysis »
Arwa Alanqary · Abdullah Alomar · Devavrat Shah 
2021 Poster: PerSim: DataEfficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators »
Anish Agarwal · Abdullah Alomar · Varkey Alumootil · Devavrat Shah · Dennis Shen · Zhi Xu · Cindy Yang 
2020 Poster: Estimation of Skill Distribution from a Tournament »
Ali Jadbabaie · Anuran Makur · Devavrat Shah 
2020 Spotlight: Estimation of Skill Distribution from a Tournament »
Ali Jadbabaie · Anuran Makur · Devavrat Shah 
2020 Poster: Sample Efficient Reinforcement Learning via LowRank Matrix Estimation »
Devavrat Shah · Dogyoon Song · Zhi Xu · Yuzhe Yang 
2020 Demonstration: tspDB: Time Series Predict DB »
Anish Agarwal · Abdullah Alomar · Devavrat Shah 
2019 Poster: On Robustness of Principal Component Regression »
Anish Agarwal · Devavrat Shah · Dennis Shen · Dogyoon Song 
2019 Oral: On Robustness of Principal Component Regression »
Anish Agarwal · Devavrat Shah · Dennis Shen · Dogyoon Song 
2019 Tutorial: Synthetic Control »
Alberto Abadie · Vishal Misra · Devavrat Shah 
2018 Poster: Qlearning with Nearest Neighbors »
Devavrat Shah · Qiaomin Xie 
2017 Workshop: Nearest Neighbors for Modern Applications with Massive Data: An Ageold Solution with New Challenges »
George H Chen · Devavrat Shah · Christina Lee 
2017 Poster: Thy Friend is My Friend: Iterative Collaborative Filtering for Sparse Matrix Estimation »
Christian Borgs · Jennifer Chayes · Christina Lee · Devavrat Shah 
2016 Poster: Blind Regression: Nonparametric Regression for Latent Variable Models via Collaborative Filtering »
Dogyoon Song · Christina Lee · Yihua Li · Devavrat Shah 
2014 Workshop: Analysis of Rank Data: Confluence of Social Choice, Operations Research, and Machine Learning »
Shivani Agarwal · Hossein Azari Soufiani · Guy Bresler · Sewoong Oh · David Parkes · Arun Rajkumar · Devavrat Shah 
2014 Poster: Hardness of parameter estimation in graphical models »
Guy Bresler · David Gamarnik · Devavrat Shah 
2014 Poster: A Latent Source Model for Online Collaborative Filtering »
Guy Bresler · George H Chen · Devavrat Shah 
2014 Spotlight: A Latent Source Model for Online Collaborative Filtering »
Guy Bresler · George H Chen · Devavrat Shah 
2014 Poster: Learning Mixed Multinomial Logit Model from Ordinal Data »
Sewoong Oh · Devavrat Shah 
2014 Poster: Structure learning of antiferromagnetic Ising models »
Guy Bresler · David Gamarnik · Devavrat Shah 
2013 Workshop: Crowdsourcing: Theory, Algorithms and Applications »
Jennifer Wortman Vaughan · Greg Stoddard · ChienJu Ho · Adish Singla · Michael Bernstein · Devavrat Shah · Arpita Ghosh · Evgeniy Gabrilovich · Denny Zhou · Nikhil Devanur · Xi Chen · Alexander Ihler · Qiang Liu · Genevieve Patterson · Ashwinkumar Badanidiyuru Varadaraja · Hossein Azari Soufiani · Jacob Whitehill 
2013 Poster: Computing the Stationary Distribution Locally »
Christina Lee · Asuman Ozdaglar · Devavrat Shah 
2012 Poster: Iterative ranking from pairwise comparisons »
Sahand N Negahban · Sewoong Oh · Devavrat Shah 
2012 Spotlight: Iterative ranking from pairwise comparisons »
Sahand N Negahban · Sewoong Oh · Devavrat Shah 
2011 Poster: Iterative Learning for Reliable Crowdsourcing Systems »
David R Karger · Sewoong Oh · Devavrat Shah 
2011 Oral: Iterative Learning for Reliable Crowdsourcing Systems »
David R Karger · Sewoong Oh · Devavrat Shah 
2009 Poster: A DataDriven Approach to Modeling Choice »
Vivek Farias · Srikanth Jagabathula · Devavrat Shah 
2009 Spotlight: A DataDriven Approach to Modeling Choice »
Vivek Farias · Srikanth Jagabathula · Devavrat Shah 
2009 Poster: Local Rules for Global MAP: When Do They Work ? »
Kyomin Jung · Pushmeet Kohli · Devavrat Shah 
2008 Poster: Inferring rankings under constrained sensing »
Srikanth Jagabathula · Devavrat Shah 
2008 Oral: Inferring rankings under constrained sensing »
Srikanth Jagabathula · Devavrat Shah 
2007 Spotlight: Message Passing for Maxweight Independent Set »
Sujay Sanghavi · Devavrat Shah · Alan S Willsky 
2007 Poster: Message Passing for Maxweight Independent Set »
Sujay Sanghavi · Devavrat Shah · Alan S Willsky 
2007 Poster: Local Algorithms for Approximate Inference in MinorExcluded Graphs »
Kyomin Jung · Devavrat Shah