Timezone: »
Exponential functions are core mathematical constructs that are the key to many important applications, including speech recognition, pattern-search and logistic regression problems in statistics, machine translation, and natural language processing. Exponential functions are found in exponential families, log-linear models, conditional random fields (CRF), entropy functions, neural networks involving sigmoid and soft max functions, and Kalman filter or MMIE training of hidden Markov models. Many techniques have been developed in pattern recognition to construct formulations from exponential expressions and to optimize such functions, including growth transforms, EM, EBW, Rprop, bounds for log-linear models, large-margin formulations, and regularization. Optimization of log-linear models also provides important algorithmic tools for machine learning applications (including deep learning), leading to new research in such topics as stochastic gradient methods, sparse / regularized optimization methods, enhanced first-order methods, coordinate descent, and approximate second-order methods. Specific recent advances relevant to log-linear modeling include the following.
• Effective optimization approaches, including stochastic gradient and Hessian-free methods.
• Efficient algorithms for regularized optimization problems.
• Bounds for log-linear models and recent convergence results
• Recognition of modeling equivalences across different areas, such as the equivalence between Gaussian and log-linear models/HMM and HCRF, and the equivalence between transfer entropy and Granger causality for Gaussian parameters.
Though exponential functions and log-linear models are well established, research activity remains intense, due to the central importance of the area in front-line applications and the rapid expanding size of the data sets to be processed. Fundamental work is needed to transfer algorithmic ideas across different contexts and explore synergies between them, to assimilate the influx of ideas from optimization, to assemble better combinations of algorithmic elements for tackling such key tasks as deep learning, and to explore such key issues as parameter tuning.
The workshop will bring together researchers from the many fields that formulate, use, analyze, and optimize log-linear models, with a view to exposing and studying the issues discussed above.
Topics of possible interest for talks at the workshop include, but are not limited to, the following.
1. Log-linear models.
2. Using equivalences to transfer optimization and modeling methods across different applications and different classes of models.
3. Comparison of optimization / accuracy performance of equivalent model pairs.
4. Convex formulations.
5. Bounds and their applications.
6. Stochastic gradient, first-order, and approximate-second-order methods.
7. Efficient non-Gaussian filtering approach (that exploits equivalence of Gaussian generative and log-linear models and projecting on exponential manifold of densities).
8. Graphic and Network inference models.
9. Missing data and hidden variables in log-linear modeling.
10. Semi-supervised estimation in log-linear modeling.
11. Sparsity in log-linear models.
12. Block and novel regularization methods for log-linear models.
13. Parallel, distributed and large-scale methods for log-linear models.
14. Information geometry of Gaussian densities and exponential families.
15. Hybrid algorithms that combine different optimization strategies.
16. Connections between log-linear models and deep belief networks.
17. Connections with kernel methods.
18. Applications to speech / natural-language processing and other areas.
19. Empirical contributions that compare and contrast different approaches.
20. Theoretical contributions that relate to any of the above topics.
Author Information
Dimitri Kanevsky (IBM, T.J. Watson Research Center)
Tony Jebara (Netflix)
Li Deng (Microsoft Reserach, Redmond)
Stephen Wright (UW-Madison)
Steve Wright is a Professor of Computer Sciences at the University of Wisconsin-Madison. His research interests lie in computational optimization and its applications to science and engineering. Prior to joining UW-Madison in 2001, Wright was a Senior Computer Scientist (1997-2001) and Computer Scientist (1990-1997) at Argonne National Laboratory, and Professor of Computer Science at the University of Chicago (2000-2001). He is the past Chair of the Mathematical Optimization Society (formerly the Mathematical Programming Society), the leading professional society in optimization, and a member of the Board of the Society for Industrial and Applied Mathematics (SIAM). Wright is the author or co-author of four widely used books in numerical optimization, including "Primal Dual Interior-Point Methods" (SIAM, 1997) and "Numerical Optimization" (with J. Nocedal, Second Edition, Springer, 2006). He has also authored over 85 refereed journal papers on optimization theory, algorithms, software, and applications. He is coauthor of widely used interior-point software for linear and quadratic optimization. His recent research includes algorithms, applications, and theory for sparse optimization (including applications in compressed sensing and machine learning).
Georg Heigold (Google)
Avishy Carmi (NTU)
More from the Same Authors
-
2022 : BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach »
Mao Ye · Bo Liu · Stephen Wright · Peter Stone · Qiang Liu -
2023 Poster: Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing »
Shuyao Li · Yu Cheng · Ilias Diakonikolas · Jelena Diakonikolas · Rong Ge · Stephen Wright -
2022 Poster: BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach »
Bo Liu · Mao Ye · Stephen Wright · Peter Stone · Qiang Liu -
2022 Poster: Coordinate Linear Variance Reduction for Generalized Linear Programming »
Chaobing Song · Cheuk Yin Lin · Stephen Wright · Jelena Diakonikolas -
2020 Poster: Object-Centric Learning with Slot Attention »
Francesco Locatello · Dirk Weissenborn · Thomas Unterthiner · Aravindh Mahendran · Georg Heigold · Jakob Uszkoreit · Alexey Dosovitskiy · Thomas Kipf -
2020 Spotlight: Object-Centric Learning with Slot Attention »
Francesco Locatello · Dirk Weissenborn · Thomas Unterthiner · Aravindh Mahendran · Georg Heigold · Jakob Uszkoreit · Alexey Dosovitskiy · Thomas Kipf -
2020 Oral: Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent »
Benjamin Recht · Christopher Ré · Stephen Wright · Feng Niu -
2019 : Second-order methods for nonconvex optimization with complexity guarantees »
Stephen Wright -
2019 Poster: A New Distribution on the Simplex with Auto-Encoding Applications »
Andrew Stirn · Tony Jebara · David Knowles -
2018 Poster: ATOMO: Communication-efficient Learning via Atomic Sparsification »
Hongyi Wang · Scott Sievert · Shengchao Liu · Zachary Charles · Dimitris Papailiopoulos · Stephen Wright -
2017 Poster: k-Support and Ordered Weighted Sparsity for Overlapping Groups: Hardness and Algorithms »
Cong Han Lim · Stephen Wright -
2015 Workshop: Learning and privacy with incomplete data and weak supervision »
Giorgio Patrini · Tony Jebara · Richard Nock · Dimitrios Kotzias · Felix Xinnan Yu -
2015 : Cross-Modality Distant Supervised Learning for Speech, Text, and Image Classification »
Li Deng -
2015 : Machine Learning For Conversational Systems »
Larry Heck · Li Deng · Olivier Pietquin · Tomas Mikolov -
2015 Poster: End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture »
Jianshu Chen · Ji He · Yelong Shen · Lin Xiao · Xiaodong He · Jianfeng Gao · Xinying Song · Li Deng -
2014 Poster: Clamping Variables and Approximate Inference »
Adrian Weller · Tony Jebara -
2014 Poster: Making Pairwise Binary Graphical Models Attractive »
Nicholas Ruozzi · Tony Jebara -
2014 Poster: Beyond the Birkhoff Polytope: Convex Relaxations for Vector Permutation Problems »
Cong Han Lim · Stephen Wright -
2014 Spotlight: Making Pairwise Binary Graphical Models Attractive »
Nicholas Ruozzi · Tony Jebara -
2014 Oral: Clamping Variables and Approximate Inference »
Adrian Weller · Tony Jebara -
2013 Workshop: Output Representation Learning »
Yuhong Guo · Dale Schuurmans · Richard Zemel · Samy Bengio · Yoshua Bengio · Li Deng · Dan Roth · Kilian Q Weinberger · Jason Weston · Kihyuk Sohn · Florent Perronnin · Gabriel Synnaeve · Pablo R Strasser · julien audiffren · Carlo Ciliberto · Dan Goldwasser -
2013 Poster: A multi-agent control framework for co-adaptation in brain-computer interfaces »
Josh S Merel · Roy Fox · Tony Jebara · Liam Paninski -
2013 Poster: Adaptive Anonymity via $b$-Matching »
Krzysztof M Choromanski · Tony Jebara · Kui Tang -
2013 Spotlight: Adaptive Anonymity via $b$-Matching »
Krzysztof M Choromanski · Tony Jebara · Kui Tang -
2013 Poster: An Approximate, Efficient LP Solver for LP Rounding »
Srikrishna Sridhar · Stephen Wright · Christopher Re · Ji Liu · Victor Bittorf · Ce Zhang -
2012 Poster: Learning with Recursive Perceptual Representations »
Oriol Vinyals · Yangqing Jia · Li Deng · Trevor Darrell -
2012 Poster: Majorization for CRFs and Latent Likelihoods »
Tony Jebara · Anna Choromanska -
2012 Spotlight: Majorization for CRFs and Latent Likelihoods »
Tony Jebara · Anna Choromanska -
2011 Workshop: Optimization for Machine Learning »
Suvrit Sra · Stephen Wright · Sebastian Nowozin -
2011 Poster: Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent »
Benjamin Recht · Christopher Re · Stephen Wright · Feng Niu -
2011 Poster: Variance Penalizing AdaBoost »
Pannagadatta K Shivaswamy · Tony Jebara -
2011 Poster: Learning a Distance Metric from a Network »
Blake Shaw · Bert Huang · Tony Jebara -
2010 Workshop: Optimization for Machine Learning »
Suvrit Sra · Sebastian Nowozin · Stephen Wright -
2010 Tutorial: Optimization Algorithms in Machine Learning »
Stephen Wright -
2009 Workshop: Deep Learning for Speech Recognition and Related Applications »
Li Deng · Dong Yu · Geoffrey E Hinton -
2009 Workshop: Optimization for Machine Learning »
Sebastian Nowozin · Suvrit Sra · S.V.N Vishwanthan · Stephen Wright -
2008 Workshop: Speech and Language: Learning-based Methods and Systems »
Xiaodong He · Li Deng -
2008 Workshop: Analyzing Graphs: Theory and Applications »
Edo M Airoldi · David Blei · Jake M Hofman · Tony Jebara · Eric Xing -
2008 Poster: Relative Margin Machines »
Pannagadatta K Shivaswamy · Tony Jebara -
2008 Session: Oral session 8: Physics and High Order Statistics »
Tony Jebara -
2007 Poster: Density Estimation under Independent Similarly Distributed Sampling Assumptions »
Tony Jebara · Yingbo Song · Kapil Thadani -
2007 Spotlight: Density Estimation under Independent Similarly Distributed Sampling Assumptions »
Tony Jebara · Yingbo Song · Kapil Thadani -
2007 Spotlight: Learning Monotonic Transformations for Classification »
Andrew G Howard · Tony Jebara -
2007 Poster: Learning Monotonic Transformations for Classification »
Andrew G Howard · Tony Jebara -
2006 Poster: An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments »
Michael Mandel · Daniel P Ellis · Tony Jebara -
2006 Poster: Gaussian and Wishart Hyperkernels »
Risi Kondor · Tony Jebara