### Spotlight Videos

- Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much
- Deep ADMM-Net for Compressive Sensing MRI
- A scaled Bregman theorem with applications
- On Regularizing Rademacher Observation Losses
- Fast and Provably Good Seedings for k-Means
- Unsupervised Learning for Physical Interaction through Video Prediction
- Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling
- Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
- Incremental Boosting Convolutional Neural Network for Facial Action Unit Recognition
- Natural-Parameter Networks: A Class of Probabilistic Neural Networks
- SURGE: Surface Regularized Geometry Estimation from a Single Image
- Interpretable Distribution Features with Maximum Testing Power
- Sorting out typicality with the inverse moment matrix SOS polynomial
- CNNpack: Packing Convolutional Neural Networks in the Frequency Domain
- Cooperative Graphical Models
- f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization
- Bayesian Optimization for Probabilistic Programs
- Hierarchical Question-Image Co-Attention for Visual Question Answering
- Fairness in Learning: Classic and Contextual Bandits
- DISCO Nets : DISsimilarity COefficients Networks
- Multimodal Residual Learning for Visual QA
- Learning and Forecasting Opinion Dynamics in Social Networks
- Collaborative Recurrent Autoencoder: Recommend while Learning to Fill in the Blanks
- Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images
- Exponential Family Embeddings
- Variational Information Maximization for Feature Selection
- Learning User Perceived Clusters with Feature-Level Supervision
- Residual Networks Behave Like Ensembles of Relatively Shallow Networks
- Adversarial Multiclass Classification: A Risk Minimization Perspective
- Deep Learning without Poor Local Minima
- Optimizing affinity-based binary hashing using auxiliary coordinates
- Double Thompson Sampling for Dueling Bandits
- Computational and Statistical Tradeoffs in Learning to Rank
- Online Convex Optimization with Unconstrained Domains and Losses
- An ensemble diversity approach to supervised binary hashing
- On Explore-Then-Commit strategies
- Sublinear Time Orthogonal Tensor Decomposition
- Dual Learning for Machine Translation
- Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition
- Efficient Second Order Online Learning by Sketching
- Distributed Flexible Nonlinear Tensor Factorization
- Even Faster SVD Decomposition Yet Without Agonizing Pain
- A Multi-Batch L-BFGS Method for Machine Learning
- Semiparametric Differential Graph Models
- VIME: Variational Information Maximizing Exploration
- Solving Marginal MAP Problems with NP Oracles and Parity Constraints
- Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning
- Dense Associative Memory for Pattern Recognition
- Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than $O(1/\epsilon)$
- What Makes Objects Similar: A Unified Multi-Metric Learning Approach
- Adaptive Maximization of Pointwise Submodular Functions With Budget Constraint
- A Communication-Efficient Parallel Algorithm for Decision Tree
- Convex Two-Layer Modeling with Latent Structure
- Adaptive Concentration Inequalities for Sequential Decision Problems
- Catching heuristics are optimal control policies
- Synthesis of MCMC and Belief Propagation
- Unifying Count-Based Exploration and Intrinsic Motivation
- Large Margin Discriminant Dimensionality Reduction in Prediction Space
- Stochastic Structured Prediction under Bandit Feedback
- SEBOOST - Boosting Stochastic Learning Using Subspace Optimization Techniques
- Adaptive Skills Adaptive Partitions (ASAP)
- Multiple-Play Bandits in the Position-Based Model
- Optimal Black-Box Reductions Between Optimization Objectives
- Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters
- Boosting with Abstention
- Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision
- A Credit Assignment Compiler for Joint Prediction
- Consistent Kernel Mean Estimation for Functions of Random Variables
- Variational Inference in Mixed Probabilistic Submodular Models
- Anchor-Free Correlated Topic Modeling: Identifiability and Algorithm
- A Bandit Framework for Strategic Regression
- Low-Rank Regression with Tensor Responses
- PAC-Bayesian Theory Meets Bayesian Inference
- Data Poisoning Attacks on Factorization-Based Collaborative Filtering
- Hierarchical Object Representation for Open-Ended Object Category Learning and Recognition
- Diffusion-Convolutional Neural Networks
- A Probabilistic Programming Approach To Probabilistic Data Analysis
- Learning Structured Sparsity in Deep Neural Networks
- Sample Complexity of Automated Mechanism Design
- Short-Dot: Computing Large Linear Transforms Distributedly Using Coded Short Dot Products
- Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles
- Active Learning from Imperfect Labelers
- Learning to Communicate with Deep Multi-Agent Reinforcement Learning
- Blind Regression: Nonparametric Regression for Latent Variable Models via Collaborative Filtering
- On the Recursive Teaching Dimension of VC Classes
- Learning Multiagent Communication with Backpropagation
- Finding significant combinations of features in the presence of categorical covariates
- Graphons, mergeons, and so on!
- Pruning Random Forests for Prediction on a Budget
- Contextual semibandits via supervised learning oracles
- Deep Learning for Predicting Human Strategic Behavior
- Eliciting Categorical Data for Optimal Aggregation
- Breaking the Bandwidth Barrier: Geometrical Adaptive Entropy Estimation
- Improved Dropout for Shallow and Deep Learning
- Cyclades: Conflict-free Asynchronous Machine Learning
- Single Pass PCA of Matrix Products
- Stochastic Variational Deep Kernel Learning
- Causal meets Submodular: Subset Selection with Directed Information
- Deep Neural Networks with Inexact Matching for Person Re-Identification
- Finite Sample Prediction and Recovery Bounds for Ordinal Embedding
- Using Social Dynamics to Make Individual Predictions: Variational Inference with a Stochastic Kinetic Model
- Object based Scene Representations using Fisher Scores of Local Subspace Projections
- Can Peripheral Representations Improve Clutter Metrics on Complex Scenes?
- Noise-Tolerant Life-Long Matrix Completion via Adaptive Sampling
- Combinatorial semi-bandit with known covariance
- Adaptive Averaging in Accelerated Descent Dynamics
- Variational Bayes on Monte Carlo Steroids
- Combining Fully Convolutional and Recurrent Neural Networks for 3D Biomedical Image Segmentation
- A Comprehensive Linear Speedup Analysis for Asynchronous Stochastic Parallel Optimization from Zeroth-Order to First-Order
- Estimating the Size of a Large Network and its Communities from a Random Sample
- On Robustness of Kernel Clustering
- New Liftable Classes for First-Order Probabilistic Inference
- The Parallel Knowledge Gradient Method for Batch Bayesian Optimization
- Learning shape correspondence with anisotropic convolutional neural networks
- Attend, Infer, Repeat: Fast Scene Understanding with Generative Models
- Interpretable Nonlinear Dynamic Modeling of Neural Trajectories
- Search Improves Label for Active Learning
- Leveraging Sparsity for Efficient Submodular Data Summarization
- Linear Contextual Bandits with Knapsacks
- Reconstructing Parameters of Spreading Models from Partial Observations
- RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism
- Exact Recovery of Hard Thresholding Pursuit
- Data Programming: Creating Large Training Sets, Quickly
- Dynamic matrix recovery from incomplete observations under an exact low-rank constraint
- Fast Distributed Submodular Cover: Public-Private Data Summarization
- Communication-Optimal Distributed Clustering
- Probing the Compositionality of Intuitive Functions
- Composing graphical models with neural networks for structured representations and fast inference
- Learning Sparse Gaussian Graphical Models with Overlapping Blocks
- Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
- Infinite Hidden Semi-Markov Modulated Interaction Point Process
- Cooperative Inverse Reinforcement Learning
- Spatio-Temporal Hilbert Maps for Continuous Occupancy Representation in Dynamic Environments
- Select-and-Sample for Spike-and-Slab Sparse Coding
- Greedy Feature Construction
- Kernel Observers: Systems-Theoretic Modeling and Inference of Spatiotemporally Evolving Processes
- Quantum Perceptron Models
- Deep Exploration via Bootstrapped DQN
- Convolutional Neural Fabrics
- A Sparse Interactive Model for Matrix Completion with Side Information
- Coresets for Scalable Bayesian Logistic Regression
- Binarized Neural Networks
- Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences
- Learnable Visual Markers
- Learning Deep Embeddings with Histogram Loss
- Spectral Learning of Dynamic Systems from Nonequilibrium Data
- A Minimax Approach to Supervised Learning
- Edge-exchangeable graphs and sparsity
- A Locally Adaptive Normal Distribution
- Completely random measures for modelling block-structured sparse networks
- Neurons Equipped with Intrinsic Plasticity Learn Stimulus Intensity Statistics
- Learning values across many orders of magnitude
- Safe Exploration in Finite Markov Decision Processes with Gaussian Processes
- Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
- Learning Additive Exponential Family Graphical Models via $\ell_{2,1}$-norm Regularized M-Estimation
- A Consistent Regularization Approach for Structured Prediction
- An urn model for majority voting in classification ensembles
- Tagger: Deep Unsupervised Perceptual Grouping
- Interaction Networks for Learning about Objects, Relations and Physics
- Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent
- Professor Forcing: A New Algorithm for Training Recurrent Networks
- Learning brain regions via large-scale online structured sparse dictionary learning
- Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods
- Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information
- Learning Parametric Sparse Models for Image Super-Resolution
- Disease Trajectory Maps
- Learning in Games: Robustness of Fast Convergence
- Algorithms and matching lower bounds for approximately-convex optimization
- Neural Universal Discrete Denoiser
- Achieving budget-optimality with adaptive schemes in crowdsourcing
- Supervised Word Mover's Distance
- Full-Capacity Unitary Recurrent Neural Networks
- k*-Nearest Neighbors: From Global to Local
- A Bayesian method for reducing bias in neural representational similarity analysis
- Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates
- Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain
- Optimal Binary Classifier Aggregation for General Losses
- A primal-dual method for conic constrained distributed optimization problems