# Downloads

Number of events: 2416

- $\alpha$-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression
- $(\textrm{Implicit})^2$: Implicit Layers for Implicit Representations
- $\texttt{LeadCache}$: Regret-Optimal Caching in Networks
- 2nd Workshop on Self-Supervised Learning: Theory and Practice
- 3DP3: 3D Scene Perception via Probabilistic Programming
- 3D Pose Transfer with Correspondence Learning and Mesh Refinement
- 3D Siamese Voxel-to-BEV Tracker for Sparse Point Clouds
- 4th Robot Learning Workshop: Self-Supervised and Lifelong Learning
- 5th Workshop on Meta-Learning
- A$^2$-Net: Learning Attribute-Aware Hash Codes for Large-Scale Fine-Grained Image Retrieval
- A 3D Generative Model for Structure-Based Drug Design
- A Bayesian-Symbolic Approach to Reasoning and Learning in Intuitive Physics
- ABC: Auxiliary Balanced Classifier for Class-imbalanced Semi-supervised Learning
- A Biased Graph Neural Network Sampler with Near-Optimal Regret
- A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs
- A/B/n Testing with Control in the Presence of Subpopulations
- Absolute Neighbour Difference based Correlation Test for Detecting Heteroscedastic Relationships
- A/B Testing for Recommender Systems in a Two-sided Marketplace
- A Causal Lens for Controllable Text Generation
- Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks
- Accelerating Quadratic Optimization with Reinforcement Learning
- Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives
- Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning
- Accumulative Poisoning Attacks on Real-time Data
- Accurately Solving Rod Dynamics with Graph Learning
- Accurate Point Cloud Registration with Robust Optimal Transport
- AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks
- A Central Limit Theorem for Differentially Private Query Answering
- AC-GC: Lossy Activation Compression with Guaranteed Convergence
- Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning
- Achieving Rotational Invariance with Bessel-Convolutional Neural Networks
- A Closer Look at the Worst-case Behavior of Multi-armed Bandit Algorithms
- A Compositional Atlas of Tractable Circuit Operations for Probabilistic Inference
- A Comprehensively Tight Analysis of Gradient Descent for PCA
- A Computationally Efficient Method for Learning Exponential Family Distributions
- A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning
- A Constant Approximation Algorithm for Sequential Random-Order No-Substitution k-Median Clustering
- A Continuous Mapping For Augmentation Design
- A Contrastive Learning Approach for Training Variational Autoencoder Priors
- A Convergence Analysis of Gradient Descent on Graph Neural Networks
- A Conversation on Human and Machine Intelligence
- A Critical Look at the Consistency of Causal Estimation with Deep Latent Variable Models
- Across-animal odor decoding by probabilistic manifold alignment
- Action-guided 3D Human Motion Prediction
- Activation Sharing with Asymmetric Paths Solves Weight Transport Problem without Bidirectional Connection
- Active 3D Shape Reconstruction from Vision and Touch
- Active Assessment of Prediction Services as Accuracy Surface Over Attribute Combinations
- Active clustering for labeling training data
- Active Learning of Convex Halfspaces on Graphs
- Actively Identifying Causal Effects with Latent Variables Given Only Response Variable Observable
- Active Offline Policy Selection
- Adaptable Agent Populations via a Generative Model of Policies
- Adapting to function difficulty and growth conditions in private optimization
- Adaptive Conformal Inference Under Distribution Shift
- Adaptive Data Augmentation on Temporal Graphs
- Adaptive Denoising via GainTuning
- Adaptive Diffusion in Graph Neural Networks
- Adaptive Ensemble Q-learning: Minimizing Estimation Bias via Error Feedback
- Adaptive First-Order Methods Revisited: Convex Minimization without Lipschitz Requirements
- Adaptive Machine Unlearning
- Adaptive Online Packing-guided Search for POMDPs
- Adaptive Proximal Gradient Methods for Structured Neural Networks
- Adaptive Risk Minimization: Learning to Adapt to Domain Shift
- Adaptive Sampling for Minimax Fair Classification
- Adaptive wavelet distillation from neural networks through interpretations
- Adder Attention for Vision Transformer
- Addressing Algorithmic Disparity and Performance Inconsistency in Federated Learning
- Adjusting for Autocorrelated Errors in Neural Networks for Time Series
- A Domain-Shrinking based Bayesian Optimization Algorithm with Order-Optimal Regret Performance
- Advances in Programming Languages and Neurosymbolic Systems (AIPLANS)
- Adversarial Attack Generation Empowered by Min-Max Optimization
- Adversarial Attacks on Black Box Video Classifiers: Leveraging the Power of Geometric Transformations
- Adversarial Attacks on Graph Classifiers via Bayesian Optimisation
- Adversarial Examples for k-Nearest Neighbor Classifiers Based on Higher-Order Voronoi Diagrams
- Adversarial Examples in Multi-Layer Random ReLU Networks
- Adversarial Examples Make Strong Poisons
- Adversarial Feature Desensitization
- Adversarial Graph Augmentation to Improve Graph Contrastive Learning
- Adversarial Intrinsic Motivation for Reinforcement Learning
- Adversarially Robust 3D Point Cloud Recognition Using Self-Supervisions
- Adversarially Robust Change Point Detection
- Adversarially robust learning for security-constrained optimal power flow
- Adversarial Neuron Pruning Purifies Backdoored Deep Models
- Adversarial Regression with Doubly Non-negative Weighting Matrices
- Adversarial Reweighting for Partial Domain Adaptation
- Adversarial Robustness of Streaming Algorithms through Importance Sampling
- Adversarial Robustness with Non-uniform Perturbations
- Adversarial Robustness without Adversarial Training: A Teacher-Guided Curriculum Learning Approach
- Adversarial Robustness with Semi-Infinite Constrained Learning
- Adversarial Teacher-Student Representation Learning for Domain Generalization
- Adversarial Training Helps Transfer Learning via Better Representations
- A Faster Decentralized Algorithm for Nonconvex Minimax Problems
- A Faster Maximum Cardinality Matching Algorithm with Applications in Machine Learning
- AFEC: Active Forgetting of Negative Transfer in Continual Learning
- A first-order primal-dual method with adaptivity to local smoothness
- A flow-based latent state generative model of neural population responses to natural images
- A Framework to Learn with Interpretation
- A Gang of Adversarial Bandits
- A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning
- A generative nonparametric Bayesian model for whole genomes
- Agent Modelling under Partial Observability for Deep Reinforcement Learning
- A Geometric Analysis of Neural Collapse with Unconstrained Features
- A Geometric Perspective towards Neural Calibration via Sensitivity Decomposition
- A Geometric Structure of Acceleration and Its Role in Making Gradients Small Fast
- Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations
- A Gradient Method for Multilevel Optimization
- A Hierarchical Reinforcement Learning Based Optimization Framework for Large-scale Dynamic Pickup and Delivery Problems
- A Highly-Efficient Group Elastic Net Algorithm with an Application to Function-On-Scalar Regression
- AI for Credible Elections: A Call to Action
- AI for Science: Mind the Gaps
- A Journey Through the Opportunity of Low Resourced Natural Language Processing — An African Lens
- A Kernel-based Test of Independence for Cluster-correlated Data
- A Law of Iterated Logarithm for Multi-Agent Reinforcement Learning
- Algorithmic Fairness through the lens of Causality and Robustness
- Algorithmic Instabilities of Accelerated Gradient Descent
- Algorithmic stability and generalization of an unsupervised feature selection algorithm
- Alias-Free Generative Adversarial Networks
- Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
- Aligned Structured Sparsity Learning for Efficient Image Super-Resolution
- Aligning Pretraining for Detection via Object-Level Contrastive Learning
- Aligning Silhouette Topology for Self-Adaptive 3D Human Pose Recovery
- Alignment Attention by Matching Key and Query Distributions
- A Little Robustness Goes a Long Way: Leveraging Robust Features for Targeted Transfer Attacks
- All Tokens Matter: Token Labeling for Training Better Vision Transformers
- (Almost) Free Incentivized Exploration from Decentralized Learning Agents
- A Mathematical Framework for Quantifying Transferability in Multi-source Transfer Learning
- A Max-Min Entropy Framework for Reinforcement Learning
- A mechanistic multi-area recurrent network model of decision-making
- A Minimalist Approach to Offline Reinforcement Learning
- Amortized Synthesis of Constrained Configurations Using a Differentiable Surrogate
- Amortized Variational Inference for Simple Hierarchical Models
- A Multi-Implicit Neural Representation for Fonts
- Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
- Analysis of one-hidden-layer neural networks via the resolvent method
- Analysis of Sensing Spectral for Signal Recovery under a Generalized Linear Model
- Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems
- Analytic Insights into Structure and Rank of Neural Network Hessian Maps
- Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II
- Analyzing the Confidentiality of Undistillable Teachers in Knowledge Distillation
- Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels
- An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias
- An analysis of Ermakov-Zolotukhin quadrature using kernels
- An Axiomatic Theory of Provably-Fair Welfare-Centric Machine Learning
- A Near-Optimal Algorithm for Debiasing Trained Machine Learning Models
- A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum
- An Efficient Pessimistic-Optimistic Algorithm for Stochastic Linear Bandits with General Constraints
- An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning
- An Empirical Investigation of Domain Generalization with Empirical Risk Minimizers
- An Empirical Study of Adder Neural Networks for Object Detection
- A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose
- An Even More Optimal Stochastic Optimization Algorithm: Minibatching and Interpolation Learning
- An Exact Characterization of the Generalization Error for the Gibbs Algorithm
- An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks
- An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap
- An Image is Worth More Than a Thousand Words: Towards Disentanglement in The Wild
- An Improved Analysis and Rates for Variance Reduction under Without-replacement Sampling Orders
- An Improved Analysis of Gradient Tracking for Decentralized Machine Learning
- An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence
- An Information-theoretic Approach to Distribution Shifts
- A No-go Theorem for Robust Acceleration in the Hyperbolic Plane
- A Non-commutative Extension of Lee-Seung's Algorithm for Positive Semidefinite Factorizations
- An Online Method for A Class of Distributionally Robust Optimization with Non-convex Objectives
- An online passive-aggressive algorithm for difference-of-squares classification
- An Online Riemannian PCA for Stochastic Canonical Correlation Analysis
- A nonparametric method for gradual change problems with statistical guarantees
- A Normative and Biologically Plausible Algorithm for Independent Component Analysis
- A Note on Sparse Generalized Eigenvalue Problem
- A novel notion of barycenter for probability distributions based on optimal weak mass transport
- Answering Complex Causal Queries With the Maximum Causal Set Effect
- Anti-Backdoor Learning: Training Clean Models on Poisoned Data
- Antipodes of Label Differential Privacy: PATE and ALIBI
- An Uncertainty Principle is a Price of Privacy-Preserving Microdata
- A PAC-Bayes Analysis of Adversarial Robustness
- Approximate Decomposable Submodular Function Minimization for Cardinality-Based Components
- Approximate optimization of convex functions with outlier noise
- Approximating the Permanent with Deep Rejection Sampling
- A Probabilistic State Space Model for Joint Inference from Differential Equations and Data
- A Prototype-Oriented Framework for Unsupervised Domain Adaptation
- A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning
- A Provably Efficient Sample Collection Strategy for Reinforcement Learning
- Arbitrary Conditional Distributions with Energy
- A Regression Approach to Learning-Augmented Online Algorithms
- Are My Deep Learning Systems Fair? An Empirical Study of Fixed-Seed Training
- Are Transformers more robust than CNNs?
- argmax centroid
- Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions
- Artistic Style Transfer with Internal-external Learning and Contrastive Learning
- A sampling-based circuit for optimal decision making
- A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs
- A Separation Result Between Data-oblivious and Data-aware Poisoning Attacks
- A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis
- A single gradient step finds adversarial examples on random two-layers neural networks
- ASSANet: An Anisotropic Separable Set Abstraction for Efficient Point Cloud Representation Learning
- Assessing Fairness in the Presence of Missing Data
- Associating Objects with Transformers for Video Object Segmentation
- Associative Memories via Predictive Coding
- A Stochastic Newton Algorithm for Distributed Convex Optimization
- A Surrogate Objective Framework for Prediction+Programming with Soft Constraints
- Asymptotically Best Causal Effect Identification with Multi-Armed Bandits
- Asymptotically Exact Error Characterization of Offline Policy Evaluation with Misspecified Linear Models
- Asymptotics of representation learning in finite Bayesian neural networks
- Asymptotics of the Bootstrap via Stability with Applications to Inference with Model Selection
- Asynchronous Decentralized Online Learning
- Asynchronous Decentralized SGD with Quantized and Local Updates
- Asynchronous Stochastic Optimization Robust to Arbitrary Delays
- A Theoretical Analysis of Fine-tuning with Linear Teachers
- A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning
- A Theory of the Distortion-Perception Tradeoff in Wasserstein Space
- ATISS: Autoregressive Transformers for Indoor Scene Synthesis
- A Topological Perspective on Causal Inference
- A Trainable Spectral-Spatial Sparse Coding Model for Hyperspectral Image Restoration
- Attention Approximates Sparse Distributed Memory
- Attention Bottlenecks for Multimodal Fusion
- Attention over Learned Object Embeddings Enables Complex Visual Reasoning
- Auditing Black-Box Prediction Models for Data Minimization Compliance
- AugMax: Adversarial Composition of Random Augmentations for Robust Training
- Augmented Shortcuts for Vision Transformers
- A Unified Approach to Fair Online Learning via Blackwell Approachability
- A unified framework for bandit multiple testing
- A Unified View of cGANs with and without Classifiers
- A Universal Law of Robustness via Isoperimetry
- A universal probabilistic spike count model reveals ongoing modulation of neural variability
- Autobahn: Automorphism-based Graph Neural Nets
- AutoBalance: Optimized Loss Functions for Imbalanced Data
- Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation
- Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting
- AutoGEL: An Automated Graph Neural Network with Explicit Link Information
- Automated Discovery of Adaptive Attacks on Adversarial Defenses
- Automated Dynamic Mechanism Design
- Automatic and Harmless Regularization with Constrained and Lexicographic Optimization: A Dynamic Barrier Approach
- Automatic Data Augmentation for Generalization in Reinforcement Learning
- Automatic Symmetry Discovery with Lie Algebra Convolutional Network
- Automatic Unsupervised Outlier Model Selection
- Automorphic Equivalence-aware Graph Neural Network
- Autonomous Reinforcement Learning via Subgoal Curricula
- A variational approximate posterior for the deep Wishart process
- A Variational Perspective on Diffusion-Based Generative Models and Score Matching
- Average-Reward Learning and Planning with Options
- Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent
- A Winning Hand: Compressing Deep Networks Can Improve Out-of-Distribution Robustness
- Baby Intuitions Benchmark (BIB): Discerning the goals, preferences, and actions of others
- Backdoor Attack with Imperceptible Input and Latent Modification
- Backward-Compatible Prediction Updates: A Probabilistic Approach
- Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion
- Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval
- Bandit Learning with Delayed Impact of Actions
- Bandit Phase Retrieval
- Bandit Quickest Changepoint Detection
- Bandits with Knapsacks beyond the Worst Case
- Bandits with many optimal arms
- BARTScore: Evaluating Generated Text as Text Generation
- BAST: Bayesian Additive Regression Spanning Trees for Complex Constrained Domain
- Batch Active Learning at Scale
- Batched Thompson Sampling
- Batch Multi-Fidelity Bayesian Optimization with Deep Auto-Regressive Networks
- Batch Normalization Orthogonalizes Representations in Deep Random Networks
- BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer
- Bayesian Adaptation for Covariate Shift
- Bayesian Bellman Operators
- Bayesian decision-making under misspecified priors with applications to meta-learning
- Bayesian Deep Learning
- Bayesian Optimization of Function Networks
- Bayesian Optimization with High-Dimensional Outputs
- BayesIMP: Uncertainty Quantification for Causal Data Fusion
- BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery
- BCORLE($\lambda$): An Offline Reinforcement Learning and Evaluation Framework for Coupons Allocation in E-commerce Market
- Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration
- Behavior From the Void: Unsupervised Active Pre-Training
- Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning
- Bellman-consistent Pessimism for Offline Reinforcement Learning
- Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
- Beltrami Flow and Neural Diffusion on Graphs
- Benign Overfitting
- Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation
- BernNet: Learning Arbitrary Graph Spectral Filters via Bernstein Approximation
- Best-case lower bounds in online learning
- Best of Both Worlds: Practical and Theoretically Optimal Submodular Maximization in Parallel
- Beta-CROWN: Efficient Bound Propagation with Per-neuron Split Constraints for Neural Network Robustness Verification
- Better Algorithms for Individually Fair $k$-Clustering
- Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training
- Beware of the Simulated DAG! Causal Discovery Benchmarks May Be Easy to Game
- Beyond Bandit Feedback in Online Multiclass Classification
- Beyond BatchNorm: Towards a Unified Understanding of Normalization in Deep Learning
- Beyond Fairness in Machine Learning
- Beyond Pinball Loss: Quantile Methods for Calibrated Uncertainty Quantification
- Beyond Smoothness: Incorporating Low-Rank Analysis into Nonparametric Density Estimation
- Beyond the Signs: Nonparametric Tensor Completion via Sign Series
- Beyond Tikhonov: faster learning with self-concordant losses, via iterative regularization
- Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning
- Bias and variance of the Bayesian-mean decoder
- Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models
- Biological learning in key-value memory networks
- Black Box Probabilistic Numerics
- BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation
- Blending Anti-Aliasing into Vision Transformer
- BNS: Building Network Structures Dynamically for Continual Learning
- Boosted CVaR Classification
- Boosting with Multiple Sources
- Boost Neural Networks by Checkpoints
- Bootstrapping the Error of Oja's Algorithm
- Bootstrap Your Object Detector via Mixed Training
- BooVAE: Boosting Approach for Continual Learning of VAE
- BooVI: Provably Efficient Bootstrapped Value Iteration
- Bounds all around: training energy-based models with bidirectional bounds
- Breaking the centralized barrier for cross-device federated learning
- Breaking the Dilemma of Medical Image-to-image Translation
- Breaking the Linear Iteration Cost Barrier for Some Well-known Conditional Gradient Methods Using MaxIP Data-structures
- Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs
- Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
- Brick-by-Brick: Combinatorial Construction with Deep Reinforcement Learning
- Bridging Explicit and Implicit Deep Generative Models via Neural Stein Estimators
- Bridging Non Co-occurrence with Unlabeled In-the-wild Data for Incremental Object Detection
- Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
- Bridging the Gap Between Practice and PAC-Bayes Theory in Few-Shot Meta-Learning
- Bridging the Gap: from Machine Learning Research to Clinical Practice
- Bridging the Imitation Gap by Adaptive Insubordination
- Bubblewrap: Online tiling and real-time flow prediction on neural manifolds
- BulletTrain: Accelerating Robust Neural Network Training via Boundary Example Mining
- ByPE-VAE: Bayesian Pseudocoresets Exemplar VAE
- CAFE: Catastrophic Data Leakage in Vertical Federated Learning
- Calibrating Predictions to Decisions: A Novel Approach to Multi-Class Calibration
- Calibration and Consistency of Adversarial Surrogate Losses
- CAM-GAN: Continual Adaptation Modules for Generative Adversarial Networks
- Can contrastive learning avoid shortcut solutions?
- Can fMRI reveal the representation of syntactic structure in the brain?
- Can Information Flows Suggest Targets for Interventions in Neural Circuits?
- CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression
- Can Less be More? When Increasing-to-Balancing Label Noise Rates Considered Beneficial
- Can multi-label classification networks know what they don’t know?
- Canonical Capsules: Self-Supervised Capsules in Canonical Pose
- Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression
- Can we have it all? On the Trade-off between Spatial and Adversarial Robustness of Neural Networks
- Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks
- Capacity and Bias of Learned Geometric Embeddings for Directed Graphs
- CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings
- Capturing implicit hierarchical structure in 3D biomedical images with self-supervised hyperbolic representations
- Cardinality constrained submodular maximization for random streams
- Cardinality-Regularized Hawkes-Granger Model
- CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator
- Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication
- Catch-A-Waveform: Learning to Generate Audio from a Single Short Example
- CATs: Cost Aggregation Transformers for Visual Correspondence
- Causal Abstractions of Neural Networks
- Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data
- Causal Bandits with Unknown Graph Structure
- Causal Effect Inference for Structured Treatments
- Causal Identification with Matrix Equations
- Causal Inference Challenges in Sequential Decision Making: Bridging Theory and Practice
- Causal Inference for Event Pairs in Multivariate Point Processes
- Causal Inference & Machine Learning: Why now?
- Causal Influence Detection for Improving Efficiency in Reinforcement Learning
- Causal Navigation by Continuous-time Neural Networks
- CBP: backpropagation with constraint on weight precision using a pseudo-Lagrange multiplier method
- CCVS: Context-aware Controllable Video Synthesis
- Celebrating Diversity in Shared Multi-Agent Reinforcement Learning
- Center Smoothing: Certified Robustness for Networks with Structured Outputs
- CentripetalText: An Efficient Text Instance Representation for Scene Text Detection
- Certifying Robustness to Programmable Data Bias in Decision Trees
- Challenges and Opportunities in High Dimensional Variational Inference
- Change Point Detection via Multivariate Singular Spectrum Analysis
- Channel Permutations for N:M Sparsity
- Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning
- Characterizing possible failure modes in physics-informed neural networks
- Characterizing the risk of fairwashing
- Charting and Navigating the Space of Solutions for Recurrent Neural Networks
- Chasing Sparsity in Vision Transformers: An End-to-End Exploration
- Chebyshev-Cantelli PAC-Bayes-Bennett Inequality for the Weighted Majority Vote
- CHIP: CHannel Independence-based Pruning for Compact Neural Networks
- Choose a Transformer: Fourier or Galerkin
- Circa: Stochastic ReLUs for Private Deep Learning
- Class-agnostic Reconstruction of Dynamic Objects from Videos
- Class-Disentanglement and Applications in Adversarial Detection and Defense
- Class-Incremental Learning via Dual Augmentation
- CLDA: Contrastive Learning for Semi-Supervised Domain Adaptation
- CLIP-It! Language-Guided Video Summarization
- Clockwork Variational Autoencoders
- Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems
- Closing the loop in medical decision support by understanding clinical decision-making: A case study on organ transplantation
- Clustering Effect of Adversarial Robust Models
- Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning
- Coarse-to-fine Animal Pose and Shape Estimation
- CoAtNet: Marrying Convolution and Attention for All Data Sizes
- Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks
- COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
- Co-evolution Transformer for Protein Contact Prediction
- CoFiNet: Reliable Coarse-to-fine Correspondences for Robust PointCloud Registration
- CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions
- CogView: Mastering Text-to-Image Generation via Transformers
- COHESIV: Contrastive Object and Hand Embedding Segmentation In Video
- Collaborating with Humans without Human Data
- Collaborative Causal Discovery with Atomic Interventions
- Collaborative Learning in the Jungle (Decentralized, Byzantine, Heterogeneous, Asynchronous and Nonconvex Learning)
- Collaborative Uncertainty in Multi-Agent Trajectory Forecasting
- Collapsed Variational Bounds for Bayesian Neural Networks
- Combating Noise: Semi-supervised Learning by Region Uncertainty Quantification
- Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable Approach
- Combinatorial Pure Exploration with Bottleneck Reward Function
- Combiner: Full Attention Transformer with Sparse Computation Cost
- Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration
- Combining Latent Space and Structured Kernels for Bayesian Optimization over Combinatorial Spaces
- Combining Recurrent, Convolutional, and Continuous-time Models with Linear State Space Layers
- COMBO: Conservative Offline Model-Based Policy Optimization
- Communication-efficient SGD: From Local SGD to One-Shot Averaging
- Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
- Complexity Lower Bounds for Nonconvex-Strongly-Concave Min-Max Optimization
- Compositional Modeling of Nonlinear Dynamical Systems with ODE-based Random Features
- Compositional Reinforcement Learning from Logical Specifications
- Compositional Transformers for Scene Generation
- Comprehensive Knowledge Distillation with Causal Intervention
- Compressed Video Contrastive Learning
- Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition
- Compressive Visual Representations
- Computer-Aided Design as Language
- Concentration inequalities under sub-Gaussian and sub-exponential conditions
- Conditional Generation Using Polynomial Expansions
- Conditionally Parameterized, Discretization-Aware Neural Networks for Mesh-Based Modeling of Physical Systems
- Conditioning Sparse Variational Gaussian Processes for Online Decision-making
- ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs
- Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality
- Confident Anchor-Induced Multi-Source Free Domain Adaptation
- Conflict-Averse Gradient Descent for Multi-task learning
- Conformal Bayesian Computation
- Conformal Prediction using Conditional Histograms
- Conformal Time-series Forecasting
- Conic Blackwell Algorithm: Parameter-Free Convex-Concave Saddle-Point Solving
- Conservative Data Sharing for Multi-Task Offline Reinforcement Learning
- Conservative Offline Distributional Reinforcement Learning
- Consistency Regularization for Variational Auto-Encoders
- Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers
- Consistent Non-Parametric Methods for Maximizing Robustness
- Constrained Optimization to Train Neural Networks on Critical and Under-Represented Classes
- Constrained Robust Submodular Partitioning
- Constrained Two-step Look-Ahead Bayesian Optimization
- Container: Context Aggregation Networks
- Contextual Recommendations and Low-Regret Cutting-Plane Algorithms
- Contextual Similarity Aggregation with Self-attention for Visual Re-ranking
- Continual Auxiliary Task Learning
- Continual Learning via Local Module Composition
- Continual World: A Robotic Benchmark For Continual Reinforcement Learning
- Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms
- Continuous Doubly Constrained Batch Reinforcement Learning
- Continuous Latent Process Flows
- Continuous Mean-Covariance Bandits
- Continuous-time edge modelling using non-parametric point processes
- Continuous vs. Discrete Optimization of Deep Neural Networks
- Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing
- Contrastive Active Inference
- Contrastive Graph Poisson Networks: Semi-Supervised Learning with Extremely Limited Labels
- Contrastive Laplacian Eigenmaps
- Contrastive Learning for Neural Topic Model
- Contrastive Learning of Global and Local Video Representations
- Contrastively Disentangled Sequential Variational Autoencoder
- Contrastive Reinforcement Learning of Symbolic Reasoning Domains
- Controllable and Compositional Generation with Latent-Space Energy-Based Models
- Controlled Text Generation as Continuous Optimization with Multiple Constraints
- Controlling Neural Networks with Rule Representations
- Control Variates for Slate Off-Policy Evaluation
- Convergence and Alignment of Gradient Descent with Random Backpropagation Weights
- Convergence of adaptive algorithms for constrained weakly convex optimization
- Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance
- Convex-Concave Min-Max Stackelberg Games
- Convex Polytope Trees
- Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training
- Cooperative AI
- Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback
- Coordinated Proximal Policy Optimization
- CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum
- Coresets for Classification – Simplified and Strengthened
- Coresets for Clustering with Missing Values
- Coresets for Decision Trees of Signals
- Coresets for Time Series Clustering
- Correlated Stochastic Block Models: Exact Graph Matching with Applications to Recovering Communities
- Corruption Robust Active Learning
- CorticalFlow: A Diffeomorphic Mesh Transformer Network for Cortical Surface Reconstruction
- Cortico-cerebellar networks as decoupling neural interfaces
- Counterbalancing Learning and Strategic Incentives in Allocation Markets
- Counterexample Guided RL Policy Refinement Using Bayesian Optimization
- Counterfactual Explanations Can Be Manipulated
- Counterfactual Explanations in Sequential Decision Making Under Uncertainty
- Counterfactual Invariance to Spurious Correlations in Text Classification
- Counterfactual Maximum Likelihood Estimation for Training Deep Networks
- Coupled Gradient Estimators for Discrete Latent Variables
- Coupled Segmentation and Edge Learning via Dynamic Graph Propagation
- Covariance-Aware Private Mean Estimation Without Private Covariance Estimation
- Credal Self-Supervised Learning
- Credit Assignment in Neural Networks through Deep Feedback Control
- Credit Assignment Through Broadcasting a Global Error Vector
- CROCS: Clustering and Retrieval of Cardiac Signals Based on Patient Disease Class, Sex, and Age
- Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning
- Cross-view Geo-localization with Layer-to-Layer Transformer
- CrypTen: Secure Multi-Party Computation Meets Machine Learning
- CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation
- CtrlGen: Controllable Generative Modeling in Language and Vision
- Curriculum Design for Teaching via Demonstrations: Theory and Applications
- Curriculum Disentangled Recommendation with Noisy Multi-feedback
- Curriculum Learning for Vision-and-Language Navigation
- Curriculum Offline Imitating Learning
- Cycle Self-Training for Domain Adaptation
- D2C: Diffusion-Decoding Models for Few-Shot Conditional Generation
- Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization
- Dangers of Bayesian Model Averaging under Covariate Shift
- Data Augmentation Can Improve Robustness
- Databases and AI (DBAI)
- Data Centric AI
- Data driven semi-supervised learning
- Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery Ticket Perspective
- Data-Efficient Instance Generation from Instance Discrimination
- Dataset Distillation with Infinitely Wide Convolutional Networks
- Data Sharing and Compression for Cooperative Networked Control
- Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification
- Debiased Visual Question Answering from Feature and Sample Perspectives
- DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks
- Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data
- Decentralized Learning in Online Queuing Systems
- Decentralized Q-learning in Zero-sum Markov Games
- Decision Transformer: Reinforcement Learning via Sequence Modeling
- Deconditional Downscaling with Gaussian Processes
- Deconvolutional Networks on Graph Data
- Decoupling the Depth and Scope of Graph Neural Networks
- Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP
- Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks
- Deep Conditional Gaussian Mixture Model for Constrained Clustering
- Deep Contextual Video Compression
- Deep Explicit Duration Switching Models for Time Series
- Deep Extended Hazard Models for Survival Analysis
- Deep Extrapolation for Attribute-Enhanced Generation
- DeepGEM: Generalized Expectation-Maximization for Blind Inversion
- Deep Generative Models and Downstream Applications
- Deep inference of latent dynamics with spatio-temporal super-resolution using selective backpropagation through time
- Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment Settings
- Deep learning is adaptive to intrinsic dimensionality of model smoothness in anisotropic Besov space
- Deep Learning on a Data Diet: Finding Important Examples Early in Training
- Deep Learning Through the Lens of Example Difficulty
- Deep Learning with Label Differential Privacy
- Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks
- Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis
- Deep Markov Factor Analysis: Towards Concurrent Temporal and Spatial Analysis of fMRI Data
- Deep Molecular Representation Learning via Fusing Physical and Chemical Information
- Deep Networks Provably Classify Data on Curves
- Deep Neural Networks as Point Estimates for Deep Gaussian Processes
- Deep Proxy Causal Learning and its Application to Confounded Bandit Policy Evaluation
- DeepReduce: A Sparse-tensor Communication Framework for Federated Deep Learning
- Deep Reinforcement Learning
- Deep Reinforcement Learning at the Edge of the Statistical Precipice
- Deep Residual Learning in Spiking Neural Networks
- Deep Self-Dissimilarities as Powerful Visual Fingerprints
- DeepSITH: Efficient Learning via Decomposition of What and When Across Time Scales
- Deep Synoptic Monte-Carlo Planning in Reconnaissance Blind Chess
- Deformable Butterfly: A Highly Structured and Sparse Linear Transform
- Delayed Gradient Averaging: Tolerate the Communication Latency for Federated Learning
- Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems
- Demonstrations 1
- Demonstrations 2
- Demonstrations 3
- Demonstrations 4
- Demystifying and Generalizing BinaryConnect
- Denoising Normalizing Flow
- Dense Keypoints via Multiview Supervision
- Densely connected normalizing flows
- Dense Unsupervised Learning for Video Segmentation
- Deployable Decision Making in Embodied Systems (DDM)
- De-randomizing MCMC dynamics with the diffusion Stein operator
- Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity
- Designing Counterfactual Generators using Deep Model Inversion
- Design of Experiments for Stochastic Contextual Linear Bandits
- Detecting and Adapting to Irregular Distribution Shifts in Bayesian Online Learning
- Detecting Anomalous Event Sequences with Temporal Point Processes
- Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles
- Detecting Individual Decision-Making Style: Exploring Behavioral Stylometry in Chess
- Detecting Moments and Highlights in Videos via Natural Language Queries
- Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD
- DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer
- DiBS: Differentiable Bayesian Structure Learning
- Differentiable Annealed Importance Sampling and the Perils of Gradient Noise
- Differentiable Equilibrium Computation with Decision Diagrams for Stackelberg Models of Combinatorial Congestion Games
- Differentiable Learning Under Triage
- Differentiable Multiple Shooting Layers
- Differentiable Optimization of Generalized Nondecomposable Functions using Linear Programs
- Differentiable Programming Workshop
- Differentiable Quality Diversity
- Differentiable rendering with perturbed optimizers
- Differentiable Simulation of Soft Multi-body Systems
- Differentiable Spike: Rethinking Gradient-Descent for Training Spiking Neural Networks
- Differentiable Spline Approximations
- Differentiable Synthesis of Program Architectures
- Differentiable Unsupervised Feature Selection based on a Gated Laplacian
- Differentially Private Empirical Risk Minimization under the Fairness Lens
- Differentially Private Federated Bayesian Optimization with Distributed Exploration
- Differentially Private Learning with Adaptive Clipping
- Differentially Private Model Personalization
- Differentially Private Multi-Armed Bandits in the Shuffle Model
- Differentially Private n-gram Extraction
- Differentially Private Sampling from Distributions
- Differentially Private Stochastic Optimization: New Results in Convex and Non-Convex Settings
- Differential Privacy Dynamics of Langevin Diffusion and Noisy Gradient Descent
- Differential Privacy Over Riemannian Manifolds
- Diffusion Models Beat GANs on Image Synthesis
- Diffusion Normalizing Flow
- Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling
- Dimensionality Reduction for Wasserstein Barycenter
- Dimension-free empirical entropy estimation
- Directed Graph Contrastive Learning
- Directed Probabilistic Watershed
- Directed Spectrum Measures Improve Latent Network Models Of Neural Populations
- Directional Message Passing on Molecular Graphs via Synthetic Coordinates
- Direct Multi-view Multi-person 3D Pose Estimation
- Dirichlet Energy Constrained Learning for Deep Graph Neural Networks
- Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation
- Discovering and Achieving Goals via World Models
- Discovering Dynamic Salient Regions for Spatio-Temporal Graph Neural Networks
- Discovery of Options via Meta-Learned Subgoals
- Discrete-Valued Neural Communication
- Disentangled Contrastive Learning on Graphs
- Disentangling Identifiable Features from Noisy Data with Structured Nonlinear ICA
- Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect
- Disrupting Deep Uncertainty Estimation Without Harming Accuracy
- Dissecting the Diffusion Process in Linear Graph Convolutional Networks
- Distilling Image Classifiers in Object Detectors
- Distilling Meta Knowledge on Heterogeneous Graph for Illicit Drug Trafficker Detection on Social Media
- Distilling Object Detectors with Feature Richness
- Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck
- Distributed Deep Learning In Open Collaborations
- Distributed Estimation with Multiple Samples per User: Sharp Rates and Phase Transition
- Distributed Machine Learning with Sparse Heterogeneous Data
- Distributed Principal Component Analysis with Limited Communication
- Distributed Saddle-Point Problems Under Data Similarity
- Distributed Zero-Order Optimization under Adversarial Noise
- Distributional Gradient Matching for Learning Uncertain Neural Dynamics Models
- Distributionally Robust Imitation Learning
- Distributional Reinforcement Learning for Multi-Dimensional Reward Functions
- Distribution-free inference for regression: discrete, continuous, and in between
- Distribution shifts: connecting methods and applications (DistShift)
- Divergence Frontiers for Generative Models: Sample Complexity, Quantization Effects, and Frontier Integrals
- Diverse Message Passing for Attribute with Heterophily
- Diversity Enhanced Active Learning with Strictly Proper Scoring Rules
- Diversity Matters When Learning From Ensembles
- DNN-based Topology Optimisation: Spatial Invariance and Neural Tangent Kernel
- DOBF: A Deobfuscation Pre-Training Objective for Programming Languages
- DOCTOR: A Simple Method for Detecting Misclassification Errors
- Do Different Tracking Tasks Require Different Appearance Models?
- Does enforcing fairness mitigate biases caused by subpopulation shift?
- Does Knowledge Distillation Really Work?
- Does Preprocessing Help Training Over-parameterized Neural Networks?
- Do Input Gradients Highlight Discriminative Features?
- Domain Adaptation with Invariant Representation Learning: What Transformations to Learn?
- Domain Invariant Representation Learning with Domain Density Transformations
- DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks
- Do Neural Optimal Transport Solvers Work? A Continuous Wasserstein-2 Benchmark
- Don’t Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence
- Do Transformers Really Perform Badly for Graph Representation?
- Double/Debiased Machine Learning for Dynamic Treatment Effects
- Double Machine Learning Density Estimation for Local Treatment Effects with Instruments
- Doubly Robust Thompson Sampling with Linear Payoffs
- Do Vision Transformers See Like Convolutional Neural Networks?
- Do We Know How to Estimate the Mean?
- Do Wider Neural Networks Really Help Adversarial Robustness?
- DP-SSL: Towards Robust Semi-supervised Learning with A Few Labeled Samples
- Drawing Robust Scratch Tickets: Subnetworks with Inborn Robustness Are Found within Randomly Initialized Networks
- DRIVE: One-bit Distributed Mean Estimation
- Dr Jekyll & Mr Hyde: the strange case of off-policy policy updates
- DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras
- DRONE: Data-aware Low-rank Compression for Large NLP Models
- Drop-DTW: Aligning Common Signal Between Sequences While Dropping Outliers
- DropGNN: Random Dropouts Increase the Expressiveness of Graph Neural Networks
- Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity
- DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning
- Dual Adaptivity: A Universal Algorithm for Minimizing the Adaptive Regret of Convex Functions
- DualNet: Continual Learning, Fast and Slow
- Dual Parameterization of Sparse Variational Gaussian Processes
- Dual Progressive Prototype Network for Generalized Zero-Shot Learning
- Dual-stream Network for Visual Recognition
- Dueling Bandits with Adversarial Sleeping
- Dueling Bandits with Team Comparisons
- Duplex Sequence-to-Sequence Learning for Reversible Machine Translation
- Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking
- Dynamical Wasserstein Barycenters for Time-series Modeling
- Dynamic Analysis of Higher-Order Coordination in Neuronal Assemblies via De-Sparsified Orthogonal Matching Pursuit
- Dynamic Bottleneck for Robust Self-Supervised Exploration
- Dynamic Causal Bayesian Optimization
- Dynamic COVID risk assessment accounting for community virus exposure from a spatial-temporal transmission model
- Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data
- Dynamic Grained Encoder for Vision Transformers
- Dynamic Inference with Neural Interpreters
- Dynamic influence maximization
- Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation
- Dynamic Normalization and Relay for Video Action Recognition
- Dynamic population-based meta-learning for multi-agent communication with natural language
- Dynamic Resolution Network
- Dynamic Sasvi: Strong Safe Screening for Norm-Regularized Least Squares
- Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models
- Dynamics-regulated kinematic policy for egocentric pose estimation
- Dynamic Trace Estimation
- Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
- DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
- Early Convolutions Help Transformers See Better
- Early-stopped neural networks are consistent
- Ecological Theory of Reinforcement Learning: How Does Task Design Influence Agent Learning?
- EDGE: Explaining Deep Reinforcement Learning Policies
- Edge Representation Learning with Hypergraphs
- EditGAN: High-Precision Semantic Image Editing
- Editing a classifier by rewriting its prediction rules
- EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback
- Effective Meta-Regularization by Kernelized Proximal Regularization
- Efficient Active Learning for Gaussian Process Classification by Error Reduction
- Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations
- Efficient and Accurate Gradients for Neural SDEs
- Efficient and Local Parallel Random Walks
- Efficient Bayesian network structure learning via local Markov boundary search
- Efficient Combination of Rematerialization and Offloading for Training DNNs
- Efficient constrained sampling via the mirror-Langevin algorithm
- Efficient Equivariant Network
- Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination
- Efficient Generalization with Distributionally Robust Learning
- Efficient hierarchical Bayesian inference for spatio-temporal regression models in neuroimaging
- Efficient Learning of Discrete-Continuous Computation Graphs
- Efficiently Identifying Task Groupings for Multi-Task Learning
- Efficiently Learning One Hidden Layer ReLU Networks From Queries
- Efficient methods for Gaussian Markov random fields under sparse linear constraints
- Efficient Mirror Descent Ascent Methods for Nonsmooth Minimax Problems
- Efficient Natural Language and Speech Processing (Models, Training, and Inference)
- Efficient Neural Network Training via Forward and Backward Propagation Sparsification
- Efficient Online Estimation of Causal Effects by Deciding What to Observe
- Efficient Statistical Assessment of Neural Network Corruption Robustness
- Efficient Training of Retrieval Models using Negative Cache
- Efficient Training of Visual Transformers with Small Datasets
- Efficient Truncated Linear Regression with Unknown Noise Variance
- EIGNN: Efficient Infinite-Depth Graph Neural Networks
- ELLA: Exploration through Learned Language Abstraction
- Embedding Principle of Loss Landscape of Deep Neural Networks
- Emergent Communication of Generalizations
- Emergent Communication under Varying Sizes and Connectivities
- Emergent Discrete Communication in Semantic Spaces
- Enabling Fast Differentially Private SGD via Just-in-Time Compilation and Vectorization
- Encoding Robustness to Image Style via Adversarial Feature Perturbations
- Encoding Spatial Distribution of Convolutional Features for Texture Representation
- End-to-end Multi-modal Video Temporal Grounding
- End-to-end reconstruction meets data-driven regularization for inverse problems
- End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
- End-to-End Weak Supervision
- E(n) Equivariant Normalizing Flows
- Ensembling Graph Predictions for AMR Parsing
- Entropic Desired Dynamics for Intrinsic Control
- Entropy-based adaptive Hamiltonian Monte Carlo
- Environment Generation for Zero-Shot Compositional Reinforcement Learning
- Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration
- Equilibrium and non-Equilibrium regimes in the learning of Restricted Boltzmann Machines
- Equilibrium Refinement for the Age of Machines: The One-Sided Quasi-Perfect Equilibrium
- Equivariant Manifold Flows
- Error Compensated Distributed SGD Can Be Accelerated
- ErrorCompensatedX: error compensation for variance reduced algorithms
- Escape saddle points by a simple gradient-descent based algorithm
- Escaping Saddle Points with Compressed SGD
- Estimating High Order Gradients of the Data Distribution by Denoising
- Estimating Multi-cause Treatment Effects via Single-cause Perturbation
- Estimating the Long-Term Effects of Novel Treatments
- Estimating the Unique Information of Continuous Variables
- Evaluating Efficient Performance Estimators of Neural Architectures
- Evaluating Gradient Inversion Attacks and Defenses in Federated Learning
- Evaluating model performance under worst-case subpopulations
- Evaluating State-of-the-Art Classification Models Against Bayes Optimality
- Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi
- Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation
- Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models
- EvoGrad: Efficient Gradient-Based Meta-Learning and Hyperparameter Optimization
- Evolution Gym: A Large-Scale Benchmark for Evolving Soft Robots
- Exact marginal prior distributions of finite Bayesian neural networks
- Exact Privacy Guarantees for Markov Chain Implementations of the Exponential Mechanism with Artificial Atoms
- Excess Capacity and Backdoor Poisoning
- eXplainable AI approaches for debugging and diagnosis
- Explainable Semantic Space by Grounding Language to Vision with Cross-Modal Contrastive Learning
- Explaining heterogeneity in medial entorhinal cortex with task-driven neural networks
- Explaining Hyperparameter Optimization via Partial Dependence Plots
- Explaining Latent Representations with a Corpus of Examples
- Explanation-based Data Augmentation for Image Classification
- Explicable Reward Design for Reinforcement Learning Agents
- Explicit loss asymptotics in the gradient descent training of neural networks
- Exploiting a Zoo of Checkpoints for Unseen Tasks
- Exploiting Chain Rule and Bayes' Theorem to Compare Probability Distributions
- Exploiting Data Sparsity in Secure Cross-Platform Social Recommendation
- Exploiting Domain-Specific Features to Enhance Domain Generalization
- Exploiting Local Convergence of Quasi-Newton Methods Globally: Adaptive Sample Size Approach
- Exploiting Opponents Under Utility Constraints in Sequential Games
- Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation
- Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded Rationality
- Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks
- Exploring Cross-Video and Cross-Modality Signals for Weakly-Supervised Audio-Visual Video Parsing
- Exploring Forensic Dental Identification with Deep Learning
- Exploring Social Posterior Collapse in Variational Autoencoder for Interaction Modeling
- Exploring the Limits of Out-of-Distribution Detection
- Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning
- Exponential Graph is Provably Efficient for Decentralized Deep Training
- Exponential Separation between Two Learning Models and Adversarial Robustness
- Extending Lagrangian and Hamiltonian Neural Networks with Differentiable Contact Models
- Extracting Deformation-Aware Local Features by Learning to Deform
- FACMAC: Factored Multi-Agent Centralised Policy Gradients
- Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs
- Fair Algorithms for Multi-Agent Multi-Armed Bandits
- Fair Classification with Adversarial Perturbations
- Fair Clustering Under a Bounded Cost
- Fair Exploration via Axiomatic Bargaining
- Fairness in Ranking under Uncertainty
- Fairness via Representation Neutralization
- Fair Scheduling for Time-dependent Resources
- Fair Sequential Selection Using Supervised Learning Models
- Fair Sortition Made Transparent
- Fair Sparse Regression with Clustering: An Invex Relaxation for a Combinatorial Problem
- Fast Abductive Learning by Similarity-based Consistency Optimization
- Fast Algorithms for $L_\infty$-constrained S-rectangular Robust MDPs
- Fast and accurate randomized algorithms for low-rank tensor decompositions
- Fast and Memory Efficient Differentially Private-SGD via JL Projections
- Fast Approximate Dynamic Programming for Infinite-Horizon Markov Decision Processes
- Fast Approximation of the Sliced-Wasserstein Distance Using Concentration of Random Projections
- Fast Axiomatic Attribution for Neural Networks
- Fast Bayesian Inference for Gaussian Cox Processes via Path Integral Formulation
- Fast Certified Robust Training with Short Warmup
- FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
- Fast Doubly-Adaptive MCMC to Estimate the Gibbs Partition Function with Weak Mixing Time Bounds
- Faster Algorithms and Constant Lower Bounds for the Worst-Case Expected Error
- Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data
- Faster Matchings via Learned Duals
- Faster Neural Network Training with Approximate Tensor Operations
- Faster Non-asymptotic Convergence for Double Q-learning
- Faster proximal algorithms for matrix optimization using Jacobi-based eigenvalue methods
- Fast Extra Gradient Methods for Smooth Structured Nonconvex-Nonconcave Minimax Problems
- Fast Federated Learning in the Presence of Arbitrary Device Unavailability
- Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints
- Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification
- Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization
- Fast Projection onto the Capped Simplex with Applications to Sparse Regression in Bioinformatics
- Fast Pure Exploration via Frank-Wolfe
- Fast rates for prediction with limited expert advice
- Fast Routing under Uncertainty: Adaptive Learning in Congestion Games via Exponential Weights
- Fast Training Method for Stochastic Compositional Optimization Problems
- Fast Training of Neural Lumigraph Representations using Meta Learning
- Fast Tucker Rank Reduction for Non-Negative Tensors Using Mean-Field Approximation
- Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee
- FedDR – Randomized Douglas-Rachford Splitting Algorithms for Nonconvex Federated Composite Optimization
- Federated-EM with heterogeneity mitigation and variance reduction
- Federated Graph Classification over Non-IID Graphs
- Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing
- Federated Linear Contextual Bandits
- Federated Multi-Task Learning under a Mixture of Distributions
- Federated Reconstruction: Partially Local Federated Learning
- Federated Split Task-Agnostic Vision Transformer for COVID-19 CXR Diagnosis
- Few-Round Learning for Federated Learning
- Few-Shot Data-Driven Algorithms for Low Rank Approximation
- Few-Shot Object Detection via Association and DIscrimination
- Few-Shot Segmentation via Cycle-Consistent Transformer
- Finding Bipartite Components in Hypergraphs
- Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
- Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks
- Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance
- Fine-grained Generalization Analysis of Inductive Matrix Completion
- Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information
- Fine-Grained Zero-Shot Learning with DNA as Side Information
- FINE Samples for Learning with Noisy Labels
- Finite Sample Analysis of Average-Reward TD Learning and $Q$-Learning
- Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators
- Fitting summary statistics of neural data with a differentiable spiking network simulator
- Fixes That Fail: Self-Defeating Improvements in Machine-Learning Systems
- FjORD: Fair and Accurate Federated Learning under heterogeneous targets with Ordered Dropout
- Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning
- Flexible Option Learning
- FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling
- FLEX: Unifying Evaluation for Few-Shot NLP
- Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation
- FL-WBC: Enhancing Robustness against Model Poisoning Attacks in Federated Learning from a Client Perspective
- FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention
- Focal Attention for Long-Range Interactions in Vision Transformers
- For high-dimensional hierarchical models, consider exchangeability of effects across covariates instead of across datasets
- Formalizing Generalization and Adversarial Robustness of Neural Networks to Weight Perturbations
- Formalizing the Generalization-Forgetting Trade-off in Continual Learning
- Forster Decomposition and Learning Halfspaces with Noise
- Foundations of Symbolic Languages for Model Interpretability
- Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms
- Framing RNN as a kernel method: A neural ODE approach
- From Canonical Correlation Analysis to Self-supervised Graph Neural Networks
- From global to local MDI variable importances for random forests and when they are Shapley values
- From Optimality to Robustness: Adaptive Re-Sampling Strategies in Stochastic Bandits
- Functionally Regionalized Knowledge Transfer for Low-resource Drug Discovery
- Functional Neural Networks for Parametric Image Restoration Problems
- Functional Regularization for Reinforcement Learning via Learned Fourier Features
- Functional Variational Inference based on Stochastic Process Generators
- Fuzzy Clustering with Similarity Queries
- Garment4D: Garment Reconstruction from Point Cloud Sequences
- Gauge Equivariant Transformer
- Gaussian Kernel Mixture Network for Single Image Defocus Deblurring
- GemNet: Universal Directional Graph Neural Networks for Molecules
- Gender, Allyship & Public Interest Technology
- Generalizable Imitation Learning from Observation via Inferring Goal Proximity
- Generalizable Multi-linear Attention Network
- Generalization Bounds for Graph Embedding Using Negative Sampling: Linear vs Hyperbolic
- Generalization Bounds For Meta-Learning: An Information-Theoretic Analysis
- Generalization Bounds for Meta-Learning via PAC-Bayes and Uniform Stability
- Generalization Bounds for (Wasserstein) Robust Optimization
- Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime
- Generalization Guarantee of SGD for Pairwise Learning
- Generalization of Model-Agnostic Meta-Learning Algorithms: Recurring and Unseen Tasks
- Generalized and Discriminative Few-Shot Object Detection via SVD-Dictionary Enhancement
- Generalized DataWeighting via Class-Level Gradient Manipulation
- Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks
- Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels
- Generalized Linear Bandits with Local Differential Privacy
- Generalized Proximal Policy Optimization with Sample Reuse
- Generalized Shape Metrics on Neural Representations
- General Low-rank Matrix Optimization: Geometric Analysis and Sharper Bounds
- General Nonlinearities in SO(2)-Equivariant CNNs
- Generating High-Quality Explanations for Navigation in Partially-Revealed Environments
- Generative Occupancy Fields for 3D Surface-Aware Image Synthesis
- Generative vs. Discriminative: Rethinking The Meta-Continual Learning
- Generic Neural Architecture Search via Regression
- GENESIS-V2: Inferring Unordered Object Representations without Iterative Refinement
- Geometry Processing with Neural Fields
- GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles
- Glance-and-Gaze Vision Transformer
- Global-aware Beam Search for Neural Abstractive Summarization
- Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization
- Global Convergence of Online Optimization for Nonlinear Model Predictive Control
- Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games
- Global Filter Networks for Image Classification
- Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning
- Going Beyond Linear RL: Sample Efficient Neural Function Approximation
- Going Beyond Linear Transformers with Recurrent Fast Weight Programmers
- Gone Fishing: Neural Active Learning with Fisher Embeddings
- Good Classification Measures and How to Find Them
- G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators
- Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation
- Gradient-based Editing of Memory Examples for Online Task-free Continual Learning
- Gradient-based Hyperparameter Optimization Over Long Horizons
- Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias
- Gradient Driven Rewards to Guarantee Fairness in Collaborative Machine Learning
- Gradient-Free Adversarial Training Against Image Corruption for Learning-based Steering
- Gradient Inversion with Generative Image Prior
- Gradient Starvation: A Learning Proclivity in Neural Networks
- GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training
- Gradual Domain Adaptation without Indexed Intermediate Domains
- Grammar-Based Grounded Lexicon Learning
- Graph Adversarial Self-Supervised Learning
- Graph Differentiable Architecture Search with Structure Learning
- GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph
- Graphical Models in Heavy-Tailed Markets
- Graph Neural Networks with Adaptive Residual
- Graph Neural Networks with Local Graph Parameters
- Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification
- Greedy and Random Quasi-Newton Methods with Faster Explicit Superlinear Convergence
- Greedy Approximation Algorithms for Active Sequential Hypothesis Testing
- GRIN: Generative Relation and Intention Network for Multi-agent Trajectory Prediction
- Grounding inductive biases in natural images: invariance stems from variations in data
- Grounding Representation Similarity Through Statistical Testing
- Grounding Spatio-Temporal Language with Transformers
- Group Equivariant Subsampling
- Habitat 2.0: Training Home Assistants to Rearrange their Habitat
- Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling
- Handling Long-tailed Feature Distribution in AdderNets
- Hard-Attention for Scalable Image Classification
- Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning
- Hash Layers For Large Sparse Models
- Heavy Ball Momentum for Conditional Gradient
- Heavy Ball Neural Ordinary Differential Equations
- Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks
- Hessian Eigenspectra of More Realistic Nonlinear Models
- Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization
- Heuristic-Guided Reinforcement Learning
- Hierarchical Clustering: $O(1)$-Approximation for Well-Clustered Graphs
- Hierarchical Reinforcement Learning with Timed Subgoals
- Hierarchical Skills for Efficient Exploration
- Higher Order Kernel Mean Embeddings to Capture Filtrations of Stochastic Processes
- High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails
- High Probability Complexity Bounds for Line Search Based on Stochastic Oracles
- Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL
- History Aware Multimodal Transformer for Vision-and-Language Navigation
- Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Generation
- H-NeRF: Neural Radiance Fields for Rendering and Temporal Reconstruction of Humans in Motion
- HNPE: Leveraging Global Parameters for Neural Posterior Estimation
- How can classical multidimensional scaling go wrong?
- How Data Augmentation affects Optimization for Linear Regression
- How does a Neural Network's Architecture Impact its Robustness to Noisy Labels?
- How Does it Sound?
- How Duolingo Uses AI to Assess, Engage and Teach Better
- How Fine-Tuning Allows for Effective Meta-Learning
- How Modular should Neural Module Networks Be for Systematic Generalization?
- How Powerful are Performance Predictors in Neural Architecture Search?
- How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?
- How Tight Can PAC-Bayes be in the Small Data Regime?
- How to transfer algorithmic reasoning knowledge to learn new algorithms?
- How Well do Feature Visualizations Support Causal Understanding of CNN Activations?
- HRFormer: High-Resolution Vision Transformer for Dense Predict
- HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning
- Human-Adversarial Visual Question Answering
- Human Centered AI
- Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits
- Hyperbolic Busemann Learning with Ideal Prototypes
- Hyperbolic Procrustes Analysis Using Riemannian Geometry
- Hypergraph Propagation and Community Selection for Objects Retrieval
- Hyperparameter Optimization Is Deceiving Us, and How to Stop It
- Hyperparameter Tuning is All You Need for LISTA
- HyperSPNs: Compact and Expressive Probabilistic Circuits
- IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers
- Identifiability in inverse reinforcement learning
- Identifiable Generative models for Missing Not at Random Data Imputation
- Identification and Estimation of Joint Probabilities of Potential Outcomes in Observational Studies with Covariate Information
- Identification of Partially Observed Linear Causal Models: Graphical Conditions for the Non-Gaussian and Heterogeneous Cases
- Identification of the Generalized Condorcet Winner in Multi-dueling Bandits
- Identifying and Benchmarking Natural Out-of-Context Prediction Problems
- Identity testing for Mallows model
- iFlow: Numerically Invertible Flows for Efficient Lossless Compression via a Uniform Coder
- ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis
- Image Generation using Continuous Filter Atoms
- ImageNet: Past, Present, and Future
- Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations
- Imitation with Neural Density Models
- Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity
- Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods
- Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path
- Implicit Generative Copulas
- Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions
- Implicit Regularization in Matrix Sensing via Mirror Descent
- Implicit Semantic Response Alignment for Partial Domain Adaptation
- Implicit Sparse Regularization: The Impact of Depth and Early Stopping
- Implicit SVD for Graph Representation Learning
- Implicit Task-Driven Probability Discrepancy Measure for Unsupervised Domain Adaptation
- Implicit Transformer Network for Screen Content Image Continuous Super-Resolution
- Impression learning: Online representation learning with synaptic plasticity
- Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction
- Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces
- Improved Guarantees for Offline Stochastic Matching via new Ordered Contention Resolution Schemes
- Improved Learning Rates of a Functional Lasso-type SVM with Sparse Multi-Kernel Representation
- Improved Regret Bounds for Tracking Experts with Memory
- Improved Regularization and Robustness for Fine-tuning in Neural Networks
- Improved Transformer for High-Resolution GANs
- Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
- Improving Anytime Prediction with Parallel Cascaded Networks and a Temporal-Difference Loss
- Improving black-box optimization in VAE latent space using decoder uncertainty
- Improving Calibration through the Relationship with Adversarial Robustness
- Improving Coherence and Consistency in Neural Sequence Models with Dual-System, Neuro-Symbolic Reasoning
- Improving Compositionality of Neural Networks by Decoding Representations to Inputs
- Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings
- Improving Conditional Coverage via Orthogonal Quantile Regression
- Improving Contrastive Learning on Imbalanced Data via Open-World Sampling
- Improving Deep Learning Interpretability by Saliency Guided Training
- Improving Generalization in Meta-RL with Imaginary Tasks from Latent Dynamics Mixture
- Improving Robustness using Generated Data
- Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration
- Improving Transferability of Representations via Augmentation-Aware Self-Supervision
- Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers
- Increasing Liquid State Machine Performance with Edge-of-Chaos Dynamics Organized by Astrocyte-modulated Plasticity
- Independent mechanism analysis, a new concept?
- Independent Prototype Propagation for Zero-Shot Compositionality
- Indexed Minimum Empirical Divergence for Unimodal Bandits
- INDIGO: GNN-Based Inductive Knowledge Graph Completion Using Pair-Wise Encoding
- Individual Privacy Accounting via a Rényi Filter
- Infinite Time Horizon Safety of Bayesian Neural Networks
- Influence Patterns for Explaining Information Flow in BERT
- InfoGCL: Information-Aware Graph Contrastive Learning
- Information-constrained optimization: can adaptive processing of gradients help?
- Information Directed Reward Learning for Reinforcement Learning
- Information Directed Sampling for Sparse Linear Bandits
- Information is Power: Intrinsic Control via Information Capture
- Information-theoretic generalization bounds for black-box learning algorithms
- Instance-Conditional Knowledge Distillation for Object Detection
- Instance-Conditioned GAN
- Instance-Dependent Bounds for Zeroth-order Lipschitz Optimization with Error Certificates
- Instance-dependent Label-noise Learning under a Structural Causal Model
- Instance-Dependent Partial Label Learning
- Instance-optimal Mean Estimation Under Differential Privacy
- Integrated Latent Heterogeneity and Invariance Learning in Kernel Space
- Integrating Expert ODEs into Neural ODEs: Pharmacology and Disease Progression
- Integrating Tree Path in Transformer for Code Representation
- Interactive Label Cleaning with Example-based Explanations
- Interesting Object, Curious Agent: Learning Task-Agnostic Exploration
- Intermediate Layers Matter in Momentum Contrastive Self Supervised Learning
- Interpolation can hurt robust generalization even when there is no noise
- Interpretable agent communication from scratch (with a generic visual processor emerging on the side)
- Interpreting Representation Quality of DNNs for 3D Point Cloud Processing
- Interventional Sum-Product Networks: Causal Inference with Tractable Probabilistic Models
- Intriguing Properties of Contrastive Losses
- Intriguing Properties of Vision Transformers
- Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks
- Introspective Distillation for Robust Question Answering
- Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization
- Invariant Causal Imitation Learning for Generalizable Policies
- Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System
- Inverse Problems Leveraging Pre-trained Contrastive Representations
- Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees
- Inverse-Weighted Survival Games
- Invertible DenseNets with Concatenated LipSwish
- Invertible Tabular GANs: Killing Two Birds with One Stone for Tabular Data Synthesis
- IQ-Learn: Inverse soft-Q Learning for Imitation
- IRM—when it works and when it doesn't: A test case of natural language inference
- Is Automated Topic Model Evaluation Broken? The Incoherence of Coherence
- Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies
- Ising Model Selection Using $\ell_{1}$-Regularized Linear Regression: A Statistical Mechanics Analysis
- I (Still) Can't Believe It's Not Better: A workshop for “beautiful” ideas that "should" have worked
- Iterative Amortized Policy Optimization
- Iterative Causal Discovery in the Possible Presence of Latent Confounders and Selection Bias
- Iterative Connecting Probability Estimation for Networks
- Iteratively Reweighted Least Squares for Basis Pursuit with Global Linear Convergence Rate
- Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods
- Iterative Teacher-Aware Learning
- Iterative Teaching by Label Synthesis
- It Has Potential: Gradient-Driven Denoisers for Convergent Solutions to Inverse Problems
- Joint inference and input optimization in equilibrium networks
- Joint Inference for Neural Network Depth and Dropout Regularization
- Joint Modeling of Visual Objects and Relations for Scene Graph Generation
- Joint Semantic Mining for Weakly Supervised RGB-D Salient Object Detection
- KALE Flow: A Relaxed KL Gradient Flow for Probabilities with Disjoint Support
- Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
- Kernel Functional Optimisation
- Kernel Identification Through Transformers
- K-level Reasoning for Zero-Shot Coordination in Hanabi
- K-Net: Towards Unified Image Segmentation
- Knowledge-Adaptation Priors
- Knowledge-inspired 3D Scene Graph Prediction in Point Cloud
- KS-GNN: Keywords Search over Incomplete Graphs via Graphs Neural Network
- L2ight: Enabling On-Chip Learning for Optical Neural Networks via Efficient in-situ Subspace Optimization
- Label consistency in overfitted generalized $k$-means
- Label Disentanglement in Partition-based Extreme Multilabel Classification
- Label-Imbalanced and Group-Sensitive Classification under Overparameterization
- Labeling Trick: A Theory of Using Graph Neural Networks for Multi-Node Representation Learning
- Label Noise SGD Provably Prefers Flat Global Minimizers
- LADA: Look-Ahead Data Acquisition via Augmentation for Deep Active Learning
- Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning
- Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision
- Landscape analysis of an improved power method for tensor decomposition
- Language models enable zero-shot prediction of the effects of mutations on protein function
- Laplace Redux - Effortless Bayesian Deep Learning
- Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods
- Large-Scale Learning with Fourier Features and Tensor Decompositions
- Large-Scale Unsupervised Object Discovery
- Large-Scale Wasserstein Gradient Flows
- Last-iterate Convergence in Extensive-Form Games
- Last iterate convergence of SGD for Least-Squares in the Interpolation regime.
- Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons
- Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages
- Latent Matters: Learning Deep State-Space Models
- Lattice partition recovery with dyadic CART
- LEADS: Learning Dynamical Systems that Generalize Across Environments
- Learnability of Linear Thresholds from Label Proportions
- Learnable Fourier Features for Multi-dimensional Spatial Positional Encoding
- Learned Robust PCA: A Scalable Deep Unfolding Approach for High-Dimensional Outlier Detection
- Learning 3D Dense Correspondence via Canonical Point Autoencoder
- Learning and Decision-Making with Strategic Feedback (StratML)
- Learning and Generalization in RNNs
- Learning a Single Neuron with Bias Using Gradient Descent
- Learning-Augmented Dynamic Power Management with Multiple States via New Ski Rental Bounds
- Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations
- Learning Causal Semantic Representation for Out-of-Distribution Prediction
- Learning Collaborative Policies to Solve NP-hard Routing Problems
- Learning Compact Representations of Neural Networks using DiscriminAtive Masking (DAM)
- Learning Conjoint Attentions for Graph Neural Nets
- Learning curves of generic features maps for realistic datasets with a teacher-student model
- Learning Debiased and Disentangled Representations for Semantic Segmentation
- Learning Debiased Representation via Disentangled Feature Augmentation
- Learning Disentangled Behavior Embeddings
- Learning Distilled Collaboration Graph for Multi-Agent Perception
- Learning Diverse Policies in MOBA Games via Macro-Goals
- Learning Domain Invariant Representations in Goal-conditioned Block MDPs
- Learning Dynamic Graph Representation of Brain Connectome with Spatio-Temporal Attention
- Learning Equilibria in Matching Markets from Bandit Feedback
- Learning Equivariant Energy Based Models with Equivariant Stein Variational Gradient Descent
- Learning Fast-Inference Bayesian Networks
- Learning Frequency Domain Approximation for Binary Neural Networks
- Learning from Inside: Self-driven Siamese Sampling and Reasoning for Video Question Answering
- Learning Gaussian Mixtures with Generalized Linear Models: Precise Asymptotics in High-dimensions
- Learning Generalized Gumbel-max Causal Mechanisms
- Learning Generative Vision Transformer with Energy-Based Latent Space for Saliency Prediction
- Learning Graph Cellular Automata
- Learning Graph Models for Retrosynthesis Prediction
- Learning Hard Optimization Problems: A Data Generation Perspective
- Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence
- Learning in Multi-Stage Decentralized Matching Markets
- Learning in Non-Cooperative Configurable Markov Decision Processes
- Learning in Presence of Strategic Behavior
- Learning interaction rules from multi-animal trajectories via augmented behavioral models
- Learning Interpretable Decision Rule Sets: A Submodular Optimization Approach
- Learning in two-player zero-sum partially observable Markov games with perfect recall
- Learning Knowledge Graph-based World Models of Textual Environments
- Learning Large Neighborhood Search Policy for Integer Programming
- Learning latent causal graphs via mixture oracles
- Learning Markov State Abstractions for Deep Reinforcement Learning
- Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning
- Learning Meaningful Representations of Life (LMRL)
- Learning Models for Actionable Recourse
- Learning Nonparametric Volterra Kernels with Gaussian Processes
- Learning One Representation to Optimize All Rewards
- Learning on Random Balls is Sufficient for Estimating (Some) Graph Parameters
- Learning Optimal Predictive Checklists
- Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs
- Learning Riemannian metric for disease progression modeling
- Learning Robust Hierarchical Patterns of Human Brain across Many fMRI Studies
- Learning rule influences recurrent network representations but not attractor structure in decision-making tasks
- Learning Semantic Representations to Verify Hardware Designs
- Learning Signal-Agnostic Manifolds of Neural Fields
- Learning Space Partitions for Path Planning
- Learning Stable Deep Dynamics Models for Partially Observed or Delayed Dynamical Systems
- Learning State Representations from Random Deep Action-conditional Predictions
- Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound
- Learning Student-Friendly Teacher Networks for Knowledge Distillation
- Learning the optimal Tikhonov regularizer for inverse problems
- Learning Theory Can (Sometimes) Explain Generalisation in Graph Neural Networks
- Learning to Adapt via Latent Domains for Adaptive Semantic Segmentation
- Learning to Assimilate in Chaotic Dynamical Systems
- Learning to Combine Per-Example Solutions for Neural Program Synthesis
- Learning to Compose Visual Relations
- Learning to dehaze with polarization
- Learning to delegate for large-scale vehicle routing
- Learning to Draw: Emergent Communication through Sketching
- Learning to Elect
- Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics
- Learning to Generate Realistic Noisy Images via Pixel-level Noise-aware Adversarial Training
- Learning to Generate Visual Questions with Noisy Supervision
- Learning to Ground Multi-Agent Communication with Autoencoders
- Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer
- Learning to Learn Dense Gaussian Processes for Few-Shot Learning
- Learning to Learn Graph Topologies
- Learning-to-learn non-convex piecewise-Lipschitz functions
- Learning to Predict Trustworthiness with Steep Slope Loss
- Learning to Schedule Heuristics in Branch and Bound
- Learning to See by Looking at Noise
- Learning to Select Exogenous Events for Marked Temporal Point Process
- Learning to Simulate Self-driven Particles System with Coordinated Policy Optimization
- Learning to Synthesize Programs as Interpretable and Generalizable Policies
- Learning to Time-Decode in Spiking Neural Networks Through the Information Bottleneck
- Learning Transferable Adversarial Perturbations
- Learning Transferable Features for Point Cloud Detection via 3D Contrastive Co-training
- Learning Treatment Effects in Panels with General Intervention Patterns
- Learning Tree Interpretation from Object Representation for Deep Reinforcement Learning
- Learning where to learn: Gradient sparsity in meta and continual learning
- Learning with Algorithmic Supervision via Continuous Relaxations
- Learning with Holographic Reduced Representations
- Learning with Labeling Induced Abstentions
- Learning with Noisy Correspondence for Cross-modal Matching
- Learning with User-Level Privacy
- Least Square Calibration for Peer Reviews
- Leveraging Distribution Alignment via Stein Path for Cross-Domain Cold-Start Recommendation
- Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces
- Leveraging SE(3) Equivariance for Self-supervised Category-Level Object Pose Estimation from Point Clouds
- Leveraging Spatial and Temporal Correlations in Sparsified Mean Estimation
- Leveraging the Inductive Bias of Large Language Models for Abstract Textual Reasoning
- Lifelong Domain Adaptation via Consolidated Internal Distribution
- Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering
- Limiting fluctuation and trajectorial stability of multilayer neural networks with mean field training
- Linear and Kernel Classification in the Streaming Model: Improved Bounds for Heavy Hitters
- Linear Convergence in Federated Learning: Tackling Client Heterogeneity and Sparse Gradients
- Linear Convergence of Gradient Methods for Estimating Structured Transition Matrices in High-dimensional Vector Autoregressive Models
- Linear-Time Probabilistic Solution of Boundary Value Problems
- Lip to Speech Synthesis with Visual Context Attentional GAN
- List-Decodable Mean Estimation in Nearly-PCA Time
- Littlestone Classes are Privately Online Learnable
- LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes
- Local Differential Privacy for Regret Minimization in Reinforcement Learning
- Local Disentanglement in Variational Auto-Encoders Using Jacobian $L_1$ Regularization
- Local Explanation of Dialogue Response Generation
- Local Hyper-Flow Diffusion
- Locality defeats the curse of dimensionality in convolutional teacher-student scenarios
- Locality Sensitive Teaching
- Localization, Convexity, and Star Aggregation
- Localization with Sampling-Argmax
- Locally differentially private estimation of functionals of discrete distributions
- Locally Most Powerful Bayesian Test for Out-of-Distribution Detection using Deep Generative Models
- Locally private online change point detection
- Locally Valid and Discriminative Prediction Intervals for Deep Learning Models
- Local plasticity rules can learn deep representations using self-supervised contrastive predictions
- Local policy search with Bayesian optimization
- Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond Kernels
- Logarithmic Regret from Sublinear Hints
- Logarithmic Regret in Feature-based Dynamic Pricing
- Long Short-Term Transformer for Online Action Detection
- Long-Short Transformer: Efficient Transformers for Language and Vision
- Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis
- Look at What I’m Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos
- Looking Beyond Single Images for Contrastive Semantic Segmentation Learning
- Loss function based second-order Jensen inequality and its application to particle variational inference
- Lossy Compression for Lossless Prediction
- Low-dimensional Structure in the Space of Language Representations is Reflected in Brain Responses
- Lower and Upper Bounds on the Pseudo-Dimension of Tensor Network Models
- Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks
- Lower Bounds on Metropolized Sampling Methods for Well-Conditioned Distributions
- Low-Fidelity Video Encoder Optimization for Temporal Action Localization
- Low-Rank Constraints for Fast Inference in Structured Models
- Low-Rank Extragradient Method for Nonsmooth and Low-Rank Matrix Optimization Problems
- Low-Rank Subspaces in GANs
- LSH-SMILE: Locality Sensitive Hashing Accelerated Simulation and Learning
- Luna: Linear Unified Nested Attention
- Machine Learning and Statistics for Climate Science
- Machine Learning and the Physical Sciences
- Machine Learning for Autonomous Driving
- Machine Learning for Creativity and Design
- Machine Learning for the Developing World (ML4D): Global Challenges
- Machine Learning for Variance Reduction in Online Experiments
- Machine learning from ground truth: New medical imaging datasets for unsolved medical problems.
- Machine Learning in Public Health
- Machine Learning in Structural Biology
- Machine Learning Meets Econometrics (MLECON)
- Machine learning structure preserving brackets for forecasting irreversible processes
- Machine Learning With Quantum Computers
- Machine versus Human Attention in Deep Reinforcement Learning Tasks
- MADE: Exploration via Maximizing Deviation from Explored Regions
- MagNet: A Neural Network for Directed Graphs
- Make Sure You're Unsure: A Framework for Verifying Probabilistic Specifications
- Making a (Counterfactual) Difference One Rationale at a Time
- Making the most of your day: online learning for optimal allocation of time
- Manifold Topology Divergence: a Framework for Comparing Data Manifolds.
- Manipulating SGD with Data Ordering Attacks
- MAP Propagation Algorithm: Faster Learning with a Team of Reinforcement Learning Agents
- Marginalised Gaussian Processes with Nested Sampling
- Margin-Independent Online Multiclass Learning via Convex Geometry
- MarioNette: Self-Supervised Sprite Learning
- Mastering Atari Games with Limited Data
- Matching a Desired Causal State via Shift Interventions
- Math AI for Education (MATHAI4ED): Bridging the Gap Between Research and Smart Education
- Matrix encoding networks for neural combinatorial optimization
- Matrix factorisation and the interpretation of geodesic distance
- MAU: A Motion-Aware Unit for Video Prediction and Beyond
- MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
- Maximum Likelihood Training of Score-Based Diffusion Models
- MCMC Variational Inference via Uncorrected Hamiltonian Annealing
- Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination
- Meaning in Context: Pragmatic Communication in Humans and Machines
- Measuring Generalization with Optimal Transport
- Medical Dead-ends and Learning to Identify High-Risk States and Treatments
- Medical Imaging meets NeurIPS
- Memory-Efficient Approximation Algorithms for Max-k-Cut and Correlation Clustering
- Memory Efficient Meta-Learning with Large Images
- Memory-efficient Patch-based Inference for Tiny Deep Learning
- MERLOT: Multimodal Neural Script Knowledge Models
- Message Passing In Machine Learning
- MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge
- Meta-Adaptive Nonlinear Control: Theory and Algorithms
- MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images
- Metacognition in the Age of AI: Challenges and Opportunities
- Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models
- Meta Internal Learning
- Meta Learning Backpropagation And Improving It
- Meta-Learning for Relative Density-Ratio Estimation
- Meta-Learning Reliable Priors in the Function Space
- Meta-Learning Sparse Implicit Neural Representations
- Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks
- Meta-learning to Improve Pre-training
- Meta-learning with an Adaptive Task Scheduler
- Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data
- Metropolis-Hastings Data Augmentation for Graph Neural Networks
- M-FAC: Efficient Matrix-Free Approximations of Second-Order Information
- MICo: Improved representations via sampling-based state similarity for Markov decision processes
- Mind the Gap: Assessing Temporal Generalization in Neural Language Models
- Minibatch and Momentum Model-based Methods for Stochastic Weakly Convex Optimization
- Mini-Batch Consistent Slot Set Encoder for Scalable Set Encoding
- Minimax Optimal Quantile and Semi-Adversarial Regret via Root-Logarithmic Regularizers
- Minimax Regret for Stochastic Shortest Path
- Minimizing Polarization and Disagreement in Social Networks via Link Recommendation
- Mining the Benefits of Two-stage and One-stage HOI Detection
- MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms
- Mirror Langevin Monte Carlo: the Case Under Isoperimetry
- Misspecified Gaussian Process Bandit Optimization
- Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage
- Mitigating Forgetting in Online Continual Learning with Neuron Calibration
- Mixability made efficient: Fast online multiclass logistic regression
- MixACM: Mixup-Based Robustness Transfer via Distillation of Activated Channel Maps
- Mixed Supervised Object Detection by Transferring Mask Prior and Semantic Similarity
- MixSeq: Connecting Macroscopic Time Series Forecasting with Microscopic Time Series Data
- Mixture Proportion Estimation and PU Learning:A Modern Approach
- Mixture weights optimisation for Alpha-Divergence Variational Inference
- ML for Physics and Physics for ML
- ML For Systems
- MLP-Mixer: An all-MLP Architecture for Vision
- MobILE: Model-Based Imitation Learning From Observation Alone
- MobTCast: Leveraging Auxiliary Trajectory Forecasting for Human Mobility Prediction
- Modality-Agnostic Topology Aware Localization
- Model Adaptation: Historical Contrastive Learning for Unsupervised Domain Adaptation without Source Data
- Model-Based Domain Generalization
- Model-Based Episodic Memory Induces Dynamic Hybrid Controls
- Model-Based Reinforcement Learning via Imagination with Derived Memory
- Modeling Heterogeneous Hierarchies with Relation-specific Hyperbolic Cones
- Model, sample, and epoch-wise descents: exact solution of gradient flow in the random feature model
- Model Selection for Bayesian Autoencoders
- Modified Frank Wolfe in Probability Space
- Modular Gaussian Processes for Transfer Learning
- MOMA: Multi-Object Multi-Actor Activity Parsing
- Momentum Centering and Asynchronous Update for Adaptive Gradient Methods
- Monte Carlo Tree Search With Iteratively Refining State Abstractions
- Morié Attack (MA): A New Potential Risk of Screen Photos
- Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data
- Moser Flow: Divergence-based Generative Modeling on Manifolds
- Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
- Motif-based Graph Self-Supervised Learning for Molecular Property Prediction
- MST: Masked Self-Supervised Transformer for Visual Representation
- Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks
- Multi-Agent Reinforcement Learning in Stochastic Networked Systems
- Multi-armed Bandit Requiring Monotone Arm Sequences
- Multi-Armed Bandits with Bounded Arm-Memory: Near-Optimal Guarantees for Best-Arm Identification and Regret Minimization
- Multiclass Boosting and the Cost of Weak Learning
- Multiclass versus Binary Differentially Private PAC Learning
- Multi-Facet Clustering Variational Autoencoders
- Multi-Label Learning with Pairwise Relevance Ordering
- Multilingual Pre-training with Universal Dependency Learning
- Multimodal and Multilingual Embeddings for Large-Scale Speech Mining
- Multi-modal Dependency Tree for Video Captioning
- Multimodal Few-Shot Learning with Frozen Language Models
- Multimodal Virtual Point 3D Detection
- Multi-Objective Meta Learning
- Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
- Multi-Person 3D Motion Prediction with Multi-Range Transformers
- Multiple Descent: Design Your Own Generalization Curve
- Multi-Scale Representation Learning on Proteins
- Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs
- Multi-task Learning of Order-Consistent Causal Graphs
- Multi-view Contrastive Graph Clustering
- Multi-View Representation Learning via Total Correlation Objective
- Multiwavelet-based Operator Learning for Differential Equations
- NAS-Bench-x11 and the Power of Learning Curves
- Natural continual learning: success is a journey, not (just) a destination
- Navigating to the Best Policy in Markov Decision Processes
- Nearly Horizon-Free Offline Reinforcement Learning
- Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
- Nearly-Tight and Oblivious Algorithms for Explainable Clustering
- Near-Optimal Lower Bounds For Convex Optimization For All Orders of Smoothness
- Near-Optimal Multi-Perturbation Experimental Design for Causal Structure Learning
- Near-Optimal No-Regret Learning in General Games
- Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems
- Near-Optimal Offline Reinforcement Learning via Double Variance Reduction
- Near Optimal Policy Optimization via REPS
- Necessary and sufficient graphical conditions for optimal adjustment sets in causal graphical models with hidden variables
- Neighborhood Reconstructing Autoencoders
- Neo-GNNs: Neighborhood Overlap-aware Graph Neural Networks for Link Prediction
- NEO: Non Equilibrium Sampling on the Orbits of a Deterministic Transform
- NeRS: Neural Reflectance Surfaces for Sparse-view 3D Reconstruction in the Wild
- NeRV: Neural Representations for Videos
- Nested Counterfactual Identification from Arbitrary Surrogate Experiments
- Nested Graph Neural Networks
- Nested Variational Inference
- Network-to-Network Regularization: Enforcing Occam's Razor to Improve Generalization
- Neural Active Learning with Performance Guarantees
- Neural Additive Models: Interpretable Machine Learning with Neural Nets
- Neural Algorithmic Reasoners are Implicit Planners
- Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
- Neural Architecture Dilation for Adversarial Robustness
- Neural Auto-Curricula in Two-Player Zero-Sum Games
- Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction
- Neural Bootstrapper
- Neural Circuit Synthesis from Specification Patterns
- Neural Distance Embeddings for Biological Sequences
- Neural Dubber: Dubbing for Videos According to Scripts
- Neural Ensemble Search for Uncertainty Estimation and Dataset Shift
- Neural Flows: Efficient Alternative to Neural ODEs
- Neural Human Performer: Learning Generalizable Radiance Fields for Human Performance Rendering
- Neural Hybrid Automata: Learning Dynamics With Multiple Modes and Stochastic Transitions
- Neural optimal feedback control with local learning rules
- Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition
- Neural Population Geometry Reveals the Role of Stochasticity in Robust Perception
- Neural Production Systems
- Neural Program Generation Modulo Static Analysis
- Neural Pseudo-Label Optimism for the Bank Loan Problem
- Neural Regression, Representational Similarity, Model Zoology & Neural Taskonomy at Scale in Rodent Visual Cortex
- Neural Relightable Participating Media Rendering
- Neural Routing by Memory
- Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation
- Neural Scene Flow Prior
- Neural Symplectic Form: Learning Hamiltonian Equations on General Coordinate Systems
- Neural Tangent Kernel Maximum Mean Discrepancy
- Neural Trees for Learning on Graphs
- Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose
- NeuroLKH: Combining Deep Learning Model with Lin-Kernighan-Helsgaun Heuristic for Solving the Traveling Salesman Problem
- NeuroMLR: Robust & Reliable Route Recommendation on Road Networks
- NeurWIN: Neural Whittle Index Network For Restless Bandits Via Deep RL
- NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction
- Never Go Full Batch (in Stochastic Convex Optimization)
- New Frontiers in Federated Learning: Privacy, Fairness, Robustness, Personalization and Data Ownership
- Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update
- NN-Baker: A Neural-network Infused Algorithmic Framework for Optimization Problems on Geometric Intersection Graphs
- Node Dependent Local Smoothing for Scalable Graph Learning
- Noether Networks: meta-learning useful conserved quantities
- Noether’s Learning Dynamics: Role of Symmetry Breaking in Neural Networks
- No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data
- Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Clean Images
- Noisy Adaptation Generates Lévy Flights in Attractor Neural Networks
- Noisy Recurrent Neural Networks
- Non-approximate Inference for Collective Graphical Models on Path Graphs via Discrete Difference of Convex Algorithm
- Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation
- Non-asymptotic convergence bounds for Wasserstein approximation using point clouds
- Non-asymptotic Error Bounds for Bidirectional GANs
- Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis
- Non-Gaussian Gaussian Processes for Few-Shot Regression
- Non-local Latent Relation Distillation for Self-Adaptive 3D Human Pose Estimation
- Nonparametric estimation of continuous DPPs with kernel methods
- Nonsmooth Implicit Differentiation for Machine-Learning and Optimization
- Nonuniform Negative Sampling and Log Odds Correction with Rare Events Data
- No-Press Diplomacy from Scratch
- No-regret Online Learning over Riemannian Manifolds
- No Regrets for Learning the Prior in Bandits
- NORESQA: A Framework for Speech Quality Assessment using Non-Matching References
- No RL, No Simulation: Learning to Navigate without Navigating
- Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition
- Not All Low-Pass Filters are Robust in Graph Convolutional Networks
- NovelD: A Simple yet Effective Exploration Criterion
- Novel Upper Bounds for the Constrained Most Probable Explanation Task
- Novel Visual Category Discovery with Dual Ranking Statistics and Mutual Knowledge Distillation
- NTopo: Mesh-free Topology Optimization using Implicit Neural Representations
- Numerical Composition of Differential Privacy
- Numerical influence of ReLU’(0) on backpropagation
- NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM
- Object-aware Contrastive Learning for Debiased Scene Representation
- Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning
- Object-Centric Representation Learning with Generative Spatial-Temporal Factorization
- Object DGCNN: 3D Object Detection using Dynamic Graphs
- Observation-Free Attacks on Stochastic Bandits
- OctField: Hierarchical Implicit Functions for 3D Modeling
- Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration
- Offline Meta Reinforcement Learning -- Identifiability Challenges and Effective Data Collection Strategies
- Offline Model-based Adaptable Policy Learning
- Offline Reinforcement Learning
- Offline Reinforcement Learning as One Big Sequence Modeling Problem
- Offline Reinforcement Learning with Reverse Model-based Imagination
- Offline RL Without Off-Policy Evaluation
- Off-Policy Risk Assessment in Contextual Bandits
- On Blame Attribution for Accountable Multi-Agent Sequential Decision Making
- On Calibration and Out-of-Domain Generalization
- On Component Interactions in Two-Stage Recommender Systems
- On Contrastive Representations of Stochastic Processes
- One Explanation is Not Enough: Structured Attention Graphs for Image Classification
- On Effective Scheduling of Model-based Reinforcement Learning
- One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective
- One More Step Towards Reality: Cooperative Bandits with Imperfect Communication
- On Empirical Risk Minimization with Dependent and Heavy-Tailed Data
- On Episodes, Prototypical Networks, and Few-Shot Learning
- One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval
- On Inductive Biases for Heterogeneous Treatment Effect Estimation
- On Interaction Between Augmentations and Corruptions in Natural Corruption Robustness
- On Joint Learning for Solving Placement and Routing in Chip Design
- On Large-Cohort Training for Federated Learning
- On Learning Domain-Invariant Representations for Transfer Learning with Multiple Sources
- On learning sparse vectors from mixture of responses
- Online Active Learning with Surrogate Loss Functions
- Online Adaptation to Label Distribution Shift
- Online and Offline Reinforcement Learning by Planning with a Learned Model
- On Linear Stability of SGD and Input-Smoothness of Neural Networks
- Online Control of Unknown Time-Varying Dynamical Systems
- Online Convex Optimization with Continuous Switching Constraint
- Online Facility Location with Multiple Advice
- Online false discovery rate control for anomaly detection in time series
- Online Knapsack with Frequency Predictions
- Online Learning and Control of Complex Dynamical Systems from Sensory Input
- Online learning in MDPs with linear function approximation and bandit feedback.
- Online Learning in Periodic Zero-Sum Games
- Online Learning Of Neural Computations From Sparse Temporal Feedback
- Online Market Equilibrium with Application to Fair Division
- Online Matching in Sparse Random Graphs: Non-Asymptotic Performances of Greedy Algorithm
- Online Meta-Learning via Learning with Layer-Distributed Memory
- Online Multi-Armed Bandits with Adaptive Inference
- Online Robust Reinforcement Learning with Model Uncertainty
- Online Selective Classification with Limited Feedback
- Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits
- Online Variational Filtering and Parameter Learning
- On Locality of Local Explanation Models
- Only Train Once: A One-Shot Neural Network Training And Pruning Framework
- On Margin-Based Cluster Recovery with Oracle Queries
- On Memorization in Probabilistic Deep Generative Models
- On Model Calibration for Long-Tailed Object Detection and Instance Segmentation
- On Optimal Interpolation in Linear Regression
- On Optimal Robustness to Adversarial Corruption in Online Decision Problems
- On Path Integration of Grid Cells: Group Representation and Isotropic Scaling
- On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations
- On Plasticity, Invariance, and Mutually Frozen Weights in Sequential Task Learning
- On Provable Benefits of Depth in Training Graph Convolutional Networks
- On Riemannian Optimization over Positive Definite Matrices with the Bures-Wasserstein Geometry
- On Robust Optimal Transport: Computational Complexity and Barycenter Computation
- On sensitivity of meta-learning to support data
- On Success and Simplicity: A Second Look at Transferable Targeted Attacks
- On the Algorithmic Stability of Adversarial Training
- On the Bias-Variance-Cost Tradeoff of Stochastic Optimization
- On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
- On the Convergence of Prior-Guided Zeroth-Order Optimization Algorithms
- On the Convergence of Step Decay Step-Size for Stochastic Optimization
- On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning
- On the Cryptographic Hardness of Learning Single Periodic Neurons
- On the Equivalence between Neural Network and Support Vector Machine
- On the Estimation Bias in Double Q-Learning
- On the Existence of The Adversarial Bayes Classifier
- On the Expected Complexity of Maxout Networks
- On the Expressivity of Markov Reward
- On the Frequency Bias of Generative Models
- On the Generative Utility of Cyclic Conditionals
- On the Importance of Gradients for Detecting Distributional Shifts in the Wild
- On the interplay between data structure and loss function in classification problems
- On the Out-of-distribution Generalization of Probabilistic Image Modelling
- On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay
- On the Power of Differentiable Learning versus PAC and SQ Learning
- On the Power of Edge Independent Graph Models
- On the Provable Generalization of Recurrent Neural Networks
- On the Rate of Convergence of Regularized Learning in Games: From Bandits and Uncertainty to Optimism and Beyond
- On the Representation of Solutions to Elliptic PDEs in Barron Spaces
- On the Representation Power of Set Pooling Networks
- On the Role of Optimization in Double Descent: A Least Squares Study
- On the Sample Complexity of Learning under Geometric Stability
- On the Sample Complexity of Privately Learning Axis-Aligned Rectangles
- On the Second-order Convergence Properties of Random Search Methods
- On the Stochastic Stability of Deep Markov Models
- On The Structure of Parametric Tournaments with Application to Ranking from Pairwise Comparisons
- On the Suboptimality of Thompson Sampling in High Dimensions
- On the Theory of Reinforcement Learning with Once-per-Episode Feedback
- On the Universality of Graph Neural Networks on Large Random Graphs
- On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)
- On the Value of Infinite Gradients in Variational Autoencoder Models
- On the Value of Interaction and Function Approximation in Imitation Learning
- On the Variance of the Fisher Information for Deep Learning
- On Training Implicit Models
- On UMAP's True Loss Function
- OpenMatch: Open-Set Semi-supervised Learning with Open-set Consistency Regularization
- Open Rule Induction
- Open-set Label Noise Can Improve Robustness Against Inherent Label Noise
- OPT 2021: Optimization for Machine Learning
- Optimal Algorithms for Stochastic Contextual Preference Bandits
- Optimal Best-Arm Identification Methods for Tail-Risk Measures
- Optimal Gradient-based Algorithms for Non-concave Bandit Optimization
- Optimality and Stability in Federated Learning: A Game-theoretic Approach
- Optimality of variational inference for stochasticblock model with missing links
- Optimal Order Simple Regret for Gaussian Process Bandits
- Optimal Policies Tend To Seek Power
- Optimal prediction of Markov chains with and without spectral gap
- Optimal Rates for Nonparametric Density Estimation under Communication Constraints
- Optimal Rates for Random Order Online Optimization
- Optimal Sketching for Trace Estimation
- Optimal Transport and Machine Learning
- Optimal Transport: Past, Present, and Future
- Optimal Underdamped Langevin MCMC Method
- Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
- Optimization-Based Algebraic Multigrid Coarsening Using Reinforcement Learning
- Optimizing Conditional Value-At-Risk of Black-Box Functions
- Optimizing Information-theoretical Generalization Bound via Anisotropic Noise of SGLD
- Optimizing Reusable Knowledge for Continual Learning via Metalearning
- Oracle Complexity in Nonsmooth Nonconvex Optimization
- Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure
- OSOA: One-Shot Online Adaptation of Deep Generative Models for Lossless Compression
- Outcome-Driven Reinforcement Learning via Variational Inference
- Out-of-distribution generalization and adaptation in natural and artificial intelligence
- Out-of-Distribution Generalization in Kernel Regression
- Overcoming Catastrophic Forgetting in Incremental Few-Shot Learning by Finding Flat Minima
- Overcoming the Convex Barrier for Simplex Inputs
- Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning
- Overinterpretation reveals image classification model pathologies
- Overlapping Spaces for Compact Graph Representations
- Overparameterization Improves Robustness to Covariate Shift in High Dimensions
- Panoptic 3D Scene Reconstruction From a Single RGB Image
- Parallel and Efficient Hierarchical k-Median Clustering
- Parallel Bayesian Optimization of Multiple Noisy Objectives with Expected Hypervolume Improvement
- Parallelizing Thompson Sampling
- Parameter-free HE-friendly Logistic Regression
- Parameter Inference with Bifurcation Diagrams
- Parameterized Knowledge Transfer for Personalized Federated Learning
- Parameter Prediction for Unseen Deep Architectures
- Parametric Complexity Bounds for Approximating PDEs with Neural Networks
- Parametrized Quantum Policies for Reinforcement Learning
- Pareto Domain Adaptation
- Pareto-Optimal Learning-Augmented Algorithms for Online Conversion Problems
- ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions
- PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
- PartialFed: Cross-Domain Personalized Federated Learning via Partial Initialization
- Partial success in closing the gap between human and machine vision
- Particle Cloud Generation with Message Passing Generative Adversarial Networks
- Particle Dual Averaging: Optimization of Mean Field Neural Network with Global Convergence Rate Analysis
- Partition and Code: learning how to compress graphs
- Partition-Based Formulations for Mixed-Integer Optimization of Trained ReLU Neural Networks
- Passive attention in artificial neural networks predicts human visual selectivity
- PatchGame: Learning to Signal Mid-level Patches in Referential Games
- Pay Attention to MLPs
- Pay Attention to What You Need: Do Structural Priors Still Matter in the Age of Billion Parameter Models?
- Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling
- PCA Initialization for Approximate Message Passing in Rotationally Invariant Models
- PDE-GCN: Novel Architectures for Graph Neural Networks Motivated by Partial Differential Equations
- Perceptual Score: What Data Modalities Does Your Model Perceive?
- Periodic Activation Functions Induce Stationarity
- Permutation-Invariant Variational Autoencoder for Graph-Level Representation Learning
- Permuton-induced Chinese Restaurant Process
- Per-Pixel Classification is Not All You Need for Semantic Segmentation
- PerSim: Data-Efficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators
- Personalized Federated Learning With Gaussian Processes
- Perturb-and-max-product: Sampling and learning in discrete energy-based models
- Perturbation-based Regret Analysis of Predictive Control in Linear Time Varying Systems
- Perturbation Theory for the Information Bottleneck
- Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL
- PettingZoo: Gym for Multi-Agent Reinforcement Learning
- Photonic Differential Privacy with Direct Feedback Alignment
- Physical Reasoning and Inductive Biases for the Real World
- Physics-Aware Downsampling with Deep Learning for Scalable Flood Modeling
- Physics-Integrated Variational Autoencoders for Robust and Interpretable Generative Modeling
- Pipeline Combinators for Gradual AutoML
- Piper: Multidimensional Planner for DNN Parallelization
- PiRank: Scalable Learning To Rank via Differentiable Sorting
- Planning from Pixels in Environments with Combinatorially Hard Search Spaces
- Play to Grade: Testing Coding Games as Classifying Markov Decision Process
- PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning
- PLUGIn: A simple algorithm for inverting generative models with recovery guarantees
- PLUR: A Unifying, Graph-Based View of Program Learning, Understanding, and Repair
- Pointwise Bounds for Distribution Estimation under Communication Constraints
- PolarStream: Streaming Object Detection and Segmentation with Polar Pillars
- Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
- Policy Learning Using Weak Supervision
- Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
- Political Economy of Reinforcement Learning Systems (PERLS)
- POODLE: Improving Few-shot Learning via Penalizing Out-of-Distribution Samples
- Pooling by Sliced-Wasserstein Embedding
- PortaSpeech: Portable and High-Quality Generative Text-to-Speech
- Post-Contextual-Bandit Inference
- Posterior Collapse and Latent Variable Non-identifiability
- Posterior Meta-Replay for Continual Learning
- Post-processing for Individual Fairness
- Post-Training Quantization for Vision Transformer
- Post-Training Sparsity-Aware Quantization
- Powerpropagation: A sparsity inducing weight reparameterisation
- Practical Large-Scale Linear Programming using Primal-Dual Hybrid Gradient
- Practical Near Neighbor Search via Group Testing
- Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers
- Pragmatic Image Compression for Human-in-the-Loop Decision-Making
- Precise characterization of the prior predictive distribution of deep ReLU networks
- Preconditioned Gradient Descent for Over-Parameterized Nonconvex Matrix Factorization
- Predicting Deep Neural Network Generalization with Perturbation Response Curves
- Predicting Event Memorability from Contextual Visual Semantics
- Predicting Molecular Conformation via Dynamic Graph Score Matching
- Predicting What You Already Know Helps: Provable Self-Supervised Learning
- Predify: Augmenting deep neural networks with brain-inspired predictive coding dynamics
- PreferenceNet: Encoding Human Preferences in Auction Design with Deep Learning
- Preserved central model for faster bidirectional compression in distributed settings
- Pretraining Representations for Data-Efficient Reinforcement Learning
- Prior-independent Dynamic Auctions for a Value-maximizing Buyer
- Privacy in Machine Learning (PriML) 2021
- Private and Non-private Uniformity Testing for Ranking Data
- Private learning implies quantum stability
- Privately Learning Mixtures of Axis-Aligned Gaussians
- Privately Learning Subspaces
- Privately Publishable Per-instance Privacy
- Private Non-smooth ERM and SCO in Subquadratic Steps
- Probabilistic Attention for Interactive Segmentation
- Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs
- Probabilistic Forecasting: A Level-Set Approach
- Probabilistic Margins for Instance Reweighting in Adversarial Training
- Probabilistic Tensor Decomposition of Neural Population Spiking Activity
- Probabilistic Transformer For Time Series Analysis
- Probability Paths and the Structure of Predictions over Time
- Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training
- Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets
- Profiling Pareto Front With Multi-Objective Stein Variational Gradient Descent
- Program Synthesis Guided Reinforcement Learning for Partially Observed Environments
- Progressive Coordinate Transforms for Monocular 3D Object Detection
- Progressive Feature Interaction Search for Deep Sparse Network
- Projected GANs Converge Faster
- Property-Aware Relation Networks for Few-Shot Molecular Property Prediction
- Proper Value Equivalence
- Proportional Participatory Budgeting with Additive Utilities
- ProTo: Program-Guided Transformer for Program-Guided Tasks
- Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation
- Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
- Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss
- Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature
- Provable Representation Learning for Imitation with Contrastive Fourier Features
- Provably Efficient Black-Box Action Poisoning Attacks Against Reinforcement Learning
- Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
- Provably efficient multi-task reinforcement learning with model transfer
- Provably Efficient Reinforcement Learning with Linear Function Approximation under Adaptivity Constraints
- Provably efficient, succinct, and precise explanations
- Provably Faster Algorithms for Bilevel Optimization
- Provably Strict Generalisation Benefit for Invariance in Kernel Methods
- Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent
- Proxy-Normalizing Activations to Match Batch Normalization while Removing Batch Dependence
- Pruning Randomly Initialized Neural Networks with Iterative Randomization
- PSD Representations for Effective Probability Models
- Pseudo-Spherical Contrastive Divergence
- PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning
- Pure Exploration in Kernel and Neural Bandits
- Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples
- Quantifying and Improving Transferability in Domain Generalization
- Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes
- QuPeD: Quantized Personalization via Distillation with Applications to Federated Learning
- Random Noise Defense Against Query-Based Black-Box Attacks
- Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems
- Ranking Policy Decisions
- Rank Overspecified Robust Matrix Recovery: Subgradient Method and Exact Recovery
- Rate-Optimal Subspace Estimation on Random Graphs
- Rates of Estimation of Optimal Transport Maps using Plug-in Estimators via Barycentric Projections
- Raw Nav-merge Seismic Data to Subsurface Properties with MLP based Multi-Modal Information Unscrambler
- R-Drop: Regularized Dropout for Neural Networks
- ReAct: Out-of-distribution Detection With Rectified Activations
- Realistic evaluation of transductive few-shot learning
- Real-Time Optimization for Fast and Complex Control Systems
- Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training
- Rebounding Bandits for Modeling Satiation Effects
- Recognizing Vector Graphics without Rasterization
- Reconstruction for Powerful Graph Representations
- Recovering Latent Causal Factor for Generalization to Distributional Shifts
- Recovery Analysis for Plug-and-Play Priors using the Restricted Eigenvalue Condition
- Rectangular Flows for Manifold Learning
- Rectifying the Shortcut Learning of Background for Few-Shot Learning
- Recurrence along Depth: Deep Convolutional Neural Networks with Recurrent Layer Aggregation
- Recurrent Bayesian Classifier Chains for Exact Multi-Label Classification
- Recurrent Submodular Welfare and Matroid Blocking Semi-Bandits
- Recursive Bayesian Networks: Generalising and Unifying Probabilistic Context-Free Grammars and Dynamic Bayesian Networks
- Recursive Causal Structure Learning in the Presence of Latent Variables and Selection Bias
- Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems
- RED : Looking for Redundancies for Data-FreeStructured Compression of Deep Neural Networks
- Reducing Collision Checking for Sampling-Based Motion Planning Using Graph Neural Networks
- Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation
- Reducing the Covariate Shift by Mirror Samples in Cross Domain Alignment
- Referring Transformer: A One-step Approach to Multi-task Visual Grounding
- Refined Learning Bounds for Kernel and Approximate $k$-Means
- Refining Language Models with Compositional Explanations
- Reformulating Zero-shot Action Recognition for Multi-label Actions
- Regime Switching Bandits
- Regret Bounds for Gaussian-Process Optimization in Large Domains
- Regret Minimization Experience Replay in Off-Policy Reinforcement Learning
- Regularization in ResNet with Stochastic Depth
- Regularized Frank-Wolfe for Dense CRFs: Generalizing Mean Field and Beyond
- Regularized Softmax Deep Multi-Agent Q-Learning
- Regulating algorithmic filtering on social media
- Reinforced Few-Shot Acquisition Function Learning for Bayesian Optimization
- Reinforcement Learning based Disease Progression Model for Alzheimer’s Disease
- Reinforcement Learning Enhanced Explainer for Graph Neural Networks
- Reinforcement learning for optimization of variational quantum circuit architectures
- Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection
- Reinforcement Learning in Newcomblike Environments
- Reinforcement Learning in Reward-Mixing MDPs
- Reinforcement Learning with Latent Flow
- Reinforcement Learning with State Observation Costs in Action-Contingent Noiselessly Observable Markov Decision Processes
- Relational Self-Attention: What's Missing in Attention for Video Understanding
- Relative Flatness and Generalization
- Relative stability toward diffeomorphisms indicates performance in deep nets
- Relative Uncertainty Learning for Facial Expression Recognition
- Relaxed Marginal Consistency for Differentially Private Query Answering
- Relaxing Local Robustness
- RelaySum for Decentralized Deep Learning on Heterogeneous Data
- Reliable and Trustworthy Machine Learning for Health Using Dataset Shift Detection
- Reliable Causal Discovery with Improved Exact Search and Weaker Assumptions
- Reliable Decisions with Threshold Calibration
- Reliable Estimation of KL Divergence using a Discriminator in Reproducing Kernel Hilbert Space
- Reliable Post hoc Explanations: Modeling Uncertainty in Explainability
- ReLU Regression with Massart Noise
- Remember What You Want to Forget: Algorithms for Machine Unlearning
- REMIPS: Physically Consistent 3D Reconstruction of Multiple Interacting People under Weak Supervision
- Removing Inter-Experimental Variability from Functional Data in Systems Neuroscience
- Renyi Differential Privacy of The Subsampled Shuffle Model In Distributed Learning
- Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification
- Replay-Guided Adversarial Environment Design
- Representation Costs of Linear Neural Networks: Analysis and Design
- Representation Learning Beyond Linear Prediction Functions
- Representation Learning for Event-based Visuomotor Policies
- Representation Learning on Spatial Networks
- Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models
- Representing Hyperbolic Space Accurately using Multi-Component Floats
- Representing Long-Range Context for Graph Neural Networks with Global Attention
- Repulsive Deep Ensembles are Bayesian
- Re-ranking for image retrieval and transductive few-shot classification
- Residual2Vec: Debiasing graph embedding with random graphs
- Residual Pathway Priors for Soft Equivariance Constraints
- Residual Relaxation for Multi-view Representation Learning
- ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees
- ReSSL: Relational Self-Supervised Learning with Weak Augmentation
- ResT: An Efficient Transformer for Visual Recognition
- Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization
- Rethinking Calibration of Deep Neural Networks: Do Not Be Afraid of Overconfidence
- Rethinking conditional GAN training: An approach using geometrically structured latent manifolds
- Rethinking gradient sparsification as total error minimization
- Rethinking Graph Transformers with Spectral Attention
- Rethinking Neural Operations for Diverse Tasks
- Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation
- Rethinking the Pruning Criteria for Convolutional Neural Network
- Rethinking the Variational Interpretation of Accelerated Optimization Methods
- Retiring Adult: New Datasets for Fair Machine Learning
- RETRIEVE: Coreset Selection for Efficient and Robust Semi-Supervised Learning
- Reusing Combinatorial Structure: Faster Iterative Projections over Submodular Base Polytopes
- Revealing and Protecting Labels in Distributed Training
- Revenue maximization via machine learning with noisy data
- Reverse-Complement Equivariant Networks for DNA Sequences
- Reverse engineering learned optimizers reveals known and novel mechanisms
- Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems
- Revisiting 3D Object Detection From an Egocentric Perspective
- Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations
- Revisiting Deep Learning Models for Tabular Data
- Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme
- Revisiting Hilbert-Schmidt Information Bottleneck for Adversarial Robustness
- Revisiting Model Stitching to Compare Neural Representations
- Revisiting ResNets: Improved Training and Scaling Strategies
- Revisiting Smoothed Online Learning
- Revisiting the Calibration of Modern Neural Networks
- Revisit Multimodal Meta-Learning through the Lens of Multi-Task Learning
- Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning
- Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation
- Reward is enough for convex MDPs
- RIM: Reliable Influence-based Active Learning on Graphs
- Risk-Averse Bayes-Adaptive Reinforcement Learning
- Risk-averse Heteroscedastic Bayesian Optimization
- Risk-Aware Transfer in Reinforcement Learning using Successor Features
- Risk Bounds and Calibration for a Smart Predict-then-Optimize Method
- Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures
- Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning
- Risk Monotonicity in Statistical Learning
- RL for Latent MDPs: Regret Guarantees and a Lower Bound
- RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem
- RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents
- RMM: Reinforced Memory Management for Class-Incremental Learning
- Robust Allocations with Diversity Constraints
- Robust and Decomposable Average Precision for Image Retrieval
- Robust and differentially private mean estimation
- Robust and Fully-Dynamic Coreset for Continuous-and-Bounded Learning (With Outliers) Problems
- Robust Auction Design in the Auto-bidding World
- Robust Compressed Sensing MRI with Deep Generative Priors
- Robust Contrastive Learning Using Negative Samples with Diminished Semantics
- Robust Counterfactual Explanations on Graph Neural Networks
- Robust Deep Reinforcement Learning through Adversarial Loss
- Robust Generalization despite Distribution Shift via Minimum Discriminating Information
- Robustifying Algorithms of Learning Latent Trees with Vector Variables
- Robust Implicit Networks via Non-Euclidean Contractions
- Robust Inverse Reinforcement Learning under Transition Dynamics Mismatch
- Robust Learning of Optimal Auctions
- Robustness between the worst and average case
- Robustness of Graph Neural Networks at Scale
- Robustness via Uncertainty-aware Cycle Consistency
- Robust Online Correlation Clustering
- Robust Optimization for Multilingual Translation with Imbalanced Data
- Robust Pose Estimation in Crowded Scenes with Direct Pose-Level Inference
- Robust Predictable Control
- Robust Regression Revisited: Acceleration and Improved Estimation Rates
- Robust Visual Reasoning via Language Guided Neural Module Networks
- ROI Maximization in Stochastic Online Decision-Making
- RoMA: Robust Model Adaptation for Offline Model-based Optimization
- Roto-translated Local Coordinate Frames For Interacting Dynamical Systems
- Rot-Pro: Modeling Transitivity by Projection in Knowledge Graph Embedding
- Row-clustering of a Point Process-valued Matrix
- S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks
- SADGA: Structure-Aware Dual Graph Aggregation Network for Text-to-SQL
- Safe and Robust Control of Uncertain Systems
- Safe Policy Optimization with Local Generalized Linear Function Approximations
- Safe Pontryagin Differentiable Programming
- Safe Reinforcement Learning by Imagining the Near Future
- Safe Reinforcement Learning with Natural Language Constraints
- Sageflow: Robust Federated Learning against Both Stragglers and Adversaries
- SalKG: Learning From Knowledge Graph Explanations for Commonsense Reasoning
- Sample Complexity Bounds for Active Ranking from Multi-wise Comparisons
- Sample Complexity of Tree Search Configuration: Cutting Planes and Beyond
- Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games
- Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model
- Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting
- Sample Selection for Fair and Robust Training
- Sampling with Trusthworthy Constraints: A Variational Gradient Framework
- Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?
- SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization
- SBO-RNN: Reformulating Recurrent Neural Networks via Stochastic Bilevel Optimization
- Scalable and Stable Surrogates for Flexible Classifiers with Fairness Constraints
- Scalable Bayesian GPFA with automatic relevance determination and discrete noise models
- Scalable Diverse Model Selection for Accessible Transfer Learning
- Scalable Inference in SDEs by Direct Matching of the Fokker–Planck–Kolmogorov Equation
- Scalable Inference of Sparsely-changing Gaussian Markov Random Fields
- Scalable Intervention Target Estimation in Linear Models
- Scalable Neural Data Server: A Data Recommender for Transfer Learning
- Scalable Online Planning via Reinforcement Learning Fine-Tuning
- Scalable Quasi-Bayesian Inference for Instrumental Variable Regression
- Scalable Rule-Based Representation Learning for Interpretable Classification
- Scalable Thompson Sampling using Sparse Gaussian Process Models
- Scalars are universal: Equivariant machine learning, structured like classical physics
- ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers
- Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets
- Scaling Gaussian Processes with Derivative Information Using Variational Inference
- Scaling Neural Tangent Kernels via Sketching and Random Features
- Scaling up Continuous-Time Markov Chains Helps Resolve Underspecification
- Scaling Up Exact Neural Network Compression by ReLU Stability
- Scaling Vision with Sparse Mixture of Experts
- Scallop: From Probabilistic Deductive Databases to Scalable Differentiable Reasoning
- Scatterbrain: Unifying Sparse and Low-rank Attention
- Scheduling jobs with stochastic holding costs
- Score-based Generative Modeling in Latent Space
- Score-based Generative Neural Networks for Large-Scale Optimal Transport
- SE(3)-equivariant prediction of molecular wavefunctions and electronic densities
- SEAL: Self-supervised Embodied Active Learning using Exploration and 3D Consistency
- Searching for Efficient Transformers for Language Modeling
- Searching Parameterized AP Loss for Object Detection
- Searching the Search Space of Vision Transformer
- Second-Order Neural ODE Optimizer
- Second Workshop on Quantum Tensor Networks in Machine Learning
- See More for Scene: Pairwise Consistency Learning for Scene Classification
- SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
- Selective Sampling for Online Best-arm Identification
- Self-Adaptable Point Processes with Nonparametric Time Decays
- Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning
- Self-Consistent Models and Values
- Self-Diagnosing GAN: Diagnosing Underrepresented Samples in Generative Adversarial Networks
- Self-Instantiated Recurrent Units with Dynamic Soft Recursion
- Self-Interpretable Model with Transformation Equivariant Interpretation
- Self-Paced Contrastive Learning for Semi-supervised Medical Image Segmentation with Meta-labels
- Self-Supervised Bug Detection and Repair
- Self-Supervised GANs with Label Augmentation
- Self-Supervised Learning Disentangled Group Representation as Feature
- Self-Supervised Learning of Event-Based Optical Flow with Spiking Neural Networks
- Self-Supervised Learning: Self-Prediction and Contrastive Learning
- Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style
- Self-Supervised Learning with Kernel Dependence Maximization
- Self-Supervised Multi-Object Tracking with Cross-input Consistency
- Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction
- Semialgebraic Representation of Monotone Deep Equilibrium Models and Applications to Certification
- Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning
- Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics
- Sequence-to-Sequence Learning with Latent Neural Grammars
- Sequential Algorithms for Testing Closeness of Distributions
- Sequential Causal Imitation Learning with Unobserved Confounders
- Set Prediction in the Latent Space
- Settling the Variance of Multi-Agent Policy Gradients
- SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs
- Shape As Points: A Differentiable Poisson Solver
- Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects
- Shape Registration in the Time of Transformers
- Shapeshifter: a Parameter-efficient Transformer using Factorized Reshaped Matrices
- Shape your Space: A Gaussian Mixture Regularization Approach to Deterministic Autoencoders
- Shaping embodied agent behavior with activity-context priors from egocentric video
- Shapley Residuals: Quantifying the limits of the Shapley value for explanations
- Shared Independent Component Analysis for Multi-Subject Neuroimaging
- Shared Visual Representations in Human and Machine Intelligence
- Sharp Impossibility Results for Hyper-graph Testing
- Shifted Chunk Transformer for Spatio-Temporal Representational Learning
- Shift Invariance Can Reduce Adversarial Robustness
- Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training data
- Sifting through the noise: Universal first-order methods for stochastic variational inequalities
- SILG: The Multi-domain Symbolic Interactive Language Grounding Benchmark
- Sim and Real: Better Together
- SimiGrad: Fine-Grained Adaptive Batching for Large Scale Training using Gradient Similarity Measurement
- Similarity and Matching of Neural Network Representations
- SIMILAR: Submodular Information Measures Based Active Learning In Realistic Scenarios
- SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition
- Simple steps are all you need: Frank-Wolfe and generalized self-concordant functions
- Simple Stochastic and Online Gradient Descent Algorithms for Pairwise Learning
- Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection
- SketchGen: Generating Constrained CAD Sketches
- Skipping the Frame-Level: Event-Based Piano Transcription With Neural Semi-CRFs
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method
- SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks
- Sliced Mutual Information: A Scalable Measure of Statistical Dependence
- Slice Sampling Reparameterization Gradients
- SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression
- Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation
- Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction
- Smooth Bilevel Programming for Sparse Regularization
- SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness
- Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization
- Smooth Normalizing Flows
- SNIPS: Solving Noisy Inverse Problems Stochastically
- Snowflake: Scaling GNNs to high-dimensional continuous control via parameter freezing
- SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language Navigation
- Soft Calibration Objectives for Neural Networks
- SOFT: Softmax-free Transformer with Linear Complexity
- SOLQ: Segmenting Objects by Learning Queries
- Solving Graph-based Public Goods Games with Tree Search and Imitation Learning
- Solving Min-Max Optimization with Hidden Structure via Gradient Descent Ascent
- Solving Soft Clustering Ensemble via $k$-Sparse Discrete Wasserstein Barycenter
- SOPE: Spectrum of Off-Policy Estimators
- Space-time Mixing Attention for Video Transformer
- SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search
- Sparse Deep Learning: A New Framework Immune to Local Traps and Miscalibration
- Sparse Flows: Pruning Continuous-depth Models
- Sparse is Enough in Scaling Transformers
- Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains
- Sparse Quadratic Optimisation over the Stiefel Manifold with Application to Permutation Synchronisation
- Sparse Spiking Gradient Descent
- Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space
- Sparse Training via Boosting Pruning Plasticity with Neuroregeneration
- Sparse Uncertainty Representation in Deep Learning with Inducing Weights
- Spatial Ensemble: a Novel Model Smoothing Mechanism for Student-Teacher Framework
- Spatial-Temporal Super-Resolution of Satellite Imagery via Conditional Pixel Synthesis
- Spatiotemporal Joint Filter Decomposition in 3D Convolutional Neural Networks
- Spatio-Temporal Variational Gaussian Processes
- Spectral embedding for dynamic networks with stability guarantees
- Spectrum-to-Kernel Translation for Accurate Blind Image Super-Resolution
- Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
- Speech-T: Transducer for Text to Speech and Beyond
- Speedy Performance Estimation for Neural Architecture Search
- Spherical Motion Dynamics: Learning Dynamics of Normalized Neural Network using SGD and Weight Decay
- Spot the Difference: Detection of Topological Changes via Geometric Alignment
- SQALER: Scaling Question Answering by Decoupling Multi-Hop and Logical Reasoning
- Square Root Principal Component Pursuit: Tuning-Free Noisy Robust Matrix Recovery
- SSAL: Synergizing between Self-Training and Adversarial Learning for Domain Adaptive Object Detection
- SSMF: Shifting Seasonal Matrix Factorization
- SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning
- Stability and Deviation Optimal Risk Bounds with Convergence Rate $O(1/n)$
- Stability and Generalization of Bilevel Programming in Hyperparameter Optimization
- Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel
- Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation
- Stabilizing Dynamical Systems via Policy Gradient Methods
- Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
- Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks
- Stateful ODE-Nets using Basis Function Expansions
- Stateful Strategic Regression
- Statistical Inference with M-Estimators on Adaptively Collected Data
- Statistically and Computationally Efficient Linear Meta-representation Learning
- Statistical Query Lower Bounds for List-Decodable Linear Regression
- Statistical Regeneration Guarantees of the Wasserstein Autoencoder with Latent Space Consistency
- Statistical Undecidability in Linear, Non-Gaussian Causal Models in the Presence of Latent Confounders
- STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal Sample and Communication Complexities for Federated Learning
- STEP: Out-of-Distribution Detection in the Presence of Limited In-Distribution Labeled Data
- Stochastic $L^\natural$-convex Function Minimization
- Stochastic Anderson Mixing for Nonconvex Stochastic Optimization
- Stochastic bandits with groups of similar arms.
- Stochastic Bias-Reduced Gradient Methods
- Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity
- Stochastic Multi-Armed Bandits with Control Variates
- Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge
- Stochastic Optimization of Areas Under Precision-Recall Curves with Provable Convergence
- Stochastic optimization under time drift: iterate averaging, step-decay schedules, and high probability guarantees
- Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
- Stochastic Solutions for Linear Inverse Problems using the Prior Implicit in a Denoiser
- Storchastic: A Framework for General Stochastic Automatic Differentiation
- STORM+: Fully Adaptive SGD with Recursive Momentum for Nonconvex Optimization
- Strategic Behavior is Bliss: Iterative Voting Improves Social Welfare
- Streaming Belief Propagation for Community Detection
- Streaming Linear System Identification with Reverse Experience Replay
- Stronger NAS with Weaker Predictors
- Structural Credit Assignment in Neural Networks using Reinforcement Learning
- Structure-Aware Random Fourier Kernel for Graphs
- Structured Denoising Diffusion Models in Discrete State-Spaces
- Structured Dropout Variational Inference for Bayesian Neural Networks
- Structured in Space, Randomized in Time: Leveraging Dropout in RNNs for Efficient Training
- Structured Reordering for Modeling Latent Alignments in Sequence Transduction
- Structure learning in polynomial time: Greedy algorithms, Bregman information, and exponential families
- Stylized Dialogue Generation with Multi-Pass Dual Learning
- Subgame solving without common knowledge
- Subgaussian and Differentiable Importance Sampling for Off-Policy Evaluation and Learning
- Subgoal Search For Complex Reasoning Tasks
- Subgraph Federated Learning with Missing Neighbor Generation
- Subgroup Generalization and Fairness of Graph Neural Networks
- Sub-Linear Memory: How to Make Performers SLiM
- Submodular + Concave
- Subquadratic Overparameterization for Shallow Neural Networks
- SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning
- Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning
- SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients
- Supercharging Imbalanced Data Learning With Energy-based Contrastive Representation Transfer
- Supervising the Transfer of Reasoning Patterns in VQA
- Support Recovery of Sparse Signals from a Mixture of Linear Measurements
- Support vector machines and linear regression coincide with very high-dimensional features
- Surrogate Regret Bounds for Polyhedral Losses
- SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data
- SWAD: Domain Generalization by Seeking Flat Minima
- Symbolic Regression via Deep Reinforcement Learning Enhanced Genetic Programming Seeding
- SyMetric: Measuring the Quality of Learnt Hamiltonian Dynamics Inferred from Vision
- Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory
- SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes
- Synthetic Design: An Optimization Approach to Experimental Design with Synthetic Controls
- Systematic Generalization with Edge Transformers
- TAAC: Temporally Abstract Actor-Critic for Continuous Control
- Tackling Climate Change with Machine Learning
- Tactical Optimism and Pessimism for Deep Reinforcement Learning
- TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning
- Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time
- Taming Communication and Sample Complexities in Decentralized Policy Evaluation for Cooperative Multi-Agent Reinforcement Learning
- Targeted Neural Dynamical Modeling
- Task-Adaptive Neural Network Search with Meta-Contrastive Learning
- Task-Agnostic Undesirable Feature Deactivation Using Out-of-Distribution Data
- Taxonomizing local versus global structure in neural network loss landscapes
- Teachable Reinforcement Learning via Advice Distillation
- Teaching an Active Learner with Contrastive Examples
- Teaching via Best-Case Counterexamples in the Learning-with-Equivalence-Queries Paradigm
- Techniques for Symbol Grounding with SATNet
- Temporal-attentive Covariance Pooling Networks for Video Recognition
- Temporally Abstract Partial Models
- Tensor decompositions of higher-order correlations by nonlinear Hebbian plasticity
- Tensor Normal Training for Deep Learning Models
- Terra: Imperative-Symbolic Co-Execution of Imperative Deep Learning Programs
- Testing Probabilistic Circuits
- TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks
- Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization
- Test-time Collective Prediction
- Test-Time Personalization with a Transformer for Human Pose Estimation
- The Adaptive Doubly Robust Estimator and a Paradox Concerning Logging Policy
- The Art of Gaussian Processes: Classical and Contemporary
- The balancing principle for parameter choice in distance-regularized domain adaptation
- The Banality of Scale: A Theory on the Limits of Modeling Bias and Fairness Frameworks for Social Justice (and other lessons from the Pandemic)
- The Benefits of Implicit Regularization from SGD in Least Squares Problems
- The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition
- The Causal-Neural Connection: Expressiveness, Learnability, and Inference
- The Collective Intelligence of Army Ants, and the Robots They Inspire
- The Complexity of Bayesian Network Learning: Revisiting the Superstructure
- The Complexity of Sparse Tensor PCA
- The decomposition of the higher-order homology embedding constructed from the $k$-Laplacian
- The Difficulty of Passive Learning in Deep Reinforcement Learning
- The effectiveness of feature attribution methods and its correlation with automatic evaluation scores
- The Effect of the Intrinsic Dimension on the Generalization of Quadratic Classifiers
- The Elastic Lottery Ticket Hypothesis
- The Emergence of Objectness: Learning Zero-shot Segmentation from Videos
- The Flip Side of the Reweighted Coin: Duality of Adaptive Dropout and Regularization
- The functional specialization of visual cortex emerges from training parallel pathways with self-supervised predictive learning
- The future is log-Gaussian: ResNets and their infinite-depth-and-width limit at initialization
- The Hardness Analysis of Thompson Sampling for Combinatorial Semi-bandits with Greedy Oracle
- The Image Local Autoregressive Transformer
- The Implicit Bias of Minima Stability: A View from Function Space
- The Inductive Bias of Quantum Kernels
- The Lazy Online Subgradient Algorithm is Universal on Strongly Convex Domains
- The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective
- The Limits of Optimal Pricing in the Dark
- The Many Faces of Adversarial Risk
- The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations
- The Pareto Frontier of model selection for general Contextual Bandits
- The pre-registration workshop: an alternative publication model for machine learning research
- There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning
- The Role of Global Labels in Few-Shot Classification and How to Infer Them
- The Semi-Random Satisfaction of Voting Axioms
- The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning
- The Skellam Mechanism for Differentially Private Federated Learning
- The staircase property: How hierarchical structure can guide deep learning
- The Symbiosis of Deep Learning and Differential Equations
- The Unbalanced Gromov Wasserstein Distance: Conic Formulation and Relaxation
- The Utility of Explainable AI in Ad Hoc Human-Machine Teaming
- The Value of Information When Deciding What to Learn
- Think Big, Teach Small: Do Language Models Distil Occam’s Razor?
- Third Workshop on AI for Humanitarian Assistance and Disaster Response
- Three-dimensional spike localization and improved motion correction for Neuropixels recordings
- Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates
- Tighter Expected Generalization Error Bounds via Wasserstein Distance
- Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize
- Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods
- Time-independent Generalization Bounds for SGLD in Non-convex Settings
- Time-series Generation by Contrastive Imitation
- T-LoHo: A Bayesian Regularization Model for Structured Sparsity and Smoothness on Graphs
- TNASP: A Transformer-based NAS Predictor with a Self-evolution Framework
- ToAlign: Task-Oriented Alignment for Unsupervised Domain Adaptation
- To Beam Or Not To Beam: That is a Question of Cooperation for Language GANs
- TOHAN: A One-step Approach towards Few-shot Hypothesis Adaptation
- TokenLearner: Adaptive Space-Time Tokenization for Videos
- Topic Modeling Revisited: A Document Graph-based Neural Network Perspective
- TopicNet: Semantic Graph-Guided Topic Discovery
- Topographic VAEs learn Equivariant Capsules
- Topological Attention for Time Series Forecasting
- Topological Detection of Trojaned Neural Networks
- Topological Relational Learning on Graphs
- Topology-Imbalance Learning for Semi-Supervised Node Classification
- TöRF: Time-of-Flight Radiance Fields for Dynamic Scene View Synthesis
- To The Point: Correspondence-driven monocular 3D category reconstruction
- Towards a Theoretical Framework of Out-of-Distribution Generalization
- Towards a Unified Game-Theoretic View of Adversarial Perturbations and Robustness
- Towards a Unified Information-Theoretic Framework for Generalization
- Towards Best-of-All-Worlds Online Learning with Feedback Graphs
- Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples
- Towards Biologically Plausible Convolutional Networks
- Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective
- Towards Context-Agnostic Learning Using Synthetic Data
- Towards Deeper Deep Reinforcement Learning with Spectral Normalization
- Towards Efficient and Effective Adversarial Training
- Towards Enabling Meta-Learning from Target Models
- Towards Gradient-based Bilevel Optimization with Non-convex Followers and Beyond
- Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning
- Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
- Towards Lower Bounds on the Depth of ReLU Neural Networks
- Towards mental time travel: a hierarchical memory for reinforcement learning agents
- Towards Multi-Grained Explainability for Graph Neural Networks
- Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach
- Towards optimally abstaining from prediction with OOD test examples
- Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation
- Towards Robust and Reliable Algorithmic Recourse
- Towards Robust Bisimulation Metric Learning
- Towards robust vision by multi-task learning on monkey visual cortex
- Towards Sample-efficient Overparameterized Meta-learning
- Towards Sample-Optimal Compressive Phase Retrieval with Sparse and Generative Priors
- Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN
- Towards Sharper Generalization Bounds for Structured Prediction
- Towards Stable and Robust AdderNets
- Towards Tight Communication Lower Bounds for Distributed Optimisation
- Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization
- Towards understanding retrosynthesis by energy-based models
- Towards Understanding Why Lookahead Generalizes Better Than SGD and Beyond
- Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games
- Tracking People with 3D Representations
- Tracking Without Re-recognition in Humans and Machines
- Tractable Density Estimation on Learned Manifolds with Conformal Embedding Flows
- Tractable Regularization of Probabilistic Circuits
- Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds
- Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State
- Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time
- Training Neural Networks is ER-complete
- Training Neural Networks with Fixed Sparse Masks
- Training Over-parameterized Models with Non-decomposable Objectives
- Transfer Learning of Graph Neural Networks with Ego-graph Information Maximization
- TransformerFusion: Monocular RGB Scene Reconstruction using Transformers
- Transformer in Transformer
- Transformers Generalize DeepSets and Can be Extended to Graphs & Hypergraphs
- TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up
- TransMatcher: Deep Image Matching Through Transformers for Generalizable Person Re-identification
- TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification
- Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation
- Tree in Tree: from Decision Trees to Decision Graphs
- TriBERT: Human-centric Audio-visual Representation Learning
- TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness
- True Few-Shot Learning with Language Models
- Truncated Marginal Neural Ratio Estimation
- Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions
- TTT++: When Does Self-Supervised Test-Time Training Fail or Thrive?
- Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
- Tuning Mixed Input Hyperparameters on the Fly for Efficient Population Based AutoRL
- Turing Completeness of Bounded-Precision Recurrent Neural Networks
- Twice regularized MDPs and the equivalence between robustness and regularization
- Twins: Revisiting the Design of Spatial Attention in Vision Transformers
- Two-sided fairness in rankings via Lorenz dominance
- Two Sides of Meta-Learning Evaluation: In vs. Out of Distribution
- Two steps to risk sensitivity
- UCB-based Algorithms for Multinomial Logistic Regression Bandits
- UFC-BERT: Unifying Multi-Modal Controls for Conditional Image Synthesis
- Ultrahyperbolic Neural Networks
- Unadversarial Examples: Designing Objects for Robust Vision
- Unbalanced Optimal Transport through Non-negative Penalized Linear Regression
- Unbiased Classification through Bias-Contrastive and Bias-Balanced Learning
- Uncertain Decisions Facilitate Better Preference Learning
- Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
- Uncertainty Calibration for Ensemble-Based Debiasing Methods
- Uncertainty-Driven Loss for Single Image Super-Resolution
- Uncertainty Quantification and Deep Ensembles
- Understanding Adaptive, Multiscale Temporal Integration In Deep Speech Recognition Systems
- Understanding and Improving Early Stopping for Learning with Noisy Labels
- Understanding Bandits with Graph Feedback
- Understanding Deflation Process in Over-parametrized Tensor Decomposition
- Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization
- Understanding How Encoder-Decoder Architectures Attend
- Understanding Instance-based Interpretability of Variational Auto-Encoders
- Understanding Interlocking Dynamics of Cooperative Rationalization
- Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning
- Understanding Partial Multi-Label Learning via Mutual Information
- Understanding the Effect of Stochasticity in Policy Optimization
- Understanding the Generalization Benefit of Model Invariance from a Data Perspective
- Understanding the Limits of Unsupervised Domain Adaptation via Data Poisoning
- Understanding the Under-Coverage Bias in Uncertainty Estimation
- Unfolding Taylor's Approximations for Image Restoration
- UniDoc: Unified Pretraining Framework for Document Understanding
- Uniform Concentration Bounds toward a Unified Framework for Robust Clustering
- Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting
- Uniform-PAC Bounds for Reinforcement Learning with Linear Function Approximation
- Uniform Sampling over Episode Difficulty
- Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
- Unifying lower bounds on prediction dimension of convex surrogates
- Unifying Width-Reduced Methods for Quasi-Self-Concordant Optimization
- Unintended Selection: Persistent Qualification Rate Disparities and Interventions
- Unique sparse decomposition of low rank matrices
- Universal Approximation Using Well-Conditioned Normalizing Flows
- Universal Graph Convolutional Networks
- Universal Off-Policy Evaluation
- Universal Rate-Distortion-Perception Representations for Lossy Compression
- Universal Semi-Supervised Learning
- Unlabeled Principal Component Analysis
- Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning
- Unsupervised Domain Adaptation with Dynamics-Aware Rewards in Reinforcement Learning
- Unsupervised Foreground Extraction via Deep Region Competition
- Unsupervised Learning of Compositional Energy Concepts
- Unsupervised Motion Representation Learning with Capsule Autoencoders
- Unsupervised Noise Adaptive Speech Enhancement by Discriminator-Constrained Optimal Transport
- Unsupervised Object-Based Transition Models For 3D Partially Observable Environments
- Unsupervised Object-Level Representation Learning from Scene Images
- Unsupervised Part Discovery from Contrastive Reconstruction
- Unsupervised Representation Transfer for Small Networks: I Believe I Can Distill On-the-Fly
- Unsupervised Speech Recognition
- USCO-Solver: Solving Undetermined Stochastic Combinatorial Optimization Problems
- User-Level Differentially Private Learning via Correlated Sampling
- Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks
- Validating the Lottery Ticket Hypothesis with Inertial Manifold Theory
- Validation Free and Replication Robust Volume-based Data Valuation
- Variance-Aware Off-Policy Evaluation with Linear Function Approximation
- Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems
- Variational Bayesian Optimistic Sampling
- Variational Bayesian Reinforcement Learning with Regret Bounds
- Variational Continual Bayesian Meta-Learning
- Variational Diffusion Models
- Variational Inference for Continuous-Time Switching Dynamical Systems
- Variational Model Inversion Attacks
- Variational Multi-Task Learning with Gumbel-Softmax Priors
- VAST: Value Function Factorization with Variable Agent Sub-Teams
- VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
- Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices
- Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels
- Video Instance Segmentation using Inter-Frame Communication Transformers
- VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer
- VigDet: Knowledge Informed Neural Temporal Point Process for Coordination Detection on Social Media
- ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction
- Visual Adversarial Imitation Learning using Variational Models
- Visualizing the Emergence of Intermediate Visual Patterns in DNNs
- Visual Search Asymmetry: Deep Nets and Humans Share Similar Inherent Biases
- ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
- VoiceMixer: Adversarial Voice Style Mixup
- Volume Rendering of Neural Implicit Surfaces
- Voxel-based 3D Detection and Reconstruction of Multiple Objects from a Single Image
- VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using Vector Quantization
- Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic
- Weak-shot Fine-grained Classification via Similarity Transfer
- Weighted model estimation for offline model-based reinforcement learning
- Weisfeiler and Lehman Go Cellular: CW Networks
- Well-tuned Simple Nets Excel on Tabular Datasets
- What can linearized neural networks actually say about generalization?
- What Makes Multi-Modal Learning Better than Single (Provably)
- What Matters for Adversarial Imitation Learning?
- What’s a good imputation to predict with missing values?
- What training reveals about neural network complexity
- When Are Solutions Connected in Deep Networks?
- When does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning?
- When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work
- When False Positive is Intolerant: End-to-End Optimization with Low FPR for Multipartite Ranking
- When in Doubt: Neural Non-Parametric Uncertainty Quantification for Epidemic Forecasting
- When Is Generalizable Reinforcement Learning Tractable?
- When Is Unsupervised Disentanglement Possible?
- Which Mutual-Information Representation Learning Objectives are Sufficient for Control?
- Who Leads and Who Follows in Strategic Classification?
- Why Do Better Loss Functions Lead to Less Transferable Features?
- Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
- Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability
- Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Sparse Neural Networks
- Why Spectral Normalization Stabilizes GANs: Analysis and Improvements
- Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation
- Width-based Lookaheads with Learnt Base Policies and Heuristics Over the Atari-2600 Benchmark
- Wisdom of the Crowd Voting: Truthful Aggregation of Voter Information and Preferences
- Word2Fun: Modelling Words as Functions for Diachronic Word Representation
- Workshop on Deep Learning and Inverse Problems
- Workshop on Human and Machine Decisions
- XCiT: Cross-Covariance Image Transformers
- XDO: A Double Oracle Algorithm for Extensive-Form Games
- You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership
- You Are the Best Reviewer of Your Own Papers: An Owner-Assisted Scoring Mechanism
- You Never Cluster Alone
- You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection
- Your head is there to move you around: Goal-driven models of the primate dorsal pathway
- Your Model is Wrong: Robustness and misspecification in probabilistic modeling
- Zero Time Waste: Recycling Predictions in Early Exit Neural Networks