Downloads 2025
Number of events: 5971
- $\boldsymbol{\lambda}$-Orthogonality Regularization for Compatible Representation Learning
- $\Delta \mathrm{Energy}$: Optimizing Energy Change During Vision-Language Alignment Improves both OOD Detection and OOD Generalization
- $\epsilon$-Seg: Sparsely Supervised Semantic Segmentation of Microscopy Data
- $i$MIND: Insightful Multi-subject Invariant Neural Decoding
- $\mathcal{X}^2$-DFD: A framework for e$\mathcal{X}$plainable and e$\mathcal{X}$tendable Deepfake Detection
- $\mu$PC: Scaling Predictive Coding to 100+ Layer Networks
- $O(\sqrt{T})$ Static Regret and Instance Dependent Constraint Violation for Constrained Online Convex Optimization
- $\Psi$-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models
- $Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training
- $\text{G}^2\text{M}$: A Generalized Gaussian Mirror Method to Boost Feature Selection Power
- $\textit{HiMaCon:}$ Discovering Hierarchical Manipulation Concepts from Unlabeled Multi-Modal Data
- $\textit{Hyper-GoalNet}$: Goal-Conditioned Manipulation Policy Learning with HyperNetworks
- $\text{S}^2$Q-VDiT: Accurate Quantized Video Diffusion Transformer with Salient Data and Sparse Token Distillation
- $\texttt{AVROBUSTBENCH}$: Benchmarking the Robustness of Audio-Visual Recognition Models at Test-Time
- $\texttt{BetaConform}$: Efficient MAP Estimation of LLM Ensemble Judgment Performance with Prior Transfer
- $\texttt{G1}$: Teaching LLMs to Reason on Graphs with Reinforcement Learning
- $\texttt{STRCMP}$: Integrating Graph Structural Priors with Language Models for Combinatorial Optimization
- 1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
- 1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities
- 2nd Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences
- 3BASiL: An Algorithmic Framework for Sparse plus Low-Rank Compression of LLMs
- 3D Equivariant Visuomotor Policy Learning via Spherical Projection
- 3D Gaussian Flats: Hybrid 2D/3D Photometric Scene Reconstruction
- 3D Gaussian Splatting based Scene-independent Relocalization with Unidirectional and Bidirectional Feature Fusion
- 3D-GSRD: 3D Molecular Graph Auto-Encoder with Selective Re-mask Decoding
- 3D Human Pose Estimation with Muscles
- 3DID: Direct 3D Inverse Design for Aerodynamics with Physics-Aware Optimization
- 3D Interaction Geometric Pre-training for Molecular Relational Learning
- 3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model
- 3DOT: Texture Transfer for 3DGS Objects from a Single Reference Image
- 3DPE-Gaze:Unlocking the Potential of 3D Facial Priors for Generalized Gaze Estimation
- 3D-Prover: Diversity Driven Theorem Proving With Determinantal Point Processes
- 3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks
- 3D Visual Illusion Depth Estimation
- 3EED: Ground Everything Everywhere in 3D
- 4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos
- 4DGCPro: Efficient Hierarchical 4D Gaussian Compression for Progressive Volumetric Video Streaming
- 4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos
- 4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time
- 4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration
- 4KAgent: Agentic Any Image to 4K Super-Resolution
- 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11)
- 7th International Workshop on Large Scale Holistic Video Understanding: Toward Video Foundation Models
- A$^3$E: Towards Compositional Model Editing
- A2Seek: Towards Reasoning-Centric Benchmark for Aerial Anomaly Understanding
- AANet: Virtual Screening under Structural Uncertainty via Alignment and Aggregation
- A Bayesian Approach to Contextual Dynamic Pricing using the Proportional Hazards Model with Discrete Price Data
- A Bayesian Fast-Slow Framework to Mitigate Interference in Non-Stationary Reinforcement Learning
- A Beyond-Worst-Case Analysis of Greedy k-means++
- A Black-Box Debiasing Framework for Conditional Sampling
- Absence Bench: Language Models Can’t See What’s Missing
- Absolute Zero: Reinforced Self-play Reasoning with Zero Data
- Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models
- Abstain Mask Retain Core: Time Series Prediction by Adaptive Masking Loss with Representation Consistency
- AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions
- Abstract Counterfactuals for Language Model Agents
- Abstract Rendering: Certified Rendering Under 3D Semantic Uncertainty
- A Cautionary Tale on Integrating Studies with Disparate Outcome Measures for Causal Inference
- Accelerated Distance-adaptive Methods for Hölder Smooth and Convex Optimization
- Accelerated Evolving Set Processes for Local PageRank Computation
- Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking
- Accelerated Vertical Federated Adversarial Learning through Decoupling Layer-Wise Dependencies
- Accelerating 3D Molecule Generative Models with Trajectory Diagnosis
- Accelerating Block Coordinate Descent for LLM Finetuning via Landscape Expansion
- Accelerating data-driven algorithm selection for combinatorial partitioning problems
- Accelerating Diffusion LLMs via Adaptive Parallel Decoding
- Accelerating Feature Conformal Prediction via Taylor Approximation
- Accelerating Model-Free Optimization via Averaging of Cost Samples
- Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings
- Accelerating Optimization via Differentiable Stopping Time
- Accelerating Parallel Diffusion Model Serving with Residual Compression
- Accelerating RL for LLM Reasoning with Optimal Advantage Regression
- Accelerating Visual-Policy Learning through Parallel Differentiable Simulation
- Acceleration via silver step-size on Riemannian manifolds with applications to Wasserstein space
- Accident Anticipation via Temporal Occurrence Prediction
- ACCO: Accumulate While You Communicate for Communication-Overlapped Sharded LLM Training
- AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models
- Accurate and Efficient Low-Rank Model Merging in Core Space
- Accurate KV Cache Eviction via Anchor Direction Projection for Efficient LLM Inference
- Accurately Predicting Protein Mutational Effects via a Hierarchical Many-Body Attention Network
- AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
- AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning
- AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play
- Achieving $\tilde{\mathcal{O}}(1/N)$ Optimality Gap in Restless Bandits through Gaussian Approximation
- Achilles' Heel of Mamba: Essential difficulties of the Mamba architecture demonstrated by synthetic data
- A Circular Argument: Does RoPE need to be Equivariant for Vision?
- A Clean Slate for Offline Reinforcement Learning
- AC-LoRA: (Almost) Training-Free Access Control Aware Multi-Modal LLMs
- A Closer Look at Graph Transformers: Cross-Aggregation and Beyond
- A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective
- A Closer Look at NTK Alignment: Linking Phase Transitions in Deep Image Regression
- A Closer Look at TabPFN v2: Understanding Its Strengths and Extending Its Capabilities
- A Closer Look to Positive-Unlabeled Learning from Fine-grained Perspectives: An Empirical Study
- A CLT for Polynomial GNNs on Community-Based Graphs
- A compressive-expressive communication framework for compositional representations
- A Computationally Viable Numerical Gradient-based Technique for Optimal Covering Problems
- A Controllable Examination for Long-Context Language Models
- A Counterfactual Semantics for Hybrid Dynamical Systems
- A Cramér–von Mises Approach to Incentivizing Truthful Data Sharing
- ACT as Human: Multimodal Large Language Model Data Annotation with Critical Thinking
- Actial: Activate Spatial Reasoning Ability of Multimodal Large Language Models
- Activated LoRA: Fine-tuned LLMs for Intrinsics
- Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models
- Activation-Guided Consensus Merging for Large Language Models
- Activation-Informed Merging of Large Language Models
- Active Measurement: Efficient Estimation at Scale
- Active Seriation: Efficient Ordering Recovery with Statistical Guarantees
- Active Target Discovery under Uninformative Priors: The Power of Permanent and Transient Memory
- Active Test-time Vision-Language Navigation
- ActiveVOO: Value of Observation Guided Active Knowledge Acquisition for Open-World Embodied Lifted Regression Planning
- Activity Pruning for Efficient Spiking Neural Networks
- Act Only When It Pays: Efficient Reinforcement Learning for LLM Reasoning via Selective Rollouts
- Actor-Free Continuous Control via Structurally Maximizable Q-Functions
- Act to See, See to Act: Diffusion-Driven Perception-Action Interplay for Adaptive Policies
- AcuRank: Uncertainty-Aware Adaptive Computation for Listwise Reranking
- AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees
- Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference
- AdaLRS: Loss-Guided Adaptive Learning Rate Search for Efficient Foundation Model Pretraining
- Adam Reduces a Unique Form of Sharpness: Theoretical Insights Near the Minimizer Manifold
- AdaMSS: Adaptive Multi-Subspace Approach for Parameter-Efficient Fine-Tuning
- Adaptable Safe Policy Learning from Multi-task Data with Constraint Prioritized Decision Transformer
- AdaptDel: Adaptable Deletion Rate Randomized Smoothing for Certified Robustness
- AdaptGrad: Adaptive Sampling to Reduce Noise
- Adapting to Stochastic and Adversarial Losses in Episodic MDPs with Aggregate Bandit Feedback
- Adaptive 3D Reconstruction via Diffusion Priors and Forward Curvature-Matching Likelihood Updates
- Adaptive Algorithms with Sharp Convergence Rates for Stochastic Hierarchical Optimization
- Adaptive and Multi-scale Affinity Alignment for Hierarchical Contrastive Learning
- Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization
- Adaptive Cannistraci-Hebb Network Automata Modelling of Complex Networks for Path-based Link Prediction
- Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
- Adaptive Context Length Optimization with Low-Frequency Truncation for Multi-Agent Reinforcement Learning
- Adaptive Data Analysis for Growing Data
- Adaptive Data-Borrowing for Improving Treatment Effect Estimation using External Controls
- Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler
- Adaptive Discretization for Consistency Models
- Adaptive Distraction: Probing LLM Contextual Robustness with Automated Tree Search
- Adaptive Divergence Regularized Policy Optimization for Fine-tuning Generative Models
- Adaptive Fission: Post-training Encoding for Low-latency Spike Neural Networks
- Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing
- Adaptive Gradient Masking for Balancing ID and MLLM-based Representations in Recommendation
- Adaptive Inference-Time Scaling via Cyclic Diffusion Search
- Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs
- Adaptive Latent-Space Constraints in Personalized Federated Learning
- Adaptive LoRA Experts Allocation and Selection for Federated Fine-Tuning
- Adaptively Coordinating with Novel Partners via Learned Latent Strategies
- Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning
- Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees
- Adaptive Preference Arithmetic: A Personalized Agent with Adaptive Preference Arithmetic for Dynamic Preference Modeling
- Adaptive Quantization in Generative Flow Networks for Probabilistic Sequential Prediction
- Adaptive Re-calibration Learning for Balanced Multimodal Intention Recognition
- Adaptive Riemannian ADMM for Nonsmooth Optimization: Optimal Complexity without Smoothing
- Adaptive Sigmoid Clipping for Balancing the Direction–Magnitude Mismatch Trade-off in Differentially Private Learning
- Adaptive Stochastic Coefficients for Accelerating Diffusion Sampling
- Adaptive Surrogate Gradients for Sequential Reinforcement Learning in Spiking Neural Networks
- Adaptive Time Encoding for Irregular Multivariate Time-Series Classification
- Adaptive Variance Inflation in Thompson Sampling: Efficiency, Safety, Robustness, and Beyond
- Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization
- AdaReasoner: Adaptive Reasoning Enables More Flexible Thinking
- AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders
- AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners
- A data and task-constrained mechanistic model of the mouse outer retina shows robustness to contrast variations
- A Data-Driven Prism: Multi-View Source Separation with Diffusion Model Priors
- A Dataset for Distilling Knowledge Priors from Literature for Therapeutic Design
- AdaTS: Learning Adaptive Time Series Representations via Dynamic Soft Contrasts
- AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented Efficient Long Video Understanding
- Additive Models Explained: A Computational Complexity Approach
- Addressing Mark Imbalance in Integration-free Marked Temporal Point Processes
- ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning
- A Difference-of-Convex Functions Approach to Energy-Based Iterative Reasoning
- A Differential and Pointwise Control Approach to Reinforcement Learning
- A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking
- Adjacent Words, Divergent Intents: Jailbreaking Large Language Models via Task Concurrency
- Adjoint Schrödinger Bridge Sampler
- Adjusted Count Quantification Learning on Graphs
- Adjusting Initial Noise to Mitigate Memorization in Text-to-Image Diffusion Models
- ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources
- AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees
- ADPretrain: Advancing Industrial Anomaly Detection via Anomaly Representation Pretraining
- A Driving-Style-Adaptive Framework for Vehicle Trajectory Prediction
- A duality framework for analyzing random feature and two-layer neural networks
- Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization
- Advancing Compositional Awareness in CLIP with Efficient Fine-Tuning
- Advancing Expert Specialization for Better MoE
- Advancing Interpretability of CLIP Representations with Concept Surrogate Model
- Advancing Machine-Generated Text Detection from an Easy to Hard Supervision Perspective
- Advancing Wasserstein Convergence Analysis of Score-Based Models: Insights from Discretization and Second-Order Acceleration
- AdvEDM: Fine-grained Adversarial Attack against VLM-based Embodied Agents
- Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
- Adversarial Diffusion for Robust Reinforcement Learning
- Adversarial generalization of unfolding (model-based) networks
- Adversarial Graph Fusion for Incomplete Multi-view Semi-supervised Learning with Tensorial Imputation
- Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning
- Adversarial Paraphrasing: A Universal Attack for Humanizing AI-Generated Text
- Adversarial Robustness of Nonparametric Regression
- Adversary Aware Optimization for Robust Defense
- AdvPrefix: An Objective for Nuanced LLM Jailbreaks
- Adv-SSL: Adversarial Self-Supervised Representation Learning with Theoretical Guarantees
- A Dynamic Learning Strategy for Dempster-Shafer Theory with Applications in Classification and Enhancement
- AegisGuard: RL-Guided Adapter Tuning for TEE-Based Efficient & Secure On-Device Inference
- Aeolus: A Multi-structural Flight Delay Dataset
- A Fair Federated Learning Method for Handling Client Participation Probability Inconsistencies in Heterogeneous Environments
- A faster training algorithm for regression trees with linear leaves, and an analysis of its complexity
- A Few Moments Please: Scalable Graphon Learning via Moment Matching
- Affine-Invariant Global Non-Asymptotic Convergence Analysis of BFGS under Self-Concordance
- AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models
- A Finite Sample Analysis of Distributional TD Learning with Linear Function Approximation
- A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1
- Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization
- AF-UMC: An Alignment-Free Fusion Framework for Unaligned Multi-View Clustering
- AGC-Drive: A Large-Scale Dataset for Real-World Aerial-Ground Collaboration in Driving Scenarios
- A Generalist Intracortical Motor Decoder
- A Generalized Binary Tree Mechanism for Private Approximation of All-Pair Shortest Distances
- A Generalized Bisimulation Metric of State Similarity between Markov Decision Processes: From Theoretical Propositions to Applications
- A Generalized Label Shift Perspective for Cross-Domain Gaze Estimation
- A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging
- AgentAuditor: Human-level Safety and Security Evaluation for LLM Agents
- AgentBreeder: Mitigating the AI Safety Risks of Multi-Agent Scaffolds via Self-Improvement
- AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents
- Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents
- Agentic RL Scaling Law: Spontaneous Code Execution for Mathematical Problem Solving
- AGENTIF: Benchmarking Large Language Models Instruction Following Ability in Agentic Scenarios
- AgentNet: Decentralized Evolutionary Coordination for LLM-based Multi-Agent Systems
- AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems
- Agents Robust to Distribution Shifts Learn Causal World Models Even Under Mediation
- AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks
- A Geometrical Analysis of Kernel Ridge Regression and its Applications
- A Geometric Analysis of PCA
- A geometric framework for momentum-based optimizers for low-rank training
- A Geometry-Aware Metric for Mode Collapse in Time Series Generative Models
- Aggregation Hides Out-of-Distribution Generalization Failures from Spurious Correlations
- AGI-Elo: How Far Are We From Mastering A Task?
- AgMMU: A Comprehensive Agricultural Multimodal Understanding Benchmark
- Agnostic Active Learning Is Always Better Than Passive Learning
- Agnostic Continuous-Time Online Learning
- Agnostic Learning under Targeted Poisoning: Optimal Rates and the Role of Randomness
- A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models
- A Gradient Guided Diffusion Framework for Chance Constrained Programming
- AHa-Bench: Benchmarking Audio Hallucinations in Large Audio-Language Models
- Aha! - Predicting What Matters Next: Online Highlight Detection Without Looking Ahead
- A Hierarchy of Graphical Models for Counterfactual Inferences
- A High-Dimensional Statistical Method for Optimizing Transfer Quantities in Multi-Source Transfer Learning
- AI4Mat-NeurIPS-2025: NeurIPS 2025 Workshop on AI for Accelerated Materials Design
- AI and ML for Next-Generation Wireless Communications and Networking (AI4NextG @ NeurIPS’25)
- AI Debate Aids Assessment of Controversial Claims
- AiDE-Q: Synthetic Labeled Datasets Can Enhance Learning Models for Quantum Property Estimation
- AIEA Lab (BTF)
- AI for non-human animal communication
- AI for Science (BTF)
- AI for Science: The Reach and Limits of AI for Scientific Discovery
- AI-Generated Video Detection via Perceptual Straightening
- A Implies B: Circuit Analysis in LLMs for Propositional Logical Reasoning
- AION-1: Omnimodal Foundation Model for Astronomical Sciences
- AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs
- AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench
- AI-Researcher: Autonomous Scientific Innovation
- AI Robotics @ UC Berkeley (BTF)
- A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
- AI Testing Should Account for Sophisticated Strategic Behaviour
- AI That Keeps Up: Workshop on Continual and Compatible Foundation Model Updates (CCFM)
- AI Virtual Cells and Instruments: A New Era in Drug Discovery and Development
- A Latent Multilayer Graphical Model For Complex, Interdependent Systems
- Alchemist: Turning Public Text-to-Image Data into Generative Gold
- A learnability analysis on neuro-symbolic learning
- A Learning-Augmented Approach to Online Allocation Problems
- A Learning-Augmented Dynamic Programming Approach for Orienteering Problem with Time Windows
- ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering
- Algorithm- and Data-Dependent Generalization Bounds for Diffusion Models
- Algorithmic Collective Action
- Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-Index Models
- AlgoTune: Can Language Models Speed Up General-Purpose Numerical Programs?
- Alias-Free ViT: Fractional Shift Invariance via Linear Attention
- Align-DA: Align Score-based Atmospheric Data Assimilation with Multiple Preferences
- AlignedGen: Aligning Style Across Generated Images
- Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment
- Aligning Compound AI Systems via System-level DPO
- Aligning Evaluation with Clinical Priorities: Calibration, Label Shift, and Error Costs
- Aligning Text-to-Image Diffusion Models to Human Preference by Classification
- Aligning Text to Image in Diffusion Models is Easier Than You Think
- Aligning Transformers with Continuous Feedback via Energy Rank Alignment
- Aligning What Matters: Masked Latent Adaptation for Text-to-Audio-Video Generation
- Alignment of Large Language Models with Constrained Learning
- AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Document Understanding
- Align Your Flow: Scaling Continuous-Time Flow Map Distillation
- ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition
- AliO: Output Alignment Matters in Long-Term Time Series Forecasting
- A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
- Alleviating Hallucinations in Large Language Models through Multi-Model Contrastive Decoding and Dynamic Hallucination Detection
- Alligat0R: Pre-Training through Covisibility Segmentation for Relative Camera Pose Regression
- All that structure matches does not glitter
- All You Need is One: Capsule Prompt Tuning with a Single Vector
- ALMGuard: Safety Shortcuts and Where to Find Them as Guardrails for Audio–Language Models
- AlphaBeta is not as good as you think: a simple random games model for a better analysis of deterministic game-solving algorithms
- AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs
- AlphaFold Database Debiasing for Robust Inverse Folding
- AlphaZero Neural Scaling and Zipf's Law: a Tale of Board Games and Power Laws
- ALTER: All-in-One Layer Pruning and Temporal Expert Routing for Efficient Diffusion Generation
- Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
- AltLoRA: Towards Better Gradient Approximation in Low-Rank Adaptation with Alternating Projections
- ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation
- A machine learning approach that beats Rubik's cubes
- AMBER: Adaptive Mesh Generation by Iterative Mesh Resolution Prediction
- Ambient Diffusion Omni: Training Good Models with Bad Data
- Ambient Proteins - Training Diffusion Models on Noisy Structures
- A-Mem: Agentic Memory for LLM Agents
- A Minimalist Example of Edge-of-Stability and Progressive Sharpening
- A Minimalistic Unified Framework for Incremental Learning across Image Restoration Tasks
- Among Us: A Sandbox for Measuring and Detecting Agentic Deception
- AmorLIP: Efficient Language-Image Pretraining via Amortization
- Amortized Active Generation of Pareto Sets
- Amortized Sampling with Transferable Normalizing Flows
- Amortized Variational Transdimensional Inference
- Amplifying Prominent Representations in Multimodal Learning via Variational Dirichlet Process
- A Multimodal Benchmark for Framing of Oil & Gas Advertising and Potential Greenwashing Detection
- A Multimodal BiMamba Network with Test-Time Adaptation for Emotion Recognition Based on Physiological Signals
- A multiscale analysis of mean-field transformers in the moderate interaction regime
- A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings
- AnaCP: Toward Upper-Bound Continual Learning via Analytic Contrastive Projection
- An Adaptive Algorithm for Bilevel Optimization on Riemannian Manifolds
- An Adaptive Quantum Circuit of Dempster's Rule of Combination for Uncertain Pattern Classification
- Analog Foundation Models
- Analog In-memory Training on General Non-ideal Resistive Elements: The Impact of Response Functions
- Analogy-based Multi-Turn Jailbreak against Large Language Models
- Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
- Analyzing Fine-Grained Alignment and Enhancing Vision Understanding in Multimodal Language Models
- Analyzing Similarity Metrics for Data Selection for Language Model Pretraining
- Analyzing the Power of Chain of Thought through Memorization Capabilities
- An Analysis of Causal Effect Estimation using Outcome Invariant Data Augmentation
- An Analysis of Concept Bottleneck Models: Measuring, Understanding, and Mitigating the Impact of Noisy Annotations
- An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models
- Anatomically inspired digital twins capture hierarchical object representations in visual cortex
- Anchor-based Maximum Discrepancy for Relative Similarity Testing
- Anchored Diffusion Language Model
- A Near-Optimal Algorithm for Decentralized Convex-Concave Finite-Sum Minimax Optimization
- A Near-optimal, Scalable and Parallelizable Framework for Stochastic Bandits Robust to Adversarial Corruptions and Beyond
- An Effective Levelling Paradigm for Unlabeled Scenarios
- An Efficient Local Search Approach for Polarized Community Discovery in Signed Networks
- An Efficient Orlicz-Sobolev Approach for Transporting Unbalanced Measures on a Graph
- An Ellipsoid Algorithm for Online Convex Optimization
- AneuG-Flow: A Large-Scale Synthetic Dataset of Diverse Intracranial Aneurysm Geometries and Hemodynamics
- An Evidence-Based Post-Hoc Adjustment Framework for Anomaly Detection Under Data Contamination
- AngleRoCL: Angle-Robust Concept Learning for Physically View-Invariant Adversarial Patches
- Angles Don’t Lie: Unlocking Training‑Efficient RL Through the Model’s Own Signals
- Angular Constraint Embedding via SpherePair Loss for Constrained Clustering
- Angular Steering: Behavior Control via Rotation in Activation Space
- AnimateQR: Bridging Aesthetics and Functionality in Dynamic QR Code Generation
- An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction
- An Information-theoretical Framework for Understanding Out-of-distribution Detection with Pretrained Vision-Language Models
- An Investigation of Memorization Risk in Healthcare Foundation Models
- An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise
- AnomalyCoT: A Multi-Scenario Chain-of-Thought Dataset for Multimodal Large Language Models
- Anomaly Detection by an Ensemble of Random Pairs of Hyperspheres
- An Optimized Franz-Parisi Criterion and its Equivalence with SQ Lower Bounds
- A Novel General Framework for Sharp Lower Bounds in Succinct Stochastic Bandits
- Anti-Aliased 2D Gaussian Splatting
- Antidistillation Sampling
- Any Large Language Model Can Be a Reliable Judge: Debiasing with a Reasoning-based Bias Detector
- Any-stepsize Gradient Descent for Separable Data under Fenchel–Young Losses
- Anytime-valid, Bayes-assisted, Prediction-Powered Inference
- AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation
- A Partition Cover Approach to Tokenization
- A Physics-preserved Transfer Learning Method for Differential Equations
- APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
- A Plug-and-Play Query Synthesis Active Learning Framework for Neural PDE Solvers
- APML: Adaptive Probabilistic Matching Loss for Robust 3D Point Cloud Reconstruction
- APOLLO: Automated LLM and Lean Collaboration for Advanced Formal Reasoning
- Approximate Domain Unlearning for Vision-Language Models
- Approximate Gradient Coding for Distributed Learning with Heterogeneous Stragglers
- Approximately Aligned Decoding
- Approximating Shapley Explanations in Reinforcement Learning
- Approximation and Generalization Abilities of Score-based Neural Network Generative Models for Sub-Gaussian Distributions
- Approximation theory for 1-Lipschitz ResNets
- A Practical Guide for Incorporating Symmetry in Diffusion Policy
- A Pre-training Framework for Relational Data with Information-theoretic Principles
- A Principled Approach to Randomized Selection under Uncertainty: Applications to Peer Review and Grant Funding
- A Principled Path to Fitted Distributional Evaluation
- A Principle of Targeted Intervention for Multi-Agent Reinforcement Learning
- A Private Approximation of the 2nd-Moment Matrix of Any Subsamplable Input
- A Provable Approach for End-to-End Safe Reinforcement Learning
- ArchCAD-400K: A Large-Scale CAD drawings Dataset and New Baseline for Panoptic Symbol Spotting
- Architectural and Inferential Inductive Biases for Exchangeable Sequence Modeling
- ArchPower: Dataset for Architecture-Level Power Modeling of Modern CPU Design
- AREAL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
- ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation
- Are Greedy Task Orderings Better Than Random in Continual Linear Regression?
- A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees
- A Reinforcement Learning-based Bidding Strategy for Data Consumers in Auction-based Federated Learning
- Are Language Models Efficient Reasoners? A Perspective from Logic Programming
- Are Large Language Models Sensitive to the Motives Behind Communication?
- Are Large Reasoning Models Good Translation Evaluators? Analysis and Performance Boost
- A Reliable Cryptographic Framework for Empirical Machine Unlearning Evaluation
- Are Pixel-Wise Metrics Reliable for Computerized Tomography Reconstruction?
- Are We Having the Wrong Nightmares About AI?
- Are We Having the Wrong Nightmares About AI?
- ARGenSeg: Image Segmentation with Autoregressive Image Generation Model
- ARIA: Training Language Agents with Intention-driven Reward Aggregation
- ARM: Adaptive Reasoning Model
- Arm (BTF)
- ARMesh: Autoregressive Mesh Generation via Next-Level-of-Detail Prediction
- AR-RAG: Autoregressive Retrieval Augmentation for Image Generation
- Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
- Artificial Intelligence for Music: Where Creativity Meets Computation
- Aruna Nannapaneni (BTF)
- A Scalable, Causal, and Energy Efficient Framework for Neural Decoding with Spiking Neural Networks
- Ascent Fails to Forget
- ASDSV: Multimodal Generation Made Efficient with Approximate Speculative Diffusion and Speculative Verification
- A Semantic Parsing Framework for End-to-End Time Normalization
- A Set of Generalized Components to Achieve Effective Poison-only Clean-label Backdoor Attacks with Collaborative Sample Selection and Triggers
- ASGO: Adaptive Structured Gradient Optimization
- A Signed Graph Approach to Understanding and Mitigating Oversmoothing
- A Single-Loop First-Order Algorithm for Linearly Constrained Bilevel Optimization
- A Single-Loop Gradient Algorithm for Pessimistic Bilevel Optimization via Smooth Approximation
- A Single-Swap Local Search Algorithm for k-Means of Lines
- Ask a Strong LLM Judge when Your Reward Model is Uncertain
- A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search
- A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning
- A solvable model of learning generative diffusion: theory and insights
- Assessing the quality of denoising diffusion models in Wasserstein distance: noisy score and optimal bounds
- Assignments for Congestion-Averse Agents: Seeking Competitive and Envy-Free Solutions
- Association-Focused Path Aggregation for Graph Fraud Detection
- A Stable Whitening Optimizer for Efficient Neural Network Training
- A Standardized Benchmark for Multilabel Antimicrobial Peptide Classification
- A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules
- A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics
- AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy
- A Sustainable AI Economy Needs Data Deals That Work for Generators
- Asymmetric Dual-Lens Video Deblurring
- Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning
- Asymmetric Duos: Sidekicks Improve Uncertainty
- Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards
- Asymptotically exact variational flows via involutive MCMC kernels
- Asymptotically Stable Quaternion-valued Hopfield-structured Neural Network with Periodic Projection-based Supervised Learning Rules
- Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks
- Asymptotic Theory of Geometric and Adaptive $k$-Means Clustering
- Asymptotic theory of SGD with a general learning-rate
- A Tale of Two Symmetries: Exploring the Loss Landscape of Equivariant Models
- A Technical Report on “Erasing the Invisible”: The 2024 NeurIPS Competition on Stress Testing Image Watermarks
- A Temporal Difference Method for Stochastic Continuous Dynamics
- A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation
- A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning
- A Theory for Worst-Case vs. Average-Case Guarantees for LLMs
- A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings
- ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data
- AtlasGS: Atlanta-world Guided Surface Reconstruction with Implicit Structured Gaussians
- AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science
- A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone
- Atomic Diffusion Models for Small Molecule Structure Elucidation from NMR Spectra
- Atomic Thinking of LLMs: Decoupling and Exploring Mathematical Reasoning Abilities
- Atom of Thoughts for Markov LLM Test-Time Scaling
- A TRIANGLE Enables Multimodal Alignment Beyond Cosine Similarity
- Attack by Yourself: Effective and Unnoticeable Multi-Category Graph Backdoor Attacks with Subgraph Triggers Pool
- Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs
- Attention (as Discrete-Time Markov) Chains
- Attention-based clustering
- Attention Mechanism, Max-Affine Partition, and Universal Approximation
- Attention on the Sphere
- AttentionPredictor: Temporal Patterns Matter for KV Cache Compression
- Attention Sinks: A 'Catch, Tag, Release' Mechanism for Embeddings
- Attention with Trained Embeddings Provably Selects Important Tokens
- Attention! Your Vision Language Model Could Be Maliciously Manipulated
- Attractive Metadata Attack: Inducing LLM Agents to Invoke Malicious Tools
- Attribution-Driven Adaptive Token Pruning for Transformers
- Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models
- Audio Super-Resolution with Latent Bridge Models
- Audio-Sync Video Generation with Multi-Stream Temporal Control
- Auditing Meta-Cognitive Hallucinations in Reasoning Large Language Models
- Audits Under Resource, Data, and Access Constraints: Scaling Laws For Less Discriminatory Alternatives
- AudSemThinker: Enhancing Audio-Language Models Through Reasoning over Semantics of Sound
- AugGen: Synthetic Augmentation using Diffusion Models Can Improve Recognition
- Augmenting Biological Fitness Prediction Benchmarks with Landscapes Features from GraphFLA
- A Unified Analysis of Stochastic Gradient Descent with Arbitrary Data Permutations and Beyond
- A Unified Approach to Submodular Maximization Under Noise
- A unified framework for establishing the universal approximation of transformer-type architectures
- A Unified Framework for Fair Graph Generation: Theoretical Guarantees and Empirical Advances
- A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values
- A Unified Framework for the Transportability of Population-Level Causal Measures
- A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random
- A Unified Reasoning Framework for Holistic Zero-Shot Video Anomaly Analysis
- A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking
- A Unified Stability Analysis of SAM vs SGD: Role of Data Coherence and Emergence of Simplicity Bias
- A Unifying View of Linear Function Approximation in Off-Policy RL Through Matrix Splitting and Preconditioning
- AuroRA: Breaking Low-Rank Bottleneck of LoRA with Nonlinear Mapping
- Auto-Compressing Networks
- Auto-Connect: Connectivity-Preserving RigFormer with Direct Preference Optimization
- AutoData: A Multi-Agent System for Open Web Data Collection
- AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise
- AutoEdit: Automatic Hyperparameter Tuning for Image Editing
- Autoencoding Random Forests
- AutoHood3D: A Multi‑Modal Benchmark for Automotive Hood Design and Fluid–Structure Interaction
- AutoJudge: Judge Decoding Without Manual Annotation
- Automated Composition of Agents: A Knapsack Approach for Agentic Component Selection
- Automated Detection of Visual Attribute Reliance with a Self-Reflective Agent
- Automated Model Discovery via Multi-modal & Multi-step Pipeline
- Automatic Auxiliary Task Selection and Adaptive Weighting Boost Molecular Property Prediction
- Automatic Synthetic Data and Fine-grained Adaptive Feature Alignment for Composed Person Retrieval
- Automatic Visual Instrumental Variable Learning for Confounding-Resistant Domain Generalization
- Automaton Constrained Q-Learning
- AutoOpt: A Dataset and a Unified Framework for Automating Optimization Problem Solving
- AutoPartGen: Autoregressive 3D Part Generation and Discovery
- AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration
- Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation
- Autoregressive Models Beyond Language
- Autoregressive Motion Generation with Gaussian Mixture-Guided Latent Sampling
- AutoSciDACT: Automated Scientific Discovery through Contrastive Embedding and Hypothesis Testing
- Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models
- AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling
- AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning
- Aux-Think: Exploring Reasoning Strategies for Data-Efficient Vision-Language Navigation
- Availability-aware Sensor Fusion via Unified Canonical Space
- AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding
- AVerImaTeC: A Dataset for Automatic Verification of Image-Text Claims with Evidence from the Web
- Avoiding exp(R) scaling in RLHF through Preference-based Exploration
- Axial Neural Networks for Dimension-Free Foundation Models
- Backdoor Cleaning without External Guidance in MLLM Fine-tuning
- BackdoorDM: A Comprehensive Benchmark for Backdoor Learning on Diffusion Model
- BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models
- Backdoor Mitigation via Invertible Pruning Masks
- Backpropagation-Free Test-Time Adaptation via Probabilistic Gaussian Alignment
- Backward Conformal Prediction
- BADiff: Bandwidth Adaptive Diffusion Model
- BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization
- Bag of Tricks for Inference-time Computation of LLM Reasoning
- Balanced Active Inference
- Balanced Conic Rectified Flow
- Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization
- Balancing Gradient and Hessian Queries in Non-Convex Optimization
- Balancing Multimodal Training Through Game-Theoretic Regularization
- Balancing Performance and Costs in Best Arm Identification
- Balancing Positive and Negative Classification Error Rates in Positive-Unlabeled Learning
- BAM-ICL: Causal Hijacking In-Context Learning with Budgeted Adversarial Manipulation
- Bandit and Delayed Feedback in Online Structured Prediction
- Bandit Guided Submodular Curriculum for Adaptive Subset Selection
- BaRISTA: Brain Scale Informed Spatiotemporal Representation of Human Intracranial Neural Activity
- Bayesian Concept Bottleneck Models with LLM Priors
- Bayesian Ego-graph inference for Networked Multi-Agent Reinforcement Learning
- Bayesian Optimization with Preference Exploration using a Monotonic Neural Network Ensemble
- Bayes optimal learning of attention-indexed models
- BayeSQP: Bayesian Optimization through Sequential Quadratic Programming
- BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning
- BecomingLit: Relightable Gaussian Avatars with Hybrid Neural Shading
- BEDLAM2.0: Synthetic humans and cameras in motion
- Behavior Injection: Preparing Language Models for Reinforcement Learning
- Belief-Calibrated Multi-Agent Consensus Seeking for Complex NLP Tasks
- BeliefMapNav: 3D Voxel-Based Belief Map for Zero-Shot Object Navigation
- BenchmarkCards: Standardized Documentation for Large Language Model Benchmarks
- Benchmarking Egocentric Multimodal Goal Inference for Assistive Wearable Agents
- Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms
- Benchmarking Large Language Models with Integer Sequence Generation Tasks
- Benchmarking Retrieval-Augmented Multimomal Generation for Document Question Answering
- Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and Challenges
- Benford’s Curse: Tracing Digit Bias to Numerical Hallucination in LLMs
- Benign Overfitting in Single-Head Attention
- Bernstein–von Mises for Adaptively Collected Data
- Best-of-N Jailbreaking
- Better Estimation of the Kullback--Leibler Divergence Between Language Models
- Better Language Model Inversion by Compactly Representing Next-Token Distributions
- Better NTK Conditioning: A Free Lunch from (ReLU) Nonlinear Activation in Wide Neural Networks
- Better Tokens for Better 3D: Advancing Vision-Language Modeling in 3D Medical Imaging
- Better Training Data Attribution via Better Inverse Hessian-Vector Products
- BevSplat: Resolving Height Ambiguity via Feature-Based Gaussian Primitives for Weakly-Supervised Cross-View Localization
- Beyond $\tilde{O}(\sqrt{T})$ Constraint Violation for Online Convex Optimization with Adversarial Constraints
- Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
- Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs
- Beyond Average Value Function in Precision Medicine: Maximum Probability-Driven Reinforcement Learning for Survival Analysis
- Beyond Benign Overfitting in Nadaraya-Watson Interpolators
- Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations
- Beyond Components: Singular Vector-Based Interpretability of Transformer Circuits
- Beyond Expectations: Quantile-Guided Alignment for Risk-Calibrated Language Models
- Beyond Greedy Exits: Improved Early Exit Decisions for Risk Control and Reliability
- Beyond Higher Rank: Token-wise Input-Output Projections for Efficient Low-Rank Adaptation
- Beyond Last-Click: An Optimal Mechanism for Ad Attribution
- Beyond Least Squares: Uniform Approximation and the Hidden Cost of Misspecification
- Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking
- BeyondMix: Leveraging Structural Priors and Long-Range Dependencies for Domain-Invariant LiDAR Segmentation
- Beyond Modality Collapse: Representation Blending for Multimodal Dataset Distillation
- Beyond Node-Centric Modeling: Sketching Signed Networks with Simplicial Complexes
- Beyond Oracle: Verifier-Supervision for Instruction Hierarchy in Reasoning and Instruction-Tuned LLMs
- Beyond Pairwise Connections: Extracting High-Order Functional Brain Network Structures under Global Constraints
- Beyond Prediction: Managing the Repercussions of Machine Learning Applications
- Beyond Random: Automatic Inner-loop Optimization in Dataset Distillation
- Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs
- Beyond Scalars: Concept-Based Alignment Analysis in Vision Transformers
- Beyond Scores: Proximal Diffusion Models
- Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs
- Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
- Beyond the Average: Distributional Causal Inference under Imperfect Compliance
- Beyond the Seen: Bounded Distribution Estimation for Open-Vocabulary Learning
- Beyond the Surface: Enhancing LLM-as-a-Judge Alignment with Human via Internal Representations
- Beyond Token Probes: Hallucination Detection via Activation Tensors with ACT-ViT
- Beyond Value Functions: Single-Loop Bilevel Optimization under Flatness Conditions
- Beyond Verifiable Rewards: Scaling Reinforcement Learning in Language Models to Unverifiable Data
- Bézier Splatting for Fast and Differentiable Vector Graphics Rendering
- Bi-Directional Communication-Efficient Stochastic FL via Remote Source Generation
- Bidirectional Motion Transformer for Safety-Critical Traffic Scenario Generation
- Bidirectional Representations Augmented Autoregressive Biological Sequence Generation: Application in De Novo Peptide Sequencing
- Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents
- BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models
- Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
- Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
- BikeBench: A Bicycle Design Benchmark for Generative Models with Objectives and Constraints
- Bi-Level Decision-Focused Causal Learning for Large-Scale Marketing Optimization: Bridging Observational and Experimental Data
- Bi-Level Knowledge Transfer for Multi-Task Multi-Agent Reinforcement Learning
- Bilevel Network Learning via Hierarchically Structured Sparsity
- Bilevel Optimization for Adversarial Learning Problems: Sharpness, Generation, and Beyond
- Bilevel ZOFO: Efficient LLM Fine-Tuning and Meta-Training
- Binary Quadratic Quantization: Beyond First-Order Quantization for Real-Valued Matrix Compression
- BioCG: Constrained Generative Modeling for Biochemical Interaction Prediction
- BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning
- Bio-Inspired Image Restoration
- BioOSS: A Bio-Inspired Oscillatory State System with Spatio-Temporal Dynamics
- BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model
- Biosecurity Safeguards for Generative AI
- BIPNN: Learning to Solve Binary Integer Programming via Hypergraph Neural Networks
- Bipolar Self-attention for Spiking Transformers
- Bisecle: Binding and Separation in Continual Learning for Video Language Understanding
- BitMark: Watermarking Bitwise Autoregressive Image Generative Models
- Bits Leaked per Query: Information-Theoretic Bounds for Adversarial Attacks on LLMs
- Bit-swapping Oriented Twin-memory Multi-view Clustering in Lifelong Incomplete Scenarios
- Bivariate Matrix-valued Linear Regression (BMLR): Finite-sample performance under Identifiability and Sparsity Assumptions
- Black-Box Membership Inference Attack for LVLMs via Prior Knowledge-Calibrated Memory Probing
- Blackbox Model Provenance via Palimpsestic Membership Inference
- Blameless Users in a Clean Room: Defining Copyright Protection for Generative Models
- Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
- BLEUBERI: BLEU is a surprisingly effective reward for instruction following
- Blindfolded Experts Generalize Better: Insights from Robotic Manipulation and Videogames
- BLINK-Twice: You see, but do you observe? A Reasoning Benchmark on Visual Perception
- Block-Biased Mamba for Long-Range Sequence Processing
- Block Coordinate Descent for Neural Networks Provably Finds Global Minima
- BlockDecoder: Boosting ASR Decoders with Context and Merger Modules
- Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving
- BlockScan: Detecting Anomalies in Blockchain Transactions
- Blockwise Flow Matching: Improving Flow Matching Models For Efficient High-Quality Generation
- BlurDM: A Blur Diffusion Model for Image Deblurring
- BlurGuard: A Simple Approach for Robustifying Image Protection Against AI-Powered Editing
- BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset
- BMW: Bidirectionally Memory bank reWriting for Unsupervised Person Re-Identification
- BNMusic: Blending Environmental Noises into Personalized Music
- BO4Mob: Bayesian Optimization Benchmarks for High-Dimensional Urban Mobility Problem
- Bohdi: Heterogeneous LLM Fusion with Automatic Data Exploration
- BoltzNCE: Learning likelihoods for Boltzmann Generation with Stochastic Interpolants and Noise Contrastive Estimation
- BOOM: Benchmarking Out-Of-distribution Molecular Property Predictions of Machine Learning Models
- Boosting Adversarial Transferability with Spatial Adversarial Alignment
- Boosting Generative Image Modeling via Joint Image-Feature Synthesis
- Boosting Knowledge Utilization in Multimodal Large Language Models via Adaptive Logits Fusion and Attention Reallocation
- Boosting Resilience of Large Language Models through Causality-Driven Robust Optimization
- Boosting Skeleton-based Zero-Shot Action Recognition with Training-Free Test-Time Adaptation
- Boosting the Uniqueness of Neural Networks Fingerprints with Informative Triggers
- Bootstrap Off-policy with World Model
- Bootstrapping Hierarchical Autoregressive Formal Reasoner with Chain-of-Proxy-Autoformalization
- Bootstrap Your Uncertainty: Adaptive Robust Classification Driven by Optimal-Transport
- Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities
- Boundary-to-Region Supervision for Offline Safe Reinforcement Learning
- Boundary-Value PDEs Meet Higher-Order Differential Topology-aware GNNs
- Bounds on the computational complexity of neurons due to dendritic morphology
- BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems
- BRACE: A Benchmark for Robust Audio Caption Quality Evaluation
- BrainEC-LLM: Brain Effective Connectivity Estimation by Multiscale Mixing LLM
- BrainFlow: A Holistic Pathway of Dynamic Neural System on Manifold
- Brain Harmony: A Multimodal Foundation Model Unifying Morphology and Function into 1D Tokens
- Brain-Informed Fine-Tuning for Improved Multilingual Understanding in Language Models
- Brain-Inspired fMRI-to-Text Decoding via Incremental and Wrap-Up Language Modeling
- Brain-Like Processing Pathways Form in Models With Heterogeneous Experts
- Brain-like Variational Inference
- BrainMoE: Cognition Joint Embedding via Mixture-of-Expert Towards Robust Brain Foundation Model
- Brain network science modelling of sparse neural networks enables Transformers and LLMs to perform as fully connected
- BrainODE: Neural Shape Dynamics for Age- and Disease-aware Brain Trajectories
- BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals
- Brain-tuning Improves Generalizability and Efficiency of Brain Alignment in Speech Models
- BraVE: Offline Reinforcement Learning for Discrete Combinatorial Action Spaces
- BREAD: Branched Rollouts from Expert Anchors Bridge SFT & RL for Reasoning
- Breaking AR’s Sampling Bottleneck: Provable Acceleration via Diffusion Language Models
- Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection
- Breaking the Batch Barrier (B3) of Contrastive Learning via Smart Batch Mining
- Breaking the Compression Ceiling: Data-Free Pipeline for Ultra-Efficient Delta Compression
- Breaking the Discretization Barrier of Continuous Physics Simulation Learning
- Breaking the Frozen Subspace: Importance Sampling for Low-Rank Optimization in LLM Pretraining
- Breaking the Gradient Barrier: Unveiling Large Language Models for Strategic Classification
- Breaking the Order Barrier: Off-Policy Evaluation for Confounded POMDPs
- Breaking the Performance Ceiling in Reinforcement Learning requires Inference Strategies
- Breakthrough Sensor-Limited Single View: Towards Implicit Temporal Dynamics for Time Series Domain Adaptation
- BridgePure: Limited Protection Leakage Can Break Black-Box Data Protection
- BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models
- Bridging Arbitrary and Tree Metrics via Differentiable Gromov Hyperbolicity
- Bridging Brains and Concepts: Interpretable Visual Decoding from fMRI with Semantic Bottlenecks
- Bridging Critical Gaps in Convergent Learning: How Representational Alignment Evolves Across Layers, Training, and Distribution Shifts
- Bridging Crypto with ML-based Solvers: the SAT Formulation and Benchmarks
- Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds
- Bridging Equivariant GNNs and Spherical CNNs for Structured Physical Domains
- Bridging Expressivity and Scalability with Adaptive Unitary SSMs
- Bridging Human and LLM Judgments: Understanding and Narrowing the Gap
- Bridging Scales: Spectral Theory Reveals How Local Connectivity Rules Sculpt Global Neural Dynamics in Spatially Extended Networks
- Bridging Sign and Spoken Languages: Pseudo Gloss Generation for Sign Language Translation
- Bridging Symmetry and Robustness: On the Role of Equivariance in Enhancing Adversarial Robustness
- Bridging the Gap Between Cross-Domain Theory and Practical Application: A Case Study on Molecular Dissolution
- Bridging the gap to real-world language-grounded visual concept learning
- Bridging Theory and Practice in Link Representation with Graph Neural Networks
- Bridging Time and Linguistics: LLMs as Time Series Analyzer through Symbolization and Segmentation
- Bringing SAM to new heights: leveraging elevation data for tree crown segmentation from drone imagery
- Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations
- BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent
- Bubbleformer: Forecasting Boiling with Transformers
- Buffer layers for Test-Time Adaptation
- Building 3D Representations and Generating Motions From a Single Image via Video-Generation
- BundleFlow: Deep Menus for Combinatorial Auctions by Diffusion-Based Optimization
- BurstDeflicker: A Benchmark Dataset for Flicker Removal in Dynamic Scenes
- C$^2$Prompt: Class-aware Client Knowledge Interaction for Federated Continual Learning
- C3Po: Cross-View Cross-Modality Correspondence by Pointmap Prediction
- C3PO: Optimized Large Language Model Cascades with Probabilistic Cost Constraints for Reasoning
- CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward
- CADGrasp: Learning Contact and Collision Aware General Dexterous Grasping in Cluttered Scenes
- CADMorph: Geometry‑Driven Parametric CAD Editing via a Plan–Generate–Verify Loop
- CAGE: Continuity-Aware edGE Network Unlocks Robust Floorplan Reconstruction
- Calibrating Translation Decoding with Quality Estimation on LLMs
- CaliGCL: Calibrated Graph Contrastive Learning via Partitioned Similarity and Consistency Discrimination
- CALM: Culturally Self-Aware Language Models
- CALM-PDE: Continuous and Adaptive Convolutions for Latent Space Modeling of Time-dependent PDEs
- CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension
- CamEdit: Continuous Camera Parameter Control for Photorealistic Image Editing
- Cameras as Relative Positional Encoding
- CAMILA: Context-Aware Masking for Image Editing with Language Alignment
- CaMiT: A Time-Aware Car Model Dataset for Classification and Generation
- CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems
- CAMO: Convergence-Aware Multi-Fidelity Bayesian Optimization
- CamSAM2: Segment Anything Accurately in Camouflaged Videos
- Can Agent Fix Agent Issues?
- Cancer Survival Analysis via Zero-shot Tumor Microenvironment Segmentation on Low-resolution Whole Slide Pathology Images
- Can Class-Priors Help Single-Positive Multi-Label Learning?
- Can Dependencies Induced by LLM-Agent Workflows Be Trusted?
- Can Diffusion Models Disentangle? A Theoretical Perspective
- Can DPO Learn Diverse Human Values? A Theoretical Scaling Law
- Can Knowledge-Graph-based Retrieval Augmented Generation Really Retrieve What You Need?
- Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark
- Can Large Language Models Master Complex Card Games?
- Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind
- Can LLMs Correct Themselves? A Benchmark of Self-Correction in LLMs
- Can LLMs Outshine Conventional Recommenders? A Comparative Evaluation
- Can LLMs Reason Over Non-Text Modalities in a Training-Free Manner? A Case Study with In-Context Representation Learning
- Can MLLMs Absorb Math Reasoning Abilities from LLMs as Free Lunch?
- Can Multi-Modal LLMs Provide Live Step-by-Step Task Guidance?
- Can NeRFs "See" without Cameras?
- Can We Infer Confidential Properties of Training Data from LLMs?
- CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness
- Caption This, Reason That: VLMs Caught in the Middle
- Capturing Individual Human Preferences with Reward Features
- Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
- CarbonGlobe: A Global-Scale, Multi-Decade Dataset and Benchmark for Carbon Forecasting in Forest Ecosystems
- CARE: Decoding-Time Safety Alignment via Rollback and Introspection Intervention
- Care-PD: A Multi-Site Anonymized Clinical Dataset for Parkinson’s Disease Gait Assessment
- CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs
- CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching
- Cascaded Language Models for Cost-Effective Human–AI Decision-Making
- CAS-Spec: Cascade Adaptive Self-Speculative Decoding for On-the-Fly Lossless Inference Acceleration of LLMs
- CAT: Circular-Convolutional Attention for Sub-Quadratic Transformers
- CAT: Content-Adaptive Image Tokenization
- CATransformers: Carbon Aware Transformers Through Joint Model-Hardware Optimization
- Causal Climate Emulation with Bayesian Filtering
- Causal Differentiating Concepts: Interpreting LM Behavior via Causal Representation Learning
- Causal Discovery and Inference through Next-Token Prediction
- Causal Discovery over Clusters of Variables in Markovian Systems
- CausalDynamics: A large‐scale benchmark for structural discovery of dynamical causal models
- Causal Explanation-Guided Learning for Organ Allocation
- Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers
- Causality-Induced Positional Encoding for Transformer-Based Representation Learning of Non-Sequential Features
- Causality Meets Locality: Provably Generalizable and Scalable Policy Learning for Networked Systems
- Causality Meets the Table: Debiasing LLMs for Faithful TableQA via Front-Door Intervention
- Causal LLM Routing: End-to-End Regret Minimization from Observational Data
- Causally Reliable Concept Bottleneck Models
- Causal Mixture Models: Characterization and Discovery
- CausalPFN: Amortized Causal Effect Estimation via In-Context Learning
- Causal-R: A Causal-Reasoning Geometry Problem Solver for Optimized Solution Exploration
- Causal Spatio-Temporal Prediction: An Effective and Efficient Multi-Modal Approach
- Causal Sufficiency and Necessity Improves Chain-of-Thought Reasoning
- CausalVerse: Benchmarking Causal Representation Learning with Configurable High-Fidelity Simulations
- CausalVTG: Towards Robust Video Temporal Grounding via Causal Inference
- CauScien: Uncovering Causality in Science
- CCL: Causal-aware In-context Learning for Out-of-Distribution Generalization
- CCS: Controllable and Constrained Sampling with Diffusion Models via Initial Noise Perturbation
- CDFlow: Building Invertible Layers with Circulant and Diagonal Matrices
- CellCLIP - Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning
- CellVerse: Do Large Language Models Really Understand Cell Biology?
- Center for Global Health Equity, University of Michigan (BTF)
- Center for Low-Resource Languages and Cultures (CLRLC) (BTF)
- Center for Subsurface Energy and Sustainability at The Ohio State University (BTF)
- Centering Low-Resource Languages and Cultures in the Age of Large Language Models
- Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning
- Certifying Concavity and Monotonicity in Games via Sum-of-Squares Hierarchies
- Certifying Deep Network Risks and Individual Predictions with PAC-Bayes Loss via Localized Priors
- Certifying Stability of Reinforcement Learning Policies using Generalized Lyapunov Functions
- CF-VLM:CounterFactual Vision-Language Fine-tuning
- CGBench: Benchmarking Language Model Scientific Reasoning for Clinical Genetics Research
- CGS-GAN: 3D Consistent Gaussian Splatting GANs for High Resolution Human Head Synthesis
- CG-SSL: Concept-Guided Self-Supervised Learning
- Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation
- Chain of Execution Supervision Promotes General Reasoning in Large Language Models
- Chain-of-Model Learning for Language Model
- Chain-of-Retrieval Augmented Generation
- Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
- ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
- Channel Matters: Estimating Channel Influence for Multivariate Time Series
- Channel Simulation and Distributed Compression with Ensemble Rejection Sampling
- Characterization and Learning of Causal Graphs from Hard Interventions
- Characterizing control between interacting subsystems with deep Jacobian estimation
- Characterizing the Expressivity of Fixed-Precision Transformer Language Models
- ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models
- ChartSketcher: Reasoning with Multimodal Feedback and Reflection for Chart Understanding
- CHASM: Unveiling Covert Advertisements on Chinese Social Media
- ChatbotID: Identifying Chatbots with Granger Causality Test
- ChatVLA-2: Vision-Language-Action Model with Open-World Reasoning
- Checklists Are Better Than Reward Models For Aligning Language Models
- CheMixHub: Datasets and Benchmarks for Chemical Mixture Property Prediction
- ChemOrch: Empowering LLMs with Chemical Intelligence via Groundbreaking Synthetic Instructions
- ChemPile: A 250 GB Diverse and Curated Dataset for Chemical Foundation Models
- ChemX: A Collection of Chemistry Datasets for Benchmarking Automated Information Extraction
- CHiQPM: Calibrated Hierarchical Interpretable Image Classification
- Chirality in Action: Time-Aware Video Representation Learning by Latent Straightening
- Chiron-o1: Igniting Multimodal Large Language Models towards Generalizable Medical Reasoning via Mentor-Intern Collaborative Search
- CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models
- CHPO: Constrained Hybrid-action Policy Optimization for Reinforcement Learning
- ChromFound: Towards A Universal Foundation Model for Single-Cell Chromatin Accessibiltiy Data
- ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference
- CIDD: Collaborative Intelligence for Structure-Based Drug Design Empowered by LLMs
- CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation
- CISPA Helmholtz Center for Information Security (BTF)
- Civic Machines Lab, TUM Think Tank, Technical University of Munich (BTF)
- Class-aware Domain Knowledge Fusion and Fission for Continual Test-Time Adaptation
- Class conditional conformal prediction for multiple inputs by p-value aggregation
- Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code
- Class-wise Balancing Data Replay for Federated Class-Incremental Learning
- CLAWS:Creativity detection for LLM-generated solutions using Attention Window of Sections
- Clean First, Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment
- CLEAR: Command Level Annotated Dataset for Ransomware Detection
- CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up
- CLEVER: A Curated Benchmark for Formally Verified Code Generation
- CleverBirds: A Multiple-Choice Benchmark for Fine-grained Human Knowledge Tracing
- CLiFT: Compressive Light-Field Tokens for Compute Efficient and Adaptive Neural Rendering
- CLIMB: Class-imbalanced Learning Benchmark on Tabular Data
- ClinBench: A Standardized Multi-Domain Framework for Evaluating Large Language Models in Clinical Information Extraction
- ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World
- Clip-and-Verify: Linear Constraint-Driven Domain Clipping for Accelerating Neural Network Verification
- CLIPGaussian: Universal and Multimodal Style Transfer Based on Gaussian Splatting
- CLIPTTA: Robust Contrastive Vision-Language Test-Time Adaptation
- C-LoRA: Contextual Low-Rank Adaptation for Uncertainty Estimation in Large Language Models
- Closed-Form Training Dynamics Reveal Learned Features and Linear Structure in Word2Vec-like Models
- Cloud4D: Estimating Cloud Properties at a High Spatial and Temporal Resolution
- ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive
- Clustering via Hedonic Games: New Concepts and Algorithms
- CMoB: Modality Valuation via Causal Effect for Balanced Multimodal Learning
- C-NAV: Towards Self-Evolving Continual Object Navigation in Open World
- COALA: Numerically Stable and Efficient Framework for Context-Aware Low-Rank Approximation
- Coarse-to-Fine 3D Part Assembly via Semantic Super-Parts and Symmetry-Aware Pose Estimation
- Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Reinforcement Learning
- CoCoA: A Minimum Bayes Risk Framework Bridging Confidence and Consistency for Uncertainty Quantification in LLMs
- COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation
- CoC-VLA: Delving into Adversarial Domain Transfer for Explainable Autonomous Driving via Chain-of-Causality Visual-Language-Action Model
- CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects
- CodeAssistBench (CAB): Dataset & Benchmarking for Multi-turn Chat-Based Code Assistance
- CodeCrash: Exposing LLM Fragility to Misleading Natural Language in Code Reasoning
- CodeGEMM: A Codebook-Centric Approach to Efficient GEMM in Quantized LLMs
- Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks
- CodeMerge: Codebook-Guided Model Merging for Robust Test-Time Adaptation in Autonomous Driving
- Codifying Character Logic in Role-Playing
- CoFFT: Chain of Foresight-Focus Thought for Visual Language Models
- CogInterp: Interpreting Cognition in Deep Learning Models
- COGNAC: Cooperative Graph-based Networked Agent Challenges for Multi-Agent Reinforcement Learning
- Cognitive Mirrors: Exploring the Diverse Functional Roles of Attention Heads in LLM Reasoning
- Cognitive Predictive Processing: A Human-inspired Framework for Adaptive Exploration in Open-World Reinforcement Learning
- CogPhys: Assessing Cognitive Load via Multimodal Remote and Contact-based Physiological Sensing
- CogVLA: Cognition-Aligned Vision-Language-Action Models via Instruction-Driven Routing & Sparsification
- CoIDO: Efficient Data Selection for Visual Instruction Tuning via Coupled Importance-Diversity Optimization
- COLA: Towards Efficient Multi-Objective Reinforcement Learning with Conflict Objective Regularization in Latent Space
- Collaborating Vision, Depth, and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm
- Collaborative and Confidential Junction Trees for Hybrid Bayesian Networks
- Collaborative Reasoner: Self-Improving Social Agents with Synthetic Conversations
- Collapsing Taylor Mode Automatic Differentiation
- Collective Bargaining in the Information Economy Can Address AI-Driven Power Concentration
- Collective Counterfactual Explanations: Balancing Individual Goals and Collective Dynamics
- ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
- Color Conditional Generation with Sliced Wasserstein Guidance
- Coloring Learning for Heterophilic Graph Representation
- CoLT: The conditional localization test for assessing the accuracy of neural posterior estimates
- Combinatorial Ski Rental Problem: Robust and Learning-Augmented Algorithms
- Combining Cost Constrained Runtime Monitors for AI Safety
- COME: Adding Scene-Centric Forecasting Control to Occupancy World Model
- ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback
- Common Task Framework For a Critical Evaluation of Scientific Machine Learning Algorithms
- Communication-Efficient Diffusion Denoising Parallelization via Reuse-then-Predict Mechanism
- Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
- Compact Memory for Continual Logistic Regression
- Comparator-Adaptive $\Phi$-Regret: Improved Bounds, Simpler Algorithms, and Applications to Games
- Comparing Uniform Price and Discriminatory Multi-Unit Auctions through Regret Minimization
- Comparison requires valid measurement: Rethinking attack success rate comparisons in AI red teaming
- Competitive Advantage Attacks to Decentralized Federated Learning
- Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning
- Complete Structure Guided Point Cloud Completion via Cluster- and Instance-Level Contrastive Learning
- Complexity Scaling Laws for Neural Models using Combinatorial Optimization
- Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections
- ComPO: Preference Alignment via Comparison Oracles
- Composing Global Solutions to Reasoning Tasks via Algebraic Objects in Neural Nets
- Composing Linear Layers from Irreducibles
- Composite Flow Matching for Reinforcement Learning with Shifted-Dynamics Data
- Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models
- Compositional Monte Carlo Tree Diffusion for Extendable Planning
- Compositional Neural Network Verification via Assume-Guarantee Reasoning
- Compositional Reasoning with Transformers, RNNs, and Chain of Thought
- Composition and Alignment of Diffusion Models using Constrained Learning
- Comprehensive Assessment and Analysis for NSFW Content Erasure in Text-to-Image Diffusion models
- Compress & Cache: Vision token compression for efficient generation and retrieval
- Compressed and Smooth Latent Space for Text Diffusion Modeling
- Compress, Gather, and Recompute: REFORMing Long-Context Processing in Transformers
- Compress Large Language Models via Collaboration Between Learning and Matrix Approximation
- Compress to Impress: Efficient LLM Adaptation Using a Single Gradient Step on 100 Samples
- Computable universal online learning
- Computational Algebra with Attention: Transformer Oracles for Border Basis Algorithms
- Computational Budget Should Be Considered in Data Selection
- Computational Efficiency under Covariate Shift in Kernel Ridge Regression
- Computational Hardness of Reinforcement Learning with Partial $q^{\pi}$-Realizability
- Computation and Memory-Efficient Model Compression with Gradient Reweighting
- Compute-Optimal Scaling for Value-Based Deep RL
- ComRank: Ranking Loss for Multi-Label Complementary Label Learning
- Concentration and excess risk bounds for imbalanced classification with synthetic oversampling
- Concept-Guided Interpretability via Neural Chunking
- Concept Incongruence: An Exploration of Time and Death in Role Playing
- ConceptScope: Characterizing Dataset Bias via Disentangled Visual Concepts
- Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations
- Conditional Diffusion Anomaly Modeling on Graphs
- Conditional Distribution Compression via the Kernel Conditional Mean Embedding
- Conditional Forecasts and Proper Scoring Rules for Reliable and Accurate Performative Predictions
- Conditional Gradient Methods with Standard LMO for Stochastic Simple Bilevel Optimization
- Conditional Panoramic Image Generation via Masked Autoregressive Modeling
- Conditional Representation Learning for Customized Tasks
- Conditioning Matters: Training Diffusion Policies is Faster Than You Think
- Confidence-Aware With Prototype Alignment for Partial Multi-label Learning
- Conflict-Aware Knowledge Editing in the Wild: Semantic-Augmented Graph Representation for Unstructured Text
- Conformal Arbitrage: Risk-Controlled Balancing of Competing Objectives in Language Models
- Conformal Inference under High-Dimensional Covariate Shifts via Likelihood-Ratio Regularization
- Conformal Information Pursuit for Interactively Guiding Large Language Models
- Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
- Conformal Mixed-Integer Constraint Learning with Feasibility Guarantees
- Conformal Online Learning of Deep Koopman Linear Embeddings
- Conformal Prediction Beyond the Horizon: Distribution-Free Inference for Policy Evaluation
- Conformal Prediction Beyond the Seen: A Missing Mass Perspective for Uncertainty Quantification in Generative Models
- Conformal Prediction for Causal Effects of Continuous Treatments
- Conformal Prediction for Ensembles: Improving Efficiency via Score-Based Aggregation
- Conformal Prediction for Time-series Forecasting with Change Points
- Conformal Prediction in The Loop: A Feedback-Based Uncertainty Model for Trajectory Optimization
- Conformal Prediction under Lévy-Prokhorov Distribution Shifts: Robustness to Local and Global Perturbations
- Conformal Risk Training: End-to-End Optimization of Conformal Risk Control
- Confounding Robust Deep Reinforcement Learning: A Causal Approach
- ConfTuner: Training Large Language Models to Express Their Confidence Verbally
- Confusion-Driven Self-Supervised Progressively Weighted Ensemble Learning for Non-Exemplar Class Incremental Learning
- Connecting Jensen–Shannon and Kullback–Leibler Divergences: A New Bound for Representation Learning
- Connecting Neural Models Latent Geometries with Relative Geodesic Representations
- Connectome-Based Modelling Reveals Orientation Maps in the Drosophila Optic Lobe
- ConnectomeBench: Can LLMs proofread the connectome?
- Consensus-Robust Transfer Attacks via Parameter and Representation Perturbations
- Conservative classifiers do consistently well with improving agents: characterizing statistical and online learning
- Consistency Conditions for Differentiable Surrogate Losses
- Consistency of Physics-Informed Neural Networks for Second-Order Elliptic Equations
- Consistency of the $k_n$-nearest neighbor rule under adaptive sampling
- Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning
- Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
- Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models
- Consistent Story Generation: Unlocking the Potential of Zigzag Sampling
- Consistent Supervised-Unsupervised Alignment for Generalized Category Discovery
- Constant Bit-size Transformers Are Turing Complete
- ConStellaration: A dataset of QI-like stellarator plasma boundaries and optimization benchmarks
- Constrained Best Arm Identification
- Constrained Diffusers for Safe Planning and Control
- Constrained Discrete Diffusion
- Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models
- Constrained Feedback Learning for Non-Stationary Multi-Armed Bandits
- Constrained Linear Thompson Sampling
- Constrained Optimization for Machine Learning
- Constrained Optimization From a Control Perspective via Feedback Linearization
- Constrained Posterior Sampling: Time Series Generation with Hard Constraints
- Constrained Sampling for Language Models Should Be Easy: An MCMC Perspective
- Constructing an Optimal Behavior Basis for the Option Keyboard
- Contact Map Transfer with Conditional Diffusion Model for Generalizable Dexterous Grasp Generation
- ContextAgent: Context-Aware Proactive LLM Agents with Open-world Sensory Perceptions
- Context-Aware Hierarchical Learning: A Two-Step Paradigm towards Safer LLMs
- Context-Aware Regularization with Markovian Integration for Attention-Based Nucleotide Analysis
- ConTextTab: A Semantics-Aware Tabular In-Context Learner
- Contextual Dynamic Pricing with Heterogeneous Buyers
- Contextual Integrity in LLMs via Reasoning and Reinforcement Learning
- Contextual Online Pricing with (Biased) Offline Data
- Contextual Thompson Sampling via Generation of Missing Data
- Contextual Tokenization for Graph Inverted Indices
- Contimask: Explaining Irregular Time Series via Perturbations in Continuous Time
- Continual Gaussian Mixture Distribution Modeling for Class Incremental Semantic Segmentation
- Continual Knowledge Adaptation for Reinforcement Learning
- Continual Model Merging without Data: Dual Projections for Balancing Stability and Plasticity
- Continual Multimodal Contrastive Learning
- Continual Optimization with Symmetry Teleportation for Multi-Task Learning
- Continual Release Moment Estimation with Differential Privacy
- Continuity and Isolation Lead to Doubts or Dilemmas in Large Language Models
- Continuous Concepts Removal in Text-to-image Diffusion Models
- Continuous Diffusion Model for Language Modeling
- Continuous Domain Generalization
- Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control
- Continuous Simplicial Neural Networks
- Continuous Soft Actor-Critic: An Off-Policy Learning Method Robust to Time Discretization
- Continuous Subspace Optimization for Continual Learning
- Continuous Thought Machines
- Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space
- Contrastive Consolidation of Top-Down Modulations Achieves Sparsely Supervised Continual Learning
- Contrastive Learning with Data Misalignment: Feature Purity, Training Dynamics and Theoretical Generalization Guarantees
- Contrastive Representations for Temporal Reasoning
- Contrastive Self-Supervised Learning As Neural Manifold Packing
- Contribution of task-irrelevant stimuli to drift of neural representations
- ControlFusion: A Controllable Image Fusion Network with Language-Vision Degradation Prompts
- Controllable 3D Molecular Generation for Structure-Based Drug Design Through Bayesian Flow Networks and Gradient Integration
- Controllable Human-centric Keyframe Interpolation with Generative Prior
- Controlled Visual Hallucination via Thalamus-Driven Decoupling Network for Domain Adaptation of Black-Box Predictors
- Controlling the Flow: Stability and Convergence for Stochastic Gradient Descent with Decaying Regularization
- Controlling The Spread of Epidemics on Networks with Differential Privacy
- Controlling Thinking Speed in Reasoning Models
- Convergence of Clipped SGD on Convex $(L_0,L_1)$-Smooth Functions
- Convergence of the Gradient Flow for Shallow ReLU Networks on Weakly Interacting Data
- Convergence Rates for Gradient Descent on the Edge of Stability for Overparametrised Least Squares
- Convergence Rates of Constrained Expected Improvement
- Convergence Theorems for Entropy-Regularized and Distributional Reinforcement Learning
- Convergent Functions, Divergent Forms
- Convex Approximation of Two-Layer ReLU Networks for Hidden State Differential Privacy
- Convex Potential Mirror Langevin Algorithm for Efficient Sampling of Energy-Based Models
- ConViS-Bench: Estimating Video Similarity Through Semantic Concepts
- Convolution Goes Higher-Order: A Biologically Inspired Mechanism Empowers Image Classification
- COOPERA: Continual Open-Ended Human-Robot Assistance
- Cooperative Bargaining Games Without Utilities: Mediated Solutions from Direction Oracles
- Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking by Contrasting Layers
- CoP: Agentic Red-teaming for Large Language Models using Composition of Principles
- Co-PatcheR: Collaborative Software Patching with Component-specific Small Reasoning Models
- Copresheaf Topological Neural Networks: A Generalized Deep Learning Framework
- CORAL: Disentangling Latent Representations in Long-Tailed Diffusion
- CoralVQA: A Large-Scale Visual Question Answering Dataset for Coral Reef Image Understanding
- CoreaSpeech: Korean Speech Corpus via JAMO-based Coreset Selection for Efficient and Robust Korean Speech Generation
- CoRe: Benchmarking LLMs’ Code Reasoning Capabilities through Static Analysis Tasks
- CORE: Collaborative Optimization with Reinforcement Learning and Evolutionary Algorithm for Floorplanning
- CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment
- Co-Regularization Enhances Knowledge Transfer in High Dimensions
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
- CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMs
- Coreset for Robust Geometric Median: Eliminating Size Dependency on Outliers
- Coresets for Clustering Under Stochastic Noise
- Corporate Needs You to Find the Difference: Revisiting Submodular and Supermodular Ratio Optimization Problems
- Correcting misinterpretations of additive models
- Corrector Sampling in Language Models
- Correlated Low-Rank Adaptation for ConvNets
- Correlation Dimension of Autoregressive Large Language Models
- COS3D: Collaborative Open-Vocabulary 3D Segmentation
- CosmoBench: A Multiscale, Multiview, Multitask Cosmology Benchmark for Geometric Deep Learning
- Cost-Aware Contrastive Routing for LLMs
- Cost-aware LLM-based Online Dataset Annotation
- Cost-Efficient LLM Training with Lifetime-Aware Tensor Offloading via GPUDirect Storage
- Cost-Sensitive Freeze-thaw Bayesian Optimization for Efficient Hyperparameter Tuning
- CoT Information: Improved Sample Complexity under Chain-of-Thought Supervision
- CoT-lized Diffusion: Let's Reinforce T2I Generation Step-by-step
- CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring
- CoUn: Empowering Machine Unlearning via Contrastive Learning
- Counteractive RL: Rethinking Core Principles for Efficient and Scalable Deep Reinforcement Learning
- Counterfactual Evolution of Multimodal Datasets via Visual Programming
- Counterfactual Identifiability via Dynamic Optimal Transport
- Counterfactual Image Editing with Disentangled Causal Latent Space
- Counterfactual Implicit Feedback Modeling
- Counterfactual reasoning: an analysis of in-context emergence
- Counterfactual Reasoning for Steerable Pluralistic Value Alignment of Large Language Models
- Coupled Data and Measurement Space Dynamics for Enhanced Diffusion Posterior Sampling
- Coupling Generative Modeling and an Autoencoder with the Causal Bridge
- Covariances for Free: Exploiting Mean Distributions for Training-free Federated Learning
- Covariate-moderated Empirical Bayes Matrix Factorization
- Covering Multiple Objectives with a Small Set of Solutions Using Bayesian Optimization
- CovMatch: Cross-Covariance Guided Multimodal Dataset Distillation with Trainable Text Encoder
- CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching
- CPathAgent: An Agent-based Foundation Model for Interpretable High-Resolution Pathology Image Analysis Mimicking Pathologists' Diagnostic Logic
- CPO: Condition Preference Optimization for Controllable Image Generation
- CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models
- CPRet: A Dataset, Benchmark, and Model for Retrieval in Competitive Programming
- CPSea: Large-scale cyclic peptide-protein complex dataset for machine learning in cyclic peptide design
- CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection
- CREA: A Collaborative Multi-Agent Framework for Creative Image Editing and Generation
- Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models
- Credal Prediction based on Relative Likelihood
- CReFT-CAD: Boosting Orthographic Projection Reasoning for CAD via Reinforcement Fine-Tuning
- Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training
- CroPe: Cross-Modal Semantic Compensation Adaptation for All Adverse Scene Understanding
- CrossAD: Time Series Anomaly Detection with Cross-scale Associations and Cross-window Modeling
- Cross City Traffic Flow Generation via Retrieval Augmented Diffusion Model
- Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models
- Cross-fluctuation phase transitions reveal sampling dynamics in diffusion models
- Cross-modal Associations in Vision and Language Models: Revisiting the Bouba-Kiki Effect
- Cross-Modal Representational Knowledge Distillation for Enhanced Spike-informed LFP Modeling
- CrossSpectra: Exploiting Cross-Layer Smoothness for Parameter-Efficient Fine-Tuning
- CRRL: Learning Channel-invariant Neural Representations for High-performance Cross-day Decoding
- Crucible: Quantifying the Potential of Control Algorithms through LLM Agents
- CrypticBio: A Large Multimodal Dataset for Visually Confusing Species
- CryptoMoE: Privacy-Preserving and Scalable Mixture of Experts Inference via Balanced Expert Routing
- C-SafeGen: Certified Safe LLM Generation with Claim-Based Streaming Guardrails
- CSBrain: A Cross-scale Spatiotemporal Brain Foundation Model for EEG Decoding
- C-SEO Bench: Does Conversational SEO Work?
- CSGO: Content-Style Composition in Text-to-Image Generation
- CSI-Bench: A Large-Scale In-the-Wild Dataset for Multi-task WiFi Sensing
- CSPCL: Category Semantic Prior Contrastive Learning for Deformable DETR-Based Prohibited Item Detectors
- CTRL-ALT-DECEIT Sabotage Evaluations for Automated AI R&D
- Ctrl-DNA: Controllable Cell-Type-Specific Regulatory DNA Design via Constrained RL
- CTSketch: Compositional Tensor Sketching for Scalable Neurosymbolic Learning
- Cue3D: Quantifying the Role of Image Cues in Single-Image 3D Generation
- CURE: Co-Evolving Coders and Unit Testers via Reinforcement Learning
- CURE: Concept Unlearning via Orthogonal Representation Editing in Diffusion Models
- Curious Causality-Seeking Agents Learn Meta Causal World
- Curl Descent : Non-Gradient Learning Dynamics with Sign-Diverse Plasticity
- Curly Flow Matching for Learning Non-gradient Field Dynamics
- Curriculum Abductive Learning
- Curriculum Design for Trajectory-Constrained Agent: Compressing Chain-of-Thought Tokens in LLMs
- Curriculum Model Merging: Harmonizing Chemical LLMs for Enhanced Cross-Task Generalization
- Curvature Tuning: Provable Training-free Model Steering From a Single Parameter
- CURV: Coherent Uncertainty-Aware Reasoning in Vision-Language Models for X-Ray Report Generation
- CVGL: Causal Learning and Geometric Topology
- CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays
- Cycle-Sync: Robust Global Camera Pose Estimation through Enhanced Cycle-Consistent Synchronization
- Cyclic Counterfactuals under Shift–Scale Interventions
- CyIN: Cyclic Informative Latent Space for Bridging Complete and Incomplete Multimodal Learning
- CymbaDiff: Structured Spatial Diffusion for Sketch-based 3D Semantic Urban Scene Generation
- Cypher-RI: Reinforcement Learning for Integrating Schema Selection into Cypher Generation
- D$^2$GS: Dense Depth Regularization for LiDAR-free Urban Scene Reconstruction
- d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
- D2SA: Dual-Stage Distribution and Slice Adaptation for Efficient Test-Time Adaptation in MRI Reconstruction
- DAA: Amplifying Unknown Discrepancy for Test-Time Discovery
- DAAC: Discrepancy-Aware Adaptive Contrastive Learning for Medical Time series
- DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning
- DAMamba: Vision State Space Model with Dynamic Adaptive Scan
- DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding
- DAPO: An Open-Source LLM Reinforcement Learning System at Scale
- DAPO : Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage-Based Policy Optimization
- DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
- Data-Adaptive Exposure Thresholds under Network Interference
- Data-Dependent Regret Bounds for Constrained MABs
- Data-Driven Performance Guarantees for Classical and Learned Optimizers
- Data Efficient Adaptation in Large Language Models via Continuous Low-Rank Fine-Tuning
- Data-Free Model Extraction for Black-box Recommender Systems via Graph Convolutions
- Data Fusion for Partial Identification of Causal Effects
- Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models
- Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
- Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework
- Data on the Brain and Mind
- Data Privacy, Memorization, & Legal Implications in Generative AI: A Practical Guide
- DataRater: Meta-Learned Dataset Curation
- Data Selection Matters: Towards Robust Instruction Tuning of Large Multimodal Models
- Dataset Distillation for Pre-Trained Self-Supervised Vision Models
- Dataset Distillation of 3D Point Clouds via Distribution Matching
- Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality
- DataSIR: A Benchmark Dataset for Sensitive Information Recognition
- DATE-LM: Benchmarking Data Attribution Evaluation for Large Language Models
- DAVE: Diagnostic benchmark for Audio Visual Evaluation
- DAWP: A framework for global observation forecasting via Data Assimilation and Weather Prediction in satellite observation space
- DBLoss: Decomposition-based Loss Function for Time Series Forecasting
- DC4GS: Directional Consistency-Driven Adaptive Density Control for 3D Gaussian Splatting
- DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection
- DCA: Graph-Guided Deep Embedding Clustering for Brain Atlases
- DCcluster-Opt: Benchmarking Dynamic Multi-Objective Optimization for Geo-Distributed Data Center Workloads
- DCI: Dual-Conditional Inversion for Boosting Diffusion-Based Image Editing
- DEAL: Diffusion Evolution Adversarial Learning for Sim-to-Real Transfer
- Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models?
- DeblurDiff: Real-Word Image Deblurring with Generative Diffusion Models
- DeCaFlow: A deconfounding causal generative model
- Decentralized Dynamic Cooperation of Personalized Models for Federated Continual Learning
- DeceptionBench: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenarios
- Deciphering the Extremes: A Novel Approach for Pathological Long-tailed Recognition in Scientific Discovery
- Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
- Decoding Causal Structure: End-to-End Mediation Pathways Inference
- Decompile-Bench: Million-Scale Binary-Source Function Pairs for Real-World Binary Decompilation
- DecompNet: Enhancing Time Series Forecasting Models with Implicit Decomposition
- Decomposing Interventional Causality into Synergistic, Redundant, and Unique Components
- Decomposing motor units through elimination for real-time intention driven assistive neurotechnology
- Decomposing stimulus-specific sensory neural information via diffusion models
- Decoupled Entropy Minimization
- Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
- DecoyDB: A Dataset for Graph Contrastive Learning in Protein-Ligand Binding Affinity Prediction
- Decreasing Entropic Regularization Averaged Gradient for Semi-Discrete Optimal Transport
- DeepASA: An Object-Oriented Multi-Purpose Network for Auditory Scene Analysis
- Deep Compositional Phase Diffusion for Long Motion Sequence Generation
- Deep Continuous-Time State-Space Models for Marked Event Sequences
- DeepDiver: Adaptive Web-Search Intensity Scaling via Reinforcement Learning
- Deep Edge Filter: Return of the Human-Crafted Layer in Deep Learning
- Deeper with Riemannian Geometry: Overcoming Oversmoothing and Oversquashing for Graph Foundation Models
- Deep Gaussian from Motion: Exploring 3D Geometric Foundation Models for Gaussian Splatting
- DeepHalo: A Neural Choice Model with Controllable Context Effects
- DeepKD: A Deeply Decoupled and Denoised Knowledge Distillation Trainer
- Deep Learning for Code in the Agentic Era
- Deep learning for continuous-time stochastic control with jumps
- Deep Learning IndabaX Uganda (BTF)
- Deep Learning with Plausible Deniability
- Deep Legendre Transform
- Deep Nonlinear Sufficient Dimension Reduction
- Deep RL Needs Deep Behavior Analysis: Exploring Implicit Planning by Model-Free Agents in Open-Ended Environments
- Deep Taxonomic Networks for Unsupervised Hierarchical Prototype Discovery
- Deep Tree Tensor Networks
- Deep Value Benchmark: Measuring Whether Models Generalize Deep values or Shallow Preferences
- Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding
- DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO
- Defending Multimodal Backdoored Models by Repulsive Visual Prompt Tuning
- Deferring Concept Bottleneck Models: Learning to Defer Interventions to Inaccurate Experts
- Defining and Discovering Hyper-meta-paths for Heterogeneous Hypergraphs
- DEFT: Decompositional Efficient Fine-Tuning for Text-to-Image Models
- DEGauss: Defending Against Malicious 3D Editing for Gaussian Splatting
- Degradation-Aware Dynamic Schrödinger Bridge for Unpaired Image Restoration
- Degrees of Freedom for Linear Attention: Distilling Softmax Attention with Optimal Feature Efficiency
- Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs
- Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction
- DeltaFlow: An Efficient Multi-frame Scene Flow Estimation Method
- DeltaFormer: Unlock the state space of Transformer
- DeltaPhi: Physical States Residual Learning for Neural Operators in Data-Limited PDE Solving
- DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
- Delving into Cascaded Instability: A Lipschitz Continuity View on Image Restoration and Object Detection Synergy
- Delving into Large Language Models for Effective Time-Series Anomaly Detection
- Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO
- Democratizing Clinical Risk Prediction with Cross-Cohort Cross-Modal Knowledge Transfer
- Demystifying depth: Principles of learning in deep neural networks
- Demystifying depth: Principles of learning in deep neural networks
- Demystifying Language Model Forgetting with Low-rank Example Associations
- Demystifying Network Foundation Models
- Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning
- Demystifying Spectral Feature Learning for Instrumental Variable Regression
- Dendritic Resonate-and-Fire Neuron for Effective and Efficient Long Sequence Modeling
- Deno-IF: Unsupervised Noisy Visible and Infrared Image Fusion Method
- DenoiseRotator: Enhance Pruning Robustness for LLMs via Importance Concentration
- Denoising Trajectory Biases for Zero-Shot AI-Generated Image Detection
- Dense Associative Memory with Epanechnikov Energy
- Dense Backpropagation Improves Training for Sparse Mixture-of-Experts
- DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
- Dense Metric Depth Estimation via Event-based Differential Focus Volume Prompting
- Dense SAE Latents Are Features, Not Bugs
- Density Ratio-Free Doubly Robust Proxy Causal Learning
- DePass: Unified Feature Attributing by Simple Decomposed Forward Pass
- Dependency Matters: Enhancing LLM Reasoning with Explicit Knowledge Grounding
- Dependency Parsing is More Parameter-Efficient with Normalization
- Deployment Efficient Reward-Free Exploration with Linear Function Approximation
- Depth-Bounds for Neural Networks via the Braid Arrangement
- Depth-Supervised Fusion Network for Seamless-Free Image Stitching
- DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches
- Depth-Width Tradeoffs for Transformers on Graph Tasks
- DERD-Net: Learning Depth from Event-based Ray Densities
- Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-based Decoding
- DermaCon-IN: A Multiconcept-Annotated Dermatological Image Dataset of Indian Skin Disorders for Clinical AI Research
- Design-Based Bandits Under Network Interference: Trade-Off Between Regret and Statistical Inference
- DesignX: Human-Competitive Algorithm Designer for Black-Box Optimization
- Detecting Data Deviations in Electronic Health Records
- Detecting Generated Images by Fitting Natural Image Distributions
- Detecting High-Stakes Interactions with Activation Probes
- DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding
- Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing
- DETree: DEtecting Human-AI Collaborative Texts via Tree-Structured Hierarchical Representation Learning
- DevFD : Developmental Face Forgery Detection by Learning Shared and Orthogonal LoRA Subspaces
- DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation
- DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable Policy
- DEXTER: Diffusion-Guided EXplanations with TExtual Reasoning for Vision Models
- DGCBench: A Deep Graph Clustering Benchmark
- DGH: Dynamic Gaussian Hair
- DGS-LRM: Real-Time Deformable 3D Gaussian Reconstruction From Monocular Videos
- DGSolver: Diffusion Generalist Solver with Universal Posterior Sampling for Image Restoration
- Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking
- DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks
- DiCoFlex: Model-Agnostic Diverse Counterfactuals with Flexible Control
- DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling
- DictPFL: Efficient and Private Federated Learning on Encrypted Gradients
- DiEP: Adaptive Mixture-of-Experts Compression through Differentiable Expert Pruning
- DiffBreak: Is Diffusion-Based Purification Robust?
- DiffE2E: Rethinking End-to-End Driving with a Hybrid Diffusion-Regression-Classification Policy
- Differentiable Constraint-Based Causal Discovery
- Differentiable Cyclic Causal Discovery Under Unmeasured Confounders
- Differentiable Decision Tree via "ReLU+Argmin" Reformulation
- Differentiable extensions with rounding guarantees for combinatorial optimization over permutations
- Differentiable Generalized Sliced Wasserstein Plans
- Differentiable Hierarchical Visual Tokenization
- Differentiable Learning of Combinatorial Algorithms: From Theory To Practice
- Differentiable Sparsity via $D$-Gating: Simple and Versatile Structured Penalization
- Differentiable Structure Learning and Causal Discovery for General Binary Data
- Differentially Private Bilevel Optimization: Efficient Algorithms with Near-Optimal Rates
- Differentially Private Federated Low Rank Adaptation Beyond Fixed-Matrix
- Differentially Private Gomory-Hu Trees
- Differentially Private High-dimensional Variable Selection via Integer Programming
- Differentially Private Quantiles with Smaller Error
- Differentially Private Relational Learning with Entity-level Privacy Guarantees
- Differential Privacy for Euclidean Jordan Algebra with Applications to Private Symmetric Cone Programming
- Differential Privacy on Fully Dynamic Streams
- Differentiation Through Black-Box Quadratic Programming Solvers
- DiffEye: Diffusion-Based Continuous Eye-Tracking Data Generation Conditioned on Natural Images
- Diff-ICMH: Harmonizing Machine and Human Vision in Image Compression with Generative Prior
- DiffLiG: Diffusion-enhanced Liquid Graph with Attention Propagation for Grid-to-Station Precipitation Correction
- DIFFSSR: Stereo Image Super-resolution Using Differential Transformer
- Diffusing DeBias: Synthetic Bias Amplification for Model Debiasing
- Diffusion Adaptive Text Embedding for Text-to-Image Diffusion Models
- Diffusion-Based Hierarchical Graph Neural Networks for Simulating Nonlinear Solid Mechanics
- Diffusion Beats Autoregressive in Data-Constrained Settings
- Diffusion Classifiers Understand Compositionality, but Conditions Apply
- Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL
- Diffusion-Driven Progressive Target Manipulation for Source-Free Domain Adaptation
- Diffusion-Driven Two-Stage Active Learning for Low-Budget Semantic Segmentation
- Diffusion Feature Field for Text-based 3D Editing with Gaussian Splatting
- Diffusion Federated Dataset
- Diffusion Generative Modeling on Lie Group Representations
- Diffusion Guided Adversarial State Perturbations in Reinforcement Learning
- Diffusion-Guided Graph Data Augmentation
- Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization
- Diffusion Models and the Manifold Hypothesis: Log-Domain Smoothing is Geometry Adaptive
- Diffusion Models Meet Contextual Bandits
- Diffusion on Demand: Selective Caching and Modulation for Efficient Generation
- Diffusion Transformers as Open-World Spatiotemporal Foundation Models
- Diffusion Transformers for Imputation: Statistical Efficiency and Uncertainty Quantification
- Diffusion Tree Sampling: Scalable inference‑time alignment of diffusion models
- Dimension-adapted Momentum Outscales SGD
- Dimensional Collapse in VQVAEs: Evidence and Remedies
- Dimensionality Mismatch Between Brains and Artificial Neural Networks
- Dimension-free Score Matching and Time Bootstrapping for Diffusion Models
- Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis
- DINGO: Constrained Inference for Diffusion LLMs
- DINO-Foresight: Looking into the Future with DINO
- DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data
- Direct3D-S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention
- Direct Alignment with Heterogeneous Preferences
- Directed-Tokens: A Robust Multi-Modality Alignment Approach to Large Language-Vision Models
- Direct Fisher Score Estimation for Likelihood Maximization
- Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning
- DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response
- DISC: Dynamic Decomposition Improves LLM Inference Scaling
- DISCO: DISCrete nOise for Conditional Control in Text-to-Image Diffusion Models
- DISCO: Disentangled Communication Steering for Large Language Models
- DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
- DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning
- Discovering Compositional Hallucinations in LVLMs
- Discovering Data Structures: Nearest Neighbor Search and Beyond
- Discovering Important Experts for Mixture-of-Experts Models Pruning Through a Theoretical Perspective
- Discovering Latent Graphs with GFlowNets for Diverse Conditional Image Generation
- Discovering Opinion Intervals from Conflicts in Signed Graphs
- Discovering Symbolic Partial Differential Equation by Abductive Learning
- Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees
- Discrete Neural Flow Samplers with Locally Equivariant Transformer
- Discrete Spatial Diffusion: Intensity-Preserving Diffusion Modeling
- Discretization-free Multicalibration through Loss Minimization over Tree Ensembles
- Disentangled Concepts Speak Louder Than Words: Explainable Video Action Recognition
- Disentangled Cross-Modal Representation Learning with Enhanced Mutual Supervision
- Disentangled Representation Learning via Modular Compositional Bias
- Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations
- Disentangling Hyperedges through the Lens of Category Theory
- Disentangling Latent Shifts of In-Context Learning with Weak Supervision
- Disentangling misreporting from genuine adaptation in strategic settings: a causal approach
- Disentangling Superpositions: Interpretable Brain Encoding Model with Sparse Concept Atoms
- DisMo: Disentangled Motion Representations for Open-World Motion Transfer
- DIsoN: Decentralized Isolation Networks for Out-of-Distribution Detection in Medical Imaging
- Distance Adaptive Beam Search for Provably Accurate Graph-Based Nearest Neighbor Search
- Distance-informed Neural Processes
- Distances for Markov chains from sample streams
- Distil-E2D: Distilling Image-to-Depth Priors for Event-Based Monocular Depth Estimation
- Distillation Robustifies Unlearning
- Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation
- Distilling LLM Agent into Small Models with Retrieval and Code Tools
- Distilling LLM Prior to Flow Model for Generalizable Agent’s Imagination in Object Goal Navigation
- Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?
- Distributed mediation analysis with communication efficiency
- Distributed Multi-Agent Bandits Over Erdős-Rényi Random Networks
- Distributional Adversarial Attacks and Training in Deep Hedging
- Distributional Autoencoders Know the Score
- Distribution-Aligned Decoding for Efficient LLM Task Adaptation
- Distributional LLM-as-a-Judge
- Distributionally Robust Feature Selection
- Distributionally Robust Learning for Multi-source Unsupervised Domain Adaptation
- Distributionally Robust Performative Optimization
- Distributional Training Data Attribution: What do Influence Functions Sample?
- Distribution-Aware Tensor Decomposition for Compression of Convolutional Neural Networks
- Distribution Learning Meets Graph Structure Sampling
- Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values
- Ditch the Denoiser: Emergence of Noise Robustness in Self-Supervised Learning from Data Curriculum
- DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection
- Diverse Influence Component Analysis: A Geometric Approach to Nonlinear Mixture Identifiability
- Diversifying Parallel Ergodic Search: A Signature Kernel Evolution Strategy
- Diversity as a Reward: Fine-Tuning LLMs on a Mixture of Domain-Undetermined Data
- Diversity-Aware Policy Optimization for Large Language Model Reasoning
- Diversity Is All You Need for Contrastive Learning: Spectral Bounds on Gradient Magnitudes
- Diversity-oriented Deep Multi-modal Clustering
- DKDR: Dynamic Knowledge Distillation for Reliability in Federated Learning
- dKV-Cache: The Cache for Diffusion Language Models
- DLoFT: Gradient-Decoupled Fine-Tuning for Generalizable Long Chain-of-Thought Reasoning
- DMol: A Highly Efficient and Chemical Motif-Preserving Molecule Generation Platform
- DMWM: Dual-Mind World Model with Long-Term Imagination
- DNA-DetectLLM: Unveiling AI-Generated Text via a DNA-Inspired Mutation-Repair Paradigm
- DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing
- Do Automatic Factuality Metrics Measure Factuality? A Critical Evaluation
- Doctor Approved: Generating Medically Accurate Skin Disease Images through AI-Expert Feedback
- Document Summarization with Conformal Importance Guarantees
- Do different prompting methods yield a common task representation in language models?
- DoDo-Code: an Efficient Levenshtein Distance Embedding-based Code for 4-ary IDS Channel
- Does Object Binding Naturally Emerge in Large Pretrained Vision Transformers?
- Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
- Does Representation Guarantee Welfare?
- Does Stochastic Gradient really succeed for bandits?
- Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models
- Do Language Models Use Their Depth Efficiently?
- Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness
- Do LVLMs Truly Understand Video Anomalies? Revealing Hallucination via Co-Occurrence Patterns
- Domain Adaptive Hashing Retrieval via VLM Assisted Pseudo-Labeling and Dual Space Adaptation
- Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection
- Domain-Specific Pruning of Large Mixture-of-Experts Models with Few-shot Demonstrations
- Do Neural Networks Need Gradient Descent to Generalize? A Theoretical Study
- Don't be lazy: CompleteP enables compute-efficient deep transformers
- Don’t call it privacy-preserving or human-centric pose estimation if you don’t measure privacy
- Don’t Forget the Enjoin: FocalLoRA for Instruction Hierarchical Alignment in Large Language Models
- Don’t Give Up on Democratizing AI for the Wrong Reasons
- Don't Just Chase “Highlighted Tokens” in MLLMs: Revisiting Visual Holistic Context Retention
- Don’t Let It Fade: Preserving Edits in Diffusion Language Models via Token Timestep Allocation
- DON’T NEED RETRAINING: A Mixture of DETR and Vision Foundation Models for Cross-Domain Few-Shot Object Detection
- Don’t Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models
- Don’t Trade Off Safety: Diffusion Regularization for Constrained Offline RL
- Doodle to Detect: A Goofy but Powerful Approach to Skeleton-based Hand Gesture Recognition
- Do-PFN: In-Context Learning for Causal Effect Estimation
- DoseSurv: Predicting Personalized Survival Outcomes under Continuous-Valued Treatments
- DOTA: Distributional Test-time Adaptation of Vision-Language Models
- Double Descent Meets Out-of-Distribution Detection: Theoretical Insights and Empirical Analysis on the Role of Model Complexity
- Doubly Robust Alignment for Large Language Models
- Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings
- DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution
- DOVTrack: Data-Efficient Open-Vocabulary Tracking
- Do You Really Need Public Data? Surrogate Public Data for Differential Privacy on Tabular Data
- DP²O-SR: Direct Perceptual Preference Optimization for Real-World Image Super-Resolution
- DPA: A one-stop metric to measure bias amplification in classification datasets
- DPAIL: Training Diffusion Policy for Adversarial Imitation Learning without Policy Optimization
- DP-LLM: Runtime Model Adaptation with Dynamic Layer-wise Precision Assignment
- DQVis Dataset: Natural Language to Biomedical Visualization
- Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
- DreamLight: Towards Harmonious and Consistent Image Relighting
- DreamPRM: Domain-reweighted Process Reward Model for Multimodal Reasoning
- DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
- DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents
- DrivAerStar: An Industrial-Grade CFD Dataset for Vehicle Aerodynamic Optimization
- DriveDPO: Policy Learning via Safety DPO For End-to-End Autonomous Driving
- DrivingRecon: Large 4D Gaussian Reconstruction Model For Autonomous Driving
- DroneAudioset: An Audio Dataset for Drone-based Search and Rescue
- Dropout Regularization Versus l2-Penalization in the Linear Model
- Dr. RAW: Towards General High-Level Vision from RAW with Efficient Task Conditioning
- DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?
- DSAS: A Universal Plug-and-Play Framework for Attention Optimization in Multi-Document Question Answering
- DSCS: Fast CPDAG-Based Verification of Collapsible Submodels in High-Dimensional Bayesian Networks
- DSRF: A Dynamic and Scalable Reasoning Framework for Solving RPMs
- Dual Alignment Framework for Few-shot Learning with Inter-Set and Intra-Set Shifts
- DualCnst: Enhancing Zero-Shot Out-of-Distribution Detection via Text-Image Consistency in Vision-Language Models
- Dual-Comb Ghost Imaging with Transformer-Based Reconstruction for Optical Fiber Endomicroscopy
- Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable
- DualEqui: A Dual-Space Hierarchical Equivariant Network for Large Biomolecules
- Dual-Flow: Transferable Multi-Target, Instance-Agnostic Attacks via $\textit{In-the-wild}$ Cascading Flow Optimization
- DualFocus: Depth from Focus with Spatio-Focal Dual Variational Constraints
- DUAL: Learning Diverse Kernels for Aggregated Two-sample and Independence Testing
- DualMPNN: Harnessing Structural Alignments for High-Recovery Inverse Protein Folding
- DualOptim: Enhancing Efficacy and Stability in Machine Unlearning with Dual Optimizers
- Dual-Path Temporal Decoder for End-to-End Multi-Object Tracking
- Dual Prototype-Enhanced Contrastive Framework for Class-Imbalanced Graph Domain Adaptation
- Dual-Res Tandem Mamba-3D: Bilateral Breast Lesion Detection and Classification on Non-contrast Chest CT
- Dual-Space Semantic Synergy Distillation for Continual Learning of Unlabeled Streams
- Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
- DUET: Dual-Perspective Pseudo Labeling and Uncertainty-aware Exploration & Exploitation Training for Source-Free Domain Adaptation
- DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning with Dual-Pathway Global-Local Fusion
- DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs
- DUO: No Compromise to Accuracy Degradation
- DuSA: Fast and Accurate Dual-Stage Sparse Attention Mechanism Accelerating Both Training and Inference
- D-VST: Diffusion Transformer for Pathology-Correct Tone-Controllable Cross-Dye Virtual Staining of Whole Slide Images
- DyFlow: Dynamic Workflow Framework for Agentic Reasoning
- DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs
- DyMoDreamer: World Modeling with Dynamic Modulation
- DyMU: Dynamic Merging and Virtual Unmerging for Efficient Variable-Length VLMs
- DynaAct: Large Language Model Reasoning with Dynamic Action Spaces
- DynaGuide: Steering Diffusion Polices with Active Dynamic Guidance
- Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
- Dynamical Decoupling of Generalization and Overfitting in Large Two-Layer Networks
- Dynamic Algorithm for Explainable $k$-medians Clustering under $\ell_p$ Norm
- Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks
- Dynamical modeling of nonlinear latent factors in multiscale neural activity with real-time inference
- Dynamical Properties of Tokens in Self-Attention and Effects of Positional Encoding
- Dynamic and Chemical Constraints to Enhance the Molecular Masked Graph Autoencoders
- Dynamic Bundling with Large Language Models for Zero-Shot Inference on Text-Attributed Graphs
- Dynamic Configuration for Cutting Plane Separators via Reinforcement Learning on Incremental Graph
- Dynamic Diameter in High-Dimensions against Adaptive Adversary and Beyond
- Dynamic Diffusion Schrödinger Bridge in Astrophysical Observational Inversions
- Dynamic Focused Masking for Autoregressive Embodied Occupancy Prediction
- Dynamic Gaussian Splatting from Defocused and Motion-blurred Monocular Videos
- Dynamic Masking and Auxiliary Hash Learning for Enhanced Cross-Modal Retrieval
- DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
- Dynamic Regret Reduces to Kernelized Static Regret
- Dynamic Risk Assessments for Offensive Cybersecurity Agents
- Dynamics-Aligned Latent Imagination in Contextual World Models for Zero-Shot Generalization
- Dynamics at the Frontiers of Optimization, Sampling, and Games
- Dynamic Semantic-Aware Correlation Modeling for UAV Tracking
- Dynamic Shadow Unveils Invisible Semantics for Video Outpainting
- Dynamic Siamese Expansion Framework for Improving Robustness in Online Continual Learning
- Dynamics of Spontaneous Topic Changes in Next Token Prediction with Self-Attention
- Dynamic Test-Time Compute Scaling in Control Policy: Difficulty-Aware Stochastic Interpolant Policy
- DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling
- Dynamic View Synthesis as an Inverse Problem
- DynamicVL: Benchmarking Multimodal Large Language Models for Dynamic City Understanding
- DynaNav: Dynamic Feature and Layer Selection for Efficient Visual Navigation
- DynaPhArM: Adaptive and Physics-Constrained Modeling for Target-Drug Complexes with Drug-Specific Adaptations
- DynaPipe: Dynamic Layer Redistribution for Efficient Serving of LLMs with Pipeline Parallelism
- DynaRend: Learning 3D Dynamics via Masked Future Rendering for Robotic Manipulation
- Dyn-O: Building Structured World Models with Object-Centric Representations
- E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis
- E2Former: An Efficient and Equivariant Transformer with Linear-Scaling Tensor Products
- EA3D: Online Open-World 3D Object Extraction from Streaming Videos
- Each Complexity Deserves a Pruning Policy
- EAG3R: Event-Augmented 3D Geometry Estimation for Dynamic and Extreme-Lighting Scenes
- Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models
- EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
- EAP-GP: Mitigating Saturation Effect in Gradient-based Automated Circuit Identification
- EAReranker: Efficient Embedding Adequacy Assessment for Retrieval Augmented Generation
- EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization
- E-BATS: Efficient Backpropagation-Free Test-Time Adaptation for Speech Foundation Models
- EchoShot: Multi-Shot Portrait Video Generation
- ECO: Evolving Core Knowledge for Efficient Transfer
- EconGym: A Scalable AI Testbed with Diverse Economic Tasks
- EDBench: Large-Scale Electron Density Data for Molecular Modeling
- EddyFormer: Accelerated Neural Simulations of Three-Dimensional Turbulence at Scale
- EDELINE: Enhancing Memory in Diffusion-based World Models via Linear-Time Sequence Modeling
- Edit Flows: Variable Length Discrete Flow Matching with Sequence-Level Edit Operations
- EditInfinity: Image Editing with Binary-Quantized Generative Models
- Edit Less, Achieve More: Dynamic Sparse Neuron Masking for Lifelong Knowledge Editing in LLMs
- EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting
- Effective Neural Approximations for Geometric Optimization Problems
- Effective Policy Learning for Multi-Agent Online Coordination Beyond Submodular Objectives
- Effects of Dropout on Performance in Long-range Graph Learning Tasks
- EffiBench-X: A Multi-Language Benchmark for Measuring Efficiency of LLM-Generated Code
- Efficient $k$-Sparse Band–Limited Interpolation with Improved Approximation Ratio
- Efficient Adaptive Experimentation with Noncompliance
- Efficient Adaptive Federated Optimization
- Efficient Algorithms for Robust and Partial Semi-Discrete Optimal Transport
- Efficient Allocation of Working Memory Resource for Utility Maximization in Humans and Recurrent Neural Networks
- Efficient and Generalizable Mixed-Precision Quantization via Topological Entropy
- Efficient and Near-Optimal Algorithm for Contextual Dueling Bandits with Offline Regression Oracles
- Efficient Bayesian Experiment Design with Equivariant Networks
- Efficient Data Selection at Scale via Influence Distillation
- Efficient Fairness-Performance Pareto Front Computation
- Efficient Federated Learning against Byzantine Attacks and Data Heterogeneity via Aggregating Normalized Gradients
- Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning
- Efficient Kernelized Learning in Polyhedral Games beyond Full Information: From Colonel Blotto to Congestion Games
- Efficient Knowledge Transfer in Federated Recommendation for Joint Venture Ecosystem
- Efficient Large Language Model Inference with Neural Block Linearization
- Efficient Last-Iterate Convergence in Solving Extensive-Form Games
- Efficient Low Rank Attention for Long-Context Inference in Large Language Models
- Efficiently Escaping Saddle Points under Generalized Smoothness via Self-Bounding Regularity
- Efficiently Maintaining the Multilingual Capacity of MCLIP in Downstream Cross-Modal Retrieval Tasks
- Efficiently Scaling LLM Reasoning Programs with Certaindex
- Efficiently Verifiable Proofs of Data Attribution
- Efficient Multi-bit Quantization Network Training via Weight Bias Correction and Bit-wise Coreset Sampling
- Efficient Multimodal Dataset Distillation via Generative Models
- Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
- EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval
- Efficient PAC Learning for Realizable-Statistic Models via Convex Surrogates
- Efficient Parametric SVD of Koopman Operator for Stochastic Dynamical Systems
- Efficient Part-level 3D Object Generation via Dual Volume Packing
- Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees
- Efficient Preference-Based Reinforcement Learning: Randomized Exploration meets Experimental Design
- Efficient Pre-Training of LLMs via Topology-Aware Communication Alignment on More Than 9600 GPUs
- Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
- Efficient Quadratic Corrections for Frank-Wolfe Algorithms
- Efficient Randomized Experiments Using Foundation Models
- Efficient RAW Image Deblurring with Adaptive Frequency Modulation
- Efficient Rectified Flow for Image Fusion
- Efficient Representativeness-Aware Coreset Selection
- Efficient Safe Meta-Reinforcement Learning: Provable Near-Optimality and Anytime Safety
- Efficient semantic uncertainty quantification in language models via diversity-steered sampling
- Efficient Spectral Control of Partially Observed Linear Dynamical Systems
- Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space
- Efficient Training-Free Online Routing for High-Volume Multi-LLM Serving
- Efficient Training of Minimal and Maximal Low-Rank Recurrent Neural Networks
- Efficient Transformers: State of the art in pruning, sparse attention, and transformer funneling
- Efficient Utility-Preserving Machine Unlearning with Implicit Gradient Surgery
- Efficient Verified Unlearning For Distillation
- EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
- Effortless, Simulation-Efficient Bayesian Inference using Tabular Foundation Models
- EGGS: Exchangeable 2D/3D Gaussian Splatting for Geometry-Appearance Balanced Novel View Synthesis
- EgoBlind: Towards Egocentric Visual Assistance for the Blind
- EgoBridge: Domain Adaptation for Generalizable Imitation from Egocentric Human Data
- EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
- egoEMOTION: Egocentric Vision and Physiological Signals for Emotion and Personality Recognition in Real-world Tasks
- EgoExoBench: A Benchmark for First- and Third-person View Video Understanding in MLLMs
- EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding
- EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Videos Generation
- ElasticMM: Efficient Multimodal LLMs Serving with Elastic Multimodal Parallelism
- Elastic Robust Unlearning of Specific Knowledge in Large Language Models
- Elastic ViTs from Pretrained Models without Retraining
- ELDET: Early-Learning Distillation with Noisy Labels for Object Detection
- ELECTRA: A Cartesian Network for 3D Charge Density Prediction with Floating Orbitals
- Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
- Eliciting Reasoning in Language Models with Cognitive Tools
- ElliCE: Efficient and Provably Robust Algorithmic Recourse via the Rashomon Sets
- Elucidated Rolling Diffusion Models for Probabilistic Forecasting of Complex Dynamics
- Eluder dimension: localise it!
- Embedding Principle of Homogeneous Neural Network for Classification Problem
- Embeddings as Probabilistic Equivalence in Logic Programs
- Embodied Cognition Augmented End2End Autonomous Driving
- Embodied Crowd Counting
- Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence
- Embodied World Models for Decision Making
- Embracing Contradiction: Theoretical Inconsistency Will Not Impede the Road of Building Responsible AI Systems
- Embracing Trustworthy Brain-Agent Collaboration as Paradigm Extension for Intelligent Assistive Technologies
- Emergence and Evolution of Interpretable Concepts in Diffusion Models
- Emergence and scaling laws in SGD learning of shallow neural networks
- Emergence of Linear Truth Encodings in Language Models
- Emergent Risk Awareness in Rational Agents under Resource Constraints
- Emergent Temporal Correspondences from Video Diffusion Transformers
- EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge
- Emerging Risks from Embodied AI Require Urgent Policy Action
- EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction
- E-MoFlow: Learning Egomotion and Optical Flow from Event Data via Implicit Regularization
- EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition
- Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning
- Empowering Decision Trees via Shape Function Branching
- Empower Words: DualGround for Structured Phrase and Sentence-Level Temporal Grounding
- Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers, and Gradient Clipping
- Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer
- Encoder-Decoder Diffusion Language Models for Efficient Training and Inference
- EnCompass: Enhancing Agent Programming with Search Over Program Execution Paths
- EndoBench: A Comprehensive Evaluation of Multi-Modal Large Language Models for Endoscopy Analysis
- End-to-End Low-Light Enhancement for Object Detection with Learned Metadata from RAWs
- End-to-End Vision Tokenizer Tuning
- Energy and Power as First-Class ML Design Metrics
- Energy-based generator matching: A neural sampler for general state space
- Energy Landscape-Aware Vision Transformers: Layerwise Dynamics and Adaptive Task-Specific Training via Hopfield States
- Energy Loss Functions for Physical Systems
- Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling
- EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
- Enforcing convex constraints in Graph Neural Networks
- Enforcing Hard Linear Constraints in Deep Learning Models with Decision Rules
- EngiBench: A Framework for Data-Driven Engineering Design Research
- Enhanced Cyclic Coordinate Descent Methods for Elastic Net Penalized Linear Models
- Enhanced Expert Merging for Mixture-of-Experts in Graph Foundation Models
- Enhanced Self-Distillation Framework for Efficient Spiking Neural Network Training
- Enhancing 3D Reconstruction for Dynamic Scenes
- Enhancing Bioactivity Prediction via Spatial Emptiness Representation of Protein-ligand Complex and Union of Multiple Pockets
- Enhancing CLIP Robustness via Cross-Modality Alignment
- Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions
- Enhancing Consistency of Flow-Based Image Editing through Kalman Control
- Enhancing Contrastive Learning with Variable Similarity
- Enhancing Deep Batch Active Learning for Regression with Imperfect Data Guided Selection
- Enhancing Diffusion-based Unrestricted Adversarial Attacks via Adversary Preferences Alignment
- Enhancing Graph Classification Robustness with Singular Pooling
- Enhancing GUI Agent with Uncertainty-Aware Self-Trained Evaluator
- Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark
- Enhancing Interpretability in Deep Reinforcement Learning through Semantic Clustering
- Enhancing LLM Planning for Robotics Manipulation through Hierarchical Procedural Knowledge Graphs
- Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks
- Enhancing Multilingual LLM Pretraining with Model-Based Data Selection
- Enhancing Optimizer Stability: Momentum Adaptation of The NGN Step-size
- Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward
- Enhancing Privacy in Multimodal Federated Learning with Information Theory
- Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization
- Enhancing Sample Selection Against Label Noise by Cutting Mislabeled Easy Examples
- Enhancing Tactile-based Reinforcement Learning for Robotic Control
- Enhancing Temporal Understanding in Video-LLMs through Stacked Temporal Attention in Vision Encoders
- Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
- Enhancing the Maximum Effective Window for Long-Term Time Series Forecasting
- Enhancing the Outcome Reward-based RL Training of MLLMs with Self-Consistency Sampling
- Enhancing Training Data Attribution with Representational Optimization
- Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding
- Enhancing Visual Prompting through Expanded Transformation Space and Overfitting Mitigation
- Enhancing Zero-Shot Black-Box Optimization via Pretrained Models with Efficient Population Modeling, Interaction, and Stable Gradient Approximation
- Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles
- ENMA: Tokenwise Autoregression for Continuous Neural PDE Operators
- Entropic Time Schedulers for Generative Diffusion Models
- Entropy-Calibrated Label Distribution Learning
- Entropy Rectifying Guidance for Diffusion and Flow Models
- Environment Inference for Learning Generalizable Dynamical System
- Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
- EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation
- EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?
- EPA: Boosting Event-based Video Frame Interpolation with Perceptually Aligned Learning
- EPFL-Smart-Kitchen: An Ego-Exo Multi-Modal Dataset for Challenging Action and Motion Understanding in Video-Language Models
- Epistemic Uncertainty Estimation in Regression Ensemble Models with Pairwise Epistemic Estimators
- Epistemic Uncertainty for Generated Image Detection
- Equilibrium Policy Generalization: A Reinforcement Learning Framework for Cross-Graph Zero-Shot Generalization in Pursuit-Evasion Games
- Equi-mRNA: Protein Translation Equivariant Encoding for mRNA Language Models
- EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Network
- Equivariance by Contrast: Identifiable Equivariant Embeddings from Unlabeled Finite Group Actions
- Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models
- Equivariant Eikonal Neural Networks: Grid-Free, Scalable Travel-Time Prediction on Homogeneous Spaces
- EraseFlow: Learning Concept Erasure Policies via GFlowNet-Driven Alignment
- Erasing Conceptual Knowledge from Language Models
- Error Broadcast and Decorrelation as a Potential Artificial and Natural Learning Mechanism
- Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum
- Error Forcing in Recurrent Neural Networks
- ErrorTrace: A Black-Box Traceability Mechanism Based on Model Family Error Space
- ESCA: Contextualizing Embodied Agents via Scene-Graph Generation
- ESCA: Enabling Seamless Codec Avatar Execution through Algorithm and Hardware Co-Optimization for Virtual Reality
- Escaping Collapse: The Strength of Weak Data for Large Language Model Training
- Escaping saddle points without Lipschitz smoothness: the power of nonlinear preconditioning
- Escaping the SpuriVerse: Can Large Vision-Language Models Generalize Beyond Seen Spurious Correlations?
- ESCORT: Efficient Stein-variational and Sliced Consistency-Optimized Temporal Belief Representation for POMDPs
- Establishing Best Practices in Building Rigorous Agentic Benchmarks
- Establishing Linear Surrogate Regret Bounds for Convex Smooth Losses via Convolutional Fenchel–Young Losses
- Estimating cognitive biases with attention-aware inverse planning
- Estimating Hitting Times Locally at Scale
- Estimating Interventional Distributions with Uncertain Causal Graphs through Meta-Learning
- Estimating Model Performance Under Covariate Shift Without Labels
- Estimation and Inference in Distributional Reinforcement Learning
- Estimation of Stochastic Optimal Transport Maps
- Estimation of Treatment Effects in Extreme and Unobserved Data
- EUGens: Efficient, Unified and General Dense Layers
- Eulerian Neural Network Informed by Chemical Transport for Air Quality Forecasting
- EuroSpeech: A Multilingual Speech Corpus
- EVAAA: A Virtual Environment Platform for Essential Variables in Autonomous and Adaptive Agents
- EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving
- Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death
- Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia
- Evaluating LLM-contaminated Crowdsourcing Data Without Ground Truth
- Evaluating LLMs in Open-Source Games
- Evaluating multiple models using labeled and unlabeled data
- Evaluating Program Semantics Reasoning with Type Inference in System $F$
- Evaluating Robustness of Monocular Depth Estimation with Procedural Scene Perturbations
- Evaluating the Evolving LLM Lifecycle: Benchmarks, Emergent Abilities, and Scaling
- Evaluating the Inductive Abilities of Large Language Models: Why Chain-of-Thought Reasoning Sometimes Hurts More Than Helps
- Eve3D: Elevating Vision Models for Enhanced 3D Surface Reconstruction via Gaussian Splatting
- Event-based HDR Structured Light
- Event-Driven Dynamic Scene Depth Completion
- Event-Guided Consistent Video Enhancement with Modality-Adaptive Diffusion Pipeline
- EventMG: Efficient Multilevel Mamba-Graph Learning for Spatiotemporal Event Representation
- EverybodyDance: Bipartite Graph–Based Identity Correspondence for Multi-Character Animation
- Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling
- EvoBrain: Dynamic Multi-Channel EEG Graph Modeling for Time-Evolving Brain Networks
- EVODiff: Entropy-aware Variance Optimized Diffusion Inference
- EvoLM: In Search of Lost Language Model Training Dynamics
- Evolutionary Multi-View Classification via Eliminating Individual Fitness Bias
- Evolutionary Prediction Games
- Evolutionary Reasoning Does Not Arise in Standard Usage of Protein Language Models
- Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits
- EvolvedGRPO: Unlocking Reasoning in LVLMs via Progressive Instruction Evolution
- Evolving and Regularizing Meta-Environment Learner for Fine-Grained Few-Shot Class-Incremental Learning
- EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions
- Exact and Linear Convergence for Federated Learning under Arbitrary Client Participation is Attainable
- ExAct: A Video-Language Benchmark for Expert Action Analysis
- Exact Expressive Power of Transformers with Padding
- Execution Guided Line-by-Line Code Generation
- ExGra-Med: Extended Context Graph Alignment for Medical Vision-Language Models
- Explainable Reinforcement Learning from Human Feedback to Improve Alignment
- Explainably Safe Reinforcement Learning
- Explain AI Models: Methods and Opportunities in Explainable AI, Data-Centric AI, and Mechanistic Interpretability
- Explaining and Mitigating Crosslingual Tokenizer Inequities
- Explaining Similarity in Vision-Language Encoders with Weighted Banzhaf Interactions
- Explaining the Law of Supply and Demand via Online Learning
- Explicitly Modeling Subcortical Vision with a Neuro-Inspired Front-End Improves CNN Robustness
- Exploiting Dynamic Sparsity in Einsum
- Exploiting LLMs for Automatic Hypothesis Assessment via a Logit-Based Calibrated Prior
- Exploiting Task Relationships in Continual Learning via Transferability-Aware Task Embeddings
- Exploiting the Asymmetric Uncertainty Structure of Pre-trained VLMs on the Unit Hypersphere
- Exploiting Vocabulary Frequency Imbalance in Language Model Pre-training
- Exploration from a Primal-Dual Lens: Value-Incentivized Actor-Critic Methods for Sample-Efficient Online RL
- Exploration via Feature Perturbation in Contextual Bandits
- Explore In-Context Message Passing Operator for Graph Neural Networks in A Mean Field Game
- Exploring and Exploiting Model Uncertainty in Bayesian Optimization
- Exploring and Leveraging Class Vectors for Classifier Editing
- Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
- Exploring Diffusion Transformer Designs via Grafting
- Exploring Landscapes for Better Minima along Valleys
- Exploring Neural Granger Causality with xLSTMs: Unveiling Temporal Dependencies in Complex Data
- Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining
- Exploring Semantic-constrained Adversarial Example with Instruction Uncertainty Reduction
- Exploring Structural Degradation in Dense Representations for Self-supervised Learning
- Exploring the Design Space of Diffusion Bridge Models
- Exploring the limits of strong membership inference attacks on large language models
- Exploring the Limits of Vision-Language-Action Manipulation in Cross-task Generalization
- Exploring the Noise Robustness of Online Conformal Prediction
- Exploring the Translation Mechanism of Large Language Models
- Exploring Tradeoffs through Mode Connectivity for Multi-Task Learning
- Exponential Convergence Guarantees for Iterative Markovian Fitting
- Exponential Dynamic Energy Network for High Capacity Sequence Memory
- ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning
- Extracting task-relevant preserved dynamics from contrastive aligned neural recordings
- Extragradient Method for $(L_0, L_1)$-Lipschitz Root-finding Problems
- Extrapolation by Association: Length Generalization Transfer In Transformers
- Extremely Simple Multimodal Outlier Synthesis for Out-of-Distribution Detection and Segmentation
- EyeBench: Predictive Modeling from Eye Movements in Reading
- Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video
- FACE: A General Framework for Mapping Collaborative Filtering Embeddings into LLM Tokens
- FACE: Faithful Automatic Concept Extraction
- Face-Human-Bench: A Comprehensive Benchmark of Face and Human Understanding for Multi-modal Assistants
- FaCT: Faithful Concept Traces for Explaining Neural Network Decisions
- FACT: Mitigating Inconsistent Hallucinations in LLMs via Fact-Driven Alternating Code-Text Training
- Factor Decorrelation Enhanced Data Removal from Deep Predictive Models
- Factorio Learning Environment
- Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning
- F-Adapter: Frequency-Adaptive Parameter-Efficient Fine-Tuning in Scientific Machine Learning
- Fading to Grow: Growing Preference Ratios via Preference Fading Discrete Diffusion for Recommendation
- FADRM: Fast and Accurate Data Residual Matching for Dataset Distillation
- Failure by Interference: Language Models Make Balanced Parentheses Errors When Faulty Mechanisms Overshadow Sound Ones
- Failure Prediction at Runtime for Generative Robot Policies
- FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure Modes
- Fair Continuous Resource Allocation with Equality of Impact
- Fair Cooperation in Mixed-Motive Games via Conflict-Aware Gradient Adjustment
- FairDD: Fair Dataset Distillation
- Fair Deepfake Detectors Can Generalize
- FairDICE: Fairness-Driven Offline Multi-Objective Reinforcement Learning
- FairImagen: Post-Processing for Bias Mitigation in Text-to-Image Models
- Fair Matroid Selection
- Fair Minimum Labeling: Efficient Temporal Network Activations for Reachability and Equity
- Fairness-aware Anomaly Detection via Fair Projection
- Fairness-aware Bayes Optimal Functional Classification
- Fairness-Regularized Online Optimization with Switching Costs
- Fairness under Competition
- FairNet: Dynamic Fairness Correction without Performance Loss via Contrastive Conditional LoRA
- Fair Representation Learning with Controllable High Confidence Guarantees via Adversarial Inference
- Fairshare Data Pricing via Data Valuation for Large Language Models
- FAIR Universe HiggsML Uncertainty Dataset and Competition
- Faithful Dynamic Imitation Learning from Human Intervention with Dynamic Regret Minimization
- Faithful Group Shapley Value
- FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design
- FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model
- FALQON: Accelerating LoRA Fine-tuning with Low-Bit Floating-Point Arithmetic
- FAME: Adaptive Functional Attention with Expert Routing for Function-on-Function Regression
- FANS: A Flatness-Aware Network Structure for Generalization in Offline Reinforcement Learning
- Fantastic Bugs and Where to Find Them in AI Benchmarks
- Fantastic Features and Where to Find Them: A Probing Method to combine Features from Multiple Foundation Models
- FAPEX: Fractional Amplitude-Phase Expressor for Robust Cross-Subject Seizure Prediction
- Far from the Shallow: Brain-Predictive Reasoning Embedding through Residual Disentanglement
- Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning
- Fast attention mechanisms: a tale of parallelism
- Fast Computation and Optimization for Opinion-Based Quantities of Friedkin-Johnsen Model
- Fast constrained sampling in pre-trained diffusion models
- Fast Data Attribution for Text-to-Image Models
- FastDINOv2: Frequency Based Curriculum Learning Improves Robustness and Training Speed
- Faster Algorithms for Structured John Ellipsoid Computation
- Faster Fixed-Point Methods for Multichain MDPs
- Faster Generic Identification in Tree-Shaped Structural Causal Models
- Faster Video Diffusion with Trainable Sparse Attention
- Fast exact recovery of noisy matrix from few entries: the infinity norm approach
- FAST: Foreground‑aware Diffusion with Accelerated Sampling Trajectory for Segmentation‑oriented Anomaly Synthesis
- Fast Inference for Augmented Large Language Models
- Fast-in-Slow: A Dual-System VLA Model Unifying Fast Manipulation within Slow Reasoning
- FastJAM: a Fast Joint Alignment Model for Images
- Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime
- Fast Local Search Algorithms for Clustering with Adaptive Sampling and Bandit Strategies
- FastLongSpeech: Enhancing Large Speech-Language Models for Efficient Long-Speech Processing
- Fast Monte Carlo Tree Diffusion: 100× Speedup via Parallel and Sparse Planning
- Fast MRI for All: Bridging Access Gaps by Training without Raw Data
- Fast Non-Log-Concave Sampling under Nonconvex Equality and Inequality Constraints with Landing
- Fast Projection-Free Approach (without Optimization Oracle) for Optimization over Compact Convex Set
- Fast Rate Bounds for Multi-Task and Meta-Learning with Different Sample Sizes
- Fast-Slow Thinking GRPO for Large Vision-Language Model Reasoning
- Fast Solvers for Discrete Diffusion Models: Theory and Applications of High-Order Algorithms
- Fast Training of Large Kernel Models with Delayed Projections
- FastVID: Dynamic Density Pruning for Fast Video Large Language Models
- Fast Zeroth-Order Convex Optimization with Quantum Gradient Methods
- FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding
- Feasibility-Aware Decision-Focused Learning for Predicting Parameters in the Constraints
- FEAT: Free energy Estimators with Adaptive Transport
- Feature-aware Modulation for Learning from Temporal Tabular Data
- Feature-Based Instance Neighbor Discovery: Advanced Stable Test-Time Adaptation in Dynamic World
- Feature Distillation is the Better Choice for Model-Heterogeneous Federated Learning
- Feature Unlearning: Theoretical Foundations and Practical Applications with Shuffling
- FedEL: Federated Elastic Learning for Heterogeneous Devices
- Federated Continual Learning via Orchestrating Multi-Scale Expertise
- Federated Dialogue-Semantic Diffusion for Emotion Recognition under Incomplete Modalities
- Federated Multi-armed Bandits with Efficient Bit-Level Communications
- FedFACT: A Provable Framework for Controllable Group-Fairness Calibration in Federated Learning
- FedFree: Breaking Knowledge-sharing Barriers through Layer-wise Alignment in Heterogeneous Federated Learning
- FedGPS: Statistical Rectification Against Data Heterogeneity in Federated Learning
- FedIGL: Federated Invariant Graph Learning for Non-IID Graphs
- FedLPA: Local Prior Alignment for Heterogeneous Federated Generalized Category Discovery
- FedMGP: Personalized Federated Learning with Multi-Group Text-Visual Prompts
- FedQS: Optimizing Gradient and Model Aggregation for Semi-Asynchronous Federated Learning
- FedRACE: A Hierarchical and Statistical Framework for Robust Federated Learning
- FedRAM: Federated Reweighting and Aggregation for Multi-Task Learning
- FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling
- FedRW: Efficient Privacy-Preserving Data Reweighting for Enhancing Federated Learning of Language Models
- FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA
- FedWMSAM: Fast and Flat Federated Learning via Weighted Momentum and Sharpness-Aware Minimization
- Feedback-Aware MCTS for Goal-Oriented Information Seeking
- FEEDBACK FRICTION: LLMs Struggle to Fully Incorporate External Feedback
- Feedback Guidance of Diffusion Models
- Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos
- Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown
- FEEL: Quantifying Heterogeneity in Physiological Signals for Generalizable Emotion Recognition
- FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies
- Few-Shot Knowledge Distillation of LLMs With Counterfactual Explanations
- Few-Shot Learning from Gigapixel Images via Hierarchical Vision-Language Alignment and Modeling
- FFN Fusion: Rethinking Sequential Computation in Large Language Models
- FGBench: A Dataset and Benchmark for Molecular Property Reasoning at Functional Group-Level in Large Language Models
- FHGS: Feature-Homogenized Gaussian Splatting
- FIGRDock: Fast Interaction-Guided Regression for Flexible Docking
- Filter Like You Test: Data-Driven Data Filtering for CLIP Pretraining
- Fin3R: Fine-tuning Feed-forward 3D Reconstruction Models via Monocular Knowledge Distillation
- Final-Model-Only Data Attribution with a Unifying View of Gradient-Based Methods
- Finding and Reactivating Post-Trained LLMs' Hidden Safety Mechanisms
- Finding Low-Rank Matrix Weights in DNNs via Riemannian Optimization: RAdaGrad and RAdamW
- Finding separatrices of dynamical flows with Deep Koopman Eigenfunctions
- Find your Needle: Small Object Image Retrieval via Multi-Object Attention Optimization
- Fine-grained Analysis and Faster Algorithms for Iteratively Solving Linear Systems
- Fine-grained List-wise Alignment for Generative Medication Recommendation
- Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs
- FineGRAIN: Evaluating Failure Modes of Text-to-Image Models with Vision Language Model Judges
- FineRS: Fine-grained Reasoning and Segmentation of Small Objects with Reinforcement Learning
- Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods
- Finite Sample Analyses for Continuous-time Linear Systems: System Identification and Online Control
- Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features
- Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning
- Finite-Time Analysis of Stochastic Nonconvex Nonsmooth Optimization on the Riemannian Manifolds
- Finite-Time Bounds for Average-Reward Fitted Q-Iteration
- FIPER: Factorized Features for Robust Image Super-Resolution and Compression
- Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
- Fire360: A Benchmark for Robust Perception and Episodic Memory in Degraded 360° Firefighting Video
- First Attentions Last: Better Exploiting First Attentions for Efficient Parallel Training
- First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training
- First Workshop on LLM Persona Modeling
- Fisher meets Feynman: score-based variational inference with a product of experts
- Fit the Distribution: Cross-Image/Prompt Adversarial Attacks on Multimodal Large Language Models
- Fixed-Point RNNs: Interpolating from Diagonal to Dense
- Fix False Transparency by Noise Guided Splatting
- Fixing It in Post: A Comparative Study of LLM Post-Training Data Quality and Model Performance
- FLAME: Fast Long-context Adaptive Memory for Event-based Vision
- FlareX: A Physics-Informed Dataset for Lens Flare Removal via 2D Synthesis and 3D Rendering
- FlashBias: Fast Computation of Attention with Bias
- Flash Invariant Point Attention
- FlashMD: long-stride, universal prediction of molecular dynamics
- FlashMoE: Fast Distributed MoE in a Single Kernel
- FlashMo: Geometric Interpolants and Frequency-Aware Sparsity for Scalable Efficient Motion Generation
- Flat Channels to Infinity in Neural Loss Landscapes
- Flatness is Necessary, Neural Collapse is Not: Rethinking Generalization via Grokking
- Flatten Graphs as Sequences: Transformers are Scalable Graph Generators
- Flattening Hierarchies with Policy Bootstrapping
- FlexAC: Towards Flexible Control of Associative Reasoning in Multimodal Large Language Models
- FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies
- Flexible inference for animal learning rules using neural networks
- Flexible Language Modeling in Continuous Space with Transformer-based Autoregressive Flows
- Flexible MOF Generation with Torsion-Aware Flow Matching
- Flexible Realignment of Language Models
- Flex-Judge: Text-Only Reasoning Unleashes Zero-Shot Multimodal Evaluators
- FlexOLMo: Open Language Models for Flexible Data Use
- FlexSelect: Flexible Token Selection for Efficient Long Video Understanding
- FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
- FlexWorld: Progressively Expanding 3D Scenes for Flexible-View Exploration
- Flick: Empowering Federated Learning with Commonsense Knowledge
- FLiP: Towards Comprehensive and Reliable Evaluation of Federated Prompt Learning
- Flow based approach for Dynamic Temporal Causal models with non-Gaussian or Heteroscedastic Noises
- Flow-Based Policy for Online Reinforcement Learning
- FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models
- FlowDAS: A Stochastic Interpolant-based Framework for Data Assimilation
- Flow Density Control: Generative Optimization Beyond Entropy-Regularized Fine-Tuning
- Flow Equivariant Recurrent Neural Networks
- FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models
- FlowFeat: Pixel-Dense Embedding of Motion Profiles
- Flow Field Reconstruction with Sensor Placement Policy Learning
- Flow-GRPO: Training Flow Matching Models via Online RL
- FLOWING: Implicit Neural Flows for Structure-Preserving Morphing
- Flow Matching-Based Autonomous Driving Planning with Advanced Interactive Behavior Modeling
- Flow Matching Neural Processes
- FlowMixer: A Depth-Agnostic Neural Architecture for Interpretable Spatiotemporal Forecasting
- FlowMoE: A Scalable Pipeline Scheduling Framework for Distributed Mixture-of-Experts Training
- FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation
- FlowNet: Modeling Dynamic Spatio-Temporal Systems via Flow Propagation
- FlowPrune: Accelerating Attention Flow Calculation by Pruning Flow Network
- FlowRefiner: A Robust Traffic Classification Framework against Label Noise
- Flux4D: Flow-based Unsupervised 4D Reconstruction
- FLUX: Efficient Descriptor-Driven Clustered Federated Learning under Arbitrary Distribution Shifts
- FlyLoRA: Boosting Task Decoupling and Parameter Efficiency via Implicit Rank-Wise Mixture-of-Experts
- FlySearch: Exploring how vision-language models explore
- FNOPE: Simulation-based inference on function spaces with Fourier Neural Operators
- FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
- FOCUS: Internal MLLM Representations for Efficient Fine-Grained Visual Question Answering
- Focus-Then-Reuse: Fast Adaptation in Visual Perturbation Environments
- FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven by Referential Segmentation
- FoGE: Fock Space inspired encoding for graph prompting
- Follow the Energy, Find the Path: Riemannian Metrics from Energy-Based Models
- Follow-the-Perturbed-Leader Nearly Achieves Best-of-Both-Worlds for the m-Set Semi-Bandit Problems
- For Better or for Worse, Transformers Seek Patterns for Memorization
- ForceFM: Enhancing Protein-Ligand Predictions through Force-Guided Flow Matching
- Force Prompting: Video Generation Models Can Learn And Generalize Physics-based Control Signals
- ForceVLA: Enhancing VLA Models with a Force-aware MoE for Contact-rich Manipulation
- Forecasting in Offline Reinforcement Learning for Non-stationary Environments
- ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization
- Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation
- ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection
- Forging Time Series with Language: A Large Language Model Approach to Synthetic Data Generation
- FORLA: Federated Object-centric Representation Learning with Slot Attention
- Formal Models of Active Learning from Contrastive Examples
- Fortifying Time Series: DTW-Certified Robust Anomaly Detection
- Fostering the Ecosystem of AI for Social Impact Requires Expanding and Strengthening Evaluation Standards
- Foundation Cures Personalization: Improving Personalized Models’ Prompt Consistency via Hidden Foundation Knowledge
- Foundation Models for Scientific Discovery: From Paradigm Enhancement to Paradigm Transition
- Foundation Models for the Brain and Body Workshop
- Foundations of Imitation Learning: From Language Modeling to Continuous Control
- Foundations of Reasoning in Language Models
- Foundations of Tensor/Low-Rank Computations for AI
- Foundations of Top-$k$ Decoding for Language Models
- Fourier Analysis Network
- Fourier Clouds: Fast Bias Correction for Imbalanced Semi-Supervised Learning
- Fourier Token Merging: Understanding and Capitalizing Frequency Domain for Efficient Image Generation
- FP4 All the Way: Fully Quantized Training of Large Language Models
- FP64 is All You Need: Rethinking Failure Modes in Physics-Informed Neural Networks
- FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion
- FracFace: Breaking The Visual Clues—Fractal-Based Privacy-Preserving Face Recognition
- Fractional Diffusion Bridge Models
- Fractional Langevin Dynamics for Combinatorial Optimization via Polynomial-Time Escape
- Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models
- Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
- FrameShield: Adversarially Robust Video Anomaly Detection
- FRAM: Frobenius-Regularized Assignment Matching with Mixed-Precision Computing
- FraPPE: Fast and Efficient Preference-Based Pure Exploration
- FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network
- Fréchet Geodesic Boosting
- FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
- FreeInv: Free Lunch for Improving DDIM Inversion
- Free-Lunch Color-Texture Disentanglement for Stylized Image Generation
- FreqExit: Enabling Early-Exit Inference for Visual Autoregressive Models via Frequency-Aware Guidance
- FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency
- FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens
- Frequency-Aware Token Reduction for Efficient Vision Transformer
- FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents
- FRN: Fractal-Based Recursive Spectral Reconstruction Network
- From Average-Iterate to Last-Iterate Convergence in Games: A Reduction and Its Applications
- From Benchmarks to Problems - A Perspective on Problem Finding in AI
- From Benchmarks to Problems - A Perspective on Problem Finding in AI
- From Black-box to Causal-box: Towards Building More Interpretable Models
- From Bytes to Ideas: Language Modeling with Autoregressive U-Nets
- From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics
- From Contextual Combinatorial Semi-Bandits to Bandit List Classification: Improved Sample Complexity with Sparse Rewards
- From Counterfactuals to Trees: Competitive Analysis of Model Extraction Attacks
- From Cradle to Cane: A Two-Pass Framework for High-Fidelity Lifespan Face Aging
- From Dormant to Deleted: Tamper-Resistant Unlearning Through Weight-Space Regularization
- From Euler to AI: Unifying Formulas for Mathematical Constants
- From Experts to a Generalist: Toward General Whole-Body Control for Humanoid Robots
- From Faults to Features: Pretraining to Learn Robust Representations against Sensor Failures
- From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
- From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit
- From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction
- From Human Attention to Diagnosis: Semantic Patch-Level Integration of Vision-Language Models in Medical Imaging
- From Indicators to Insights: Diversity-Optimized for Medical Series-Text Decoding via LLMs
- From Information to Generative Exponent: Learning Rate Induces Phase Transitions in SGD
- From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring
- From Kolmogorov to Cauchy: Shallow XNet Surpasses KANs
- From Likelihood to Fitness: Improving Variant Effect Prediction in Protein and Genome Language Models
- From Linear to Nonlinear: Provable Weak-to-Strong Generalization through Feature Learning
- From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers
- From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes
- From Pixels to Views: Learning Angular-Aware and Physics-Consistent Representations for Light Field Microscopy
- From Play to Replay: Composed Video Retrieval for Temporally Fine-Grained Videos
- From Pose to Muscle: Multimodal Learning for Piano Hand Muscle Electromyography
- From Pretraining to Pathology: How Noise Leads to Catastrophic Inheritance in Medical Models
- From Programs to Poses: Factored Real-World Scene Generation via Learned Program Libraries
- From Replication to Redesign: Exploring Pairwise Comparisons for LLM-Based Peer Review
- From Self-Check to Consensus: Bayesian Strategic Decoding in Large Language Models
- From Sequence to Structure: Uncovering Substructure Reasoning in Transformers
- From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers
- From Softmax to Score: Transformers Can Effectively Implement In-Context Denoising Steps
- From Specificity to Generality: Revisiting Generalizable Artifacts in Detecting Face Deepfakes
- From stability of Langevin diffusion to convergence of proximal MCMC for non-log-concave sampling
- From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning
- From Synapses to Dynamics: Obtaining Function from Structure in a Connectome Constrained Model of the Head Direction Circuit
- From Tuning to Guarantees: Statistically Valid Hyperparameter Selection
- Frontiers in Probabilistic Inference: Learning meets Sampling
- FSEO: Few-Shot Evolutionary Optimization via Meta-Learning for Expensive Multi-Objective Optimization
- FSI-Edit: Frequency and Stochasticity Injection for Flexible Diffusion-Based Image Editing
- FSNet: Feasibility-Seeking Neural Network for Constrained Optimization with Guarantees
- FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities
- Fully Autonomous Neuromorphic Navigation and Dynamic Obstacle Avoidance
- Fully Dynamic Algorithms for Chamfer Distance
- Fully Spiking Neural Networks for Unified Frame-Event Object Tracking
- FuncGenFoil: Airfoil Generation and Editing Model in Function Space
- Functional Complexity-adaptive Temporal Tensor Decomposition
- Functional data analysis for multivariate distributions through Wasserstein slicing
- Functional Matching of Logic Subgraphs: Beyond Structural Isomorphism
- Functional Scaling Laws in Kernel Regression: Loss Dynamics and Learning Rate Schedules
- Functional Virtual Adversarial Training for Semi-Supervised Time Series Classification
- Fundamental Limitations in Pointwise Defences of LLM Finetuning APIs
- Fuse2Match: Training-Free Fusion of Flow, Diffusion, and Contrastive Models for Zero-Shot Semantic Matching
- Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation
- Future-Aware End-to-End Driving: Bidirectional Modeling of Trajectory Planning and Scene Evolution
- Future Link Prediction Without Memory or Aggregation
- FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving
- FuXi-Ocean: A Global Ocean Forecasting System with Sub-Daily Resolution
- Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty
- Gains: Fine-grained Federated Domain Adaptation in Open Set
- GAM-Agent: Game-Theoretic and Uncertainty-Aware Collaboration for Complex Visual Reasoning
- GAMMA: Gated Multi-hop Message Passing for Homophily-Agnostic Node Representation in GNNs
- GaRA-SAM: Robustifying Segment Anything Model with Gated-Rank Adaptation
- GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs
- Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
- Gated Integration of Low-Rank Adaptation for Continual Learning of Large Language Models
- Gatekeeper: Improving Model Cascades Through Confidence Tuning
- Gate to the Vessel: Residual Experts Restore What SAM Overlooks
- GauSAM: Contour‑Guided 2D Gaussian Fields for Multi‑Scale Medical Image Segmentation with Segment Anything
- Gaussian Approximation and Concentration of Constant Learning-Rate Stochastic Gradient Descent
- Gaussian-Augmented Physics Simulation and System Identification with Complex Colliders
- GaussianFusion: Gaussian-Based Multi-Sensor Fusion for End-to-End Autonomous Driving
- Gaussian Herding across Pens: An Optimal Transport Perspective on Global Gaussian Reduction for 3DGS
- Gaussian Processes for Shuffled Regression
- Gaussian Process Upper Confidence Bound Achieves Nearly-Optimal Regret in Noise-Free Gaussian Process Bandits
- Gaussian Regression-Driven Tensorized Incomplete Multi-View Clustering with Dual Manifold Regularization
- Gaze Beyond the Frame: Forecasting Egocentric 3D Visual Span
- Gaze-VLM: Bridging Gaze and VLMs through Attention Regularization for Egocentric Understanding
- GC4NC: A Benchmark Framework for Graph Condensation on Node Classification with New Insights
- GD$^2$: Robust Graph Learning under Label Noise via Dual-View Prediction Discrepancy
- GeGS-PCR: Fast and Robust Color 3D Point Cloud Registration with Two-Stage Geometric-3DGS Fusion
- GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws
- GenColor: Generative and Expressive Color Enhancement with Pixel-Perfect Texture Preservation
- GeneFlow: Translation of Single-cell Gene Expression to Histopathological Images via Rectified Flow
- GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data
- Generalizable Domain Adaptation for Sim-and-Real Policy Co-Training
- Generalizable Hand-Object Modeling from Monocular RGB Images via 3D Gaussians
- Generalizable Insights for Graph Transformers in Theory and Practice
- Generalizable, real-time neural decoding with hybrid state-space models
- Generalizable Reasoning through Compositional Energy Minimization
- Generalization Bound of Gradient Flow through Training Trajectory and Data-dependent Kernel
- Generalization Bounds for Kolmogorov-Arnold Networks (KANs) and Enhanced KANs with Lower Lipschitz Complexity
- Generalization Bounds for Model-based Algorithm Configuration
- Generalization Bounds for Rank-sparse Neural Networks
- Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention
- Generalization Guarantees for Learning Score-Based Branch-and-Cut Policies in Integer Programming
- Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers
- Generalization vs Specialization under Concept Shift
- Generalized and Invariant Single-Neuron In-Vivo Activity Representation Learning
- Generalized Category Discovery under Domain Shift: A Frequency Domain Perspective
- Generalized Contrastive Learning for Universal Multimodal Retrieval
- Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness
- Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update
- Generalized Linear Mode Connectivity for Transformers
- Generalized Top-k Mallows Model for Ranked Choices
- Generalizing Experience for Language Agents with Hierarchical MetaFlows
- Generalizing Single-Frame Supervision to Event-Level Understanding for Video Anomaly Detection
- Generalizing Verifiable Instruction Following
- Generalizing while preserving monotonicity in comparison-based preference learning models
- General-Reasoner: Advancing LLM Reasoning Across All Domains
- Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling
- Generating and Checking DNN Verification Proofs
- Generating Computational Cognitive models using Large Language Models
- Generating Creative Chess Puzzles
- Generating Full-field Evolution of Physical Dynamics from Irregular Sparse Observations
- Generating Informative Samples for Risk-Averse Fine-Tuning of Downstream Tasks
- Generating Multi-Table Time Series EHR from Latent Space with Minimal Preprocessing
- Generating Physically Sound Designs from Text and a Set of Physical Constraints
- Generation as Search Operator for Test-Time Scaling of Diffusion-based Combinatorial Optimization
- Generative AI in Finance
- Generative Caching for Structurally Similar Prompts and Responses
- Generative Data Augmentation via Diffusion Distillation, Adversarial Alignment, and Importance Reweighting
- Generative diffusion for perceptron problems: statistical physics analysis and efficient algorithms
- Generative Distribution Embeddings
- Generative Graph Pattern Machine
- Generative Modeling of Full-Atom Protein Conformations using Latent Diffusion on Graph Embeddings
- Generative Model Inversion Through the Lens of the Manifold Hypothesis
- Generative Perception of Shape and Material from Differential Motion
- Generative Pre-trained Autoregressive Diffusion Transformer
- Generative RLHF-V: Learning Principles from Multi-modal Human Preference
- Generative Trajectory Stitching through Diffusion Composition
- Generator-Mediated Bandits: Thompson Sampling for GenAI-Powered Adaptive Interventions
- Gene Regulatory Network Inference in the Presence of Selection Bias and Latent Confounders
- Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency
- GenIR: Generative Visual Feedback for Mental Image Retrieval
- GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning
- GenProCC: 1st Workshop on Generative and Protective AI for Content Creation
- GenSpace: Benchmarking Spatially-Aware Image Generation
- GeoAda: Efficiently Finetune Geometric Diffusion Models with Equivariant Adapters
- GeoCAD: Local Geometry-Controllable CAD Generation with Large Language Models
- GeoClip: Geometry-Aware Clipping for Differentially Private SGD
- GeoComplete: Geometry-Aware Diffusion for Reference-Driven Image Completion
- GeoLink: Empowering Remote Sensing Foundation Model with OpenStreetMap Data
- GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
- Geometric Algebra-Enhanced Bayesian Flow Network for RNA Inverse Design
- Geometric Algorithms for Neural Combinatorial Optimization with Constraints
- Geometric Imbalance in Semi-Supervised Node Classification
- Geometric Learning with Positively Decomposable Kernels
- Geometric Logit Decoupling for Energy-Based Graph Out-of-distribution Detection
- Geometric Mixture Models for Electrolyte Conductivity Prediction
- Geometry-Aware Collaborative Multi-Solutions Optimizer for Model Fine-Tuning with Parameter Efficiency
- Geometry-Aware Edge Pooling for Graph Neural Networks
- Geometry Aware Operator Transformer as an efficient and accurate neural surrogate for PDEs on arbitrary domains
- Geometry Meets Incentives: Sample-Efficient Incentivized Exploration with Linear Contexts
- Geometry of Decision Making in Language Models
- GeoRanker: Distance-Aware Ranking for Worldwide Image Geolocalization
- GeoRemover: Removing Objects and Their Causal Visual Artifacts
- Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation
- Geospatial Foundation Models: Overview, Application and Benchmarking
- GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction
- GeoVideo: Introducing Geometric Regularization into Video Generation Model
- GeRaF: Neural Geometry Reconstruction from Radio Frequency Signals
- GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation
- GIST: Greedy Independent Set Thresholding for Max-Min Diversification with Submodular Utility
- Glance2Gaze: Efficient Vision-Language Models from Glance Fusion to Gaze Compression
- GLID$^2$E: A Gradient-Free Lightweight Fine-tune Approach for Discrete Biological Sequence Design
- GLNCD: Graph-Level Novel Category Discovery
- Global Convergence for Average Reward Constrained MDPs with Primal-Dual Actor Critic Algorithm
- Globally Optimal Policy Gradient Algorithms for Reinforcement Learning with PID Control Policies
- Global Minimizers of $\ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks
- Global Minimizers of Sigmoid Contrastive Loss
- Global Prompt Refinement with Non-Interfering Attention Masking for One-Shot Federated Learning
- GlobalTomo: A global dataset for physics-ML seismic wavefield modeling and FWI
- Glocal Information Bottleneck for Time Series Imputation
- GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity
- GLVD: Guided Learned Vertex Descent
- G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems
- GMM-based VAE model with Normalising Flow for effective stochastic segmentation
- GMV: A Unified and Efficient Graph Multi-View Learning Framework
- G-Net: A Provably Easy Construction of High-Accuracy Random Binary Neural Networks
- GnnXemplar: Exemplars to Explanations - Natural Language Rules for Global GNN Interpretability
- GoalLadder: Incremental Goal Discovery with Vision-Language Models
- GOATex: Geometry & Occlusion-Aware Texturing
- Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning Dynamics
- GOOD: Training-Free Guided Diffusion Sampling for Out-of-Distribution Detection
- GoRA: Gradient-driven Adaptive Low Rank Adaptation
- GoT: Unleashing Reasoning Capability of MLLM for Visual Generation and Editing
- Go With the Flow: Fast Diffusion for Gaussian Mixture Models
- GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling
- GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers
- GPO: Learning from Critical Steps to Improve LLM Reasoning
- GPSToken: Gaussian Parameterized Spatially-adaptive Tokenization for Image Representation and Generation
- GPU-Accelerated and Scalable Optimization (ScaleOpt)
- Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective
- Gradient Descent as Loss Landscape Navigation: a Normative Framework for Deriving Learning Rules
- Gradient-Guided Epsilon Constraint Method for Online Continual Learning
- Gradient Multi-Normalization for Efficient LLM Training
- Gradient Variance Reveals Failure Modes in Flow-Based Generative Models
- Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness
- Gradient-Weight Alignment as a Train-Time Proxy for Generalization in Classification Tasks
- GradMetaNet: An Equivariant Architecture for Learning on Gradients
- GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
- Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks
- GRAPE: Optimize Data Mixture for Group Robust Multi-target Adaptive Pretraining
- Graph Alignment via Birkhoff Relaxation
- Graph-based Symbolic Regression with Invariance and Constraint Encoding
- GraphChain: Large Language Models for Large-scale Graph Analysis via Tool Chaining
- Graph Data Selection for Domain Adaptation: A Model-Free Approach
- Graph Diffusion that can Insert and Delete
- Graph Few-Shot Learning via Adaptive Spectrum Experts and Cross-Set Distribution Calibration
- GraphKeeper: Graph Domain-Incremental Learning via Knowledge Disentanglement and Preservation
- Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models
- GraphLand: Evaluating Graph Machine Learning Models on Diverse Industrial Data
- GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments
- Graph Neural Network Based Action Ranking for Planning
- Graph Persistence goes Spectral
- Graphs Help Graphs: Multi-Agent Graph Socialized Learning
- Graph–Smoothed Bayesian Black-Box Shift Estimator and Its Information Geometry
- Graph-Theoretic Insights into Bayesian Personalized Ranking for Recommendation
- GraphTOP: Graph Topology-Oriented Prompting for Graph Neural Networks
- Graph Your Own Prompt
- Grasp2Grasp: Vision-Based Dexterous Grasp Translation via Schrödinger Bridges
- GraSS: Scalable Data Attribution with Gradient Sparsification and Sparse Projection
- GRAVER: Generative Graph Vocabularies for Robust Graph Foundation Models Fine-tuning
- Greed is Good: A Unifying Perspective on Guided Generation
- Greedy Algorithms for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure
- Greedy Sampling Is Provably Efficient For RLHF
- GreenHyperSpectra: A multi-source hyperspectral dataset for global vegetation trait prediction
- GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains
- Grids Often Outperform Implicit Neural Representation at Compressing Dense Signals
- GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
- GRIP: A Graph-Based Reasoning Instruction Producer
- GRIT: Teaching MLLMs to Think with Images
- Ground-Compose-Reinforce: Grounding Language in Agentic Behaviours using Limited Data
- Grounded Reinforcement Learning for Visual Reasoning
- Grounding Language with Vision: A Conditional Mutual Information Calibrated Decoding Strategy for Reducing Hallucinations in LVLMs
- Group-in-Group Policy Optimization for LLM Agent Training
- Group-Level Data Selection for Efficient Pretraining
- GS2E: Gaussian Splatting is an Effective Data Generator for Event Stream Generation
- GSAlign: Geometric and Semantic Alignment Network for Aerial-Ground Person Re-Identification
- GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
- GSPN-2: Efficient Parallel Sequence Modeling
- GSRF: Complex-Valued 3D Gaussian Splatting for Efficient Radio-Frequency Data Synthesis
- GST-UNet: A Neural Framework for Spatiotemporal Causal Inference with Time-Varying Confounding
- GTPBD: A Fine-Grained Global Terraced Parcel and Boundary Dataset
- GTR-Loc: Geospatial Text Regularization Assisted Outdoor LiDAR Localization
- Guarantees for Alternating Least Squares in Overparameterized Tensor Decompositions
- GUARD: Constructing Realistic Two-Player Matrix and Security Games for Benchmarking Game-Theoretic Algorithms
- GUARDIAN: Safeguarding LLM Multi-Agent Collaborations with Temporal Graph Modeling
- Guard Me If You Know Me: Protecting Specific Face-Identity from Deepfakes
- GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
- GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
- Guided Diffusion Sampling on Function Spaces with Applications to PDEs
- GUIDED: Granular Understanding via Identification, Detection, and Discrimination for Fine-Grained Open-Vocabulary Object Detection
- GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer
- Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment
- Guiding LLM Decision-Making with Fairness Reward Models
- GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning
- GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents
- GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior
- GUI-Rise: Structured Reasoning and History Summarization for GUI Navigation
- GVPO: Group Variance Policy Optimization for Large Language Model Post-Training
- Gymnasium: A Standard Interface for Reinforcement Learning Environments
- GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations
- H3D-DGS: Exploring Heterogeneous 3D Motion Representation for Deformable 3D Gaussian Splatting
- Hadamard Test is Sufficient for Efficient Quantum Gradient Estimation with Lie Algebraic Symmetries
- Hadamax Encoding: Elevating Performance in Model-Free Atari
- HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene
- HairFree: Compositional 2D Head Prior for Text-Driven 360° Bald Texture Synthesis
- Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning
- HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs
- Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time
- Hamiltonian Neural PDE Solvers through Functional Approximation
- Handling Label Noise via Instance-Level Difficulty Modeling and Dynamic Optimization
- Handling Missing Responses under Cluster Dependence with Applications to Language Model Evaluation
- Hankel Singular Value Regularization for Highly Compressible State Space Models
- HAODiff: Human-Aware One-Step Diffusion via Dual-Prompt Guidance
- HARDMath2: A Benchmark for Applied Mathematics Built by Students as Part of a Graduate Class
- Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
- Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning
- Harnessing Feature Resonance under Arbitrary Target Alignment for Out-of-Distribution Node Detection
- Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability
- Harnessing the Universal Geometry of Embeddings
- Harvard (BTF)
- Hawaii: Hierarchical Visual Knowledge Transfer for Efficient Vision-Language Models
- HawkBench: Investigating Resilience of RAG Methods on Stratified Information-Seeking Tasks
- Hawk: Leveraging Spatial Context for Faster Autoregressive Text-to-Image Generation
- HBLLM: Wavelet-Enhanced High-Fidelity 1-Bit Quantization for LLMs
- HCRMP: An LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving
- Head Pursuit: Probing Attention Specialization in Multimodal Transformers
- Heavy-Ball Momentum Method in Continuous Time and Discretization Error Analysis
- HeavyWater and SimplexWater: Distortion-free LLM Watermarks for Low-Entropy Distributions
- HEIR: Learning Graph-Based Motion Hierarchies
- HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts
- HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages
- Hephaestus: Mixture Generative Modeling with Energy Guidance for Large-scale QoS Degradation
- HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
- HeroFilter: Adaptive Spectral Graph Filter for Varying Heterophilic Relations
- Hessian-guided Perturbed Wasserstein Gradient Flows for Escaping Saddle Points
- Heterogeneous Adversarial Play in Interactive Environments
- Heterogeneous Diffusion Structure Inference for Network Cascade
- Heterogeneous Graph Transformers for Simultaneous Mobile Multi-Robot Task Allocation and Scheduling under Temporal Constraints
- Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems
- HetSyn: Versatile Timescale Integration in Spiking Neural Networks via Heterogeneous Synapses
- Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity
- HIDISC: A Hyperbolic Framework for Domain Generalization with Generalized Category Discovery
- Hierachical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM
- Hierarchical Demonstration Order Optimization for Many-shot In-Context Learning
- Hierarchical Fine-grained Preference Optimization for Physically Plausible Video Generation
- Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain
- Hierarchical Implicit Neural Emulators
- Hierarchical Information Aggregation for Incomplete Multimodal Alzheimer's Disease Diagnosis
- Hierarchical Koopman Diffusion: Fast Generation with Interpretable Diffusion Trajectory
- Hierarchical Optimization via LLM-Guided Objective Evolution for Mobility-on-Demand Systems
- Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe
- Hierarchical Self-Attention: Generalizing Neural Attention Mechanics to Multi-Scale Problems
- Hierarchical Semantic-Augmented Navigation: Optimal Transport and Graph-Driven Reasoning for Vision-Language Navigation
- Hierarchical Shortest-Path Graph Kernel Network
- HiFC: High-efficiency Flash-based KV Cache Swapping for Scaling LLM Inference
- HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
- High-Dimensional Calibration from Swap Regret
- High-dimensional neuronal activity from low-dimensional latent dynamics: a solvable model
- High Dynamic Range Imaging with Time-Encoding Spike Camera
- Higher-Order Learning with Graph Neural Networks via Hypergraph Encodings
- Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval
- High-order Equivariant Flow Matching for Density Functional Theory Hamiltonian Prediction
- High-Order Flow Matching: Unified Framework and Sharp Statistical Rates
- High-order Interactions Modeling for Interpretable Multi-Agent Q-Learning
- High-Performance Arithmetic Circuit Optimization via Differentiable Architecture Search
- High Resolution UDF Meshing via Iterative Networks
- HiMoLE: Towards OOD-Robust LoRA via Hierarchical Mixture of Experts
- HiPoNet: A Multi-View Simplicial Complex Network for High Dimensional Point-Cloud and Single-Cell data
- Hippocampal-like Sequential Editing for Continual Knowledge Updates in Large Language Models
- HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
- HMARL-CBF – Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems
- HMVLM:Human Motion-Vision-Language Model via MoE LoRA
- HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction
- HOComp: Interaction-Aware Human-Object Composition
- Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
- HOI-Dyn: Learning Interaction Dynamics for Human-Object Motion Diffusion
- HoliGS: Holistic Gaussian Splatting for Embodied View Synthesis
- Holistic Large-Scale Scene Reconstruction via Mixed Gaussian Splatting
- Holistic Order Prediction in Natural Scenes
- HoliTom: Holistic Token Merging for Fast Video Large Language Models
- HollowFlow: Efficient Sample Likelihood Evaluation using Hollow Message Passing
- HoloLLM: Multisensory Foundation Model for Language-Grounded Human Sensing and Reasoning
- HoloScene: Simulation‑Ready Interactive 3D Worlds from a Single Video
- Homogeneous Algorithms Can Reduce Competition in Personalized Pricing
- Homogeneous Keys, Heterogeneous Values: Exploiting Local KV Cache Asymmetry for Long-Context LLMs
- HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios
- HoPE: Hybrid of Position Embedding for Long Context Vision-Language Models
- Horizon Reduction Makes RL Scalable
- HoT-VI: Reparameterizable Variational Inference for Capturing Instance-Level High-Order Correlations
- HouseLayout3D: A Benchmark and Training-free Baseline for 3D Layout Estimation in the Wild
- How Benchmark Prediction from Fewer Data Misses the Mark
- How Classifier Features Transfer to Downstream: An Asymptotic Analysis in a Two-Layer Model
- How Data Mixing Shapes In-Context Learning: Asymptotic Equivalence for Transformers with MLPs
- How Different from the Past? Spatio-Temporal Time Series Forecasting with Self-Supervised Deviation Learning
- How Does Label Noise Gradient Descent Improve Generalization in the Low SNR Regime?
- How Does Sequence Modeling Architecture Influence Base Capabilities of Pre-trained Language Models? Exploring Key Architecture Design Principles to Avoid Base Capabilities Degradation
- How Does Topology Bias Distort Message Passing in Graph Recommender? A Dirichlet Energy Perspective
- How do Transformers Learn Implicit Reasoning?
- How Ensembles of Distilled Policies Improve Generalisation in Reinforcement Learning
- How Far Are We from Optimal Reasoning Efficiency?
- How Many Domains Suffice for Domain Generalization? A Tight Characterization via the Domain Shattering Dimension
- How many measurements are enough? Bayesian recovery in inverse problems with general distributions
- How Many Tokens Do 3D Point Cloud Transformer Architectures Really Need?
- How Memory in Optimization Algorithms Implicitly Modifies the Loss
- How Particle System Theory Enhances Hypergraph Message Passing
- How Patterns Dictate Learnability in Sequential Data
- How to Auto-optimize Prompts for Domain Tasks? Adaptive Prompting and Reasoning through Evolutionary Domain Knowledge Adaptation
- How to build a consistency model: Learning flow maps via self-distillation
- How to Build Agents to Generate Kernels for Faster LLMs (and Other Models!)
- How to Learn a Star: Binary Classification with Starshaped Polyhedral Sets
- How to Scale Second-Order Optimization
- How to Train Your LLM Web Agent: A Statistical Diagnosis
- How Well Can Differential Privacy Be Audited in One Run?
- HPSERec: A Hierarchical Partitioning and Stepwise Enhancement Framework for Long-tailed Sequential Recommendation
- HQA-VLAttack: Towards High Quality Adversarial Attack on Vision-Language Pre-Trained Models
- H-SPLID: HSIC-based Saliency Preserving Latent Information Decomposition
- HubGT: Fast Graph Transformer with Decoupled Hierarchy Labeling
- Human-AI Alignment: Foundations, Methods, Practice, and Challenges
- Human-assisted Robotic Policy Refinement via Action Preference Optimization
- HumanCrafter: Synergizing Generalizable Human Reconstruction and Semantic 3D Segmentation
- HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning
- Human Texts Are Outliers: Detecting LLM-generated Texts via Out-of-distribution Detection
- Hybrid Autoencoders for Tabular Data: Leveraging Model-Based Augmentation in Low-Label Settings
- Hybrid-Balance GFlowNet for Solving Vehicle Routing Problems
- Hybrid Boundary Physics-Informed Neural Networks for Solving Navier-Stokes Equations with Complex Boundary
- Hybrid-Collaborative Augmentation and Contrastive Sample Adaptive-Differential Awareness for Robust Attributed Graph Clustering
- Hybrid Latent Reasoning via Reinforcement Learning
- Hybrid Latent Representations for PDE Emulation
- HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization
- Hybrid Re-matching for Continual Learning with Parameter-Efficient Tuning
- HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
- Hyperbolic Dataset Distillation
- Hyperbolic Fine-Tuning for Large Language Models
- HyperET: Efficient Training in Hyperbolic Space for Multi-modal Large Language Models
- Hypergraph-Enhanced Contrastive Learning for Multi-View Clustering with Hyper-Laplacian Regularization
- HyperGraphRAG: Retrieval-Augmented Generation via Hypergraph-Structured Knowledge Representation
- HYPERION: Fine-Grained Hypersphere Alignment for Robust Federated Graph Learning
- HyperMARL: Adaptive Hypernetworks for Multi-Agent RL
- HyperMixup: Hypergraph-Augmented with Higher-order Information Mixup
- Hyper-Modality Enhancement for Multimodal Sentiment Analysis with Missing Modalities
- Hyperphantasia: A Benchmark for Evaluating the Mental Visualization Capabilities of Multimodal LLMs
- HyPINO: Multi-Physics Neural Operators via HyperPINNs and the Method of Manufactured Solutions
- HyPlaneHead: Rethinking Tri-plane-like Representations in Full-Head Image Synthesis
- HypoBootstrap: A Bootstrapping Framework for Inductive Reasoning
- HYPRL: Reinforcement Learning of Control Policies for Hyperproperties
- HyRF: Hybrid Radiance Fields for Memory-efficient and High-quality Novel View Synthesis
- I2-NeRF: Learning Neural Radiance Fields Under Physically-Grounded Media Interactions
- IA-GGAD: Zero-shot Generalist Graph Anomaly Detection via Invariant and Affinity Learning
- IBGS: Image-Based Gaussian Splatting
- ICLScan: Detecting Backdoors in Black-Box Large Language Models via Targeted In-context Illumination
- ICPC-Eval: Probing the Frontiers of LLM Reasoning with Competitive Programming Contests
- Identifiability of Deep Polynomial Neural Networks
- Identifying interactions across brain areas while accounting for individual-neuron dynamics with a Transformer-based variational autoencoder
- Identifying Macro Causal Effects in C-DMGs over DMGs
- Identifying multi-compartment Hodgkin-Huxley models with high-density extracellular voltage recordings
- IDOL: Meeting Diverse Distribution Shifts with Prior Physics for Tropical Cyclone Multi-Task Estimation
- IF-Guide: Influence Function-Guided Detoxification of LLMs
- iFinder: Structured Zero-Shot Vision-Based LLM Grounding for Dash-Cam Video Reasoning
- IGD: Token Decisiveness Modeling via Information Gain in LLMs for Personalized Recommendation
- IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation
- Image as a World: Generating Interactive World from Single Image via Panoramic Video Generation
- Image Editing As Programs with Diffusion Models
- ImageNet-trained CNNs are not biased towards texture: Revisiting feature reliance through controlled suppression
- Imageomics: Discovering Biological Knowledge from Images Using AI
- ImageSentinel: Protecting Visual Datasets from Unauthorized Retrieval-Augmented Image Generation
- Image Stitching in Adverse Condition: A Bidirectional-Consistency Learning Framework and Benchmark
- Image Super-Resolution with Guarantees via Conformalized Generative Models
- Image Token Matters: Mitigating Hallucination in Discrete Tokenizer-based Large Vision-Language Models via Latent Editing
- Imagine360: Immersive 360 Video Generation from Perspective Anchor
- Imagine Beyond ! Distributionally Robust Autoencoding for State Space Coverage in Online Reinforcement Learning
- Imagined Autocurricula
- Imbalances in Neurosymbolic Learning: Characterization and Mitigating Strategies
- ImgEdit: A Unified Image Editing Dataset and Benchmark
- Imitation Beyond Expectation Using Pluralistic Stochastic Dominance
- Imitation Learning with Temporal Logic Constraints
- IMPACT: Irregular Multi-Patch Adversarial Composition Based on Two‑Phase Optimization
- Impact of Dataset Properties on Membership Inference Vulnerability of Deep Transfer Learning
- Impact of Layer Norm on Memorization and Generalization in Transformers
- Impartial Selection with Predictions
- Implicit-ARAP: Efficient Handle-Guided Neural Field Deformation via Local Patch Meshing
- Implicit Bias of Spectral Descent and Muon on Multiclass Separable Data
- Implicit Generative Property Enhancer
- Implicit Modeling for Transferability Estimation of Vision Foundation Models
- Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections
- Impromptu VLA: Open Weights and Open Data for Driving Vision-Language-Action Models
- Improved Algorithms for Fair Matroid Submodular Maximization
- Improved Algorithms for Overlapping and Robust Clustering of Edge-Colored Hypergraphs: An LP-Based Combinatorial Approach
- Improved Approximation Algorithms for Chromatic and Pseudometric-Weighted Correlation Clustering
- Improved Balanced Classification with Theoretically Grounded Loss Functions
- Improved Best-of-Both-Worlds Regret for Bandits with Delayed Feedback
- Improved Bounds for Swap Multicalibration and Swap Omniprediction
- Improved Confidence Regions and Optimal Algorithms for Online and Offline Linear MNL Bandits
- IMPROVED LEARNING THEORY FOR KERNEL DISTRIBUTION REGRESSION WITH TWO-STAGE SAMPLING
- Improved Regret and Contextual Linear Extension for Pandora's Box and Prophet Inequality
- Improved Regret Bounds for Gaussian Process Upper Confidence Bound in Bayesian Optimization
- Improved Regret Bounds for Linear Bandits with Heavy-Tailed Rewards
- Improved Representation Steering for Language Models
- Improved Robust Estimation for Erdős-Rényi Graphs: The Sparse Regime and Optimal Breakdown Point
- Improved Scaling Laws in Linear Regression via Data Reuse
- Improved Training Technique for Shortcut Models
- Improve Temporal Reasoning in Multimodal Large Language Models via Video Contrastive Decoding
- Improving Bilinear RNN with Closed-loop Control
- Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay
- Improving Decision Trees through the Lens of Parameterized Local Search
- Improving Deep Learning for Accelerated MRI With Data Filtering
- Improving Diffusion-based Inverse Algorithms under Few-Step Constraint via Linear Extrapolation
- Improving Energy Natural Gradient Descent through Woodbury, Momentum, and Randomization
- Improving Formal Reasoning of Transformer with State Stack
- Improving Generalization of Neural Combinatorial Optimization for Vehicle Routing Problems via Test-Time Projection Learning
- Improving Generative Behavior Cloning via Self-Guidance and Adaptive Chunking
- Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
- Improving Model-Based Reinforcement Learning by Converging to Flatter Minima
- Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
- Improving Monte Carlo Tree Search for Symbolic Regression
- Improving Perturbation-based Explanations by Understanding the Role of Uncertainty Calibration
- Improving planning and MBRL with temporally-extended actions
- Improving Progressive Generation with Decomposable Flow Matching
- Improving Regret Approximation for Unsupervised Dynamic Environment Generation
- Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning
- Improving Reward Models with Proximal Policy Exploration for Preference-Based Reinforcement Learning
- Improving Target Sound Extraction via Disentangled Codec Representations with Privileged Knowledge Distillation
- Improving Task-Specific Multimodal Sentiment Analysis with General MLLMs via Prompting
- Improving the Euclidean Diffusion Generation of Manifold Data by Mitigating Score Function Singularity
- Improving the Generation and Evaluation of Synthetic Data for Downstream Medical Causal Inference
- Improving the Straight-Through Estimator with Zeroth-Order Information
- Improving Time Series Forecasting via Instance-aware Post-hoc Revision
- Improving Video Generation with Human Feedback
- INC: An Indirect Neural Corrector for Auto-Regressive Hybrid PDE Solvers
- Incentive-Aware Dynamic Resource Allocation under Long-Term Cost Constraints
- Incentivizing Desirable Effort Profiles in Strategic Classification: The Role of Causality and Uncertainty
- Incentivizing Dual Process Thinking for Efficient Large Language Model Reasoning
- Incentivizing LLMs to Self-Verify Their Answers
- Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models
- Incentivizing Time-Aware Fairness in Data Sharing
- Incentivizing Truthful Language Models via Peer Elicitation Games
- Incomplete Multi-view Clustering via Hierarchical Semantic Alignment and Cooperative Completion
- Incomplete Multi-view Deep Clustering with Data Imputation and Alignment
- In-Context Compositional Learning vis Sparse Coding Transformer
- In-Context Fully Decentralized Cooperative Multi-Agent Reinforcement Learning
- In-context Learning of Linear Dynamical Systems with Transformers: Approximation Bounds and Depth-separation
- In-Context Learning of Stochastic Differential Equations with Foundation Inference Models
- In-Context Learning Strategies Emerge Rationally
- Increasing the Utility of Synthetic Images through Chamfer Guidance
- Incremental Sequence Classification with Temporal Consistency
- IndEgo: A Dataset of Industrial Scenarios and Collaborative Work for Egocentric Assistants
- Individual Fairness In Strategic Classification
- Individually Fair Diversity Maximization
- Individual Regret in Cooperative Stochastic Multi-Armed Bandits
- Inductive Domain Transfer In Misspecified Simulation-Based Inference
- IndustryEQA: Pushing the Frontiers of Embodied Question Answering in Industrial Scenarios
- Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving of Inequalities
- IneqSearch: Hybrid Reasoning for Olympiad Inequality Proofs
- Inexact Column Generation for Bayesian Network Structure Learning via Difference-of-Submodular Optimization
- InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction
- Inference of Whole Brain Electrophysiological Networks Through Multimodal Integration of Simultaneous Scalp and Intracranial EEG
- Inference-time Alignment in Continuous Space
- Inference-Time Hyper-Scaling with KV Cache Compression
- Inference-Time Personalized Alignment with a Few User Preference Queries
- Inference-Time Reward Hacking in Large Language Models
- Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
- Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search
- Inference with correlated priors using sisters cells
- Inferring stochastic dynamics with growth from cross-sectional data
- InfiFPO: Implicit Model Fusion via Preference Optimization in Large Language Models
- InfiGFusion: Graph-on-Logits Distillation via Efficient Gromov-Wasserstein for Model Fusion
- InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
- Infinite Neural Operators: Gaussian processes on functions
- Infinite-Width Limit of a Single Attention Layer: Analysis via Tensor Programs
- InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
- Influence Functions for Edge Edits in Non-Convex Graph Neural Networks
- Influence Guided Context Selection for Effective Retrieval-Augmented Generation
- InFlux: A Benchmark for Self-Calibration of Dynamic Intrinsics of Video Cameras
- InfMasking: Unleashing Synergistic Information by Contrastive Multimodal Interactions
- InfoChartQA: A Benchmark for Multimodal Question Answering on Infographic Charts
- Information-Computation Tradeoffs for Noiseless Linear Regression with Oblivious Contamination
- Information-Driven Design of Imaging Systems
- Information Retrieval Induced Safety Degradation in AI Agents
- Information-Theoretic Discrete Diffusion
- Information-theoretic Generalization Analysis for VQ-VAEs: A Role of Latent Variables
- Information Theoretic Learning for Diffusion Models with Warm Start
- Information-Theoretic Reward Decomposition for Generalizable RLHF
- Informed Correctors for Discrete Diffusion Models
- Informed Initialization for Bayesian Optimization and Active Learning
- Infrequent Exploration in Linear Bandits
- Injecting Frame-Event Complementary Fusion into Diffusion for Optical Flow in Challenging Scenes
- Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination
- Inpainting the Neural Picture: Inferring Unrecorded Brain Area Dynamics from Multi-Animal Datasets
- In Search of Adam’s Secret Sauce
- In Silico Mapping of Visual Categorical Selectivity Across the Whole Brain
- InstaInpaint: Instant 3D-Scene Inpainting with Masked Large Reconstruction Model
- InstanceAssemble: Layout-Aware Image Generation via Instance Assembling Attention
- Instance-Dependent Regret Bounds for Nonstochastic Linear Partial Monitoring
- Instance-Level Composed Image Retrieval
- Instance-Optimality for Private KL Distribution Estimation
- Instant4D: 4D Gaussian Splatting in Minutes
- Instant Video Models: Universal Adapters for Stabilizing Image-Based Networks
- INST-IT: Boosting Instance Understanding via Explicit Visual Prompt Instruction Tuning
- InstructFlow: Adaptive Symbolic Constraint-Guided Code Generation for Long-Horizon Planning
- InstructHOI: Context-Aware Instruction for Multi-Modal Reasoning in Human-Object Interaction Detection
- InstructRestore: Region-Customized Image Restoration with Human Instructions
- InstructSAM: A Training-free Framework for Instruction-Oriented Remote Sensing Object Recognition
- Integral Imprecise Probability Metrics
- Integrating Drug Substructures and Longitudinal Electronic Health Records for Personalized Drug Recommendation
- Integration Matters for Learning PDEs with Backwards SDEs
- Intend to Move: A Multimodal Dataset for Intention-Aware Human Motion Understanding
- Interaction-Centric Knowledge Infusion and Transfer for Open Vocabulary Scene Graph Generation
- Interactive and Hybrid Imitation Learning: Provably Beating Behavior Cloning
- Interactive Anomaly Detection for Articulated Objects via Motion Anticipation
- Interactive Cross-modal Learning for Text-3D Scene Retrieval
- Intermediate Domain Alignment and Morphology Analogy for Patent-Product Image Retrieval
- InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback
- InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts
- Interpretable and Parameter Efficient Graph Neural Additive Models with Random Fourier Features
- Interpretable Global Minima of Deep ReLU Neural Networks on Sequentially Separable Data
- Interpretable Next-token Prediction via the Generalized Induction Head
- Interpreting Arithmetic Reasoning in Large Language Models using Game-Theoretic Interactions
- Interpreting Emergent Features in Deep Learning-based Side-channel Analysis
- Interpreting vision transformers via residual replacement model
- Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats
- In the Eye of MLLM: Benchmarking Egocentric Video Intent Understanding with Gaze-Guided Prompting
- Intrinsic Benefits of Categorical Distributional Loss: Uncertainty-aware Regularized Exploration in Reinforcement Learning
- Intrinsic Goals for Autonomous Agents: Model-Based Exploration in Virtual Zebrafish Predicts Ethological Behavior and Whole-Brain Dynamics
- IntrinsiX: High-Quality PBR Generation using Image Priors
- Introducing FOReCAst: The Future Outcome Reasoning and Confidence Assessment Benchmark
- Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models
- Inverse Methods for Missing Data Imputation
- Inverse Optimization Latent Variable Models for Learning Costs Applied to Route Problems
- Investigating and Mitigating Catastrophic Forgetting in Medical Knowledge Injection through Internal Knowledge Augmentation Learning
- Investigating Hallucinations of Time Series Foundation Models through Signal Subspace Analysis
- InvFusion: Bridging Supervised and Zero-shot Diffusion for Inverse Problems
- InvisibleInk: High-Utility and Low-Cost Text Generation with Differential Privacy
- IOSTOM: Offline Imitation Learning from Observations via State Transition Occupancy Matching
- IPAD: Inverse Prompt for AI Detection - A Robust and Interpretable LLM-Generated Text Detector
- IPFormer: Visual 3D Panoptic Scene Completion with Context-Adaptive Instance Proposals
- IPSI: Enhancing Structural Inference with Automatically Learned Structural Priors
- IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering
- IR-OptSet: An Optimization-Sensitive Dataset for Advancing LLM-Based IR Optimizer
- Irrational Complex Rotations Empower Low-bit Optimizers
- IRRISIGHT: A Large-Scale Multimodal Dataset and Scalable Pipeline to Address Irrigation and Water Management in Agriculture
- Is Artificial Intelligence Generated Image Detection a Solved Problem?
- Is Grokking a Computational Glass Relaxation?
- Is Limited Participant Diversity Impeding EEG-based Machine Learning?
- Is Noise Conditioning Necessary? A Unified Theory of Unconditional Graph Diffusion Models
- Isotropic Noise in Stochastic and Quantum Convex Optimization
- Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs
- Is the acquisition worth the cost? Surrogate losses for Consistent Two-stage Classifiers
- Is This Tracker On? A Benchmark Protocol for Dynamic Tracking
- Is Your Diffusion Model Actually Denoising?
- ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model
- Iterative Foundation Model Fine-Tuning on Multiple Rewards
- Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers
- Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
- It’s Hard to Be Normal: The Impact of Noise on Structure-agnostic Estimation
- Jacobian-Based Interpretation of Nonlinear Neural Encoding Model
- JADE: Joint Alignment and Deep Embedding for Multi-Slice Spatial Transcriptomics
- JAFAR: Jack up Any Feature at Any Resolution
- JailBound: Jailbreaking Internal Safety Boundaries of Vision-Language Models
- Jailbreak-AudioBench: In-Depth Evaluation and Analysis of Jailbreak Threats for Large Audio Language Models
- Jamais Vu: Exposing the Generalization Gap in Supervised Semantic Correspondence
- JAMUN: Bridging Smoothed Molecular Dynamics and Score-Based Learning for Conformational Ensemble Generation
- JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model
- Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
- JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent
- Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation
- JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation
- Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
- Johnson-Lindenstrauss Lemma Beyond Euclidean Geometry
- Joint Design of Protein Surface and Backbone Using a Diffusion Bridge Model
- Joint‑Embedding vs Reconstruction: Provable Benefits of Latent Space Prediction for Self‑Supervised Learning
- Joint Hierarchical Representation Learning of Samples and Features via Informed Tree-Wasserstein Distance
- Joint Modeling of fMRI and EEG Imaging Using Ordinary Differential Equation-Based Hypergraph Neural Networks
- Joint Relational Database Generation via Graph-Conditional Diffusion Models
- Joint Velocity-Growth Flow Matching for Single-Cell Dynamics Modeling
- Jury-and-Judge Chain-of-Thought for Uncovering Toxic Data in 3D Visual Grounding
- Just One Layer Norm Guarantees Stable Extrapolation
- KAIROS: Scalable Model-Agnostic Data Valuation
- KaRF: Weakly-Supervised Kolmogorov-Arnold Networks-based Radiance Fields for Local Color Editing
- KARMA: Leveraging Multi-Agent LLMs for Automated Knowledge Graph Enrichment
- K-DeCore: Facilitating Knowledge Transfer in Continual Structured Knowledge Reasoning via Knowledge Decoupling
- KeeA*: Epistemic Exploratory A* Search via Knowledge Calibration
- Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy
- Keep It on a Leash: Controllable Pseudo-label Generation Towards Realistic Long-Tailed Semi-Supervised Learning
- Keep Vision Group (BTF)
- Kek (BTF)
- Kernel-based Equalized Odds: A Quantification of Accuracy-Fairness Trade-off in Fair Representation Learning
- Kernel conditional tests from learning-theoretic bounds
- Kernel Density Steering: Inference-Time Scaling via Mode Seeking for Image Restoration
- Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization
- Kernel Regression in Structured Non-IID Settings: Theory and Implications for Denoising Score Learning
- Kernel von Mises Formula of the Influence Function
- KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments
- KGGen: Extracting Knowledge Graphs from Plain Text with Language Models
- Kinaema: a recurrent sequence model for memory and pose in motion
- KINDLE: Knowledge-Guided Distillation for Prior-Free Gene Regulatory Network Inference
- Kinetics: Rethinking Test-Time Scaling Law
- KLASS: KL-Guided Fast Inference in Masked Diffusion Models
- KL Penalty Control via Perturbation for Direct Preference Optimization
- KL-Regularized RLHF with Multiple Reference Models: Exact Solutions and Sample Complexity
- Knee-Deep in C-RASP: A Transformer Depth Hierarchy
- Knot So Simple: A Minimalistic Environment for Spatial Reasoning
- Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering
- Knowledge Distillation Detection for Open-weights Models
- Knowledge Distillation of Uncertainty using Deep Latent Factor Model
- Knowledge Graph Enhanced Generative Multi-modal Models for Class-Incremental Learning
- Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better
- Knowledge Starts with Practice: Knowledge-Aware Exercise Generative Recommendation with Adaptive Multi-Agent Cooperation
- KnowMol: Advancing Molecular Large Language Models with Multi-Level Chemical Knowledge
- Know Thyself by Knowing Others: Learning Neuron Identity from Population Context
- Know What You Don't Know: Uncertainty Calibration of Process Reward Models
- KOALA++: Efficient Kalman-Based Optimization with Gradient-Covariance Products
- KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation
- KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models
- KScope: A Framework for Characterizing the Knowledge Status of Language Models
- KSP: Kolmogorov-Smirnov metric-based Post-Hoc Calibration for Survival Analysis
- KTAE: A Model-Free Algorithm to Key-Tokens Advantage Estimation in Mathematical Reasoning
- KungfuBot: Physics-Based Humanoid Whole-Body Control for Learning Highly-Dynamic Skills
- Kuramoto Orientation Diffusion Models
- KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems
- KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows
- KVLink: Accelerating Large Language Models via Efficient KV Cache Reuse
- KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction
- L$^2$M: Mutual Information Scaling Law for Long-Context Language Modeling
- L2DGCN: Learnable Enhancement and Label Selection Dynamic Graph Convolutional Networks for Mitigating Degree Bias
- L2RSI: Cross-view LiDAR-based Place Recognition for Large-scale Urban Scenes via Remote Sensing Imagery
- LabelAny3D: Label Any Object 3D in the Wild
- LABridge: Text–Image Latent Alignment Framework via Mean-Conditioned OU Process
- LabUtopia: High-Fidelity Simulation and Hierarchical Benchmark for Scientific Embodied Agents
- LaM-SLidE: Latent Space Modeling of Spatial Dynamical Systems via Linked Entities
- LangHOPS: Language Grounded Hierarchical Open-Vocabulary Part Segmentation
- LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS
- Language‑Bias‑Resilient Visual Question Answering via Adaptive Multi‑Margin Collaborative Debiasing
- Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale
- Language Modeling by Language Models
- Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations
- Language Models Can Predict Their Own Behavior
- Language Models can Self-Improve at State-Value Estimation for Better Search
- Language Models (Mostly) Know When to Stop Reading
- Language Ranker: A Lightweight Ranking framework for LLM Decoding
- LaRes: Evolutionary Reinforcement Learning with LLM-based Adaptive Reward Search
- Large Language Bayes
- Large Language Diffusion Models
- Large Language Models as End-to-end Combinatorial Optimization Solvers
- Large Language Models as Model Organisms for Human Associative Learning
- Large language models can learn and generalize steganographic chain-of-thought under process supervision
- Large Language Models for Lossless Image Compression: Next-Pixel Prediction in Language Space is All You Need
- Large Language Models Miss the Multi-agent Mark
- Large Language Models Think Too Fast To Explore Effectively
- Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression
- LARGO: Latent Adversarial Reflection through Gradient Optimization for Jailbreaking LLMs
- LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits
- Last Iterate Convergence in Monotone Mean Field Games
- Last-Iterate Convergence of Smooth Regret Matching$^+$ Variants in Learning Nash Equilibria
- Latency NMS Attacks: Is It Real Life or Is It Just Fantasy?
- Latent Chain-of-Thought for Visual Reasoning
- Latent Harmony: Synergistic Unified UHD Image Restoration via Latent Space Regularization and Controllable Refinement
- Latent Mixture of Symmetries for Sample-Efficient Dynamic Learning
- Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution
- Latent Principle Discovery for Language Model Self-Improvement
- Latent Refinement via Flow Matching for Training-free Linear Inverse Problem Solving
- Latent Retrieval Augmented Generation of Cross-Domain Protein Binders
- Latent Space Factorization in LoRA
- Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification
- Lattice Boltzmann Model for Learning Real-World Pixel Dynamicity
- LaViDa: A Large Diffusion Model for Vision-Language Understanding
- LAW 2025: Bridging Language, Agent, and World Models for Reasoning and Planning
- LawShift: Benchmarking Legal Judgment Prediction Under Statute Shifts
- LaX: Boosting Low-Rank Training of Foundation Models via Latent Crossing
- Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation
- LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration
- LayerIF: Estimating Layer Quality for Large Language Models using Influence Functions
- LayerNavigator: Finding Promising Intervention Layers for Efficient Activation Steering in Large Language Models
- Layer-Wise Modality Decomposition for Interpretable Multimodal Sensor Fusion
- Layer-wise Update Aggregation with Recycling for Communication-Efficient Federated Learning
- LBMKGC: Large Model-Driven Balanced Multimodal Knowledge Graph Completion
- LCDB 1.1: A Database Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought
- LC-Opt: Benchmarking Reinforcement Learning and Agentic AI for End-to-End Liquid Cooling Optimization in Data Centers
- Leader360V: A Large-scale, Real-world 360 Video Dataset for Multi-task Learning in Diverse Environment
- LeapFactual: Reliable Visual Counterfactual Explanation Using Conditional Flow Matching
- Learn2Mix: Training Neural Networks Using Adaptive Data Integration
- Learnable Burst-Encodable Time-of-Flight Imaging for High-Fidelity Long-Distance Depth Sensing
- Learnable Sampler Distillation for Discrete Diffusion Models
- Learn and Ensemble Bridge Adapters for Multi-domain Task Incremental Learning
- Learned Prefix Caching for Efficient LLM Inference
- Learning 3D Anisotropic Noise Distributions Improves Molecular Force Fields
- Learning 3D Persistent Embodied World Models
- Learning a Cross-Modal Schrödinger Bridge for Visual Domain Generalization
- Learning Across the Gap: Hybrid Multi-armed Bandits with Heterogeneous Offline and Online Data
- Learning and Planning Multi-Agent Tasks via an MoE-based World Model
- Learning (Approximately) Equivariant Networks via Constrained Optimization
- Learning-Augmented Algorithms for $k$-median via Online Learning
- Learning-Augmented Facility Location Mechanisms for the Envy Ratio Objective
- Learning-Augmented Online Bidding in Stochastic Settings
- Learning-Augmented Online Bipartite Fractional Matching
- Learning-Augmented Streaming Algorithms for Correlation Clustering
- Learning CAD Modeling Sequences via Projection and Part Awareness
- Learning Chern Numbers of Multiband Topological Insulators with Gauge Equivariant Neural Networks
- Learning Cocoercive Conservative Denoisers via Helmholtz Decomposition for Poisson Imaging Inverse Problems
- Learning conformational ensembles of proteins based on backbone geometry
- Learning Counterfactual Outcomes Under Rank Preservation
- Learning Crossmodal Interaction Patterns via Attributed Bipartite Graphs for Single-Cell Omics
- Learning Dense Hand Contact Estimation from Imbalanced Data
- Learning Differential Pyramid Representation for Tone Mapping
- Learning Diffusion Models with Flexible Representation Guidance
- Learning Dynamics of RNNs in Closed-Loop Environments
- Learning Efficient Fuse-and-Refine for Feed-Forward 3D Gaussian Splatting
- Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning
- Learning Expandable and Adaptable Representations for Continual Learning
- Learning from A Single Markovian Trajectory: Optimality and Variance Reduction
- Learning from Delayed Feedback in Games via Extra Prediction
- Learning from Demonstrations via Capability-Aware Goal Sampling
- Learning from Disjoint Views: A Contrastive Prototype Matching Network for Fully Incomplete Multi-View Clustering
- Learning from Interval Targets
- Learning from positive and unlabeled examples -Finite size sample bounds
- Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
- Learning from Time-Series for Health
- Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors
- Learning Generalizable Shape Completion with SIM(3) Equivariance
- Learning Gradient Boosted Decision Trees with Algorithmic Recourse
- Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
- Learning Human-Like RL Agents Through Trajectory Optimization With Action Quantization
- Learning Human-Object Interaction as Groups
- Learning in Compact Spaces with Approximately Normalized Transformer
- Learning Individual Behavior in Agent-Based Models with Graph Diffusion Networks
- Learning in Stackelberg Mean Field Games: A Non-Asymptotic Analysis
- Learning Interactive World Model for Object-Centric Reinforcement Learning
- Learning Interestingness in Automated Mathematical Theory Formation
- Learning Intractable Multimodal Policies with Reparameterization and Diversity Regularization
- Learning Juntas under Markov Random Fields
- Learning Latent Variable Models via Jarzynski-adjusted Langevin Algorithm
- Learning Linear Attention in Polynomial Time
- Learning long range dependencies through time reversal symmetry breaking
- Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling
- Learning Multi-Source and Robust Representations for Continual Learning
- Learning Neural Exposure Fields for View Synthesis
- Learning non-equilibrium diffusions with Schrödinger bridges: from exactly solvable to simulation-free
- Learning normalized image densities via dual score matching
- Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis
- Learning Parameterized Skills from Demonstrations
- Learning “Partner-Aware” Collaborators in Multi-Party Collaboration
- Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift
- Learning Personalized Ad Impact via Contextual Reinforcement Learning under Delayed Rewards
- Learning Preferences without Interaction for Cooperative AI: A Hybrid Offline-Online Approach
- Learning Provably Improves the Convergence of Gradient Descent
- Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws
- Learning Reconfigurable Representations for Multimodal Federated Learning with Missing Data
- Learning Relative Gene Expression Trends from Pathology Images in Spatial Transcriptomics
- Learning Repetition-Invariant Representations for Polymer Informatics
- Learning Robust Spectral Dynamics for Temporal Domain Generalization
- Learning Robust Vision-Language Models from Natural Latent Spaces
- Learning Shared Representations from Unpaired Data
- Learning Simple Interpolants for Linear Integer Arithmetic
- Learning single index models via harmonic decomposition
- Learning Skill-Attributes for Transferable Assessment in Video
- Learning Source-Free Domain Adaptation for Visible-Infrared Person Re-Identification
- Learning Sparse Approximate Inverse Preconditioners for Conjugate Gradient Solvers on GPUs
- Learning Spatial-Aware Manipulation Ordering
- Learning Stochastic Multiscale Models
- Learning Task-Agnostic Representations through Multi-Teacher Distillation
- Learning Temporal 3D Semantic Scene Completion via Optical Flow Guidance
- Learning Theory for Kernel Bilevel Optimization
- Learning the Plasticity: Plasticity-Driven Learning Framework in Spiking Neural Networks
- Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models
- Learning to Add, Multiply, and Execute Algorithmic Instructions Exactly with Neural Networks
- Learning to Better Search with Language Models via Guided Reinforced Self-Training
- Learning to Clean: Reinforcement Learning for Noisy Label Correction
- Learning to cluster neuronal function
- Learning to Condition: A Neural Heuristic for Scalable MPE Inference
- Learning to Control Free-Form Soft Swimmers
- Learning to Factorize Spatio-Temporal Foundation Models
- Learning to Flow from Generative Pretext Tasks for Neural Architecture Encoding
- Learning to Focus: Causal Attention Distillation via Gradient‐Guided Token Pruning
- Learning to Generalize: An Information Perspective on Neural Processes
- Learning to Generate Human-Human-Object Interactions from Textual Descriptions
- Learning to Insert for Constructive Neural Vehicle Routing Solver
- Learning to Instruct for Visual Instruction Tuning
- Learning to Integrate Diffusion ODEs by Averaging the Derivatives
- Learning to Learn with Contrastive Meta-Objective
- Learning to Plan Like the Human Brain via Visuospatial Perception and Semantic-Episodic Synergistic Decision-Making
- Learning to price with resource constraints: from full information to machine-learned prices
- Learning to Rank for In-Context Example Retrieval
- Learning to Reason under Off-Policy Guidance
- Learning to Route: Per-Sample Adaptive Routing for Multimodal Multitask Prediction
- Learning to Sense (L2S)
- Learning to Solve Complex Problems via Dataset Decomposition
- Learning to Specialize: Joint Gating-Expert Training for Adaptive MoEs in Decentralized Settings
- Learning to Steer: Input-dependent Steering for Multimodal LLMs
- Learning to Think: Information-Theoretic Reinforcement Fine-Tuning for LLMs
- Learning to Watermark: A Selective Watermarking Framework for Large Language Models via Multi-Objective Optimization
- Learning to Zoom with Anatomical Relations for Medical Structure Detection
- Learning Urban Climate Dynamics via Physics-Guided Urban Surface–Atmosphere Interactions
- Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
- Learning with Calibration: Exploring Test-Time Computing of Spatio-Temporal Forecasting
- Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks
- Learning Without Augmenting: Unsupervised Time Series Representation Learning via Frame Projections
- Learning with Restricted Boltzmann Machines: Asymptotics of AMP and GD in High Dimensions
- Learning with Statistical Equality Constraints
- Learning World Models for Interactive Video Generation
- Least squares variational inference
- Leaving No OOD Instance Behind: Instance-Level OOD Fine-Tuning for Anomaly Segmentation
- LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
- LeMiCa: Lexicographic Minimax Path Caching for Efficient Diffusion-Based Video Generation
- Length Generalization via Auxiliary Tasks
- Less but More: Linear Adaptive Graph Learning Empowering Spatiotemporal Forecasting
- Less Greedy Equivalence Search
- Less is More: an Attention-free Sequence Prediction Modeling for Offline Embodied Learning
- Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior
- Less is More: Improving LLM Alignment via Preference Data Selection
- Less is More: Local Intrinsic Dimensions of Contextual Language Models
- Less is More: Unlocking Specialization of Time Series Foundation Models via Structured Pruning
- Lessons Learned: A Multi-Agent Framework for Code LLMs to Learn and Improve
- Let a Neural Network be Your Invariant
- Let Brain Rhythm Shape Machine Intelligence for Connecting Dots on Graphs
- Let LRMs Break Free from Overthinking via Self-Braking Tuning
- Let Me Think! A Long Chain of Thought Can Be Worth Exponentially Many Short Ones
- Let's Revise Step-by-Step: A Unified Local Search Framework for Code Generation with LLMs
- Let the LLM Stick to Its Strengths: Learning to Route Economical LLM
- Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
- Leveraging Conditional Dependence for Efficient World Model Denoising
- Leveraging Depth and Language for Open-Vocabulary Domain-Generalized Semantic Segmentation
- Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
- Leveraging robust optimization for llm alignment under distribution shifts
- Leveraging semantic similarity for experimentation with AI-generated treatments
- LeVo: High-Quality Song Generation with Multi-Preference Alignment
- LexiCon: a Benchmark for Planning under Temporal Constraints in Natural Language
- LibriBrain: Over 50 Hours of Within-Subject MEG to Improve Speech Decoding Methods at Scale
- Lie Detector: Unified Backdoor Detection via Cross-Examination Framework
- LIFEBENCH: Evaluating Length Instruction Following in Large Language Models
- Lifelong Safety Alignment for Language Models
- Lifelong Test-Time Adaptation via Online Learning in Tracked Low-Dimensional Subspace
- LightFair: Towards an Efficient Alternative for Fair T2I Diffusion via Debiasing Pre-trained Text Encoders
- Light-Weight Diffusion Multiplier and Uncertainty Quantification for Fourier Neural Operators
- LILO: Learning to Reason at the Frontier of Learnability
- Limitations of Normalization in Attention
- Limited Preference Data? Learning Better Reward Model with Latent Space Synthesis
- LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling
- Linear Attention for Efficient Bidirectional Sequence Modeling
- Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials
- Linearization Explains Fine-Tuning in Large Language Models
- Linearly Constrained Diffusion Implicit Models
- Linear Mixture Distributionally Robust Markov Decision Processes
- Linear Transformers Implicitly Discover Unified Numerical Algorithms
- LinEAS: End-to-end Learning of Activation Steering with a Distributional Loss
- Linguini: A benchmark for language-agnostic linguistic reasoning
- LinPrim: Linear Primitives for Differentiable Volumetric Rendering
- LISAt: Language-Instructed Segmentation Assistant for Satellite Imagery
- Listening to the Brain: Multi-Band sEEG Auditory Reconstruction via Dynamic Spatio-Temporal Hypergraphs
- List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression
- Listwise Preference Diffusion Optimization for User Behavior Trajectories Prediction
- LiteReality: Graphic-Ready 3D Scene Reconstruction from RGB-D Scans
- LithoSim: A Large, Holistic Lithography Simulation Benchmark for AI-Driven Semiconductor Manufacturing
- LittleBit: Ultra Low-Bit Quantization via Latent Factorization
- LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?
- LiveStar: Live Streaming Assistant for Real-World Online Video Understanding
- LLM at Network Edge: A Layer-wise Efficient Federated Fine-tuning Approach
- LLM-DAMVC: A Large Language Model Assisted Dynamic Agent for Multi-View Clustering
- LLM-Driven Treatment Effect Estimation Under Inference Time Text Confounding
- LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models
- LLM Generated Persona is a Promise with a Catch
- LLM Interpretability with Identifiable Temporal-Instantaneous Representation
- LLM Layers Immediately Correct Each Other
- LLM Meeting Decision Trees on Tabular Data
- LLM Meets Diffusion: A Hybrid Framework for Crystal Material Generation
- LLM-PySC2: Starcraft II learning environment for Large Language Models
- LLM Query Scheduling with Prefix Reuse and Latency Constraints
- LLM Safety Alignment is Divergence Estimation in Disguise
- LLMs Encode Harmfulness and Refusal Separately
- LLM Strategic Reasoning: Agentic Study through Behavioral Game Theory
- LLM Unlearning via Neural Activation Redirection
- LMFusion: Adapting Pretrained Language Models for Multimodal Generation
- L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models
- Local Curvature Descent: Squeezing More Curvature out of Standard and Polyak Gradient Descent
- Local-Global Associative Frames for Symmetry-Preserving Crystal Structure Modeling
- Local-Global Coupling Spiking Graph Transformer for Brain Disorders Diagnosis from Two Perspectives
- Localist Topographic Expert Routing: A Barrel Cortex-Inspired Modular Network for Sensorimotor Processing
- Locality in Image Diffusion Models Emerges from Data Statistics
- Localized Data Shapley: Accelerating Valuation for Nearest Neighbor Algorithms
- Localizing Knowledge in Diffusion Transformers
- Local Learning for Covariate Selection in Nonparametric Causal Effect Estimation with Latent Variables
- Locally Optimal Private Sampling: Beyond the Global Minimax
- LocDiff: Identifying Locations on Earth by Diffusing in the Hilbert Space
- Lock-LLM Workshop: Prevent Unauthorized Knowledge Use from Large Language Models - Deep Dive into Un-Distillate, Un-Finetunable, Un-Compressible, Un-Editable, and Un-Usable
- LODGE: Level-of-Detail Large-Scale Gaussian Splatting with Efficient Rendering
- Logical Expressiveness of Graph Neural Networks with Hierarchical Node Individualization
- Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding
- Logic.py: Bridging the Gap between LLMs and Constraint Solvers
- LogicTree: Improving Complex Reasoning of LLMs via Instantiated Multi-step Synthetic Logical Data
- LOMIA: Label-Only Membership Inference Attacks against Pre-trained Large Vision-Language Models
- LoMix: Learnable Weighted Multi-Scale Logits Mixing for Medical Image Segmentation
- Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning
- LongMagpie: A Self-synthesis Method for Generating Large-scale Long-context Instructions
- Long-Tailed Recognition via Information-Preservable Two-Stage Learning
- Long-tailed Recognition with Model Rebalancing
- Long-term Intracortical Neural activity and Kinematics (LINK): An intracortical neural dataset for chronic brain-machine interfaces, neuroscience, and machine learning
- LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization
- LooGLE v2: Are LLMs Ready for Real World Long Dependency Challenges?
- Look-Ahead Reasoning on Learning Platforms
- Lookahead Routing for Large Language Models
- Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation
- Looking Beyond the Known: Towards a Data Discovery Guided Open-World Object Detection
- Looking Into the Water by Unsupervised Learning of the Surface Shape
- LookWhere? Efficient Visual Recognition by Learning Where to Look and What to See from Self-Supervision
- LOPT: Learning Optimal Pigovian Tax in Sequential Social Dilemmas
- Loquetier: A Virtualized Multi-LoRA Framework for Unified LLM Fine-tuning and Serving
- LoRA-EnVar: Parameter-Efficient Hybrid Ensemble Variational Assimilation for Weather Forecasting
- LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers
- LoRASuite: Efficient LoRA Adaptation Across Large Language Model Upgrades
- LoRATv2: Enabling Low-Cost Temporal Modeling in One-Stream Trackers
- LoRA vs Full Fine-tuning: An Illusion of Equivalence
- LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders
- Lorentz Local Canonicalization: How to make any Network Lorentz-Equivariant
- LoRO: Real-Time on-Device Secure Inference for LLMs via TEE-Based Low Rank Obfuscation
- LoSplit: Loss-Guided Dynamic Split for Training-Time Defense Against Graph Backdoor Attacks
- Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation
- Lost in Transmission: When and Why LLMs Fail to Reason Globally
- LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning
- Low-degree evidence for computational transition of recovery rate in stochastic block model
- Low Precision Streaming PCA
- Low Rank Gradients and Where to Find Them
- Low-Rank Graphon Learning for Networks
- Low-Rank Head Avatar Personalization with Registers
- LTD-Bench: Evaluating Large Language Models by Letting Them Draw
- LT-Soups: Bridging Head and Tail Classes via Subsampled Model Soups
- Lua-LLM: Learning Unstructured-Sparsity Allocation for Large Language Models
- Luminance-Aware Statistical Quantization: Unsupervised Hierarchical Learning for Illumination Enhancement
- LUNA: Efficient and Topology-Agnostic Foundation Model for EEG Signal Analysis
- LuxDiT: Lighting Estimation with Video Diffusion Transformer
- LVLM-Driven Attribute-Aware Modeling for Visible-Infrared Person Re-Identification
- Lyapunov-Stable Adaptive Control for Multimodal Concept Drift
- Machine Learning and the Physical Sciences
- Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy and Research
- Machine Unlearning in 3D Generation: A Perspective-Coherent Acceleration Framework
- Machine Unlearning under Overparameterization
- Machine Unlearning via Task Simplex Arithmetic
- macOSWorld: A Multilingual Interactive Benchmark for GUI Agents
- MACS: Multi-Agent Reinforcement Learning for Optimization of Crystal Structures
- MAESTRO : Adaptive Sparse Attention and Robust Learning for Multimodal Dynamic Time Series
- MagCache: Fast Video Generation with Magnitude-Aware Cache
- Magical: Medical Lay Language Generation via Semantic Invariance and Layperson-tailored Adaptation
- MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
- MaintainCoder: Maintainable Code Generation Under Dynamic Requirements
- Majority of the Bests: Improving Best-of-N via Bootstrapping
- Make Information Diffusion Explainable: LLM-based Causal Framework for Diffusion Prediction
- Making Classic GNNs Strong Baselines Across Varying Homophily: A Smoothness–Generalization Perspective
- MALinZero: Efficient Low-Dimensional Search for Mastering Complex Multi-Agent Planning
- Mamba Goes HoME: Hierarchical Soft Mixture-of-Experts for 3D Medical Image Segmentation
- Mamba Modulation: On the Length Generalization of Mamba Models
- Mamba Only Glances Once (MOGO): A Lightweight Framework for Efficient Video Action Detection
- MaNGO — Adaptable Graph Network Simulators via Meta-Learning
- MANGO: Multimodal Attention-based Normalizing Flow Approach to Fusion Learning
- Manipulating 3D Molecules in a Fixed-Dimensional E(3)-Equivariant Latent Space
- Manipulating Feature Visualizations with Gradient Slingshots
- Many LLMs Are More Utilitarian Than One
- Many Minds, One Goal: Time Series Forecasting via Sub-task Specialization and Inter-agent Cooperation
- MAP Estimation with Denoisers: Convergence Rates and Guarantees
- MAPLE: Multi-scale Attribute-enhanced Prompt Learning for Few-shot Whole Slide Image Classification
- Marginal-Nonuniform PAC Learnability
- Markov Persuasion Processes: Learning to Persuade From Scratch
- MARS: A Malignity-Aware Backdoor Defense in Federated Learning
- Mars-Bench: A Benchmark for Evaluating Foundation Models for Mars Science Tasks
- MARS-VFL: A Unified Benchmark for Vertical Federated Learning with Realistic Evaluation
- Martian World Model: Controllable Video Synthesis with Physically Accurate 3D Reconstructions
- Martingale Posterior Neural Networks for Fast Sequential Decision Making
- Martingale Score: An Unsupervised Metric for Bayesian Rationality in LLM Reasoning
- Masked Diffusion Models as Energy Minimization
- Masked Gated Linear Unit
- Mask Image Watermarking
- Massive Sound Embedding Benchmark (MSEB)
- MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching
- MAT-Agent: Adaptive Multi-Agent Training Optimization
- Matching Markets Meet LLMs: Algorithmic Reasoning with Ranked Preferences
- Matchings Under Biased and Correlated Evaluations
- MATCH: Multi-faceted Adaptive Topo-Consistency for Semi-Supervised Histopathology Segmentation
- MaterialRefGS: Reflective Gaussian Splatting with Multi-view Consistent Material Inference
- MATH-AI: The 5th Workshop on Mathematical Reasoning and AI
- MathArena: Evaluating LLMs on Uncontaminated Math Competitions
- Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs
- Max Entropy Moment Kalman Filter for Polynomial Systems with Arbitrary Noise
- Maximizing the Value of Predictions in Control: Accuracy Is Not Enough
- MaxSup: Overcoming Representation Collapse in Label Smoothing
- MDNS: Masked Diffusion Neural Sampler via Stochastic Optimal Control
- MDReID: Modality-Decoupled Learning for Any-to-Any Multi-Modal Object Re-Identification
- Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning
- Mean Flows for One-step Generative Modeling
- Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning
- Measure-Theoretic Anti-Causal Representation Learning
- Measuring AI Ability to Complete Long Software Tasks
- Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks
- Measuring and Guiding Monosemanticity
- Measuring Fingerprints of Web-filtered Text Datasets and Fingerprint Propagation Through Training
- Measuring Scientific Capabilities of Language Models with a Systems Biology Dry Lab
- Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models
- Measuring what Matters: Construct Validity in Large Language Model Benchmarks
- MeCeFO: Enhancing LLM Training Robustness via Fault-Tolerant Optimization
- Mechanism Design for LLM Fine-tuning with Multiple Reward Models
- Mechanism Design via the Interim Relaxation
- Mechanistic Interpretability of RNNs emulating Hidden Markov Models
- MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks
- MedChain: Bridging the Gap Between LLM Agents and Clinical Practice with Interactive Sequence
- Median Selection with Noisy and Structural Information
- MedicalNarratives: Connecting Medical Vision and Language with Localized Narratives
- MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants
- MedSG-Bench: A Benchmark for Medical Image Sequences Grounding
- MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation
- MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation
- MEIcoder: Decoding Visual Stimuli from Neural Activity by Leveraging Most Exciting Inputs
- Mellow: a small audio language model for reasoning
- MemEIC: A Step Toward Continual and Compositional Knowledge Editing
- MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs
- Memorization in Graph Neural Networks
- Memory-Augmented Potential Field Theory: A Framework for Adaptive Control in Non-Convex Domains
- Memory by accident: a theory of learning as a byproduct of network stabilization
- Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
- Memory-Efficient Training with In-Place FFT Implementation
- Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
- Memory-Enhanced Neural Solvers for Routing Problems
- Memory Injection Attacks on LLM Agents via Query-Only Interaction
- Memory-Integrated Reconfigurable Adapters: A Unified Framework for Settings with Multiple Tasks
- Memory Mosaics at scale
- Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning
- MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants
- MenaML (BTF)
- MergeBench: A Benchmark for Merging Domain-Specialized LLMs
- Merging on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging
- MERIT: Multilingual Semantic Retrieval with Interleaved Multi-Condition Query
- Merlin L48 Spectrogram Dataset
- MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning
- MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
- Mesh Interpolation Graph Network for Dynamic and Spatially Irregular Global Weather Forecasting
- Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning
- MESS+: Dynamically Learned Inference-Time LLM Routing in Model Zoos with Service Level Guarantees
- MetaBox-v2: A Unified Benchmark Platform for Meta-Black-Box Optimization
- Meta CLIP 2: A Worldwide Scaling Recipe
- Meta-D2AG: Causal Graph Learning with Interventional Dynamic Data
- MetaDefense: Defending Fine-tuning based Jailbreak Attack Before and During Generation
- MetaFind: Scene-Aware 3D Asset Retrieval for Coherent Metaverse Scene Generation
- MetaGS: A Meta-Learned Gaussian-Phong Model for Out-of-Distribution 3D Scene Relighting
- Meta Guidance: Incorporating Inductive Biases into Deep Time Series Imputers
- MetaKoopman: Bayesian Meta-Learning of Koopman Operators for Modeling Structured Dynamics under Distribution Shifts
- Meta-Learning an In-Context Transformer Model of Human Higher Visual Cortex
- Meta-learning how to Share Credit among Macro-Actions
- Meta-Learning Objectives for Preference Optimization
- MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
- MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning
- metaTextGrad: Automatically optimizing language model optimizers
- Meta-World+: An Improved, Standardized, RL Benchmark
- Metis: A Foundation Speech Generation Model with Masked Generative Pre-training
- Metric Automata Theory: A Unifying Theory of RNNs
- Metritocracy: Representative Metrics for Lite Benchmarks
- Metropolis Adjusted Microcanonical Hamiltonian Monte Carlo
- Metropolis-Hastings Sampling for 3D Gaussian Reconstruction
- MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework
- MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
- MGUP: A Momentum-Gradient Alignment Update Policy for Stochastic Optimization
- MIBP-Cert: Certified Training against Data Perturbations with Mixed-Integer Bilinear Programs
- MiCADangelo: Fine-Grained Reconstruction of Constrained CAD Models from 3D Scans
- Michigan Technological University (BTF)
- MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
- MIDAS: Misalignment-based Data Augmentation Strategy for Imbalanced Multimodal Learning
- MigGPT: Harnessing Large Language Models for Automated Migration of Out-of-Tree Linux Kernel Patches Across Versions
- MIHC: Multi-View Interpretable Hypergraph Neural Networks with Information Bottleneck for Chip Congestion Prediction
- Military AI Needs Technically-Informed Regulation to Safeguard AI Research and its Applications
- MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models
- Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge
- MindForge: Empowering Embodied Agents with Theory of Mind for Lifelong Cultural Learning
- MindGYM: What Matters in Question Synthesis for Thinking-Centric Fine-Tuning?
- MindJourney: Test-Time Scaling with World Models for Spatial Reasoning
- MIND: Material Interface Generation from UDFs for Non-Manifold Surface Reconstruction
- MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
- Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning
- Mind the Gap: Removing the Discretization Gap in Differentiable Logic Gate Networks
- Mind the GAP! The Challenges of Scale in Pixel-based Deep Reinforcement Learning
- Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation
- Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules
- MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents
- MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
- miniF2F-Lean Revisited: Reviewing Limitations and Charting a Path Forward
- Minimal Semantic Sufficiency Meets Unsupervised Domain Generalization
- Minimax Adaptive Online Nonparametric Regression over Besov spaces
- Minimax-Optimal Univariate Function Selection in Sparse Additive Models: Rates, Adaptation, and the Estimation-Selection Gap
- MiniMax-Remover: Taming Bad Noise Helps Video Object Removal
- Minimizing False-Positive Attributions in Explanations of Non-Linear Models
- Minimum Width for Deep, Narrow MLP: A Diffeomorphism Approach
- Minorities in Medicine 501(c)(3) (BTF)
- Mint: A Simple Test-Time Adaptation of Vision-Language Models against Common Corruptions
- MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning
- MiNT: Multi-Network Transfer Benchmark for Temporal Graph Learning
- MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents
- MIRAGE: A Benchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided Conversations
- MIRAGE: Assessing Hallucination in Multimodal Reasoning Chains of MLLM
- MIRA: Medical Time Series Foundation Model for Real-World Health Data
- MIR-Bench: Can Your LLM Recognize Complicated Patterns via Many-Shot In-Context Reasoning?
- MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling
- MisoDICE: Multi-Agent Imitation from Mixed-Quality Demonstrations
- Missing Data Imputation by Reducing Mutual Information with Rectified Flows
- Miss-ReID: Delivering Robust Multi-Modality Object Re-Identification Despite Missing Modalities
- MIT-CALC (BTF)
- Mitigating Forgetting in LLM Fine-Tuning via Low-Perplexity Token Learning
- Mitigating Hallucination in VideoLLMs via Temporal-Aware Activation Engineering
- Mitigating Hallucination Through Theory-Consistent Symmetric Multimodal Preference Optimization
- Mitigating Instability in High Residual Adaptive Sampling for PINNs via Langevin Dynamics
- Mitigating Intra- and Inter-modal Forgetting in Continual Learning of Unified Multimodal Models
- Mitigating Occlusions in Virtual Try-On via A Simple-Yet-Effective Mask-Free Framework
- Mitigating Overthinking in Large Reasoning Models via Manifold Steering
- Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling
- Mitigating Semantic Collapse in Partially Relevant Video Retrieval
- Mitigating Sexual Content Generation via Embedding Distortion in Text-conditioned Diffusion Models
- Mitigating Spurious Features in Contrastive Learning with Spectral Regularization
- Mitigating the Privacy–Utility Trade-off in Decentralized Federated Learning via f-Differential Privacy
- Mitra: Mixed Synthetic Priors for Enhancing Tabular Foundation Models
- MI-TRQR: Mutual Information-Based Temporal Redundancy Quantification and Reduction for Energy-Efficient Spiking Neural Networks
- MIX: A Multi-view Time-Frequency Interactive Explanation Framework for Time Series Classification
- MixAT: Combining Continuous and Discrete Adversarial Training for LLMs
- Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging
- Mixed-Sample SGD: an End-to-end Analysis of Supervised Transfer Learning
- Mixing Expert Knowledge: Bring Human Thoughts Back To the Game of Go
- MixPrompt: Efficient Mixed Prompting for Multimodal Semantic Segmentation
- MixSignGraph: A Sign Sequence is Worth Mixed Graphs of Nodes
- Mixture-of-Experts Meets In-Context Reinforcement Learning
- Mixture-of-Experts Operator Transformer for Large-Scale PDE Pre-Training
- Mixture of Inputs: Text Generation Beyond Discrete Token Sampling
- Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
- Mixture of Scope Experts at Test: Generalizing Deeper Graph Neural Networks with Shallow Variants
- Mixtures of Subspaces for Bandwidth Efficient Context Parallel Training
- MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
- MJ-Video: Benchmarking and Rewarding Video Generation with Fine-Grained Video Preference
- ML4CFD Competition: Results and Retrospective Analysis
- ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs
- MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
- MLEP: Multi-granularity Local Entropy Patterns for Generalized AI-generated Image Detection
- MLE-STAR: Machine Learning Engineering Agent via Search and Targeted Refinement
- ML for Systems
- MLIP Arena: Advancing Fairness and Transparency in Machine Learning Interatomic Potentials via an Open, Accessible Benchmark Platform
- MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation
- MLLM-ISU: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models based Intrusion Scene Understanding
- MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
- MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research
- MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?
- ML x OR: Mathematical Foundations and Operational Integration of Machine Learning for Uncertainty-Aware Decision-Making
- MLZero: A Multi-Agent System for End-to-end Machine Learning Automation
- MMaDA: Multimodal Large Diffusion Language Models
- MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem
- MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
- MMCSBench: A Fine-Grained Benchmark for Large Vision-Language Models in Camouflage Scenes
- MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
- MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios
- MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models
- MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly
- MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning
- MM-OPERA: Benchmarking Open-ended Association Reasoning for Large Vision-Language Models
- MMOT: The First Challenging Benchmark for Drone-based Multispectral Multi-Object Tracking
- MMPB: It’s Time for Multi-Modal Personalization
- MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness
- MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark
- mmWalk: Towards Multi-modal Multi-view Walking Assistance
- MoBA: Mixture of Block Attention for Long-Context LLMs
- MobileODE: An Extra Lightweight Network
- MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation
- MOBO-OSD: Batch Multi-Objective Bayesian Optimization via Orthogonal Search Directions
- MoCha: Towards Movie-Grade Talking Character Generation
- Modality-Aware SAM: Sharpness-Aware-Minimization Driven Gradient Modulation for Harmonized Multimodal Learning
- Model-Based Policy Adaptation for Closed-Loop End-to-end Autonomous Driving
- Model–Behavior Alignment under Flexible Evaluation: When the Best-Fitting Model Isn’t the Right One
- Model Editing for Vision Transformers
- Model-Guided Dual-Role Alignment for High-Fidelity Open-Domain Video-to-Audio Generation
- Model-Informed Flows for Bayesian Inference
- Modeling Cell Dynamics and Interactions with Unbalanced Mean Field Schrödinger Bridge
- Modeling Dynamic Neural Activity by combining Naturalistic Video Stimuli and Stimulus-independent Latent Factors
- Modeling Microenvironment Trajectories on Spatial Transcriptomics with NicheFlow
- Modeling Neural Activity with Conditionally Linear Dynamical Systems
- Modeling the Economic Impacts of AI Openness Regulation
- Model Inversion with Layer-Specific Modeling and Alignment for Data-Free Continual Learning
- Modelling the control of offline processing with reinforcement learning
- Model Merging in Pre-training of Large Language Models
- Model Merging: Theory, Practice and Applications
- Model Provenance Testing for Large Language Models
- Model Reconciliation via Cost-Optimal Explanations in Probabilistic Logic Programming
- Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
- MODEL SHAPLEY: Find Your Ideal Parameter Player via One Gradient Backpropagation
- MODEM: A Morton-Order Degradation Estimation Mechanism for Adverse Weather Image Recovery
- ModHiFi: Identifying High Fidelity predictive components for Model Modification
- ModuLM: Enabling Modular and Multimodal Molecular Relational Learning with Large Language Models
- MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems
- MoE-Gyro: Self-Supervised Over-Range Reconstruction and Denoising for MEMS Gyroscopes
- MoEMeta: Mixture-of-Experts Meta Learning for Few-Shot Relational Learning
- MoESD: Unveil Speculative Decoding's Potential for Accelerating Sparse MoE
- MOF-BFN: Metal-Organic Frameworks Structure Prediction via Bayesian Flow Networks
- MoFo: Empowering Long-term Time Series Forecasting with Periodic Pattern Modeling
- MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details
- MokA: Multimodal Low-Rank Adaptation for MLLMs
- MoleBridge: Synthetic Space Projecting with Discrete Markov Bridges
- Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model
- MolVision: Molecular Property Prediction with Vision Language Models
- MoME: Mixture of Matryoshka Experts for Audio-Visual Speech Recognition
- Moment- and Power-Spectrum-Based Gaussianity Regularization for Text-to-Image Models
- MomentSeeker: A Task-Oriented Benchmark For Long-Video Moment Retrieval
- Momentum Multi-Marginal Schrödinger Bridge Matching
- Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
- MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention
- MoniTor: Exploiting Large Language Models with Instruction for Online Video Anomaly Detection
- Monitoring Risks in Test-Time Adaptation
- MONITRS: Multimodal Observations of Natural Incidents Through Remote Sensing
- Monoculture or Multiplicity: Which Is It?
- MonoLift: Learning 3D Manipulation Policies from Monocular RGB via Distillation
- Monotone and Separable Set Functions: Characterizations and Neural Models
- MoodAngels: A Retrieval-augmented Multi-agent Framework for Psychiatry Diagnosis
- MoonCast: High-Quality Zero-Shot Podcast Generation
- MoORE: SVD-based Model MoE-ization for Conflict- and Oblivion-Resistant Multi-Task Adaptation
- MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search
- MoPFormer: Motion-Primitive Transformer for Wearable-Sensor Activity Recognition
- MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding
- More effort is needed to protect pedestrian privacy in the era of AI
- More of the Same: Persistent Representational Harms Under Increased Representation
- More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models
- More Than Just Functional: LLM-as-a-Critique for Efficient Code Generation
- More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models
- MoRIC: A Modular Region-based Implicit Codec for Image Compression
- MOSDT: Self-Distillation-Based Decision Transformer for Multi-Agent Offline Safe Reinforcement Learning
- 🎧MOSPA: Human Motion Generation Driven by Spatial Audio
- Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding
- MotionBind: Multi-Modal Human Motion Alignment for Retrieval, Recognition, and Generation
- Motion Matters: Compact Gaussian Streaming for Free-Viewpoint Video Reconstruction
- MOTION: Multi-Sculpt Evolutionary Coarsening for Federated Continual Graph Learning
- MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation
- Mozart: Modularized and Efficient MoE Training on 3.5D Wafer-Scale Chiplet Architectures
- MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference
- MPMAvatar: Learning 3D Gaussian Avatars with Accurate and Robust Physics-Based Dynamics
- MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation
- MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization
- MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations
- MR. Video: MapReduce as an Effective Principle for Long Video Understanding
- MS-BART: Unified Modeling of Mass Spectra and Molecules for Structure Elucidation
- MS-Bench: Evaluating LMMs in Ancient Manuscript Study through a Dunhuang Case Study
- msf-CNN: Patch-based Multi-Stage Fusion with Convolutional Neural Networks for TinyML
- MS-GS: Multi-Appearance Sparse-View 3D Gaussian Splatting in the Wild
- MSTAR: Box-free Multi-query Scene Text Retrieval with Attention Recycling
- MTBBench: A Multimodal Sequential Clinical Decision-Making Benchmark in Oncology
- MTL-KD: Multi-Task Learning Via Knowledge Distillation for Generalizable Neural Vehicle Routing Solver
- MTRec: Learning to Align with User Preferences via Mental Reward Models
- Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
- Multi-Agent Collaboration via Evolving Orchestration
- Multi-Agent Debate for LLM Judges with Adaptive Stability Detection
- Multi-Agent Imitation by Learning and Sampling from Factorized Soft Q-Function
- Multi-agent KTO: Enhancing Strategic Interactions of Large Language Model in Language Game
- Multi-Agent Learning under Uncertainty: Recurrence vs. Concentration
- Multi-agent Markov Entanglement
- Multi-Agent Reinforcement Learning with Communication-Constrained Priors
- Multiclass Loss Geometry Matters for Generalization of Gradient Descent in Separable Classification
- Multi-Class Support Vector Machine with Differential Privacy
- Multi-dataset Joint Pre-training of Emotional EEG Enables Generalizable Affective Computing
- Multidimensional Bayesian Utility Maximization: Tight Approximations to Welfare
- Multi-Environment POMDPs: Discrete Model Uncertainty Under Partial Observability
- Multi-Expert Distributionally Robust Optimization for Out-of-Distribution Generalization
- Multi-head Temporal Latent Attention
- Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent
- MultiHuman-Testbench: Benchmarking Image Generation for Multiple Humans
- Multi-Kernel Correlation-Attention Vision Transformer for Enhanced Contextual Understanding and Multi-Scale Integration
- Multilevel neural simulation-based inference
- Multimodal 3D Genome Pre-training
- Multimodal Algorithmic Reasoning Workshop
- Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms
- Multimodal Causal Reasoning for UAV Object Detection
- Multi-modal contrastive learning adapts to intrinsic dimensions of shared latent variables
- Multimodal Disease Progression Modeling via Spatiotemporal Disentanglement and Multiscale Alignment
- Multi-Modal Interactive Agent Layer for Few-Shot Universal Cross-Domain Retrieval and Beyond
- Multimodal LiDAR-Camera Novel View Synthesis with Unified Pose-free Neural Fields
- Multimodal Negative Learning
- Multimodal Tabular Reasoning with Privileged Structured Information
- Multi-Modal View Enhanced Large Vision Models for Long-Term Time Series Forecasting
- MultiNet: Adaptive Multi-Viewed Subgraph Convolutional Networks for Graph Classification
- Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability Graphs
- Multi-Objective One-Shot Pruning for Large Language Models
- Multi-Objective Reinforcement Learning with Max-Min Criterion: A Game-Theoretic Approach
- Multi-order Orchestrated Curriculum Distillation for Model-Heterogeneous Federated Graph Learning
- Multiplayer Federated Learning: Reaching Equilibrium with Less Communication
- Multiplication-Free Parallelizable Spiking Neurons with Efficient Spatio-Temporal Dynamics
- Multipole Attention for Efficient Long Context Reasoning
- Multiresolution Analysis and Statistical Thresholding on Dynamic Networks
- MultiScale Contextual Bandits for Long Term Objectives
- Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
- Multiscale guidance of protein structure prediction with heterogeneous cryo-EM data
- Multi-scale Temporal Prediction via Incremental Generation and Multi-agent Collaboration
- Multi-step Visual Reasoning with Visual Tokens Scaling and Verification
- Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
- Multitask Learning with Stochastic Interpolants
- Multi-Task Vehicle Routing Solver via Mixture of Specialized Experts under State-Decomposable MDP
- Multi-Token Prediction Needs Registers
- Multivariate Dynamic Mediation Analysis under a Reinforcement Learning Framework
- Multivariate Latent Recalibration for Conditional Normalizing Flows
- Multivariate Time Series Anomaly Detection with Idempotent Reconstruction
- Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
- Multi-View Oriented GPLVM: Expressiveness and Efficiency
- MUniverse: A Simulation and Benchmarking Suite for Motor Unit Decomposition
- MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining
- MURKA: Multi-Reward Reinforcement Learning with Knowledge Alignment for Optimization Tasks
- MuSLR: Multimodal Symbolic Logical Reasoning
- MUSTAFAR: Promoting Unstructured Sparsity for KV Cache Pruning in LLM Inference
- MutualVPR: A Mutual Learning Framework for Resolving Supervision Inconsistencies via Adaptive Clustering
- MUVR: A Multi-Modal Untrimmed Video Retrieval Benchmark with Multi-Level Visual Correspondence
- MV-CoLight: Efficient Object Compositing with Consistent Lighting and Shadow Generation
- MVSMamba: Multi-View Stereo with State Space Model
- MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
- MyoChallenge 2024: A New Benchmark for Physiological Dexterity and Agility in Bionic Humans
- Mysteries of the Deep: Role of Intermediate Representations in Out of Distribution Detection
- Nabla-R2D3: Effective and Efficient 3D Diffusion Alignment with 2D Rewards
- NaDRO: Leveraging Dual-Reward Strategies for LLMs Training on Noisy Data
- Native-Resolution Image Synthesis
- Native Segmentation Vision Transformers
- Natural Gradient VI: Guarantees for Non-Conjugate Models
- NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
- NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding
- NavBench: Probing Multimodal Large Language Models for Embodied Navigation
- Navigating the MIL Trade-Off: Flexible Pooling for Whole Slide Image Classification
- NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints
- NAVIX: Scaling MiniGrid Environments with JAX
- Near-Exponential Savings for Population Mean Estimation with Active Learning
- Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference
- Nearly-Linear Time and Massively Parallel Algorithms for $k$-anonymity
- Nearly-Linear Time Private Hypothesis Selection with the Optimal Approximation Factor
- Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models
- Near-Optimal Quantum Algorithms for Computing (Coarse) Correlated Equilibria of General-Sum Games
- Near-Optimal Regret-Queue Length Tradeoff in Online Learning for Two-Sided Markets
- Near-Optimal Sample Complexity for Online Constrained MDPs
- NEED: Cross-Subject and Cross-Task Generalization for Video and Image Reconstruction from EEG Signals
- NeedleInATable: Exploring Long-Context Capability of Large Language Models towards Long-Structured Tables
- Negative Feedback Really Matters: Signed Dual-Channel Graph Contrastive Learning Framework for Recommendation
- NegoCollab: A Common Representation Negotiation Approach for Heterogeneous Collaborative Perception
- Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
- Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
- Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
- NEP: Autoregressive Image Editing via Next Editing Token Prediction
- Neptune-X: Active X-to-Maritime Generation for Universal Maritime Object Detection
- NerfBaselines: Consistent and Reproducible Evaluation of Novel View Synthesis Methods
- NestedFP: High-Performance, Memory-Efficient Dual-Precision Floating Point Support for LLMs
- Nested Learning: The Illusion of Deep Learning Architectures
- NeSyPr: Neurosymbolic Proceduralization For Efficient Embodied Reasoning
- Network two-sample test for block models
- Neural Atlas Graphs for Dynamic Scene Decomposition and Editing
- Neural Attention Search
- Neural B-frame Video Compression with Bi-directional Reference Harmonization
- Neural Collapse in Cumulative Link Models for Ordinal Regression: An Analysis with Unconstrained Feature Model
- Neural Collapse is Globally Optimal in Deep Regularized ResNets and Transformers
- Neural Collapse under Gradient Flow on Shallow ReLU Networks for Orthogonally Separable Data
- Neural Combinatorial Optimization for Time Dependent Traveling Salesman Problem
- Neural Correlates of Serial Dependence: Synaptic Short-term Plasticity Orchestrates Repulsion and Attraction
- Neural-Driven Image Editing
- Neural Emulator Superiority: When Machine Learning for PDEs Surpasses its Training Data
- Neural Entropy
- Neural Evolution Strategy for Black-box Pareto Set Learning
- Neural Fractional Attention Differential Equations
- Neural Green’s Functions
- Neural Hamiltonian Diffusions for Modeling Structured Geometric Dynamics
- Neural MJD: Neural Non-Stationary Merton Jump Diffusion for Time Series Prediction
- Neural Mutual Information Estimation with Vector Copulas
- Neural Networks for Learnable and Scalable Influence Estimation of Instruction Fine-Tuning Data
- Neural Networks Generalize on Low Complexity Data
- NeuralPLexer3: Accurate Biomolecular Complex Structure Prediction with Flow Models
- Neural Rule Lists: Learning Discretizations, Rules, and Order in One Go
- Neural Stochastic Flows: Solver-Free Modelling and Inference for SDE Solutions
- NeuralSurv: Deep Survival Analysis with Bayesian Uncertainty Quantification
- Neural Tangent Knowledge Distillation for Optical Convolutional Networks
- Neural Thermodynamics: Entropic Forces in Deep and Universal Representation Learning
- NeurIPS 2025 Workshop on Embodied and Safe-Assured Robotic Systems
- NeurIPS 2025 Workshop on Socially Responsible and Trustworthy Foundation Models
- NeurIPS 2025 Workshop on Space in Vision, Language, and Embodied AI
- NeurIPS2025 Workshop Research Development AI Mexico
- NeurIPS should lead scientific consensus on AI policy
- NeurIPT: Foundation Model for Neural Interfaces
- NeuroGenPoisoning: Neuron-Guided Attacks on Retrieval-Augmented Generation of LLM via Genetic Optimization of External Knowledge
- NeuroH-TGL: Neuro-Heterogeneity Guided Temporal Graph Learning Strategy for Brain Disease Diagnosis
- Neurons as Detectors of Coherent Sets in Sensory Dynamics
- NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval
- NeuroRenderedFake: A Challenging Benchmark to Detect Fake Images Generated by Advanced Neural Rendering Methods
- Neuro-Spectral Architectures for Causal Physics-Informed Networks
- Neurosymbolic Diffusion Models
- NeuSymEA: Neuro-symbolic Entity Alignment via Variational Inference
- New Frontiers of Hyperparameter Optimization: Recent advances and open challenges in theory and practice
- New Parallel and Streaming Algorithms for Directed Densest Subgraph
- New Perspectives in Graph Machine Learning
- New Perspectives on the Polyak Stepsize: Surrogate Functions and Negative Results
- Next Semantic Scale Prediction via Hierarchical Diffusion Language Models
- NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering
- NFL-BA: Near-Field Light Bundle Adjustment for SLAM in Dynamic Lighting
- NOBLE - Neural Operator with Biologically-informed Latent Embeddings to Capture Experimental Variability in Biological Neuron Models
- NoBOOM: Chemical Process Datasets for Industrial Anomaly Detection
- No Experts, No Problem: Avoidance Learning from Bad Demonstrations
- Noise Consistency Training: A Native Approach for One-step Generator in Learning Additional Controls
- Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models
- Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models
- Noise Matters: Optimizing Matching Noise for Diffusion Classifiers
- Noise-Robustness Through Noise: A Framework combining Asymmetric LoRA with Poisoning MoE
- NoisyGRPO: Incentivizing Multimodal CoT Reasoning via Noise Injection and Bayesian Estimation
- Noisy Multi-Label Learning through Co-Occurrence-Aware Diffusion
- NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
- No Loss, No Gain: Gated Refinement and Adaptive Compression for Prompt Optimization
- Non-Adaptive Adversarial Face Generation
- Non-Asymptotic Analysis Of Data Augmentation For Precision Matrix Estimation
- Non-Asymptotic Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes
- Non-Clairvoyant Scheduling with Progress Bars
- Non-convex entropic mean-field optimization via Best Response flow
- Non-Convex Tensor Recovery from Tube-Wise Sensing
- Non-equilibrium Annealed Adjoint Sampler
- Non-Euclidean Foundation Models and Geometric Learning: Advancing AI Beyond Euclidean Frameworks
- Non-exchangeable Conformal Prediction with Optimal Transport: Tackling Distribution Shift with Unlabeled Data
- Nonlinear Laplacians: Tunable principal component analysis under directional prior information
- Nonlinearly Preconditioned Gradient Methods: Momentum and Stochastic Analysis
- Non-Line-of-Sight 3D Reconstruction with Radar
- Non-Markovian Discrete Diffusion with Causal Language Models
- Non-monotone Submodular Optimization: $p$-Matchoid Constraints and Fully Dynamic Setting
- Nonparametric Quantile Regression with ReLU-Activated Recurrent Neural Networks
- Non-rectangular Robust MDPs with Normed Uncertainty Sets
- Non-Singularity of the Gradient Descent Map for Neural Networks with Piecewise Analytic Activations
- Non-stationary Bandit Convex Optimization: A Comprehensive Study
- Non-stationary Equivariant Graph Neural Networks for Physical Dynamics Simulation
- Non-Stationary Lipschitz Bandits
- Non-Stationary Structural Causal Bandits
- Non-Uniform Multiclass Learning with Bandit Feedback
- No Object Is an Island: Enhancing 3D Semantic Segmentation Generalization with Diffusion Models
- NopeRoomGS: Indoor 3D Gaussian Splatting Optimization without Camera Pose Input
- NoPo-Avatar: Generalizable and Animatable Avatars from Sparse Inputs without Human Poses
- NORA: The First Workshop on Knowledge Graphs & Agentic Systems Interplay
- No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!
- No-Regret Online Autobidding Algorithms in First-price Auctions
- No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
- Normal-Abnormal Guided Generalist Anomaly Detection
- Normalization in Attention Dynamics
- Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models
- Normalize Filters! Classical Wisdom for Deep Vision
- Normalizing Flows are Capable Models for Continuous Control
- NormFit: A Lightweight Solution for Few-Shot Federated Learning with Non-IID Data
- Not All Data are Good Labels: On the Self-supervised Labeling for Time Series Forecasting
- NOVA: A Benchmark for Rare Anomaly Localization and Clinical Reasoning in Brain MRI
- Novel Class Discovery for Point Cloud Segmentation via Joint Learning of Causal Representation and Reasoning
- Novel Exploration via Orthogonality
- Novel View Synthesis from A Few Glimpses via Test-Time Natural Video Completion
- NPN: Non-Linear Projections of the Null-Space for Imaging Inverse Problems
- NS-Gym: A Comprehensive and Open-Source Simulation Framework for Non-Stationary Markov Decision Processes
- NSNQuant: A Double Normalization Approach for Calibration-Free Low-Bit Vector Quantization of KV Cache
- NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective
- NUTS: Eddy-Robust Reconstruction of Surface Ocean Nutrients via Two-Scale Modeling
- nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning
- Nyström-Accelerated Primal LS-SVMs: Breaking the $O(an^3)$ Complexity Bottleneck for Scalable ODEs Learning
- OASIS: One-Shot Federated Graph Learning via Wasserstein Assisted Knowledge Integration
- ObCLIP: Oblivious CLoud-Device Hybrid Image Generation with Privacy Preservation
- Object-centric 3D Motion Field for Robot Learning from Human Videos
- Object-centric binding in Contrastive Language-Image Pretraining
- Object-Centric Concept-Bottlenecks
- Object-Centric Representation Learning for Enhanced 3D Semantic Scene Graph Prediction
- Object Concepts Emerge from Motion
- Objective Soups: Multilingual Multi-Task Modeling for Speech Processing
- Object-X: Learning to Reconstruct Multi-Modal 3D Object Representations
- Obliviator Reveals the Cost of Nonlinear Guardedness in Concept Erasure
- OceanBench: A Benchmark for Data-Driven Global Ocean Forecasting systems
- OCN: Effectively Utilizing Higher-Order Common Neighbors for Better Link Prediction
- OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
- OCTDiff: Bridged Diffusion Model for Portable OCT Super-Resolution and Enhancement
- OctoNet: A Large-Scale Multi-Modal Dataset for Human Activity Understanding Grounded in Motion-Captured 3D Pose Labels
- ODG: Occupancy Prediction Using Dual Gaussians
- Offline Actor-Critic for Average Reward MDPs
- Offline Goal-conditioned Reinforcement Learning with Quasimetric Representations
- Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies
- Offline imitation learning in $Q^\pi$-realizable MDPs without expert realizability
- Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization
- Off-policy Reinforcement Learning with Model-based Exploration Augmentation
- Olabisi Onabanjo University (BTF)
- OligoGym: Curated Datasets and Benchmarks for Oligonucleotide Drug Discovery
- OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed Domain
- OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization
- OMiSO: Adaptive optimization of state-dependent brain stimulation to shape neural population states
- OmniBench: Towards The Future of Universal Omni-Language Models
- OmniCast: A Masked Latent Diffusion Model for Weather Forecasting Across Time Scales
- OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data
- Omnidirectional 3D Scene Reconstruction from Single Image
- Omni-DNA: A Genomic Model Supporting Sequence Understanding, Long-context, and Textual Annotation
- OmniDraft: A cross-vocabulary, online adaptive drafter for on-device speculative decoding
- OmniFC: Rethinking Federated Clustering via Lossless and Secure Distance Reconstruction
- OmniGaze: Reward-inspired Generalizable Gaze Estimation in the Wild
- OmniGen-AR: AutoRegressive Any-to-Image Generation
- Omni-Mol: Multitask Molecular Model for Any-to-any Modalities
- Omnipresent Yet Overlooked: Heat Kernels in Combinatorial Bayesian Optimization
- Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
- OmniResponse: Online Multimodal Conversational Response Generation in Dyadic Interactions
- OmniSegmentor: A Flexible Multi-Modal Learning Framework for Semantic Segmentation
- OmniSVG: A Unified Scalable Vector Graphics Generation Model
- OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers
- OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking
- OmniTry: Virtual Try-On Anything without Masks
- OmniVCus: Feedforward Subject-driven Video Customization with Multimodal Control Conditions
- OmniZoom: A Universal Plug-and-Play Paradigm for Cross-Device Smooth Zoom Interpolation
- On Agnostic PAC Learning in the Small Error Regime
- Once Upon an Input: Reasoning via Per-Instance Program Synthesis
- On Efficiency-Effectiveness Trade-off of Diffusion-based Recommenders
- One Filters All: A Generalist Filter For State Estimation
- One for All: Universal Topological Primitive Transfer for Graph Structure Learning
- One Head to Rule Them All: Amplifying LVLM Safety through a Single Critical Attention Head
- OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
- On Epistemic Uncertainty of Visual Tokens for Object Hallucinations in Large Vision-Language Models
- One Prompt Fits All: Universal Graph Adaptation for Pretrained Models
- One Sample is Enough to Make Conformal Prediction Robust
- One SPACE to Rule Them All: Jointly Mitigating Factuality and Faithfulness Hallucinations in LLMs
- One-Step Diffusion-Based Image Compression with Semantic Distillation
- One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution
- One-Step is Enough: Sparse Autoencoders for Text-to-Image Diffusion Models
- One-Step Offline Distillation of Diffusion-based Models via Koopman Modeling
- One Stone with Two Birds: A Null-Text-Null Frequency-Aware Diffusion Models for Text-Guided Image Inpainting
- One Subgoal at a Time: Zero-Shot Generalization to Arbitrary Linear Temporal Logic Requirements in Multi-Task Reinforcement Learning
- One Token Embedding Is Enough to Deadlock Your Large Reasoning Model
- One Token per Highly Selective Frame: Towards Extreme Compression for Long Video Understanding
- On Evaluating LLM Alignment by Evaluating LLMs as Judges
- On Evaluating Policies for Robust POMDPs
- On Extending Direct Preference Optimization to Accommodate Ties
- On Fairness of Unified Multimodal Large Language Model for Image Generation
- On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning
- On Geometry-Enhanced Parameter-Efficient Fine-Tuning for 3D Scene Segmentation
- On Group Sufficiency Under Label Bias
- On Hierarchies of Fairness Notions in Cake Cutting: From Proportionality to Super Envy-Freeness
- On Inductive Biases That Enable Generalization in Diffusion Transformers
- On Learning Verifiers and Implications to Chain-of-Thought Reasoning
- On Linear Mode Connectivity of Mixture-of-Experts Architectures
- Online Bilateral Trade With Minimal Feedback: Don’t Waste Seller’s Time
- Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
- Online Experimental Design With Estimation-Regret Trade-off Under Network Interference
- Online Feedback Efficient Active Target Discovery in Partially Observable Environments
- Online Functional Tensor Decomposition via Continual Learning for Streaming Data Completion
- Online Inverse Linear Optimization: Efficient Logarithmic-Regret Algorithm, Robustness to Suboptimality, and Lower Bound
- Online Learning in the Repeated Mediated Newsvendor Problem
- Online Learning of Neural Networks
- Online Learning of Pure States is as Hard as Mixed States
- Online Locally Differentially Private Conformal Prediction via Binary Inquiries
- Online Mixture of Experts: No-Regret Learning for Optimal Collective Decision-Making
- Online Multi-Class Selection with Group Fairness Guarantee
- Online Optimization for Offline Safe Reinforcement Learning
- Online Portfolio Selection with ML Predictions
- Online Prediction with Limited Selectivity
- Online robust locally differentially private learning for nonparametric regression
- Online Segment Any 3D Thing as Instance Tracking
- OnlineSplatter: Pose-Free Online 3D Reconstruction for Free-Moving Objects
- Online Statistical Inference in Decision Making with Matrix Context
- Online Strategic Classification With Noise and Partial Feedback
- Online Time Series Forecasting with Theoretical Guarantees
- Online Two-Stage Submodular Maximization
- On Local Limits of Sparse Random Graphs: Color Convergence and the Refined Configuration Model
- On Logic-based Self-Explainable Graph Neural Networks
- On Minimax Estimation of Parameters in Softmax-Contaminated Mixture of Experts
- On Optimal Steering to Achieve Exact Fairness
- On-Policy Optimization with Group Equivalent Preference for Multi-Programming Language Understanding
- On Reasoning Strength Planning in Large Reasoning Models
- On scalable and efficient training of diffusion samplers
- On the $O(\frac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm
- On the Bias of Next-Token Predictors Toward Systematically Inefficient Reasoning: A Shortest-Path Case Study
- On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity
- On the Coexistence and Ensembling of Watermarks
- On the Complexity of Finding Stationary Points in Nonconvex Simple Bilevel Optimization
- On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
- On the Convergence of Single-Timescale Actor-Critic
- On the Convergence of Stochastic Smoothed Multi-Level Compositional Gradient Descent Ascent
- On the creation of narrow AI: hierarchy and nonlocality of neural network skills
- On the Edge of Memorization in Diffusion Models
- On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization
- On the Emergence of Linear Analogies in Word Embeddings
- On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection
- On the Entropy Calibration of Language Models
- On the Existence and Complexity of Core-Stable Data Exchanges
- On the Expressive Power of Mixture-of-Experts for Structured Complex Tasks
- On the Global Optimality of Policy Gradient Methods in General Utility Reinforcement Learning
- On the Hardness of Approximating Distributions with Tractable Probabilistic Models
- On the Hardness of Conditional Independence Testing In Practice
- On the Integration of Spatial-Temporal Knowledge: A Lightweight Approach to Atmospheric Time Series Forecasting
- On the Loss of Context Awareness in General Instruction Fine-tuning
- On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective
- On the necessity of adaptive regularisation: Optimal anytime online learning on $\boldsymbol{\ell_p}$-balls
- On the Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization
- On the Optimality of the Median-of-Means Estimator under Adversarial Contamination
- On the rankability of visual embeddings
- On the Relation between Rectified Flows and Optimal Transport
- On the Robustness of Transformers against Context Hijacking for Linear Classification
- On the Robustness of Verbal Confidence of LLMs in Adversarial Attacks
- On the Role of Hidden States of Modern Hopfield Network in Transformer
- On the SAC-BL Algorithm for Anomaly Detection
- On the Sample Complexity Bounds of Bilevel Reinforcement Learning
- On the Sample Complexity of Differentially Private Policy Optimization
- On the sample complexity of semi-supervised multi-objective learning
- On the Science of “Alien Intelligences”: Evaluating Cognitive Capabilities in Babies, Animals, and AI
- On the Science of “Alien Intelligences”: Evaluating Cognitive Capabilities in Babies, Animals, and AI
- On the Stability and Generalization of Meta-Learning: the Impact of Inner-Levels
- On the Stability of Graph Convolutional Neural Networks: A Probabilistic Perspective
- On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling
- On the Universal Near Optimality of Hedge in Combinatorial Settings
- On the Value of Cross-Modal Misalignment in Multimodal Representation Learning
- On the VC dimension of deep group convolutional neural networks
- On topological descriptors for graph products
- On Traceability in $\ell_p$ Stochastic Convex Optimization
- On Transferring Transferability: Towards a Theory for Size Generalization
- On Union-Closedness of Language Generation
- On Universality Classes of Equivariant Networks
- On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning
- OOD-Barrier: Build a Middle-Barrier for Open-Set Single-Image Test Time Adaptation via Vision Language Models
- OOD Detection with Relative Angles
- OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
- OpenBox: Annotate Any Bounding Boxes in 3D
- Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
- OpenCUA: Open Foundations for Computer-Use Agents
- OpenGU: A Comprehensive Benchmark for Graph Unlearning
- OpenHOI: Open-World Hand-Object Interaction Synthesis with Multimodal Large Language Model
- OpenHype: Hyperbolic Embeddings for Hierarchical Open-Vocabulary Radiance Fields
- Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring
- OpenLex3D: A Tiered Benchmark for Open-Vocabulary 3D Scene Representations
- OpenMMEgo: Enhancing Egocentric Understanding for LMMs with Open Weights and Data
- OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-time Emotional Speech Synthesis
- Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
- OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
- OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
- Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning
- OpenVLThinker: Complex Vision-Language Reasoning via Iterative SFT-RL Cycles
- Open-Vocabulary Part Segmentation via Progressive and Boundary-Aware Strategy
- Open-World Drone Active Tracking with Goal-Centered Rewards
- OpenWorldSAM: Extending SAM2 for Universal Image Segmentation with Language Prompts
- OPHR: Mastering Volatility Trading with Multi-Agent Deep Reinforcement Learning
- Opinion Maximization in Social Networks by Modifying Internal Opinions
- OPMapper: Enhancing Open-Vocabulary Semantic Segmentation with Multi-Guidance Information
- OPT 2025: Optimization for Machine Learning
- OPTFM: A Scalable Multi-View Graph Transformer for Hierarchical Pre-Training in Combinatorial Optimization
- Optical Coherence Tomography Harmonization with Anatomy-Guided Latent Metric Schrödinger Bridges
- Optimal Adjustment Sets for Nonparametric Estimation of Weighted Controlled Direct Effect
- Optimal and Provable Calibration in High-Dimensional Binary Classification: Angular Calibration and Platt Scaling
- Optimal Best Arm Identification under Differential Privacy
- Optimal community detection in dense bipartite graphs
- Optimal Control for Transformer Architectures: Enhancing Generalization, Robustness and Efficiency
- Optimal Dynamic Regret by Transformers for Non-Stationary Reinforcement Learning
- Optimal Estimation of the Best Mean in Multi-Armed Bandits
- Optimal Graph Clustering without Edge Density Signals
- Optimality and NP-Hardness of Transformers in Learning Markovian Dynamical Functions
- Optimal kernel regression bounds under energy-bounded noise
- Optimal Minimum Width for the Universal Approximation of Continuously Differentiable Functions by Deep Narrow MLPs
- Optimal Mistake Bounds for Transductive Online Learning
- Optimal Neural Compressors for the Rate-Distortion-Perception Tradeoff
- Optimal Nuisance Function Tuning for Estimating a Doubly Robust Functional under Proportional Asymptotics
- Optimal Online Change Detection via Random Fourier Features
- Optimal Rates for Generalization of Gradient Descent for Deep ReLU Classification
- Optimal Rates in Continual Linear Regression via Increasing Regularization
- Optimal Regret Bounds via Low-Rank Structured Variation in Non-Stationary Reinforcement Learning
- Optimal Regret of Bandits under Differential Privacy
- Optimal Single-Policy Sample Complexity and Transient Coverage for Average-Reward Offline RL
- Optimal Spectral Transitions in High-Dimensional Multi-Index Models
- Optimism Without Regularization: Constant Regret in Zero-Sum Games
- Optimistic Online-to-Batch Conversions for Accelerated Convergence and Universality
- Optimistic Query Routing in Clustering-based Approximate Maximum Inner Product Search
- Optimization Inspired Few-Shot Adaptation for Large Language Models
- Optimize Any Topology: A Foundation Model for Shape- and Resolution-Free Structural Topology Optimization
- Optimized Minimal 3D Gaussian Splatting
- Optimize the Unseen - Fast NeRF Cleanup with Free Space Prior
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization
- Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
- Optimizing Distributional Geometry Alignment with Optimal Transport for Generative Dataset Distillation
- Optimizing Retrieval for RAG via Reinforced Contrastive Learning
- Optimizing the Unknown: Black Box Bayesian Optimization with Energy-Based Model and Reinforcement Learning
- Option-aware Temporally Abstracted Value for Offline Goal-Conditioned Reinforcement Learning
- OPTIONSCHAINFLEX ROBO (OPC) PRIVATE LIMITED (BTF)
- OptiScene: LLM-driven Indoor Scene Layout Generation via Scaled Human-aligned Data Synthesis and Multi-Stage Preference Optimization
- OptiTree: Hierarchical Thoughts Generation with Tree Search for LLM Optimization Modeling
- Oracle-Efficient Combinatorial Semi-Bandits
- ORBIT - Open Recommendation Benchmark for Reproducible Research with Hidden Tests
- OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning
- Order-Level Attention Similarity Across Language Models: A Latent Commonality
- OrdShap: Feature Position Importance for Sequential Black-Box Models
- Orient Anything V2: Unifying Orientation and Rotation Understanding
- Orientation-anchored Hyper-Gaussian for 4D Reconstruction from Casual Videos
- Orientation Matters: Making 3D Generative Models Orientation-Aligned
- ORIGAMISPACE: Benchmarking Multimodal LLMs in Multi-Step Spatial Reasoning with Mathematical Constraints
- ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
- Orochi: Versatile Biomedical Image Processor
- Orthogonal Contrastive Learning for Multi-Representation fMRI Analysis
- Orthogonal Survival Learners for Estimating Heterogeneous Treatment Effects from Time-to-Event Data
- OrthoLoC: UAV 6-DoF Localization and Calibration Using Orthographic Geodata
- Oryx: a Scalable Sequence Model for Many-Agent Coordination in Offline MARL
- OSCAR: One-Step Diffusion Codec Across Multiple Bit-rates
- OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents
- OSKAR: Omnimodal Self-supervised Knowledge Abstraction and Representation
- OSTAR: Optimized Statistical Text-classifier with Adversarial Resistance
- OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
- OSVI-WM: One-Shot Visual Imitation for Unseen Tasks using World-Model-Guided Trajectory Generation
- Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits
- Out-of-Distribution Generalized Graph Anomaly Detection with Homophily-aware Environment Mixup
- Overcoming Challenges of Long-Horizon Prediction in Driving World Models
- Overcoming Long Context Limitations of State Space Models via Context Dependent Sparse Attention
- Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning
- OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps
- Over-squashing in Spatiotemporal Graph Neural Networks
- OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models
- OVS Meets Continual Learning: Towards Sustainable Open-Vocabulary Segmentation
- OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
- OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis
- PAC-Bayes Bounds for Multivariate Linear Regression and Linear Autoencoders
- PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?
- PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding
- PAID: Pairwise Angular-Invariant Decomposition for Continual Test-Time Adaptation
- PairEdit: Learning Semantic Variations for Exemplar-based Image Editing
- Pairwise Calibrated Rewards for Pluralistic Alignment
- Pairwise Optimal Transports for Training All-to-All Flow-Based Condition Transfer Model
- PALQO: Physics-informed model for Accelerating Large-scale Quantum Optimization
- Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation
- Pancakes: Consistent Multi-Protocol Image Segmentation Across Biomedical Domains
- PandaPose: 3D Human Pose Lifting from a Single Image via Propagating 2D Pose Prior to 3D Anchor Space
- PANDA: Towards Generalist Video Anomaly Detection via Agentic AI Engineer
- PANGEA: Projection-Based Augmentation with Non-Relevant General Data for Enhanced Domain Adaptation in LLMs
- Pan-LUT: Efficient Pan-sharpening via Learnable Look-Up Tables
- Panoptic Captioning: An Equivalence Bridge for Image and Text
- PANORAMA: A Dataset and Benchmarks Capturing Decision Trails and Rationales in Patent Examination
- PanoWan: Lifting Diffusion Video Generation Models to 360$^\circ$ with Latitude/Longitude-aware Mechanisms
- PANTHER: Generative Pretraining Beyond Language for Sequential User Behavior Modeling
- PanTS: The Pancreatic Tumor Segmentation Dataset
- Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers
- Parallelizing MCMC Across the Sequence Length
- PARALLELPROMPT: Extracting Parallelism from Large Language Model Queries
- Parallel Scaling Law for Language Models
- Parameter Dynamics of Online Machine Learning and Test-time Adaptation
- Parameter Efficient Fine-tuning via Explained Variance Adaptation
- Parameter-free Algorithms for the Stochastically Extended Adversarial Model
- Parameter-Free Hypergraph Neural Network for Few-Shot Node Classification
- Parameterized Synthetic Text Generation with SimpleStories
- ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation
- PARCO: Parallel AutoRegressive Models for Multi-Agent Combinatorial Optimization
- Pareto-Optimal Energy Alignment for Designing Nature-Like Antibodies
- Pareto Optimal Risk-Agnostic Distributional Bandits with Heavy-Tail Rewards
- ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization
- PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models
- PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation
- Parsimonious Predictions for Strategyproof Scheduling
- Part-Aware Bottom-Up Group Reasoning for Fine-Grained Social Interaction Detection
- PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers
- Partial Correlation Network Estimation by Semismooth Newton Methods
- Partial Information Decomposition via Normalizing Flows in Latent Gaussian Distributions
- Partial Physics Informed Diffusion Model for Ocean Chlorophyll Concentration Reconstruction
- Partition-Then-Adapt: Combating Prediction Bias for Reliable Multi-Modal Test-Time Adaptation
- Partition to Evolve: Niching-enhanced Evolution with LLMs for Automated Algorithm Discovery
- Partner Modelling Emerges in Recurrent Agents (But Only When It Matters)
- PartNeXt: A Next-Generation Dataset for Fine-Grained and Hierarchical 3D Part Understanding
- PARTONOMY: Large Multimodal Models with Part-Level Visual Understanding
- Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems
- PASS: Path-selective State Space Model for Event-based Recognition
- PaTH Attention: Position Encoding via Accumulating Householder Transformations
- Path-Enhanced Contrastive Learning for Recommendation
- Path Gradients after Flow Matching
- Path-specific effects for pulse-oximetry guided decisions in critical care
- PathVQ: Reforming Computational Pathology Foundation Model for Whole Slide Image Analysis via Vector Quantization
- PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions
- Pattern-Guided Adaptive Prior for Structure Learning
- Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers
- Pay Attention to Small Weights
- PaZO: Preconditioned Accelerated Zeroth-Order Optimization for Fine-Tuning LLMs
- PBR-SR: Mesh PBR Texture Super Resolution from 2D Image Priors
- PCA++: How Uniformity Induces Robustness to Background Noise in Contrastive Learning
- PC-Net: Weakly Supervised Compositional Moment Retrieval via Proposal-Centric Network
- PDEfuncta: Spectrally-Aware Neural Representation for PDE Solution Modeling
- PDPO: Parametric Density Path Optimization
- Per-Architecture Training-Free Metric Optimization for Neural Architecture Search
- Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
- Perception Encoder: The best visual embeddings are not at the output of the network
- PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
- Perception-R1: Pioneering Perception Policy with Reinforcement Learning
- Performative Risk Control: Calibrating Models for Reliable Deployment under Performativity
- Performative Validity of Recourse Explanations
- Periodic Skill Discovery
- PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning
- Permissioned LLMs: Enforcing Access Control in Large Language Models
- PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models
- Permutation Equivariant Neural Controlled Differential Equations for Dynamic Graph Representation Learning
- Personalized Bayesian Federated Learning with Wasserstein Barycenter Aggregation
- Personalized Decision Modeling: Utility Optimization or Textualized-Symbolic Reasoning
- Personalized Exercise Recommendation with Semantically-Grounded Knowledge Tracing
- Personalized Federated Conformal Prediction with Localization
- Personalized Image Editing in Text-to-Image Diffusion Models via Collaborative Direct Preference Optimization
- Personalized Safety in LLMs: A Benchmark and A Planning-Based Agent Approach
- Personalized Subgraph Federated Learning with Differentiable Auxiliary Projections
- Personalized Visual Content Generation in Conversational Systems
- Perturb a Model, Not an Image: Towards Robust Privacy Protection via Anti-Personalized Diffusion Models
- Perturbation Bounds for Low-Rank Inverse Approximations under Noise
- PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis
- Pessimistic Data Integration for Policy Evaluation
- PF∆: A Benchmark Dataset for Power Flow under Load, Generation, and Topology Variations
- PHANTOM: A Benchmark for Hallucination Detection in Financial Long-Context QA
- Photography Perspective Composition: Towards Aesthetic Perspective Recommendation
- PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models
- PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly
- PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
- PhysDiff: A Physically-Guided Diffusion Model for Multivariate Time Series Anomaly Detection
- PhysDiff-VTON: Cross-Domain Physics Modeling and Trajectory Optimization for Virtual Try-On
- PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring
- PhySense: Sensor Placement Optimization for Accurate Physics Sensing
- PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors
- Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints
- Physics-Driven Spatiotemporal Modeling for AI-Generated Video Detection
- Physics-informed machine learning with domain decomposition and global dynamics for three-dimensional intersecting flows
- Physics-informed Neural Operator for Pansharpening
- Physics-informed Reduced Order Modeling of Time-dependent PDEs via Differentiable Solvers
- Physics-informed Value Learner for Offline Goal-Conditioned Reinforcement Learning
- Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers
- PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation
- PhysVLM-AVR: Active Visual Reasoning for Multimodal Large Language Models in Physical Environments
- PhySwin: An Efficient and Physically-Informed Foundation Model for Multispectral Earth Observation
- PhysX-3D: Physical-Grounded 3D Asset Generation
- PID-controlled Langevin Dynamics for Faster Sampling on Generative Models
- PiKE: Adaptive Data Mixing for Large-Scale Multi-Task Learning Under Low Gradient Conflicts
- PINN Balls: Scaling Second-Order Methods for PINNs with Domain Decomposition and Adaptive Sampling
- PINNs with Learnable Quadrature
- Pinpointing Attention-Causal Communication in Language Models
- Pin the Tail on the Model: Blindfolded Repair of User-Flagged Failures in Text-to-Image Services
- PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference
- PIPE: Physics-Informed Position Encoding for Alignment of Satellite Images and Time Series in Typhoon Forecasting
- PIVNO: Particle Image Velocimetry Neural Operator
- Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers
- Pixel Reasoner: Incentivizing Pixel Space Reasoning via Curiosity-Driven Reinforcement Learning
- PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement
- Place Cells as Multi-Scale Position Embeddings: Random Walk Transition Kernels for Path Planning
- PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-forward Planar Splatting
- PlanarGS: High-Fidelity Indoor 3D Gaussian Splatting Guided by Vision-Language Planar Priors
- Planning and Learning in Average Risk-aware MDPs
- Planning in the Era of Language Models
- Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL
- Planning with Quantized Opponent Models
- PlanU: Large Language Model Reasoning through Planning under Uncertainty
- Plasticity as the Mirror of Empowerment
- P-Law: Predicting Quantitative Scaling Law with Entropy Guidance in Large Recommendation Models
- PlayerOne: Egocentric World Simulator
- PLD: A Choice-Theoretic List-Wise Knowledge Distillation
- PLEIADES: Building Temporal Kernels with Orthogonal Polynomials
- Plenodium: Underwater 3D Scene Reconstruction with Plenoptic Medium Representation
- PLMTrajRec: A Scalable and Generalizable Trajectory Recovery Method with Pre-trained Language Models
- pLSTM: parallelizable Linear Source Transition Mark networks
- Plug-and-Play Context Feature Reuse for Efficient Masked Generation
- Plug-and-play Feature Causality Decomposition for Multimodal Representation Learning
- PMLF: A Physics-Guided Multiscale Loss Framework for Structurally Heterogeneous Time Series
- PMQ-VE: Progressive Multi-Frame Quantization for Video Enhancement
- PocketSR: The Super-Resolution Expert in Your Pocket Mobiles
- POCO: Scalable Neural Forecasting through Population Conditioning
- PoE-World: Compositional World Modeling with Products of Programmatic Experts
- PoGDiff: Product-of-Gaussians Diffusion Models for Imbalanced Text-to-Image Generation
- Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory
- Point4Bit: Post Training 4-bit Quantization for Point Cloud 3D Detection
- Point Cloud Synthesis Using Inner Product Transforms
- PointMAC: Meta-Learned Adaptation for Robust Test-Time Point Cloud Completion
- Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-training
- PointMapPolicy: Structured Point Cloud Processing for Multi-Modal Imitation Learning
- Point or Line? Using Line-based Representation for Panoptic Symbol Spotting in CAD Drawings
- Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning
- PointTruss: K-Truss for Point Cloud Registration
- Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs
- PoLAR: Polar-Decomposed Low-Rank Adapter Representation
- PolarQuant: Leveraging Polar Transformation for Key Cache Quantization and Decoding Acceleration
- Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity
- Policy Compatible Skill Incremental Learning via Lazy Learning Interface
- Policy Gradient Methods Converge Globally in Imperfect-Information Extensive-Form Games
- Policy learning “without” overlap: Pessimism and generalized empirical Bernstein’s inequality
- Policy Optimized Text-to-Image Pipeline Design
- PolyGuard: Massive Multi-Domain Safety Policy-Grounded Guardrail Dataset
- PolyJuice Makes It Real: Black-Box, Universal Red Teaming for Synthetic Image Detectors
- Polyline Path Masked Attention for Vision Transformer
- PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts
- PolyPose: Deformable 2D/3D Registration via Polyrigid Transformations
- PolypSense3D: A Multi-Source Benchmark Dataset for Depth-Aware Polyp Size Measurement in Endoscopy
- PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement
- Pool Me Wisely: On the Effect of Pooling in Transformer-Based Models
- PoseCrafter: Extreme Pose Estimation with Hybrid Video Synthesis
- Pose Splatter: A 3D Gaussian Splatting Model for Quantifying Animal Pose and Appearance
- Position: AI Should Sense Better, Not Just Scale Bigger: Adaptive Sensing as a Paradigm Shift
- Positional Encoding: Past, Present, and Future
- Positional Fragility in LLMs: How Offset Effects Reshape Our Understanding of Memorization Risks
- Position: Benchmarking is Broken - Don't Let AI be Its Own Judge
- Position: Biology is the Challenge Physics-Informed ML Needs to Evolve
- Position: Bridge the Gaps between Machine Unlearning and AI Regulation
- Position: If Innovation in AI systematically Violates Fundamental Rights, Is It Innovation at All?
- Position: Machine Learning Conferences Should Establish a "Refutations and Critiques" Track
- Position: Require Frontier AI Labs To Release Small "Analog" Models
- Position: Towards Bidirectional Human-AI Alignment
- Posterior Contraction for Sparse Neural Networks in Besov Spaces with Intrinsic Dimensionality
- Posterior Sampling by Combining Diffusion Models with Annealed Langevin Dynamics
- Post Hoc Regression Refinement via Pairwise Rankings
- Power Lines: Scaling laws for weight decay and batch size in LLM pre-training
- PPMStereo: Pick-and-Play Memory Construction for Consistent Dynamic Stereo Matching
- Practical and Effective Code Watermarking for Large Language Models
- Practical Bayes-Optimal Membership Inference Attacks
- Practical do-Shapley Explanations with Estimand-Agnostic Causal Inference
- Practical Kernel Selection for Kernel-based Conditional Independence Test
- Pragmatic Heterogeneous Collaborative Perception via Generative Communication Mechanism
- Praxis-VLM: Vision-Grounded Decision Making via Text-Driven Reinforcement Learning
- PREAMBLE: Private and Efficient Aggregation via Block Sparse Vectors
- Precise Asymptotics and Refined Regret of Variance-Aware UCB
- Precise Diffusion Inversion: Towards Novel Samples and Few-Step Models
- Precise Information Control in Long-Form Text Generation
- Preconditioned Langevin Dynamics with Score-based Generative Models for Infinite-Dimensional Linear Bayesian Inverse Problems
- Predictability Enables Parallelization of Nonlinear State Space Models
- Predictable Scale (Part II) --- Farseer: A Refined Scaling Law in LLMs
- Predicting Empirical AI Research Outcomes with Language Models
- Predicting Functional Brain Connectivity with Context-Aware Deep Neural Networks
- Predicting partially observable dynamical systems via diffusion models with a multiscale inference scheme
- Predicting the Performance of Black-box Language Models with Follow-up Queries
- Prediction-Powered Causal Inferences
- Prediction-Powered Semi-Supervised Learning with Online Power Tuning
- Prediction with expert advice under additive noise
- Predictive Coding Enhances Meta-RL To Achieve Interpretable Bayes-Optimal Belief Representation Under Partial Observability
- Predictive Preference Learning from Human Interventions
- Preference-Based Dynamic Ranking Structure Recognition
- Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options
- Preference Distillation via Value based Reinforcement Learning
- Preference-driven Knowledge Distillation for Few-shot Node Classification
- Preference-Driven Multi-Objective Combinatorial Optimization with Conditional Computation
- Preference-Guided Diffusion for Multi-Objective Offline Optimization
- Preference Learning with Lie Detectors can Induce Honesty or Evasion
- Preference Learning with Response Time: Robust Losses and Guarantees
- Preference Optimization by Estimating the Ratio of the Data Distribution
- Preference Optimization on Pareto Sets: On a Theory of Multi-Objective Optimization
- PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
- PreFM: Online Audio-Visual Event Parsing via Predictive Future Modeling
- PRESCRIBE: Predicting Single-Cell Responses with Bayesian Estimation
- Preserving LLM Capabilities through Calibration Data Curation: From Analysis to Optimization
- Preserving Task-Relevant Information Under Linear Concept Removal
- PRESTO: Preimage-Informed Instruction Optimization for Prompting Black-Box LLMs
- Pre-trained Large Language Models Learn to Predict Hidden Markov Models In-context
- Pre-Trained Policy Discriminators are General Reward Models
- Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning
- Preventing Shortcuts in Adapter Training via Providing the Shortcuts
- Price of Parsimony: Complexity of Fourier Sparsity Testing
- PRIMT: Preference-based Reinforcement Learning with Multimodal Feedback and Trajectory Synthesis from Foundation Models
- Principled Data Augmentation for Learning to Solve Quadratic Programming Problems
- Principled Fine-tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward
- Principled Long-Tailed Generative Modeling via Diffusion Models
- Principled Model Routing for Unknown Mixtures of Source Domains
- PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs
- Prior Forgetting and In-Context Overfitting
- Prior-Guided Diffusion Planning for Offline Reinforcement Learning
- Prior-Guided Flow Matching for Target-Aware Molecule Design with Learnable Atom Number
- Prioritizing Perception-Guided Self-Supervision: A New Paradigm for Causal Modeling in End-to-End Autonomous Driving
- Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning
- Privacy amplification by random allocation
- Privacy Reasoning in Ambiguous Contexts
- Private Continual Counting of Unbounded Streams
- Private Evolution Converges
- Private Geometric Median in Nearly-Linear Time
- Private Hyperparameter Tuning with Ex-Post Guarantee
- Private Online Learning against an Adaptive Adversary: Realizable and Agnostic Settings
- Private Set Union with Multiple Contributions
- Private Statistical Estimation via Truncation
- Private Training Large-scale Models with Efficient DP-SGD
- Private Zeroth-Order Optimization with Public Data
- Pro3D-Editor: A Progressive Framework for Consistent and Precise 3D Editing
- Probabilistic Reasoning with LLMs for Privacy Risk Estimation
- Probabilistic Stability Guarantees for Feature Attributions
- Probabilistic Token Alignment for Large Language Model Fusion
- Probably Approximately Precision and Recall Learning
- Probing Equivariance and Symmetry Breaking in Convolutional Networks
- Probing Hidden Knowledge Holes in Unlearned LLMs
- Probing Neural Combinatorial Optimization Models
- Problem-Parameter-Free Decentralized Bilevel Optimization
- Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning
- Procurement Auctions with Predictions: Improved Frugality for Facility Location
- ProDAG: Projected Variational Inference for Directed Acyclic Graphs
- Product Distribution Learning with Imperfect Advice
- ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos
- PROFIT: A Specialized Optimizer for Deep Fine Tuning
- ProfiX: Improving Profile-Guided Optimization in Compilers with Graph Neural Networks
- Program Synthesis via Test-Time Transduction
- Progressive Data Dropout: An Embarrassingly Simple Approach to Train Faster
- Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann Densities
- Progress Reward Model for Reinforcement Learning via Large Language Models
- Prohibiting Generative AI in any Form of Weapon Control
- Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry
- Projection-based Lyapunov method for fully heterogeneous weakly-coupled MDPs
- Projection-Manifold Regularized Latent Diffusion for Robust General Image Fusion
- Projective Equivariant Networks via Second-order Fundamental Differential Invariants
- Promptable 3-D Object Localization with Latent Diffusion Models
- Prompted Policy Search: Reinforcement Learning through Linguistic and Numerical Reasoning in LLMs
- Prompt-Guided Alignment with Information Bottleneck Makes Image Compression Also a Restorer
- Prompt-guided Disentangled Representation for Action Recognition
- Prompting as Scientific Inquiry
- Prompt Tuning Decision Transformers with Structured and Scalable Bandits
- Prompt Tuning Transformers for Data Memorization
- ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
- ProSpero: Active Learning for Robust Protein Design Beyond Wild-Type Neighborhoods
- Prot2Text-V2: Protein Function Prediction with Multimodal Contrastive Alignment
- ProteinConformers: Benchmark Dataset for Simulating Protein Conformational Landscape Diversity and Plausibility
- Protein Design with Dynamic Protein Vocabulary
- Protein Inverse Folding From Structure Feedback
- ProtInvTree: Deliberate Protein Inverse Folding with Reward-guided Tree Search
- Protocols for Verifying Smooth Strategies in Bandits and Games
- ProtoPairNet: Interpretable Regression through Prototypical Pair Reasoning
- Provable Gradient Editing of Deep Neural Networks
- Provable Meta-Learning with Low-Rank Adaptations
- Provable Ordering and Continuity in Vision-Language Pretraining for Generalizable Embodied Agents
- Provable Sample-Efficient Transfer Learning Conditional Diffusion Models via Representation Learning
- Provable Scaling Laws for the Test-Time Compute of Large Language Models
- Provable Watermarking for Data Poisoning Attacks
- Provably Efficient Multi-Task Meta Bandit Learning via Shared Representations
- Provably Efficient Online RLHF with One-Pass Reward Modeling
- Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation
- Proximalized Preference Optimization for Diverse Feedback Types: A Decomposed Perspective on DPO
- Proxy-SPEX: Sample-Efficient Interpretability via Sparse Feature Interactions in LLMs
- Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
- PRSformer: Disease Prediction from Million-Scale Individual Genotypes
- Pruning-Robust Mamba with Asymmetric Multi-Scale Scanning Paths
- Pruning Spurious Subgraphs for Graph Out-of-Distribution Generalization
- PSBench: a large-scale benchmark for estimating the accuracy of protein complex structural models
- Pseudo-Labeling for Kernel Ridge Regression under Covariate Shift
- Pseudo-Riemannian Graph Transformer
- PseuZO: Pseudo-Zeroth-Order Algorithm for Training Deep Neural Networks
- PSI: A Benchmark for Human Interpretation and Response in Traffic Interactions
- PSMBench: A Benchmark and Dataset for Evaluating LLMs Extraction of Protocol State Machines from RFC Specifications
- PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning
- PUATE: Efficient ATE Estimation from Treated (Positive) and Unlabeled Units
- PubSub-VFL: Towards Efficient Two-Party Split Learning in Heterogeneous Environments via Publisher/Subscriber Architecture
- PUO-Bench: A Panel Understanding and Operation Benchmark with A Privacy-Preserving Framework
- Puppeteer: Rig and Animate Your 3D Models
- Purest Quantum State Identification
- Purifying Approximate Differential Privacy with Randomized Post-processing
- Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
- Purity Law for Neural Routing Problem Solvers with Enhanced Generalizability
- PurpCode: Reasoning for Safer Code Generation
- Put CASH on Bandits: A Max K-Armed Problem for Automated Machine Learning
- Puzzles: Unbounded Video-Depth Augmentation for Scalable End-to-End 3D Reconstruction
- PyraMotion: Attentional Pyramid-Structured Motion Integration for Co-Speech 3D Gesture Synthesis
- Q3R: Quadratic Reweighted Rank Regularizer for Effective Low-Rank Training
- QBasicVSR: Temporal Awareness Adaptation Quantization for Video Super-Resolution
- QCircuitBench: A Large-Scale Dataset for Benchmarking Quantum Algorithm Design
- QFFT, Question-Free Fine-Tuning for Adaptive Reasoning
- QiMeng-CodeV-R1: Reasoning-Enhanced Verilog Generation
- QiMeng-MuPa: Mutual-Supervised Learning for Sequential-to-Parallel Code Translation
- QiMeng-NeuComBack: Self-Evolving Translation from IR to Assembly Code
- QiMeng-SALV: Signal-Aware Learning for Verilog Code Generation
- Q-Insight: Understanding Image Quality via Visual Reinforcement Learning
- QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training
- Q-Palette: Fractional-Bit Quantizers Toward Optimal Bit Allocation for Efficient LLM Deployment
- QSCA: Quantization with Self-Compensating Auxiliary for Monocular Depth Estimation
- QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models
- QuadEnhancer: Leveraging Quadratic Transformations to Enhance Deep Neural Networks
- Quadratic Coreset Selection: Certifying and Reconciling Sequence and Token Mining for Efficient Instruction Tuning
- QuadricFormer: Scene as Superquadrics for 3D Semantic Occupancy Prediction
- Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models
- QuanDA: Quantile-Based Discriminant Analysis for High-Dimensional Imbalanced Classification
- Quantifying and Alleviating Co-Adaptation in Sparse-View 3D Gaussian Splatting
- Quantifying Cross-Modality Memorization in Vision-Language Models
- Quantifying Distributional Invariance in Causal Subgraph for IRM-Free Graph Generalization
- Quantifying Elicitation of Latent Capabilities in Language Models
- Quantifying Generalisation in Imitation Learning
- Quantifying Statistical Significance of Deep Nearest Neighbor Anomaly Detection via Selective Inference
- Quantifying Task-relevant Similarities in Representations Using Decision Variable Correlations
- Quantifying Uncertainty in Error Consistency: Towards Reliable Behavioral Comparison of Classifiers
- Quantifying Uncertainty in the Presence of Distribution Shifts
- Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions
- Quantitative convergence of trained neural networks to Gaussian processes
- Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
- Quantization-Free Autoregressive Action Transformer
- Quantum Doubly Stochastic Transformers
- Quantum speedup of non-linear Monte Carlo problems
- Quantum Speedups for Minimax Optimization and Beyond
- Quantum Visual Fields with Neural Amplitude Encoding
- QuARI: Query Adaptive Retrieval Improvement
- Quartet: Native FP4 Training Can Be Optimal for Large Language Models
- Quasi-Self-Concordant Optimization with $\ell_{\infty}$ Lewis Weights
- Query-Efficient Locally Private Hypothesis Selection via the Scheffe Graph
- QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks?
- QUT-DV25: A Dataset for Dynamic Analysis of Next-Gen Software Supply Chain Attacks
- R$^2$ec: Towards Large Recommender Models with Reasoning
- R1-ShareVL: Incentivizing Reasoning Capabilities of Multimodal Large Language Models via Share-GRPO
- R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
- RADAR: Benchmarking Language Models on Imperfect Tabular Data
- RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts
- Radial Attention: $\mathcal O(n \log n)$ Sparse Attention for Long Video Generation
- RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis
- RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
- RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Chest X-ray with Zero-Shot Multi-Task Capability
- RAG4GFM: Bridging Knowledge Gaps in Graph Foundation Models through Graph Retrieval Augmented Generation
- RAG-IGBench: Innovative Evaluation for RAG-based Interleaved Generation in Open-domain Question Answering
- RAGRouter: Learning to Route Queries to Multiple Retrieval-Augmented Language Models
- Rainbow Delay Compensation: A Multi-Agent Reinforcement Learning Framework for Mitigating Observation Delays
- RAM-W600: A Multi-Task Wrist Dataset and Benchmark for Rheumatoid Arthritis
- Rancho Santiago Community College District (RSCCD) (BTF)
- Random Forest Autoencoders for Guided Representation Learning
- Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2
- Random Search Neural Networks for Efficient and Expressive Graph Learning
- Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback
- RANK++LETR: Learn to Rank and Optimize Candidates for Line Segment Detection
- RankMatch: A Novel Approach to Semi-Supervised Label Distribution Learning Leveraging Rank Correlation between Labels
- RankSEG-RMA: An Efficient Segmentation Algorithm via Reciprocal Moment Approximation
- Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion
- Rao-Blackwellised Reparameterisation Gradients
- RAPID Hand: Robust, Affordable, Perception-Integrated, Dexterous Manipulation Platfrom for Embodied Intelligence
- RAPTR: Radar-based 3D Pose Estimation using Transformer
- Rare Text Semantics Were Always There in Your Diffusion Transformer
- RAST: Reasoning Activation in LLMs via Small-model Transfer
- RAT: Bridging RNN Efficiency and Attention Accuracy via Chunk-based Sequence Modeling
- Rationalized All-Atom Protein Design with Unified Multi-Modal Bayesian Flow
- Ravan: Multi-Head Low-Rank Adaptation for Federated Fine-Tuning
- Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2)
- RayFusion: Ray Fusion Enhanced Collaborative Visual Perception
- RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion
- RBench-V: A Primary Assessment for Visual Reasoning Models with Multimodal Outputs
- RCCDA: Adaptive Model Updates in the Presence of Concept Drift under a Constrained Resource Budget
- R&D-Agent-Quant: A Multi-Agent Framework for Data-Centric Factors and Model Joint Optimization
- RDB2G-Bench: A Comprehensive Benchmark for Automatic Graph Modeling of Relational Databases
- RDD: Retrieval-Based Demonstration Decomposer for Planner Alignment in Long-Horizon Tasks
- Reaction Prediction via Interaction Modeling of Symmetric Difference Shingle Sets
- Reading Recognition in the Wild
- ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding
- REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites
- Real-DRL: Teach and Learn in Reality
- RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics
- Real-Time Execution of Action Chunking Flow Policies
- Real-Time Hyper-Personalized Generative AI Should Be Regulated to Prevent the Rise of "Digital Heroin"
- Real-Time Scene-Adaptive Tone Mapping for High-Dynamic Range Object Detection
- Real-World Adverse Weather Image Restoration via Dual-Level Reinforcement Learning with High-Quality Cold Start
- Real-World Reinforcement Learning of Active Perception Behaviors
- REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints
- ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs
- Reasoning as an Adaptive Defense for Safety
- Reasoning Beyond Points: A Visual Introspective Approach for Few-Shot 3D Segmentation
- Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought
- REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving
- Reasoning Gym: Reasoning Environments for Reinforcement Learning with Verifiable Rewards
- Reasoning Is Not a Race: When Stopping Early Beats Going Deeper
- Reasoning is Periodicity? Improving Large Language Models Through Effective Periodicity Modeling
- Reasoning Models Better Express Their Confidence
- Reasoning Models Hallucinate More: Factuality-Aware Reinforcement Learning for Large Reasoning Models
- Reasoning Models Sometimes Output Illegible Chains of Thought
- Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning
- Reasoning Planning for Language Models
- Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models
- Rebalancing Contrastive Alignment with Bottlenecked Semantic Increments in Text-Video Retrieval
- Rebalancing Return Coverage for Conditional Sequence Modeling in Offline Reinforcement Learning
- ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents
- Recent Advances in Time Series Foundation Models: Have We Reached the ‘BERT Moment’?
- Recent Developments in Geometric Machine Learning: Foundations, Models, and More
- Re-coding for Uncertainties: Edge-awareness Semantic Concordance for Resilient Event-RGB Segmentation
- Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models
- Reconciling Geospatial Prediction and Retrieval via Sparse Representations
- ReCon-GS: Continuum-Preserved Guassian Streaming for Fast and Compact Reconstruction of Dynamic Scenes
- ReCon: Region-Controllable Data Augmentation with Rectification and Alignment for Object Detection
- Reconstructing Heterogeneous Biomolecules via Hierarchical Gaussian Mixtures and Part Discovery
- Reconstruct, Inpaint, Test-Time Finetune: Dynamic Novel-view Synthesis from Monocular Videos
- Reconstruction and Secrecy under Approximate Distance Queries
- Rectified CFG++ for Flow Based Models
- Rectified Point Flow: Generic Point Cloud Pose Estimation
- Rectifying Shortcut Behaviors in Preference-based Reward Learning
- Rectifying Soft-Label Entangled Bias in Long-Tailed Dataset Distillation
- Recurrent Attention-based Token Selection for Efficient Streaming Video-LLMs
- Recurrent Memory for Online Interdomain Gaussian Processes
- Recurrent Self-Attention Dynamics: An Energy-Agnostic Perspective from Jacobians
- Recursive Inference Scaling: A Winning Path to Scalable Inference in Language and Multimodal Systems
- Redefining Experts: Interpretable Decomposition of Language Models for Toxicity Mitigation
- ReDi: Rectified Discrete Flow
- ReDit: Reward Dithering for Improved LLM Policy Optimization
- REDOUBT: Duo Safety Validation for Autonomous Vehicle Motion Planning
- Red-Teaming Text-to-Image Systems by Rule-based Preference Modeling
- Reducing the Probability of Undesirable Outputs in Language Models Using Probabilistic Inference
- Reduction-based Pseudo-label Generation for Instance-dependent Partial Label Learning
- Redundancy-Aware Test-Time Graph Out-of-Distribution Detection
- REFED: A Subject Real-time Dynamic Labeled EEG-fNIRS Synchronized Recorded Emotion Dataset
- Refinement Methods for Distributed Distribution Estimation under $\ell^p$-Losses
- Refining Norms: A Post-hoc Framework for OOD Detection in Graph Neural Networks
- RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models
- Reframing Gaussian Splatting Densification with Complexity-Density Consistency of Primitives
- Refusal Direction is Universal Across Safety-Aligned Languages
- REGen: Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing
- Regional Explanations: Bridging Local and Global Variable Importance
- Register and [CLS] tokens induce a decoupling of local and global features in large ViTs
- Registration is a Powerful Rotation-Invariance Learner for 3D Anomaly Detection
- Regression-adjusted Monte Carlo Estimators for Shapley Values and Probabilistic Values
- Regression Trees Know Calculus
- Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach
- Regret Bounds for Adversarial Contextual Bandits with General Function Approximation and Delayed Feedback
- Regret Lower Bounds for Decentralized Multi-Agent Stochastic Shortest Path Problems
- Regret-Optimal Q-Learning with Low Cost for Single-Agent and Federated Reinforcement Learning
- Regularized least squares learning with heavy-tailed noise is minimax optimal
- Regulatable ML: Towards Bridging the Gaps between Machine Learning Research and Regulations
- ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single Model
- ReinAD: Towards Real-world Industrial Anomaly Detection with a Comprehensive Contrastive Dataset
- ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
- REINFORCE Converges to Optimal Policies with Any Learning Rate
- Reinforced Active Learning for Large-Scale Virtual Screening with Learnable Policy Model
- Reinforced Context Order Recovery for Adaptive Reasoning and Planning
- Reinforcement Learning Finetunes Small Subnetworks in Large Language Models
- REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA
- Reinforcement learning for one-shot DAG scheduling with comparability identification and dense reward
- Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding
- Reinforcement Learning for Reasoning in Large Language Models with One Training Example
- Reinforcement Learning Meets Masked Generative Models: Mask-GRPO for Text-to-Image Generation
- Reinforcement Learning Teachers of Test Time Scaling
- Reinforcement Learning with Action Chunking
- Reinforcement Learning with Backtracking Feedback
- Reinforcement Learning with Imperfect Transition Predictions: A Bellman-Jensen Approach
- Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
- Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
- Reinventing Multi-Agent Collaboration through Gaussian-Image Synergy in Diffusion Policies
- RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers
- Relaxing partition admissibility in Cluster-DAGs: a causal calculus with arbitrary variable clustering
- ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search
- Reliable Decision‑Making via Calibration‑Oriented Retrieval‑Augmented Generation
- Reliable Lifelong Multimodal Editing: Conflict-Aware Retrieval Meets Multi-Level Guidance
- Reliable ML from Unreliable Data
- Reliably detecting model failures in deployment without labels
- Relieving the Over-Aggregating Effect in Graph Transformers
- ReMA: Learning to Meta-Think for LLMs with Multi-agent Reinforcement Learning
- Remarkable Robustness of LLMs: Stages of Inference?
- Remasking Discrete Diffusion Models with Inference-Time Scaling
- ReMindRAG: Low-Cost LLM-Guided Knowledge Graph Traversal for Efficient RAG
- REMI: Reconstructing Episodic Memory During Internally Driven Path Planning
- Removing Concepts from Text-to-Image Models with Only Negative Samples
- Rendering-Aware Reinforcement Learning for Vector Graphics Generation
- REN: Fast and Efficient Region Encodings from Patch-Based Image Encoders
- REOBench: Benchmarking Robustness of Earth Observation Foundation Models
- REOrdering Patches Improves Vision Models
- Reparameterized LLM Training via Orthogonal Equivalence Transformation
- REPA Works Until It Doesn’t: Early-Stopped, Holistic Alignment Supercharges Diffusion Training
- RepGuard: Adaptive Feature Decoupling for Robust Backdoor Defense in Large Language Models
- RePIC: Reinforced Post-Training for Personalizing Multi-Modal Language Models
- ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization
- RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation
- Replicable Distribution Testing
- Replicable Online Learning
- Replicable Online pricing
- Repo2Run: Automated Building Executable Environment for Code Repository at Scale
- RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories for Complex Task Solving
- RePO: Understanding Preference Learning Through ReLU-Based Optimization
- Representational Difference Explanations
- Representation Consistency for Accurate and Coherent LLM Answer Aggregation
- Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think
- Representation-Level Counterfactual Calibration for Debiased Zero-Shot Recognition
- REP: Resource-Efficient Prompting for Rehearsal-Free Continual Learning
- Reproducing Kernel Banach Space Models for Neural Networks with Application to Rademacher Complexity Analysis
- Repurposing AlphaFold3-like Protein Folding Models for Antibody Sequence and Structure Co-design
- Repurposing Marigold for Zero-Shot Metric Depth Estimation via Defocus Blur Cues
- RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
- Rescaled Influence Functions: Accurate Data Attribution in High Dimension
- ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code
- ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
- ReservoirTTA: Prolonged Test-time Adaptation for Evolving and Recurring Domains
- Residual Stream Analysis of Overfitting And Structural Disruptions
- ReSim: Reliable World Simulation for Autonomous Driving
- Resolution of Simpson's paradox via the common cause principle
- Resounding Acoustic Fields with Reciprocity
- Resource-Constrained Federated Continual Learning: What Does Matter?
- RESPIN-S1.0: A read speech corpus of 10000+ hours in dialects of nine Indian Languages
- RespoDiff: Dual-Module Bottleneck Transformation for Responsible & Faithful T2I Generation
- ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning
- Restage4D: Reanimating Deformable 3D Reconstruction from a Single Video
- Restoring Pruned Large Language Models via Lost Component Compensation
- Restricted Global-Aware Graph Filters Bridging GNNs and Transformer for Node Classification
- Restricted Spectral Gap Decomposition for Simulated Tempering Targeting Mixture Distributions
- Results of the Big ANN: NeurIPS’23 competition
- Rethinking Approximate Gaussian Inference in Classification
- Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates
- Rethinking Entropy in Test-Time Adaptation: The Missing Piece from Energy Duality
- Rethinking Evaluation of Infrared Small Target Detection
- Rethinking Fair Federated Learning from Parameter and Client View
- Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning
- Rethinking Gradient Step Denoiser: Towards Truly Pseudo-Contractive Operator
- Rethinking Hebbian Principle: Low-Dimensional Structural Projection for Unsupervised Learning
- Rethinking Joint Maximum Mean Discrepancy for Visual Domain Adaptation
- Rethinking Losses for Diffusion Bridge Samplers
- Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion
- Rethinking Neural Combinatorial Optimization for Vehicle Routing Problems with Different Constraint Tightness Degrees
- Rethinking Nighttime Image Deraining via Learnable Color Space Transformation
- Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling
- Rethinking Out-of-Distribution Detection and Generalization with Collective Behavior Dynamics
- Rethinking PCA Through Duality
- Rethinking Residual Distribution in Locate-then-Edit Model Editing
- Rethinking Scale-Aware Temporal Encoding for Event-based Object Detection
- Rethinking the Role of Verbatim Memorization in LLM Privacy
- Rethinking Tokenized Graph Transformers for Node Classification
- Rethinking Verification for LLM Code Generation: From Generation to Testing
- RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
- Retrieval is Not Enough: Enhancing RAG through Test-Time Critique and Optimization
- Retro-R1: LLM-based Agentic Retrosynthesis
- Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models
- RETRO SYNFLOW: Discrete Flow-Matching for Accurate and Diverse Single-Step Retrosynthesis
- Retrosynthesis Planning via Worst-path Policy Optimisation in Tree-structured MDPs
- Retrv-R1: A Reasoning-Driven MLLM Framework for Universal and Efficient Multimodal Retrieval
- Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape
- Return of ChebNet: Understanding and Improving an Overlooked GNN on Long Range Tasks
- REVE: A Foundation Model for EEG - Adapting to Any Setup with Large-Scale Pretraining on 25,000 Subjects
- Revealing Multimodal Causality with Large Language Models
- Reverse-Annealed Sequential Monte Carlo for Efficient Bayesian Optimal Experiment Design
- Reverse Diffusion Sequential Monte Carlo Samplers
- Reverse Engineering Human Preferences with Reinforcement Learning
- Revising and Falsifying Sparse Autoencoder Feature Explanations
- Revisiting 1-peer exponential graph for enhancing decentralized learning efficiency
- Revisiting Agnostic Boosting
- Revisiting Bi-Linear State Transitions in Recurrent Neural Networks
- Revisiting Consensus Error: A Fine-grained Analysis of Local SGD under Second-order Data Heterogeneity
- Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology
- Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems
- Revisiting Frank-Wolfe for Structured Nonconvex Optimization
- Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws
- Revisiting Glorot Initialization for Long-Range Linear Recurrences
- Revisiting Logit Distributions for Reliable Out-of-Distribution Detection
- Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability
- Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective
- Revisiting Orbital Minimization Method for Neural Operator Decomposition
- Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
- Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks
- Revisiting Semi-Supervised Learning in the Era of Foundation Models
- Revitalizing SVD for Global Covariance Pooling: Halley’s Method to Overcome Over-Flattening
- Reviving DSP for Advanced Theorem Proving in the Era of Reasoning Models
- Revolutionizing Graph Aggregation: From Suppression to Amplification via BoostGCN
- Revolutionizing Training-Free NAS: Towards Efficient Automatic Proxy Discovery via Large Language Models
- Reward-Aware Proto-Representations in Reinforcement Learning
- Reward-Instruct: A Reward-Centric Approach to Fast Photo-Realistic Image Generation
- Reward-oriented Causal Representation Learning
- Reward Reasoning Models
- Rewind-to-Delete: Certified Machine Unlearning for Nonconvex Functions
- RF-Agent: Automated Reward Function Design via Language Agent Tree Search
- RFMPose: Generative Category-level Object Pose Estimation via Riemannian Flow Matching
- RGB-Only Supervised Camera Parameter Optimization in Dynamic Scenes
- RGB-to-Polarization Estimation: A New Task and Benchmark Study
- RGNMR: A Gauss-Newton method for robust matrix completion with theoretical guarantees
- RHYTHM: Reasoning with Hierarchical Temporal Tokenization for Human Mobility
- RiboFlow: Conditional De Novo RNA Co-Design via Synergistic Flow Matching
- Ridge Boosting is Both Robust and Efficient
- RidgeLoRA: Matrix Ridge Enhanced Low-Rank Adaptation of Large Language Models
- Riemannian Consistency Model
- Riemannian Flow Matching for Brain Connectivity Matrices via Pullback Geometry
- Riemannian Proximal Sampler for High-accuracy Sampling on Manifolds
- Rig3R: Rig-Aware Conditioning and Discovery for 3D Reconstruction
- RigAnyFace: Scaling Neural Facial Mesh Auto-Rigging with Unlabeled Data
- Right for the Right Reasons: Avoiding Reasoning Shortcuts via Prototypical Neurosymbolic AI
- Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
- RIGNO: A Graph-based Framework For Robust And Accurate Operator Learning For PDEs On Arbitrary Domains
- Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor
- RiOSWorld: Benchmarking the Risk of Multimodal Computer-Use Agents
- Rising from Ashes: Generalized Federated Learning via Dynamic Parameter Reset
- Risk-Averse Constrained Reinforcement Learning with Optimized Certainty Equivalents
- Risk-Averse Total-Reward Reinforcement Learning
- Risk-aware Direct Preference Optimization under Nested Risk Measure
- Risk Bounds For Distributional Regression
- Risk Management for Mitigating Benchmark Failure Modes: BenchRisk
- RiverMamba: A State Space Model for Global River Discharge and Flood Forecasting
- R-KV: Redundancy-aware KV Cache Compression for Reasoning Models
- RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation
- RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning
- RLtools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control
- RLVR-World: Training World Models with Reinforcement Learning
- RLZero: Direct Policy Inference from Language Without In-Domain Supervision
- RNNs perform task computations by dynamically warping neural representations
- RobIA: Robust Instance-aware Continual Test-time Adaptation for Deep Stereo
- Robo2VLM: Improving Visual Question Answering using Large-Scale Robot Manipulation Data
- RoboCerebra: A Large-scale Benchmark for Long-horizon Robotic Manipulation Evaluation
- Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models
- RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics
- RoboScape: Physics-informed Embodied World Model
- Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
- RobotSmith: Generative Robotic Tool Design for Acquisition of Complex Manipulation Skills
- Robust and Computation-Aware Gaussian Processes
- Robust and Diverse Multi-Agent Learning via Rational Policy Gradient
- Robust and Scalable Autonomous Reinforcement Learning in Irreversible Environments
- Robust Contextual Pricing
- Robust Cross-modal Alignment Learning for Cross-Scene Spatial Reasoning and Grounding
- Robust Distortion-Free Watermark for Autoregressive Audio Generation Models
- Robust Distributed Estimation: Extending Gossip Algorithms to Ranking and Trimmed Means
- Robust Egocentric Referring Video Object Segmentation via Dual-Modal Causal Intervention
- Robust Ego-Exo Correspondence with Long-Term Memory
- Robust Equilibria in Continuous Games: From Strategic to Dynamic Robustness
- Robust Estimation Under Heterogeneous Corruption Rates
- Robust Explanations of Graph Neural Networks via Graph Curvatures
- Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA
- Robust Graph Condensation via Classification Complexity Mitigation
- Robust Hallucination Detection in LLMs via Adaptive Token Selection
- Robust Hyperbolic Learning with Curvature-Aware Optimization
- Robustifying Learning-Augmented Caching Efficiently without Compromising 1-Consistency
- Robust Integrated Learning and Pauli Noise Mitigation for Parametrized Quantum Circuits
- Robust Label Proportions Learning
- Robust learning of halfspaces under log-concave marginals
- Robust LLM Alignment via Distributionally Robust Direct Preference Optimization
- Robustly Learning Monotone Single-Index Models
- RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness
- Robust Minimax Boosting with Performance Guarantees
- Robustness in Both Domains: CLIP Needs a Robust Text Encoder
- Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting
- Robust Policy Expansion for Offline-to-Online RL under Diverse Data Corruption
- Robust Regression of General ReLUs with Queries
- Robust Reinforcement Learning in Finance: Modeling Market Impact with Elliptic Uncertainty Sets
- Robust Sampling for Active Statistical Inference
- Robust Satisficing Gaussian Process Bandits Under Adversarial Attacks
- Robust SuperAlignment: Weak-to-Strong Robustness Generalization for Vision-Language Models
- Robust Transfer Learning with Unreliable Source Data
- RODS: Robust Optimization Inspired Diffusion Sampling for Detecting and Reducing Hallucination in Generative Models
- RoFt-Mol: Benchmarking Robust Fine-tuning with Molecular Graph Foundation Models
- ROGR: Relightable 3D Objects using Generative Relighting
- Role-aware Multi-agent Reinforcement Learning for Coordinated Emergency Traffic Control
- Role Bias in Diffusion Models: Diagnosing and Mitigating through Intermediate Decomposition
- Rollout Roulette: A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods
- RoMa: A Robust Model Watermarking Scheme for Protecting IP in Diffusion Models
- RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing
- RoME: Domain-Robust Mixture-of-Experts for MILP Solution Prediction across Domains
- RoomEditor: High-Fidelity Furniture Synthesis with Parameter-Sharing U-Net
- Rooms from Motion: Un-posed Indoor 3D Object Detection as Localization and Mapping
- Root Cause Analysis of Outliers with Missing Structural Knowledge
- ROOT: Rethinking Offline Optimization as Distributional Translation via Probabilistic Bridge
- RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers
- Rope to Nope and Back Again: A New Hybrid Attention Strategy
- ROSE: Remove Objects with Side Effects in Videos
- Rotary Masked Autoencoders are Versatile Learners
- Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning
- Routing Mamba: Scaling State Space Models with Mixture-of-Experts Projection
- ROVER: Recursive Reasoning Over Videos with Vision-Language Models for Embodied Tasks
- RPG360: Robust 360 Depth Estimation with Perspective Foundation Models and Graph Optimization
- RrED: Black-box Unsupervised Domain Adaptation via Rectifying-reasoning Errors of Diffusion
- RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards
- RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
- RSCC: A Large-Scale Remote Sensing Change Caption Dataset for Disaster Events
- rStar-Coder: Scaling Competitive Code Reasoning with a Large-Scale Verified Dataset
- RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video
- RUAGO: Effective and Practical Retain-Free Unlearning via Adversarial Attack and OOD Generator
- RULE: Reinforcement UnLEarning Achieves Forget-retain Pareto Optimality
- RvLLM: LLM Runtime Verification with Domain Knowledge
- S$^2$M-Former: Spiking Symmetric Mixing Branchformer for Brain Auditory Attention Detection
- S$^2$NN: Sub-bit Spiking Neural Networks
- SAD Neural Networks: Divergent Gradient Flows and Asymptotic Optimality via o-minimal Structures
- SAEMark: Steering Personalized Multilingual LLM Watermarks with Sparse Autoencoders
- Safe and Stable Control via Lyapunov-Guided Diffusion Models
- Safely Learning Controlled Stochastic Dynamics
- SAFE: Multitask Failure Detection for Vision-Language-Action Models
- SAFEPATH: Preventing Harmful Reasoning in Chain-of-Thought via Early Alignment
- SafePTR: Token-Level Jailbreak Defense in Multimodal LLMs via Prune-then-Restore Mechanism
- Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
- Safe + Safe = Unsafe? Exploring How Safe Images Can Be Exploited to Jailbreak Large Vision-Language Models
- Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking
- Safety Depth in Large Language Models: A Markov Chain Perspective
- Safety Pretraining: Toward the Next Generation of Safe AI
- SafeVid: Toward Safety Aligned Video Large Multimodal Models
- SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning
- SAFEx: Analyzing Vulnerabilities of MoE-Based LLMs via Stable Safety-critical Expert Identification
- SaFiRe: Saccade-Fixation Reiteration with Mamba for Referring Image Segmentation
- SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training
- SAGE: A Unified Framework for Generalizable Object State Recognition with State-Action Graph Embedding
- SAGE-Eval: Evaluating LLMs for Systematic Generalizations of Safety Facts
- SAINT: Sequence-Aware Integration for Spatial Transcriptomics Multi-View Clustering
- Salient Concept-Aware Generative Data Augmentation
- SALMONN-omni: A Standalone Speech LLM without Codec Injection for Full-duplex Conversation
- SALoM: Structure Aware Temporal Graph Networks with Long-Short Memory Updater
- SALS: Sparse Attention in Latent Space for KV Cache Compression
- SAM2Flow: Interactive Optical Flow Estimation with Dual Memory for in vivo Microcirculation Analysis
- SAMA: Towards Multi-Turn Referential Grounded Video Chat with Large Language Models
- Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs
- Sample-Adaptivity Tradeoff in On-Demand Sampling
- Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures
- Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function
- Sample Complexity of Distributionally Robust Average-Reward Reinforcement Learning
- Sample-Conditional Coverage in Split-Conformal Prediction
- Sampled Estimators For Softmax Must Be Biased
- Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions
- Sample-Efficient Multi-Round Generative Data Augmentation for Long-Tail Instance Segmentation
- Sample-Efficient Tabular Self-Play for Offline Robust Reinforcement Learning
- Sampling 3D Molecular Conformers with Diffusion Transformers
- Sampling by averaging: A multiscale approach to score estimation
- Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding
- Sampling from multi-modal distributions with polynomial query complexity in fixed dimension via reverse diffusion
- SAMPO: Scale-wise Autoregression with Motion Prompt for Generative World Models
- SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning
- SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
- SAO-Instruct: Free-form Audio Editing using Natural Language Instructions
- SAP: Exact Sorting in Splatting via Screen-Aligned Primitives
- SAS: Simulated Attention Score
- Satellites Reveal Mobility: A Commuting Origin-destination Flow Generator for Global Cities
- SATURN: SAT-based Reinforcement Learning to Unleash LLMs Reasoning
- SAVVY: Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing
- Scaffolding Dexterous Manipulation with Vision-Language Models
- Scalable and adaptive prediction bands with kernel sum-of-squares
- Scalable and Cost-Efficient de Novo Template-Based Molecular Generation
- Scalable Best-of-N Selection for Large Language Models via Self-Certainty
- Scalable Cross-View Sample Alignment for Multi-View Clustering with View Structure Similarity
- Scalable Evaluation and Neural Models for Compositional Generalization
- Scalable, Explainable and Provably Robust Anomaly Detection with One-Step Flow Matching
- Scalable Exploration via Ensemble++
- Scalable Feature Learning on Huge Knowledge Graphs for Downstream Machine Learning
- Scalable Fingerprinting of Large Language Models
- Scalable In-context Ranking with Generative Models
- Scalable inference of functional neural connectivity at submillisecond timescales
- Scalable Neural Incentive Design with Parameterized Mean-Field Approximation
- Scalable Neural Network Geometric Robustness Validation via Hölder Optimisation
- Scalable Policy-Based RL Algorithms for POMDPs
- Scalable Signature Kernel Computations via Local Neumann Series Expansions
- Scalable Valuation of Human Feedback through Provably Robust Model Alignment
- ScaleDiff: Higher-Resolution Image Synthesis via Efficient and Model-Agnostic Diffusion
- Scale-invariant attention
- Scale Test-Time Compute on Modern Hardware
- Scaling and context steer LLMs along the same computational path as the human brain
- Scaling can lead to compositional generalization
- Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning
- Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
- Scaling Data-Driven Probabilistic Robustness Analysis for Semantic Segmentation Neural Networks
- Scaling Diffusion Transformers Efficiently via $\mu$P
- Scaling Embedding Layers in Language Models
- Scaling Epidemic Inference on Contact Networks: Theory and Algorithms
- Scaling Image Geo-Localization to Continent Level
- Scaling Language-centric Omnimodal Representation Learning
- Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf’s Law
- Scaling Laws for Optimal Data Mixtures
- Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets
- Scaling Laws For Scalable Oversight
- Scaling Law with Learning Rate Annealing
- Scaling Offline RL via Efficient and Expressive Shortcut Models
- Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization
- Scaling Physical Reasoning with the PHYSICS Dataset
- Scaling RL to Long Videos
- Scaling Speculative Decoding with Lookahead Reasoning
- Scaling Unlocks Broader Generation and Deeper Functional Understanding of Proteins
- Scaling Up Active Testing to Large Language Models
- Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling
- Scaling Up Parameter Generation: A Recurrent Diffusion Approach
- Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
- SCAN: Self-Denoising Monte Carlo Annotation for Robust Process Reward Learning
- ScatterAD: Temporal-Topological Scattering Mechanism for Time Series Anomaly Detection
- SceneDecorator: Towards Scene-Oriented Story Generation with Scene Planning and Scene Consistency
- SceneDesigner: Controllable Multi-Object Image Generation with 9-DoF Pose Manipulation
- SceneForge: Enhancing 3D-text alignment with Structured Scene Compositions
- SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting
- SceneWeaver: All-in-One 3D Scene Synthesis with an Extensible and Self-Reflective Agent
- Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging
- scGeneScope: A Treatment-Matched Single Cell Imaging and Transcriptomics Dataset and Benchmark for Treatment Response Modeling
- Schrödinger Bridge Matching for Tree-Structured Costs and Entropic Wasserstein Barycentres
- SciArena: An Open Evaluation Platform for Non-Verifiable Scientific Literature-Grounded Tasks
- Science of Trustworthy Generative Foundation Models
- Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning
- scMRDR: A scalable and flexible framework for unpaired single-cell multi-omics data integration
- SCOPE: Saliency-Coverage Oriented Token Pruning for Efficient Multimodel LLMs
- Score-Based Diffusion Modeling for Nonparametric Empirical Bayes in Heteroscedastic Gaussian Mixtures
- Score-informed Neural Operator for Enhancing Ordering-based Causal Discovery
- SCoT: Unifying Consistency Models and Rectified Flows via Straight-Consistent Trajectories
- SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought
- scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery
- S-Crescendo: A Nested Transformer Weaving Framework for Scalable Nonlinear System in S-Domain Representation
- scSplit: Bringing Severity Cognizance to Image Decomposition in Fluorescence Microscopy
- Sculpting Features from Noise: Reward-Guided Hierarchical Diffusion for Task-Optimal Feature Transformation
- SD-KDE: Score-Debiased Kernel Density Estimation
- SDPGO: Efficient Self-Distillation Training Meets Proximal Gradient Optimization
- SDTagNet: Leveraging Text-Annotated Navigation Maps for Online HD Map Construction
- SD-VLM: Spatial Measuring and Understanding with Depth-Encoded Vision-Language Models
- SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents
- SEAL: Semantic-Aware Hierarchical Learning for Generalized Category Discovery
- Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning
- Searching Efficient Semantic Segmentation Architectures via Dynamic Path Selection
- Searching Latent Program Spaces
- SeasonBench-EA: A Multi-Source Benchmark for Seasonal Prediction and Numerical Model Post-Processing in East Asia
- SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations
- SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks
- SECODEPLT: A Unified Benchmark for Evaluating the Security Risks and Capabilities of Code GenAI
- Second-Order Convergence in Private Stochastic Non-Convex Optimization
- Second-order Optimization under Heavy-Tailed Noise: Hessian Clipping and Sample Complexity Limits
- Second Workshop on Aligning Reinforcement Learning Experimentalists and Theorists (ARLET)
- SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG
- Secure and Confidential Certificates of Online Fairness
- Securing the Language of Life: Inheritable Watermarks from DNA Language Models to Proteins
- Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition
- SEEA-R1: Tree-Structured Reinforcement Fine-Tuning for Self-Evolving Embodied Agents
- Seeds of Structure: Patch PCA Reveals Universal Compositional Cues in Diffusion Models
- Seeing in the Dark: Benchmarking Egocentric 3D Vision with the Oxford Day-and-Night Dataset
- Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models
- Seeing Sound, Hearing Sight: Uncovering Modality Bias and Conflict of AI models in Sound Localization
- Seeing the Arrow of Time in Large Multimodal Models
- Seeing the Wind from a Falling Leaf
- Seeing through Uncertainty: Robust Task-Oriented Optimization in Visual Navigation
- Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation
- Seeking and Updating with Live Visual Knowledge
- Seemingly Redundant Modules Enhance Robust Odor Learning in Fruit Flies
- SeePhys: Does Seeing Help Thinking? – Benchmarking Vision-Based Physics Reasoning
- SeerAttention: Self-distilled Attention Gating for Efficient Long-context Prefilling
- See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction
- See&Trek: Training-Free Spatial Prompting for Multimodal Large Language Model
- Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control
- Seg4Diff: Unveiling Open-Vocabulary Semantic Segmentation in Text-to-Image Diffusion Transformers
- SEGA: Shaping Semantic Geometry for Robust Hashing under Noisy Supervision
- SegGraph: Leveraging Graphs of SAM Segments for Few-Shot 3D Part Segmentation
- SegMASt3R: Geometry Grounded Segment Matching
- Segment Anything Model Meets Semi-supervised Medical Image Segmentation: A Novel Perspective
- Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
- Segment then Splat: Unified 3D Open-Vocabulary Segmentation via Gaussian Splatting
- SE-GUI: Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
- Seg-VAR:Image Segmentation with Visual Autoregressive Modeling
- Sekai: A Video Dataset towards World Exploration
- Selective Learning for Deep Time Series Forecasting
- Selective Omniprediction and Fair Abstention
- Self-Adapting Language Models
- Self-alignment of Large Video Language Models with Refined Regularized Preference Optimization
- Self-Assembling Graph Perceptrons
- Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing
- Self-Calibrating BCIs: Ranking and Recovery of Mental Targets Without Labels
- Self-Challenging Language Model Agents
- Self-diffusion for Solving Inverse Problems
- Self-Evolving Pseudo-Rehearsal for Catastrophic Forgetting with Task Similarity in LLMs
- Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion
- Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks
- Self-Guided Hierarchical Exploration for Generalist Foundation Model Web Agents
- Self-Improving Embodied Foundation Models
- Self Iterative Label Refinement via Robust Unlabeled Learning
- Self-Perturbed Anomaly-Aware Graph Dynamics for Multivariate Time-Series Anomaly Detection
- Self-Refining Language Model Anonymizers via Adversarial Distillation
- Self-supervised Blending Structural Context of Visual Molecules for Robust Drug Interaction Prediction
- Self-Supervised Contrastive Learning is Approximately Supervised Contrastive Learning
- Self-Supervised Direct Preference Optimization for Text-to-Image Diffusion Models
- Self-Supervised Discovery of Neural Circuits in Spatially Patterned Neural Responses with Graph Neural Networks
- Self supervised learning for in vivo localization of microelectrode arrays using raw local field potential
- Self-supervised Learning of Echocardiographic Video Representations via Online Cluster Distillation
- Self-Supervised Learning of Graph Representations for Network Intrusion Detection
- Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals
- Self-Supervised Selective-Guided Diffusion Model for Old-Photo Face Restoration
- Selftok-Zero: Reinforcement Learning for Visual Generation via Discrete and Autoregressive Visual Tokens
- Self-Training with Dynamic Weighting for Robust Gradual Domain Adaptation
- Self-Verification Provably Prevents Model Collapse in Recursive Synthetic Training
- Self-Verifying Reflection Helps Transformers with CoT Reasoning
- Semantic and Visual Crop-Guided Diffusion Models for Heterogeneous Tissue Synthesis in Histopathology
- Semantic-guided Diverse Decoding for Large Language Model
- Semantic-KG: Using Knowledge Graphs to Construct Benchmarks for Measuring Semantic Similarity
- Semantic Representation Attack against Aligned Large Language Models
- Semantic Surgery: Zero-Shot Concept Erasure in Diffusion Models
- SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens
- Semi-infinite Nonconvex Constrained Min-Max Optimization
- Semi-off-Policy Reinforcement Learning for Vision-Language Slow-Thinking Reasoning
- Semi-supervised Graph Anomaly Detection via Robust Homophily Learning
- Semi-Supervised Regression with Heteroscedastic Pseudo-Labels
- Semi-supervised Vertex Hunting, with Applications in Network and Text Analysis
- SEMPO: Lightweight Foundation Models for Time Series Forecasting
- SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors
- Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists
- SensorLM: Learning the Language of Wearable Sensors
- SentinelKilnDB: A Large-Scale Dataset and Benchmark for OBB Brick Kiln Detection in South Asia Using Satellite Imagery
- Separating the 'what' and 'how' of compositional computation to enable reuse and continual learning
- seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models
- Sequence Modeling with Spectral Mean Flows
- Sequential Attention-based Sampling for Histopathological Analysis
- Sequentially Auditing Differential Privacy
- Sequential Monte Carlo for Policy Optimization in Continuous POMDPs
- Sequential Multi-Agent Dynamic Algorithm Configuration
- SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data
- Set-LLM: A Permutation-Invariant LLM
- Set Smoothness Unlocks Clarke Hyper-stationarity in Bilevel Optimization
- Setting $\varepsilon$ is not the Issue in Differential Privacy
- sfsdf (BTF)
- SGAR: Structural Generative Augmentation for 3D Human Motion Retrieval
- SGCD: Stain-Guided CycleDiffusion for Unsupervised Domain Adaptation of Histopathology Image Classification
- SGN: Shifted Window-Based Hierarchical Variable Grouping for Multivariate Time Series Classification
- S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models
- Shallow Diffuse: Robust and Invisible Watermarking through Low-Dim Subspaces in Diffusion Models
- Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis
- ShapeCraft: LLM Agents for Structured, Textured and Interactive 3D Modeling
- ShapeEmbed: a self-supervised learning framework for 2D contour quantification
- Shape-Informed Clustering of Multi-Dimensional Functional Data via Deep Functional Autoencoders
- Shape it Up! Restoring LLM Safety during Finetuning
- ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding
- ShapeX: Shapelet-Driven Post Hoc Explanations for Time Series Classification Models
- Shaping Sequence Attractor Schema in Recurrent Neural Networks
- Shapley-Based Data Valuation for Weighted $k$-Nearest Neighbors
- Shapley-Coop: Credit Assignment for Emergent Cooperation in Self-Interested LLM Agents
- SHAP Meets Tensor Networks: Provably Tractable Explanations with Parallelism
- SHAP values via sparse Fourier representation
- SHAP zero Explains Biological Sequence Models with Near-zero Marginal Cost for Future Queries
- Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
- Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings
- Sharp Gap-Dependent Variance-Aware Regret Bounds for Tabular MDPs
- Sharp Gaussian approximations for Decentralized Federated Learning
- Sharp Matrix Empirical Bernstein Inequalities
- SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes
- Sheetpedia: A 300K-Spreadsheet Corpus for Spreadsheet Intelligence and LLM Fine-Tuning
- Sherlock: Self-Correcting Reasoning in Vision-Language Models
- SHF: Symmetrical Hierarchical Forest with Pretrained Vision Transformer Encoder for High-Resolution Medical Segmentation
- SHGR: A Generalized Maximal Correlation Coefficient
- Shift Before You Learn: Enabling Low-Rank Representations in Reinforcement Learning
- ShiQ: Bringing back Bellman to LLMs
- ShoeFit: A New Dataset and Dual-image-stream DiT Framework for Virtual Footwear Try-On
- Shortcut Features as Top Eigenfunctions of NTK: A Linear Neural Network Case and More
- Shortcuts and Identifiability in Concept-based Models from a Neuro-Symbolic Lens
- Shortcutting Pre-trained Flow Matching Diffusion Models is Almost Free Lunch
- ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning
- Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence
- ShortListing Model: A Streamlined Simplex Diffusion for Discrete Variable Generation
- ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models
- Show-o2: Improved Native Unified Multimodal Models
- Siegel Neural Networks
- SIFusion: A Unified Fusion Framework for Multi-granularity Arctic Sea Ice Forecasting
- SIGMA: Refining Large Language Model Reasoning via Sibling-Guided Monte Carlo Augmentation
- Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation
- SignFlow Bipartite Subgraph Network For Large-Scale Graph Link Sign Prediction
- Sign-In to the Lottery: Reparameterizing Sparse Training
- Silencer: From Discovery to Mitigation of Self-Bias in LLM-as-Benchmark-Generator
- SilentStriker: Toward Stealthy Bit-Flip Attacks on Large Language Models
- Sim-LLM: Optimizing LLM Inference at the Edge through Inter-Task KV Reuse
- Simple and Effective Specialized Representations for Fair Classifiers
- Simple and Efficient Heterogeneous Temporal Graph Neural Network
- Simple and Optimal Sublinear Algorithms for Mean Estimation
- Simple Distillation for One-Step Diffusion Models
- SimpleStrat: Diversifying Language Model Generation with Stratification
- Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
- SimSort: A Data-Driven Framework for Spike Sorting by Large-Scale Electrophysiology Simulation
- Simulating Society Requires Simulating Thought
- Simulating Viva Voce Examinations to Evaluate Clinical Reasoning in Large Language Models
- Simulation-Based Inference for Adaptive Experiments
- SimulMEGA: MoE Routers are Advanced Policy Makers for Simultaneous Speech Translation
- Simultaneous Modeling of Protein Conformation and Dynamics via Autoregression
- Simultaneous Statistical Inference for Off-Policy Evaluation in Reinforcement Learning
- Simultaneous Swap Regret Minimization via KL-Calibration
- SimWorld: An Open-ended Simulator for Agents in Physical and Social Worlds
- Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis
- Single-pass Adaptive Image Tokenization for Minimum Program Search
- Single-Step Operator Learning for Conditioned Time-Series Diffusion Models
- Single-Teacher View Augmentation: Boosting Knowledge Distillation via Angular Diversity
- SingRef6D: Monocular Novel Object Pose Estimation with a Single RGB Reference
- SING: SDE Inference via Natural Gradients
- Sinusoidal Initialization, Time for a New Start
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning
- Situat3DChange: Situated 3D Change Understanding Dataset for Multimodal Large Language Model
- SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alignment
- Size-adaptive Hypothesis Testing for Fairness
- Sketch-Augmented Features Improve Learning Long-Range Dependencies in Graph Neural Networks
- Sketched Adaptive Distributed Deep Learning: A Sharp Convergence Analysis
- Sketched Gaussian Mechanism for Private Federated Learning
- SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketches
- Skill-Driven Neurosymbolic State Abstractions
- Skrull: Towards Efficient Long Context Fine-tuning through Dynamic Data Scheduling
- SkyLadder: Better and Faster Pretraining via Context Window Scheduling
- Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families
- Slow Transition to Low-Dimensional Chaos in Heavy-Tailed Recurrent Neural Networks
- Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation is Wasteful
- SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference
- Small Resamples, Sharp Guarantees: Convergence Rates for Resampled Studentized Quantile Estimators
- Small Singular Values Matter: A Random Matrix Analysis of Transformer Models
- SmartCache: Context-aware Semantic Cache for Efficient Multi-turn LLM Inference
- SMARTraj$^2$: A Stable Multi-City Adaptive Method for Multi-View Spatio-Temporal Trajectory Representation Learning
- Smart Surrogate Losses for Contextual Stochastic Linear Optimization with Robust Constraints
- SMMILE: An expert-driven benchmark for multimodal medical in-context learning
- SmokeViz: A Large-Scale Satellite Dataset for Wildfire Smoke Detection and Segmentation
- Smooth and Flexible Camera Movement Synthesis via Temporal Masked Generative Modeling
- Smoothed Agnostic Learning of Halfspaces over the Hypercube
- Smoothed Differentiation Efficiently Mitigates Shattered Gradients in Explanations
- Smooth Quadratic Prediction Markets
- Smooth Regularization for Efficient Video Recognition
- Smooth Sailing: Lipschitz-Driven Uncertainty Quantification for Spatial Associations
- S'MoRE: Structural Mixture of Residual Experts for Parameter-Efficient LLM Fine-tuning
- SMRS: advocating a unified reporting standard for surrogate models in the artificial intelligence era.
- SNAP: Low-Latency Test-Time Adaptation with Sparse Updates
- SnapMoGen: Human Motion Generation from Expressive Texts
- SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation
- Social World Model-Augmented Mechanism Design Policy Learning
- SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
- Soft-consensual Federated Learning for Data Heterogeneity via Multiple Paths
- Soft Task-Aware Routing of Experts for Equivariant Representation Learning
- Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space
- SolidGeo: Measuring Multimodal Spatial Math Reasoning in Solid Geometry
- Solver-Free Decision-Focused Learning for Linear Optimization Problems
- Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling
- SolverLLM: Leveraging Test-Time Scaling for Optimization Problem via LLM-Guided Search
- Solving and Learning Partial Differential Equations with Variational Q-Exponential Processes
- Solving Continuous Mean Field Games: Deep Reinforcement Learning for Non-Stationary Dynamics
- Solving Discrete (Semi) Unbalanced Optimal Transport with Equivalent Transformation Mechanism and KKT-Multiplier Regularization
- Solving Inequality Proofs with Large Language Models
- Solving Inverse Problems with FLAIR
- Solving Neural Min-Max Games: The Role of Architecture, Initialization & Dynamics
- Solving Partial Differential Equations via Radon Neural Operator
- Solving the Asymmetric Traveling Salesman Problem via Trace-Guided Cost Augmentation
- SOMBRL: Scalable and Optimistic Model-Based RL
- Some Optimizers are More Equal: Understanding the Role of Optimizers in Group Fairness
- SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions
- SONAR: Long-Range Graph Propagation Through Information Waves
- SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement
- SonoGym: High Performance Simulation for Challenging Surgical Tasks with Robotic Ultrasound
- SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization
- SORTeD Rashomon Sets of Sparse Decision Trees: Anytime Enumeration
- SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement
- Sound Logical Explanations for Mean Aggregation Graph Neural Networks
- Space Group Equivariant Crystal Diffusion
- SPACE: Noise Contrastive Estimation Stabilizes Self-Play Fine-Tuning for Large Language Models
- SpaceServe: Spatial Multiplexing of Complementary Encoders and Decoders for Multimodal LLMs
- SPACE: SPike-Aware Consistency Enhancement for Test-Time Adaptation in Spiking Neural Networks
- Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling
- SPARKE: Scalable Prompt-Aware Diversity and Novelty Guidance in Diffusion Models via RKE Score
- Spark Transformer: Reactivating Sparsity in Transformer FFN and Attention
- Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
- Sparse Diffusion Autoencoder for Test-time Adapting Prediction of Complex Systems
- SparseDiT: Token Sparsification for Efficient Diffusion Transformer
- Sparse Gaussian Processes: Structured Approximations and Power-EP Revisited
- Sparse Image Synthesis via Joint Latent and RoI Flow
- Sparse Meets Dense: Unified Generative Recommendations with Cascaded Sparse-Dense Representations
- Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
- SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering
- Sparse Optimistic Information Directed Sampling
- Sparse Polyak: an adaptive step size rule for high-dimensional M-estimation
- Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation
- Sparta Alignment: Collectively Aligning Multiple Language Models through Combat
- SPARTAN: A Sparse Transformer World Model Attending to What Matters
- Spatial-Aware Decision-Making with Ring Attractors in Reinforcement Learning Systems
- SpatialLM: Training Large Language Models for Structured Indoor Modeling
- Spatially-aware Weights Tokenization for NeRF-Language Models
- Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
- SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning
- Spatial Understanding from Videos: Structured Prompts Meet Simulation Data
- Spatiotemporal Consensus with Scene Prior for Unsupervised Domain Adaptive Person Search
- SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding
- SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning
- SpecEdge: Scalable Edge-Assisted Serving Framework for Interactive LLMs
- SpecEM: Training-Free LLM Ensembling via Iterative Drafting, Verification, and Online Feedback
- SpecMAS: A Multi-Agent System for Self-Verifying System Generation via Formal Model Checking
- SpecMER: Fast Protein Generation with K-mer Guided Speculative Decoding
- SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning
- Spectral Analysis of Diffusion Models with Application to Schedule Design
- Spectral Analysis of Representational Similarity with Limited Neurons
- Spectral Compressive Imaging via Chromaticity-Intensity Decomposition
- Spectral Conditioning of Attention Improves Transformer Performance
- Spectral Convolutional Conditional Neural Process
- SpectraLDS: Provable Distillation for Linear Dynamical Systems
- Spectral Estimation with Free Decompression
- Spectral Graph Coarsening Using Inner Product Preservation and the Grassmann Manifold
- Spectral Graph Neural Networks are Incomplete on Graphs with a Simple Spectrum
- Spectral Learning for Infinite-Horizon Average-Reward POMDPs
- Spectral Perturbation Bounds for Low-Rank Approximation with Applications to Privacy
- Speculate Deep and Accurate: Lossless and Training-Free Acceleration for Offloaded LLMs via Substitute Speculative Decoding
- Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation
- Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Bootstrapping
- SpEx: A Spectral Approach to Explainable Clustering
- SPFL: Sequential updates with Parallel aggregation for Enhanced Federated Learning under Category and Domain Shifts
- SPICED: A Synaptic Homeostasis-Inspired Framework for Unsupervised Continual EEG Decoding
- SpiderSolver: A Geometry-Aware Transformer for Solving PDEs on Complex Geometries
- SPiDR: A Simple Approach for Zero-Shot Safety in Sim-to-Real Transfer
- Spike4DGS: Towards High-Speed Dynamic Scene Rendering with 4D Gaussian Splatting via a Spike Camera Array
- Spike-RetinexFormer: Rethinking Low-light Image Enhancement with Spiking Neural Networks
- Spike-timing-dependent Hebbian learning as noisy gradient descent
- Spiking Meets Attention: Efficient Remote Sensing Image Super-Resolution with Attention Spiking Neural Networks
- Spiking Neural Networks Need High-Frequency Information
- SpikingVTG: A Spiking Detection Transformer for Video Temporal Grounding
- Spik-NeRF: Spiking Neural Networks for Neural Radiance Fields
- SPINT: Spatial Permutation-Invariant Neural Transformer for Consistent Intracortical Motor Decoding
- Spiral: Semantic-Aware Progressive LiDAR Scene Generation and Understanding
- SplashNet: Split‑and‑Share Encoders for Accurate and Efficient Typing with Surface Electromyography
- Split conformal classification with unsupervised calibration
- SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing
- Split Gibbs Discrete Diffusion Posterior Sampling
- SPMDM: Enhancing Masked Diffusion Models through Simplifing Sampling Path
- Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
- SPOT: Scalable Policy Optimization with Trees for Markov Decision Processes
- Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation
- SPOT-Trip: Dual-Preference Driven Out-of-Town Trip Recommendation
- SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Models
- SPRO: Improving Image Generation via Self-Play
- Spurious-Aware Prototype Refinement for Reliable Out-of-Distribution Detection
- SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL
- SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
- SQS: Enhancing Sparse Perception Models via Query-based Splatting in Autonomous Driving
- Squared families are useful conjugate priors
- SRA-CL: Semantic Retrieval Augmented Contrastive Learning for Sequential Recommendation
- SRHand: Super-Resolving Hand Images and 3D Shapes via View/Pose-aware Neural Image Representations and Explicit Meshes
- SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning
- SRSR: Enhancing Semantic Accuracy in Real-World Image Super-Resolution with Spatially Re-Focused Text-Conditioning
- SSIMBaD: Sigma Scaling with SSIM-Guided Balanced Diffusion for AnimeFace Colorization
- SSRB: Direct Natural Language Querying to Massive Heterogeneous Semi-Structured Data
- SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
- SSTAG: Structure-Aware Self-Supervised Learning Method for Text-Attributed Graphs
- ST$^2$360D: Spatial-to-Temporal Consistency for Training-free 360 Monocular Depth Estimation
- Stability and Oracle Inequalities for Optimal Transport Maps between General Distributions
- Stability and Sharper Risk Bounds with Convergence Rate $\tilde{O}(1/n^2)$
- Stabilizing LTI Systems under Partial Observability: Sample Complexity and Fundamental Limits
- Stable Cinemetrics : Structured Taxonomy and Evaluation for Professional Video Generation
- Stable Coresets via Posterior Sampling: Aligning Induced and Full Loss Landscapes
- Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
- StableGuard: Towards Unified Copyright Protection and Tamper Localization in Latent Diffusion Models
- Stable Matching with Ties: Approximation Ratios and Learning
- Stable Minima of ReLU Neural Networks Suffer from the Curse of Dimensionality: The Neural Shattering Phenomenon
- Stable Part Diffusion 4D: Multi-View RGB and Kinematic Parts Video Generation
- Stable Port-Hamiltonian Neural Networks
- Stab-SGD: Noise-Adaptivity in Smooth Optimization with Stability Ratios
- STACI: Spatio-Temporal Aleatoric Conformal Inference
- Stackelberg Learning with Outcome-based Payment
- Stackelberg Self-Annotation: A Robust Approach to Data-Efficient LLM Alignment
- Staggered Environment Resets Improve Massively Parallel On-Policy Reinforcement Learning
- STAIR: Addressing Stage Misalignment through Temporal-Aligned Preference Reinforcement Learning
- STAR: A Benchmark for Astronomical Star Fields Super-Resolution
- STAR-Bets: Sequential TArget-Recalculating Bets for Tighter Confidence Intervals
- STARC-9: A Large-scale Dataset for Multi-Class Tissue Classification for CRC Histopathology
- STAR: Efficient Preference-based Reinforcement Learning via Dual Regularization
- STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
- STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data
- STAR: Spatial-Temporal Tracklet Matching for Multi-Object Tracking
- StarTrail: Concentric Ring Sequence Parallelism for Efficient Near-Infinite-Context Transformer Model Training
- State-Covering Trajectory Stitching for Diffusion Planners
- State Entropy Regularization for Robust Reinforcement Learning
- State Size Independent Statistical Error Bound for Discrete Diffusion Models
- StateSpaceDiffuser: Bringing Long Context to Diffusion World Models
- State Space Prompting via Gathering and Spreading Spatio-Temporal Information for Video Understanding
- Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces II: non-compact symmetric spaces
- Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces I: the compact case
- Statistical Analysis of an Adversarial Bayesian Weak Supervision Method
- Statistical Analysis of the Sinkhorn Iterations for Two-Sample Schr\"{o}dinger Bridge Estimation
- Statistical Guarantees for High-Dimensional Stochastic Gradient Descent
- Statistical Inference for Decentralized Federated Learning
- Statistical Inference for Gradient Boosting Regression
- Statistical inference for Linear Stochastic Approximation with Markovian Noise
- Statistical Inference under Performativity
- Statistically Valid Post-Deployment Monitoring Should Be Standard for AI-Based Digital Health
- Statistical Parity with Exponential Weights
- Statistics Caching Test-Time Adaptation for Vision-Language Models
- STEAD: Robust Provably Secure Linguistic Steganography with Diffusion Language Model
- Stealthy Yet Effective: Distribution-Preserving Backdoor Attacks on Graph Classification
- SteerConf: Steering LLMs for Confidence Elicitation
- Steering Generative Models with Experimental Data for Protein Fitness Optimization
- Steering Information Utility in Key-Value Memory for Language Model Post-Training
- Steering When Necessary: Flexible Steering Large Language Models with Backtracking
- STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models
- StegoZip: Enhancing Linguistic Steganography Payload in Practice with Large Language Models
- StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold
- STEP: A Unified Spiking Transformer Evaluation Platform for Fair and Reproducible Benchmarking
- Stepsize anything: A unified learning rate schedule for budgeted-iteration training
- Stitch and Tell: A Structured Data Augmentation Method for Spatial Understanding
- STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
- STNet: Spectral Transformation Network for Solving Operator Eigenvalue Problem
- Stochastically Dominant Peer Prediction
- Stochastic-Constrained Stochastic Optimization with Markovian Data
- Stochastic Forward-Forward Learning through Representational Dimensionality Compression
- Stochastic Gradients under Nuisances
- Stochastic Momentum Methods for Non-smooth Non-Convex Finite-Sum Coupled Compositional Optimization
- Stochastic Optimization in Semi-Discrete Optimal Transport: Convergence Analysis and Minimax Rate
- Stochastic Principal-Agent Problems: Computing and Learning Optimal History-Dependent Policies
- Stochastic Process Learning via Operator Flow Matching
- Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization
- Stochastic Shortest Path with Sparse Adversarial Costs
- Stop DDoS Attacking the Research Community with AI-Generated Survey Papers
- Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning
- Stop the Nonconsensual Use of Nude Images in Research
- Storyboard-guided Alignment for Fine-grained Video Action Recognition
- Straight-Line Diffusion Model for Efficient 3D Molecular Generation
- STRAP: Spatio-Temporal Pattern Retrieval for Out-of-Distribution Generalization
- Strassen Attention, Split VC Dimension and Compositionality in Transformers
- Strategic Classification with Non-Linear Classifiers
- Strategic Cost Selection in Participatory Budgeting
- Strategic Costs of Perceived Bias in Fair Selection
- Strategic Hypothesis Testing
- Strategyproof Reinforcement Learning from Human Feedback
- Stratify or Die: Rethinking Data Splits in Image Segmentation
- STRATUS: A Multi-agent System for Autonomous Reliability Engineering of Modern Clouds
- StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs
- StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant
- StreamForest: Efficient Online Video Understanding with Persistent Event Memory
- Streaming Attention Approximation via Discrepancy Theory
- Streaming Audio Generation from Discrete Tokens via Streaming Flow Matching
- Streaming Federated Learning with Markovian Data
- Streaming Stochastic Submodular Maximization with On-Demand User Requests
- STree: Speculative Tree Decoding for Hybrid State Space Models
- STRIDER: Navigation via Instruction-Aligned Structural Decision Space Optimization
- Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs
- Struct-Bench: A Benchmark for Differentially Private Structured Text Generation
- Structural Causal Bandits under Markov Equivalence
- Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs
- Structural Information-based Hierarchical Diffusion for Offline Reinforcement Learning
- Structure-Aware Cooperative Ensemble Evolutionary Optimization on Combinatorial Problems with Multimodal Large Language Models
- Structure-Aware Fusion with Progressive Injection for Multimodal Molecular Representation Learning
- Structure-Aware Spectral Sparsification via Uniform Edge Sampling
- Structured Initialization for Vision Transformers
- Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models
- Structured Probabilistic Inference and Generative Modeling
- Structured Reinforcement Learning for Combinatorial Decision-Making
- Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models
- Structured Spectral Reasoning for Frequency-Adaptive Multimodal Recommendation
- Structured Temporal Causality for Interpretable Multivariate Time Series Anomaly Detection
- Structure Matters: Dynamic Policy Gradient
- StruDiCO: Structured Denoising Diffusion with Gradient-free Inference-stage Boosting for Memory and Time Efficient Combinatorial Optimization
- STSBench: A Large-Scale Dataset for Modeling Neuronal Activity in the Dorsal Stream of Primate Visual Cortex
- STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving
- Styl3R: Instant 3D Stylized Reconstruction for Arbitrary Scenes and Styles
- StyleGuard: Preventing Text-to-Image-Model-based Style Mimicry Attacks by Style Perturbations
- Subgraph Federated Learning via Spectral Methods
- Subsampled Ensemble Can Improve Generalization Tail Exponentially
- Subspace Networks: Scaling Decentralized Training with Communication-Efficient Model Parallelism
- SubTrack++ : Gradient Subspace Tracking for Scalable LLM Training
- Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control
- SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
- Sum Estimation under Personalized Local Differential Privacy
- SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training
- SuperCLIP: CLIP with Simple Classification Supervision
- SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
- Superposition Yields Robust Neural Scaling
- Support Vector Generation: Kernelizing Large Language Models for Efficient Zero‑Shot NLP
- SURDS: Benchmarking Spatial Understanding and Reasoning in Driving Scenarios with Vision Language Models
- Surface-Aware Feed-Forward Quadratic Gaussian for Frame Interpolation with Large Motion
- SurfelSplat: Learning Efficient and Generalizable Gaussian Surfel Representations for Sparse-View Surface Reconstruction
- Surprise3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes
- SutureBot: A Precision Framework & Benchmark For Autonomous End-to-End Suturing
- SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios
- SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem
- SWE-bench Goes Live!
- SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
- SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
- SWE-smith: Scaling Data for Software Engineering Agents
- SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications
- Switchable Token-Specific Codebook Quantization For Face Image Compression
- SwitchLingua: The First Large-Scale Multilingual and Multi-Ethnic Code-Switching Dataset
- SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning
- SymMaP: Improving Computational Efficiency in Linear Solvers through Symbolic Preconditioning
- Symmetry and Geometry in Neural Representations
- Symmetry-Preserving Conformer Ensemble Networks for Molecular Representation Learning
- SYMPHONY: Synergistic Multi-agent Planning with Heterogeneous Language Model Assembly
- SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning
- SynBrain: Enhancing Visual-to-fMRI Synthesis via Probabilistic Representation Learning
- SyncHuman: Synchronizing 2D and 3D Generative Models for Single-view Human Reconstruction
- SynCL: A Synergistic Training Strategy with Instance-Aware Contrastive Learning for End-to-End Multi-Camera 3D Tracking
- Synergistic Tensor and Pipeline Parallelism
- Synergy Between the Strong and the Weak: Spiking Neural Networks are Inherently Self-Distillers
- Synergy over Discrepancy: A Partition-Based Approach to Multi-Domain LLM Fine-Tuning
- SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
- Synthesize Privacy-Preserving High-Resolution Images via Private Textual Intermediaries
- Synthesizing Performance Constraints for Evaluating and Improving Code Efficiency
- Synthesizing Photorealistic and Dynamic Urban Environments for Multimodal Robot Navigation and Collaboration
- Synthetic-powered predictive inference
- Synthetic Series-Symbol Data Generation for Time Series Foundation Models
- SynTSBench: Rethinking Temporal Pattern Learning in Deep Learning Models for Time Series
- System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts
- Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
- System-Embedded Diffusion Bridge Models
- System Prompt Optimization with Meta-Learning
- T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning
- T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
- T2SMark: Balancing Robustness and Diversity in Noise-as-Watermark for Diffusion Models
- T2V-OptJail: Discrete Prompt Optimization for Text-to-Video Jailbreak Attacks
- TabArena: A Living Benchmark for Machine Learning on Tabular Data
- TabDPT: Scaling Tabular Foundation Models on Real Data
- Table2LaTeX-RL: High-Fidelity LaTeX Code Generation from Table Images via Reinforced Multimodal Language Models
- Table as a Modality for Large Language Models
- TabSTAR: A Tabular Foundation Model for Tabular Data with Text Fields
- Tabula: A Tabular Self-Supervised Foundation Model for Single-Cell Transcriptomics
- Taccel: Scaling Up Vision-based Tactile Robotics via High-performance GPU Simulation
- Tackling Biased Evaluators in Dueling Bandits
- Tackling Climate Change with Machine Learning
- Tackling Continual Offline RL through Selective Weights Activation on Aligned Spaces
- Tackling Feature-Classifier Mismatch in Federated Learning via Prompt-Driven Feature Transformation
- TADA: Improved Diffusion Sampling with Training-free Augmented DynAmics
- TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling
- TAI3: Testing Agent Integrity in Interpreting User Intent
- Tail-Optimized Caching for LLM Inference
- TaiwanVQA: Benchmarking and Enhancing Cultural Understanding in Vision-Language Models
- Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras
- TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation
- Taming Adversarial Constraints in CMDPs
- Taming generative video models for zero-shot optical flow extraction
- Taming Hyperparameter Sensitivity in Data Attribution: Practical Selection Without Costly Retraining
- TAMI: Taming Heterogeneity in Temporal Interactions for Temporal Graph Link Prediction
- TANDEM: Bi-Level Data Mixture Optimization with Twin Networks
- TAPAS: Datasets for Learning the Learning with Errors Problem
- Tapered Off-Policy REINFORCE - Stable and efficient reinforcement learning for large language models
- TAPIP3D: Tracking Any Point in Persistent 3D Geometry
- TAPVid-360: Tracking Any Point in 360 from Narrow Field of View Video
- TARFVAE: Efficient One-Step Generative Time Series Forecasting via TARFLOW based VAE
- Targeted Maximum Likelihood Learning: An Optimization Perspective
- Target Speaker Extraction through Comparing Noisy Positive and Negative Audio Enrollments
- Task-Optimized Convolutional Recurrent Networks Align with Tactile Processing in the Rodent Brain
- Task-Specific Data Selection for Instruction Tuning via Monosemantic Neuronal Activations
- Taught Well Learned Ill: Towards Distillation-conditional Backdoor Attack
- Taxonomy of reduction matrices for Graph Coarsening
- TC-Light: Temporally Coherent Generative Rendering for Realistic World Transfer
- TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine
- Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
- Teaching Language Models to Reason with Tools
- Teaching Transformers to Solve Combinatorial Problems through Efficient Trial & Error
- TechFest Dayton Foundation (BTF)
- Technical Debt in In-Context Learning: Diminishing Efficiency in Long Context
- Temperature is All You Need for Generalization in Langevin Dynamics and other Markov Processes
- Template-Guided 3D Molecular Pose Generation via Flow Matching and Differentiable Optimization
- Temporal Chain of Thought: Long-Video Understanding by Thinking in Frames
- Temporal-Difference Variational Continual Learning
- Temporal In‑Context Fine‑Tuning for Versatile Control of Video Diffusion Models
- Temporal Logic-Based Multi-Vehicle Backdoor Attacks against Offline RL Agents in End-to-end Autonomous Driving
- Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
- Temporal Smoothness-Aware Rate-Distortion Optimized 4D Gaussian Splatting
- TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles
- TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs
- Tensor Decomposition Networks for Accelerating Machine Learning Force Field Computations
- Tensor-Parallelism with Partially Synchronized Activations
- Tensor Product Attention Is All You Need
- TensorRL-QAS: Reinforcement learning with tensor networks for improved quantum architecture search
- Test3R: Learning to Reconstruct 3D at Test Time
- TESTING STATIONARITY AND CHANGE POINT DETECTION IN REINFORCEMENT LEARNING
- Test-Time Adaptation by Causal Trimming
- Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation
- Test-Time Adaptive Object Detection with Foundation Model
- Test Time Scaling for Neural Processes
- Test-Time Scaling of Diffusion Models via Noise Trajectory Search
- Test-Time Spectrum-Aware Latent Steering for Zero-Shot Generalization in Vision-Language Models
- Text-Aware Real-World Image Super-Resolution via Diffusion Model with Joint Segmentation Decoders
- Text-to-Code Generation for Modular Building Layouts in Building Information Modeling
- Text-to-Decision Agent: Offline Meta-Reinforcement Learning from Natural Language Supervision
- Text to Sketch Generation with Multi-Styles
- TF-MAS: Training-free Mamba2 Architecture Search
- TGA: True-to-Geometry Avatar Dynamic Reconstruction
- THD-BAR: Topology Hierarchical Derived Brain Autoregressive Modeling for EEG Generic Representations
- The $\varphi$ Curve: The Shape of Generalization through the Lens of Norm-based Capacity Control
- The Adaptive Complexity of Minimizing Relative Fisher Information
- TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
- The Art of (Artificial) Reasoning
- The Art of (Artificial) Reasoning
- The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
- The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements
- The Best Instruction-Tuning Data are Those That Fit
- The Bias-Variance Tradeoff in Data-Driven Optimization: A Local Misspecification Perspective
- The Boundaries of Fair AI in Medical Image Prognosis: A Causal Perspective
- The Burden of Interactive Alignment with Inconsistent Preferences
- The Catechol Benchmark: Time-series Solvent Selection Data for Few-shot Machine Learning
- The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text
- The Complexity of Correlated Equilibria in Generalized Games
- The Complexity of Finding Local Optima in Contrastive Learning
- The Complexity of Symmetric Equilibria in Min-Max Optimization and Team Zero-Sum Games
- The Computational Advantage of Depth in Learning High-Dimensional Hierarchical Targets
- The Computational Complexity of Counting Linear Regions in ReLU Neural Networks
- The Cost of Compression: Tight Quadratic Black-Box Attacks on Sketches for $\ell_2$ Norm Estimation
- The Cost of Robustness: Tighter Bounds on Parameter Complexity for Robust Memorization in ReLU Nets
- The Curse of Depth in Large Language Models
- The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
- The Dual Nature of Plasticity Loss in Deep Continual Learning: Dissection and Mitigation
- The Effect of Optimal Self-Distillation in Noisy Gaussian Mixture Model
- The Emergence of Abstract Thought in Large Language Models Beyond Any Language
- The emergence of sparse attention: impact of data distribution and benefits of repetition
- The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models
- The First Workshop on Efficient Reasoning
- The Flood Complex: Large-Scale Persistent Homology on Millions of Points
- The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition
- The Fragile Truth of Saliency: Improving LLM Input Attribution via Attention Bias Optimization
- The Future Unmarked: Watermark Removal in AI-Generated Images via Next-Frame Prediction
- The Gaussian Mixing Mechanism: Renyi Differential Privacy via Gaussian Sketches
- The Generative Leap: Tight Sample Complexity for Efficiently Learning Gaussian Multi-Index Models
- The Good, the Bad and the Ugly: Meta-Analysis of Watermarks, Transferable Attacks and Adversarial Defenses
- The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis
- The Hawthorne Effect in Reasoning Models: Evaluating and Steering Test Awareness
- The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models
- The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
- The Impact of Coreset Selection on Spurious Correlations and Group Robustness
- The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels
- The Indra Representation Hypothesis
- The Leaderboard Illusion
- The Lighthouse of Language: Enhancing LLM Agents via Critique-Guided Improvement
- The Logical Expressiveness of Temporal GNNs via Two-Dimensional Product Logics
- The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control
- The Mirage of Performance Gains: Why Contrastive Decoding Fails to Mitigate Object Hallucinations in MLLMs?
- The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
- The Narrow Gate: Localized Image-Text Communication in Native Multimodal Models
- The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?
- The Nuclear Route: Sharp Asymptotics of ERM in Overparameterized Quadratic Networks
- The Oak Architecture: A Vision of SuperIntelligence from Experience
- The Oak Architecture: A Vision of SuperIntelligence from Experience
- The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
- The Omni-Expert: A Computationally Efficient Approach to Achieve a Mixture of Experts in a Single Expert Model
- Theoretical Benefit and Limitation of Diffusion Language Model
- Theoretical Guarantees for the Retention of Strict Nash Equilibria by Coevolutionary Algorithms
- Theoretical Insights into In-context Learning with Unlabeled Data
- Theoretical Insights on Training Instability in Deep Learning
- Theoretical Investigation of Adafactor for Non-Convex Smooth Optimization
- Theoretically Grounded Framework for LLM Watermarking: A Distribution-Adaptive Approach
- Theory-Driven Label-Specific Representation for Incomplete Multi-View Multi-Label Learning
- The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training
- The Parameterized Complexity of Computing the VC-Dimension
- The Persistence of Neural Collapse Despite Low-Rank Bias
- The Power of Iterative Filtering for Supervised Learning with (Heavy) Contamination
- The Price of Opportunity Fairness in Matroid Allocation Problems
- The Price of Sparsity: Sufficient Conditions for Sparse Recovery using Sparse and Sparsified Measurements
- The Primacy of Magnitude in Low-Rank Adaptation
- The Promise of RL for Autoregressive Image Editing
- The quest for the GRAph Level autoEncoder (GRALE)
- The Quest for Universal Master Key Filters in DS-CNNs
- The Quotient Bayesian Learning Rule
- The Rashomon Set Has It All: Analyzing Trustworthiness of Trees under Multiplicity
- The Rich and the Simple: On the Implicit Bias of Adam and SGD
- The Right to Red-Team: Adversarial AI Literacy as a Civic Imperative in K-12 Education
- The Rise of Parameter Specialization for Knowledge Storage in Large Language Models
- ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation
- The Science of Benchmarking: What’s Measured, What’s Missed, and What’s Next
- The Second Workshop on GenAI for Health: Potential, Trust, and Policy Compliance
- The Structural Complexity of Matrix-Vector Multiplication
- The Structure of Relation Decoding Linear Operators in Large Language Models
- The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning
- The Temporal Graph of Bitcoin Transactions
- The third pillar of causal analysis? A measurement perspective on causal representations
- The Underappreciated Power of Vision Models for Graph Structural Understanding
- The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
- The Unseen Threat: Residual Knowledge in Machine Unlearning under Perturbed Samples
- The VLLM Safety Paradox: Dual Ease in Jailbreak Attack and Defense
- The World Is Bigger: A Computationally-Embedded Perspective on the Big World Hypothesis
- ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
- Think before Recommendation: Autonomous Reasoning-enhanced Recommender
- ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning
- Thinker: Learning to Think Fast and Slow
- Thinking in Character: Advancing Role-Playing Agents with Role-Aware Reasoning
- Thinking vs. Doing: Improving Agent Reasoning by Scaling Test-Time Interaction
- Thinkless: LLM Learns When to Think
- Think Only When You Need with Large Hybrid-Reasoning Models
- Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic Lens
- Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
- Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models
- Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains
- ThinkSound: Chain-of-Thought Reasoning in Multimodal LLMs for Audio Generation and Editing
- This Time is Different: An Observability Perspective on Time Series Foundation Models
- Thompson Sampling for Multi-Objective Linear Contextual Bandit
- Thompson Sampling in Function Spaces via Neural Operators
- Thought Communication in Multiagent Collaboration
- Thoughts Are All Over the Place: On the Underthinking of Long Reasoning Models
- Thousand Voices of Trauma: A Large-Scale Synthetic Dataset for Modeling Prolonged Exposure Therapy Conversations
- Thresholds for sensitive optimality and Blackwell optimality in stochastic games
- Through the Lens: Benchmarking Deepfake Detectors Against Moiré-Induced Distortions
- Through the River: Understanding the Benefit of Schedule-Free Methods for Language Model Training
- Thumb on the Scale: Optimal Loss Weighting in Last Layer Retraining
- THUNDER: Tile-level Histopathology image UNDERstanding benchmark
- TIDMAD: Time Series Dataset for Discovering Dark Matter with AI Denoising
- Tight analyses of first-order methods with error feedback
- Tight Asymptotics of Extreme Order Statistics
- Tight Bounds for Answering Adaptively Chosen Concentrated Queries
- Tight Bounds for Maximum Weight Matroid Independent Set and Matching in the Zero Communication Model
- Tight Bounds on the Distortion of Randomized and Deterministic Distributed Voting
- Tightening Regret Lower and Upper Bounds in Restless Rising Bandits
- Tighter CMI-Based Generalization Bounds via Stochastic Projection and Quantization
- Tight Generalization Bounds for Large-Margin Halfspaces
- Tight High-Probability Bounds for Nonconvex Heavy-Tailed Scenario under Weaker Assumptions
- Tight Lower Bounds and Improved Convergence in Performative Prediction
- Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
- TimE: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios
- TimeEmb: A Lightweight Static-Dynamic Disentanglement Framework for Time Series Forecasting
- Time-Embedded Algorithm Unrolling for Computational MRI
- Time-Evolving Dynamical System for Learning Latent Representations of Mouse Visual Neural Activity
- Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series
- Timely Clinical Diagnosis through Active Test Selection
- Time-Masked Transformers with Lightweight Test-Time Adaptation for Neural Speech Decoding
- Time-o1: Time-Series Forecasting Needs Transformed Label Alignment
- TimePerceiver: An Encoder-Decoder Framework for Generalized Time-Series Forecasting
- Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding
- Time Reversal Symmetry for Efficient Robotic Manipulations in Deep Reinforcement Learning
- Time Series Generation Under Data Scarcity: A Unified Generative Modeling Approach
- Time Travel is Cheating: Going Live with DeepFund for Real-Time Fund Investment Benchmarking
- Time-uniform and Asymptotic Confidence Sequence of Quantile under Local Differential Privacy
- TimeWak: Temporal Chained-Hashing Watermark for Time Series Data
- TimeXL: Explainable Multi-modal Time Series Prediction with LLM-in-the-Loop
- TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning
- TITAN: A Trajectory-Informed Technique for Adaptive Parameter Freezing in Large-Scale VQE
- Titans: Learning to Memorize at Test Time
- T-norm Selection for Object Detection in Autonomous Driving with Logical Constraints
- To Distill or Decide? Understanding the Algorithmic Trade-off in Partially Observable RL
- ToF-IP: Time-of-Flight Enhanced Sparse Inertial Poser for Real-time Human Motion Capture
- Token Bottleneck: One Token to Remember Dynamics
- Token Embeddings Violate the Manifold Hypothesis
- Token-Level Self-Play with Importance-Aware Guidance for Large Language Models
- Token Perturbation Guidance for Diffusion Models
- TokenSqueeze: Performance-Preserving Compression for Reasoning LLMs
- TokenSwap: A Lightweight Method to Disrupt Memorized Sequences in LLMs
- TokMan:Tokenize Manhattan Mask Optimization for Inverse Lithography
- TOMCAT: Test-time Comprehensive Knowledge Accumulation for Compositional Zero-Shot Learning
- Too Late to Recall: Explaining the Two-Hop Problem in Multimodal Knowledge Retrieval
- Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task
- ToolRL: Reward is All Tool Learning Needs
- TopER: Topological Embeddings in Graph Representation Learning
- Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation
- Topology-Aware Conformal Prediction for Stream Networks
- Topology-aware Graph Diffusion Model with Persistent Homology
- Topology-Aware Learning of Tubular Manifolds via SE(3)-Equivariant Network on Ball B-Spline Curve
- Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties
- TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving
- Torch-Uncertainty: Deep Learning Uncertainty Quantification
- Tortoise and Hare Guidance: Accelerating Diffusion Model Inference with Multirate Integration
- To Think or Not To Think: A Study of Thinking in Rule-Based Visual Reinforcement Fine-Tuning
- Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper
- Toward Artificial Palpation: Representation Learning of Touch on Soft Bodies
- Toward a Unified Geometry Understanding : Riemannian Diffusion Framework for Graph Generation and Prediction
- Toward a Vision-Language Foundation Model for Medical Data: Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation
- Toward Efficient Inference Attacks: Shadow Model Sharing via Mixture-of-Experts
- Toward Engineering AGI: Benchmarking the Engineering Design Capabilities of LLMs
- Toward Human Deictic Gesture Target Estimation
- Toward Interpretable Evaluation Measures for Time Series Segmentation
- Toward Real-world Text Image Forgery Localization: Structured and Interpretable Data Synthesis
- Toward Relative Positional Encoding in Spiking Transformers
- Towards 3D Objectness Learning in an Open World
- Towards Accurate Time Series Forecasting via Implicit Decoding
- Towards a General Attention Framework on Gyrovector Spaces for Matrix Manifolds
- Towards A Generalist Code Embedding Model Based On Massive Data Synthesis
- Towards a Geometric Understanding of Tensor Learning via the t-Product
- Towards a Golden Classifier-Free Guidance Path via Foresight Fixed Point Iterations
- Towards a Pairwise Ranking Model with Orderliness and Monotonicity for Label Enhancement
- Towards A Translative Model of Sperm Whale Vocalization
- Towards Automated Petrography
- Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis
- Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy
- Towards Building Model/Prompt-Transferable Attackers against Large Vision-Language Models
- Towards Comprehensive Scene Understanding: Integrating First and Third-Person Views for LVLMs
- Towards Doctor-Like Reasoning: Medical RAG Fusing Knowledge with Patient Analogy through Textual Gradients
- Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery
- Towards Effective Federated Graph Foundation Model via Mitigating Knowledge Entanglement
- Towards Evaluating Proactive Risk Awareness of Multimodal Language Models
- Towards foundational LiDAR world models with efficient latent flow matching
- Towards Fully FP8 GEMM LLM Training at Scale
- Towards General Continuous Memory for Vision-Language Models
- Towards Generalizable 3D Human Pose Estimation via Ensembles on Flat Loss Landscapes
- Towards Generalizable Detector for Generated Image
- Towards Generalizable Multi-Policy Optimization with Self-Evolution for Job Scheduling
- Towards Generalizable Retina Vessel Segmentation with Deformable Graph Priors
- Towards General Modality Translation with Contrastive and Predictive Latent Diffusion Bridge
- Towards Graph Foundation Models: Training on Knowledge Graphs Enables Transferability to General Graphs
- Towards Identifiability of Hierarchical Temporal Causal Representation Learning
- Towards Implicit Aggregation: Robust Image Representation for Place Recognition in the Transformer Era
- Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders
- Towards Interpretable and Efficient Attention: Compressing All by Contracting a Few
- Towards Irreversible Attack: Fooling Scene Text Recognition via Multi-Population Coevolution Search
- Towards Large-Scale In-Context Reinforcement Learning by Meta-Training in Randomized Worlds
- Towards Minimizing Feature Drift in Model Merging: Layer-wise Task Vector Fusion for Adaptive Knowledge Integration
- Towards Multiscale Graph-based Protein Learning with Geometric Secondary Structural Motifs
- Towards Multi-Table Learning: A Novel Paradigm for Complementarity Quantification and Integration
- Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
- Towards Physics-informed Spatial Intelligence with Human Priors: An Autonomous Driving Pilot Study
- Towards precision protein-ligand affinity prediction benchmark: A Complete and Modification-Aware DAVIS Dataset
- Towards Predicting Any Human Trajectory In Context
- Towards Pre-trained Graph Condensation via Optimal Transport
- Towards Principled Unsupervised Multi-Agent Reinforcement Learning
- Towards Prospective Medical Image Reconstruction via Knowledge-Informed Dynamic Optimal Transport
- Towards Provable Emergence of In-Context Reinforcement Learning
- Towards Realistic Earth-Observation Constellation Scheduling: Benchmark and Methodology
- Towards Reliable and Holistic Visual In-Context Learning Prompt Selection
- Towards Reliable Code-as-Policies: A Neuro-Symbolic Framework for Embodied Task Planning
- Towards Reliable Identification of Diffusion-based Image Manipulations
- Towards Reliable LLM-based Robots Planning via Combined Uncertainty Estimation
- Towards Resilient Safety-driven Unlearning for Diffusion Models against Downstream Fine-tuning
- Towards Robust Parameter-Efficient Fine-Tuning for Federated Learning
- Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding Perspective
- Towards Robust Uncertainty Calibration for Composed Image Retrieval
- Towards Robust Zero-Shot Reinforcement Learning
- Towards Self-Refinement of Vision-Language Models with Triangular Consistency
- Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts
- Towards Straggler-Resilient Split Federated Learning: An Unbalanced Update Approach
- Towards Syn-to-Real IQA: A Novel Perspective on Reshaping Synthetic Data Distributions
- Towards the Resistance of Neural Network Fingerprinting to Fine-tuning
- Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
- Towards Understanding Camera Motions in Any Video
- Towards Understanding Safety Alignment: A Mechanistic Perspective from Safety Neurons
- Towards Understanding the Mechanisms of Classifier-Free Guidance
- Towards Understanding Transformers in Learning Random Walks
- Towards Unified and Lossless Latent Space for 3D Molecular Latent Diffusion Modeling
- Towards Unified Multimodal Interleaved Generation via Group Relative Policy Optimization
- Towards Unsupervised Domain Bridging via Image Degradation in Semantic Segmentation
- Towards Unsupervised Open-Set Graph Domain Adaptation via Dual Reprogramming
- Towards Unsupervised Training of Matching-based Graph Edit Distance Solver via Preference-aware GAN
- Towards Visualization-of-Thought Jailbreak Attack against Large Visual Language Models
- ToxicTextCLIP: Text-Based Poisoning and Backdoor Attacks on CLIP Pre-training
- TP-MDDN: Task-Preferenced Multi-Demand-Driven Navigation with Autonomous Decision-Making
- TPP-SD: Accelerating Transformer Point Process Sampling with Speculative Decoding
- TRACE: Contrastive learning for multi-trial time series data in neuroscience
- TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval
- Tracing Back the Malicious Clients in Poisoning Attacks to Federated Learning
- Tracing the Representation Geometry of Language Models from Pretraining to Post-training
- Tracing the Roots: Leveraging Temporal Dynamics in Diffusion Trajectories for Origin Attribution
- Track3R: Joint Point Map and Trajectory Prior for Spatiotemporal 3D Understanding
- Tracking and Understanding Object Transformations
- TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
- Track, Inpaint, Resplat: Subject-driven 3D and 4D Generation with Progressive Texture Infilling
- Tractable Multinomial Logit Contextual Bandits with Non-Linear Utilities
- TractoTransformer: Diffusion MRI Streamline Tractography using CNN and Transformer Networks
- Tradeoffs between Mistakes and ERM Oracle Calls in Online and Transductive Online Learning
- TraffiDent: A Dataset for Understanding the Interplay Between Traffic Dynamics and Incidents
- Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression
- Training a Scientific Reasoning Model for Chemistry
- Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
- Training-Free Constrained Generation With Stable Diffusion Models
- Training-free Detection of AI-generated images via Cropping Robustness
- Training-Free Efficient Video Generation via Dynamic Token Carving
- Training-Free Guidance Beyond Differentiability: Scalable Path Steering with Tree Search in Diffusion and Flow Models
- Training-free Online Video Step Grounding
- Training-Free Safe Denoisers for Safe Use of Diffusion Models
- Training-Free Safe Text Embedding Guidance for Text-to-Image Diffusion Models
- Training-Free Test-Time Adaptation via Shape and Style Guidance for Vision-Language Models
- Training Language Models to Generate Quality Code with Program Analysis Feedback
- Training Language Models to Reason Efficiently
- Training Robust Graph Neural Networks by Modeling Noise Dependencies
- Training the Untrainable: Introducing Inductive Bias via Representational Alignment
- Train on Pins and Test on Obstacles for Rectilinear Steiner Minimum Tree
- Train to Defend: First Defense Against Cryptanalytic Neural Network Parameter Extraction Attacks
- Train with Perturbation, Infer after Merging: A Two-Stage Framework for Continual Learning
- TrajAgent: An LLM-Agent Framework for Trajectory Modeling via Large-and-Small Model Collaboration
- Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training
- Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning
- Trajectory Graph Learning: Aligning with Long Trajectories in Reinforcement Learning Without Reward Design
- TrajMamba: An Efficient and Semantic-rich Vehicle Trajectory Pre-training Model
- Transcending Cost-Quality Tradeoff in Agent Serving via Session-Awareness
- Transductive Conformal Inference for Full Ranking
- Trans-EnV: A Framework for Evaluating the Linguistic Robustness of LLMs Against English Varieties
- Transferable Black-Box One-Shot Forging of Watermarks via Image Preference Models
- TransferBench: Benchmarking Ensemble-based Black-box Transfer Attacks
- Transfer Faster, Price Smarter: Minimax Dynamic Pricing under Cross-Market Preference Shift
- Transfer Learning for Benign Overfitting in High-Dimensional Linear Regression
- Transfer Learning on Edge Connecting Probability Estimation Under Graphon Model
- Transferring Causal Effects using Proxies
- Transferring Linear Features Across Language Models With Model Stitching
- TransferTraj: A Vehicle Trajectory Learning Model for Region and Task Transferability
- Transformer brain encoders explain human high-level visual responses
- Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning
- Transformer Key-Value Memories Are Nearly as Interpretable as Sparse Autoencoders
- Transformers are almost optimal metalearners for linear classification
- Transformers for Mixed-type Event Sequences
- Transformers Learn Faster with Semantic Focus
- Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization
- Transforming Gaps into Gains: Bridging Model and Data Heterogeneity in Federated Learning via Knowledge Weak-Aware Zones
- Transforming Generic Coder LLMs to Effective Binary Code Embedding Models for Similarity Detection
- Transition Matching: Scalable and Flexible Generative Modeling
- TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup
- Transstratal Adversarial Attack: Compromising Multi-Layered Defenses in Text-to-Image Models
- TranSUN: A Preemptive Paradigm to Eradicate Retransformation Bias Intrinsically from Regression Models in Recommender Systems
- TRAP: Targeted Redirecting of Agentic Preferences
- Traversal Verification for Speculative Tree Decoding
- Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers
- Treatment Effect Estimation for Optimal Decision-Making
- Tree-Based Premise Selection for Lean4
- Tree Ensemble Explainability through the Hoeffding Functional Decomposition and TreeHFD Algorithm
- TreeFinder: A US-Scale Benchmark Dataset for Individual Tree Mortality Monitoring Using High-Resolution Aerial Imagery
- TreeGen: A Bayesian Generative Model for Hierarchies
- Tree-Guided Diffusion Planner
- Tree of Preferences for Diversified Recommendation
- Tree-Sliced Entropy Partial Transport
- TreeSplat: Mergeable Tree for Deformable Gaussian Splatting
- TreeSynth: Synthesizing Diverse Data from Scratch via Tree-Guided Subspace Partitioning
- T-REGS: Minimum Spanning Tree Regularization for Self-Supervised Learning
- TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception
- TRiCo: Triadic Game-Theoretic Co-Training for Robust Semi-Supervised Learning
- TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence
- Tri-MARF: A Tri-Modal Multi-Agent Responsive Framework for Comprehensive 3D Object Annotation
- TRIM: Scalable 3D Gaussian Diffusion Inference with Temporal and Spatial Trimming
- Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs
- Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms
- TRoVe: Discovering Error-Inducing Static Feature Biases in Temporal Vision-Language Models
- True Impact of Cascade Length in Contextual Cascading Bandits
- True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics
- Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs
- Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
- Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference
- Trust Region Reward Optimization and Proximal Inverse Reward Optimization Algorithm
- TRUST: Test-Time Refinement using Uncertainty-Guided SSM Traverses
- Truthful Aggregation of LLMs with an Application to Online Advertising
- Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection
- TSENOR: Highly-Efficient Algorithm for Finding Transposable N:M Sparse Masks
- T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
- TS-MOF: Two-Stage Multi-Objective Fine-tuning for Long-Tailed Recognition
- TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster
- TTRL: Test-Time Reinforcement Learning
- TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive Generation
- Turbocharging Gaussian Process Inference with Approximate Sketch-and-Project
- Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound
- Turning the Tables: Enabling Backward Transfer via Causal-Aware LoRA in Continual Learning
- TV-Rec: Time-Variant Convolutional Filter for Sequential Recommendation
- Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
- TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets
- Two Causally Related Needles in a Video Haystack
- Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training
- Two Heads are Better than One: Simulating Large Transformers with Small Ones
- Two‑Stage Learning of Stabilizing Neural Controllers via Zubov Sampling and Iterative Domain Expansion
- Two-Steps Diffusion Policy for Robotic Manipulation via Genetic Denoising
- Týr-the-Pruner: Structural Pruning LLMs via Global Sparsity Distribution Optimization
- UAV-Flow Colosseo: A Real-World Benchmark for Flying-on-a-Word UAV Imitation Learning
- U-CAN: Unsupervised Point Cloud Denoising with Consistency-Aware Noise2Noise Matching
- UEPI: Universal Energy-Behavior-Preserving Integrators for Energy Conservative/Dissipative Differential Equations
- UFM: A Simple Path towards Unified Dense Correspondence with Flow
- UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
- UFO-RL: Uncertainty-Focused Optimization for Efficient Reinforcement Learning Data Selection
- UFT: Unifying Supervised and Reinforcement Fine-Tuning
- UGG-ReID: Uncertainty-Guided Graph Model for Multi-Modal Object Re-Identification
- UGM2N: An Unsupervised and Generalizable Mesh Movement Network via M-Uniform Loss
- UGoDIT: Unsupervised Group Deep Image Prior Via Transferable Weights
- UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents
- Ultra-high Resolution Watermarking Framework Resistant to Extreme Cropping and Scaling
- UltraHR-100K: Enhancing UHR Image Synthesis with A Large-Scale High-Quality Dataset
- UltraLED: Learning to See Everything in Ultra-High Dynamic Range Scenes
- Ultrametric Cluster Hierarchies: I Want ‘em All!
- UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions
- UMA: A Family of Universal Models for Atoms
- UMAMI: Unifying Masked Autoregressive Models and Deterministic Rendering for View Synthesis
- UMoE: Unifying Attention and FFN with Shared Experts
- UMU-Bench: Closing the Modality Gap in Multimodal Unlearning Evaluation
- un$^2$CLIP: Improving CLIP's Visual Detail Capturing Ability via Inverting unCLIP
- Unbalanced Optimal Total Variation Transport: A Theoretical Approach to Spatial Resource Allocation Problems
- Unbiased Prototype Consistency Learning for Multi-Modal and Multi-Task Object Re-Identification
- Unbiased Sliced Wasserstein Kernels for High-Quality Audio Captioning
- Uncertain Knowledge Graph Completion via Semi-Supervised Confidence Distribution Learning
- Uncertainty-Aware Multi-Objective Reinforcement Learning-Guided Diffusion Models for 3D De Novo Molecular Design
- Uncertainty-aware Preference Alignment for Diffusion Policies
- Uncertainty-Based Smooth Policy Regularisation for Reinforcement Learning with Few Demonstrations
- Uncertainty-Calibrated Prediction of Randomly-Timed Biomarker Trajectories with Conformal Bands
- Uncertainty Estimation by Flexible Evidential Deep Learning
- Uncertainty Estimation on Graphs with Structure Informed Stochastic Partial Differential Equations
- Uncertainty-Guided Exploration for Efficient AlphaZero Training
- Uncertainty-Informed Meta Pseudo Labeling for Surrogate Modeling with Limited Labeled Data
- Uncertainty Quantification for Deep Regression using Contextualised Normalizing Flows
- Uncertainty Quantification for Physics-Informed Neural Networks with Extended Fiducial Inference
- Uncertainty Quantification with the Empirical Neural Tangent Kernel
- Uncertainty-quantified Rollout Policy Adaptation for Unlabelled Cross-domain Video Temporal Grounding
- Uncertainty-Sensitive Privileged Learning
- UnCLe: Towards Scalable Dynamic Causal Discovery in Non-linear Temporal Systems
- Uncoupled and Convergent Learning in Monotone Games under Bandit Feedback
- Uncover Governing Law of Pathology Propagation Mechanism Through A Mean-Field Game
- Uncovering a Universal Abstract Algorithm for Modular Addition in Neural Networks
- Uncovering the Spectral Bias in Diagonal State Space Models
- Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation
- Understanding Adam Requires Better Rotation Dependent Assumptions
- Understanding and Enhancing Mask-Based Pretraining towards Universal Representations
- Understanding and Enhancing Message Passing on Heterophilic Graphs via Compatibility Matrix
- Understanding and Improving Adversarial Robustness of Neural Probabilistic Circuits
- Understanding and Improving Fast Adversarial Training against $l_0$ Bounded Perturbations
- Understanding and Mitigating Numerical Sources of Nondeterminism in LLM Inference
- Understanding and Rectifying Safety Perception Distortion in VLMs
- Understanding Bias Terms in Neural Representations
- Understanding challenges to the interpretation of disaggregated evaluations of algorithmic fairness
- Understanding Contrastive Learning via Gaussian Mixture Models
- Understanding Data Influence in Reinforcement Finetuning
- Understanding Differential Transformer Unchains Pretrained Self-Attentions
- Understanding Fairness and Prediction Error through Subspace Decomposition and Influence Analysis
- Understanding Generalization in Physics Informed Models through Affine Variety Dimensions
- Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws
- Understanding outer learning rates in Local SGD
- Understanding Parametric and Contextual Knowledge Reconciliation within Large Language Models
- Understanding Prompt Tuning and In-Context Learning via Meta-Learning
- Understanding protein function with a multimodal retrieval-augmented foundation model
- Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling
- Understanding Softmax Attention Layers:\\ Exact Mean-Field Analysis on a Toy Problem
- Understanding the Evolution of the Neural Tangent Kernel at the Edge of Stability
- Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
- Understanding the Generalization of Stochastic Gradient Adam in Learning Neural Networks
- Understanding while Exploring: Semantics-driven Active Mapping
- Under the Shadow: Exploiting Opacity Variation for Fine-grained Shadow Detection
- Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization
- Unfolding the Black Box of Recurrent Neural Networks for Path Integration
- UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens
- UniDomain: Pretraining a Unified PDDL Domain from Real-World Demonstrations for Generalizable Robot Task Planning
- UniEdit: A Unified Knowledge Editing Benchmark for Large Language Models
- Unified 2D-3D Discrete Priors for Noise-Robust and Calibration-Free Multiview 3D Human Pose Estimation
- Unified Algorithms for RL with Decision-Estimation Coefficients: PAC, Reward-Free, Preference-Based Learning, and Beyond
- Unified all-atom molecule generation with neural fields
- Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
- Unified Reinforcement and Imitation Learning for Vision-Language Models
- Unified Scaling Laws for Compressed Representations
- Unified Transferability Metrics for Time Series Foundation Models
- UniFoil: A Universal Dataset of Airfoils in Transitional and Turbulent Regimes for Subsonic and Transonic Flows
- Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets
- Uniform Wrappers: Bridging Concave to Quadratizable Functions in Online Optimization
- Unifying and Enhancing Graph Transformers via a Hierarchical Mask Framework
- Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting
- Unifying Attention Heads and Task Vectors via Hidden State Geometry in In-Context Learning
- Unifying Proportional Fairness in Centroid and Non-Centroid Clustering
- Unifying Reconstruction and Density Estimation via Invertible Contraction Mapping in One-Class Classification
- Unifying Re-Identification, Attribute Inference, and Data Reconstruction Risks in Differential Privacy
- Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization
- Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models
- UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation
- UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression
- UniGTE: Unified Graph–Text Encoding for Zero-Shot Generalization across Graph Tasks and Domains
- UniHG: A Large-scale Universal Heterogeneous Graph Dataset and Benchmark for Representation Learning and Cross-Domain Transferring
- Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction
- Uni-LoRA: One Vector is All You Need
- UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback
- UniMotion: A Unified Motion Framework for Simulation, Prediction and Planning
- UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation
- Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression Recognition
- UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
- UniRelight: Learning Joint Decomposition and Synthesis for Video Relighting
- UniReps: Unifying Representations in Neural Models
- Uni-RL: Unifying Online and Offline RL via Implicit Value Regularization
- UniSite: The First Cross-Structure Dataset and Learning Framework for End-to-End Ligand Binding Site Detection
- UniteFormer: Unifying Node and Edge Modalities in Transformers for Vehicle Routing Problems
- UniTok: a Unified Tokenizer for Visual Generation and Understanding
- UniTraj: Learning a Universal Trajectory Foundation Model from Billion-Scale Worldwide Traces
- UniTransfer: Video Concept Transfer via Progressive Spatio-Temporal Decomposition
- Universal Causal Inference in a Topos
- Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching
- Universal Few-shot Spatial Control for Diffusion Models
- Universally Invariant Learning in Equivariant GNNs
- Universal Sequence Preconditioning
- Universal Video Temporal Grounding with Generative Multi-modal Large Language Models
- Universal Visuo-Tactile Video Understanding for Embodied Interaction
- Universidad Carlos III de Madrid (BTF)
- University of Montreal (BTF)
- University of Texas Southwestern Medical Center (BTF)
- University of Toronto (BTF)
- University of Washington Department of Anesthesiology and Pain Medicine (BTF)
- UniViT: Unifying Image and Video Understanding in One Vision Encoder
- UniZyme: A Unified Protein Cleavage Site Predictor Enhanced with Enzyme Active-Site Knowledge
- Unlabeled Data Can Provably Enhance In-Context Learning of Transformers
- Unlabeled Data Improves Fine-Grained Image Zero-shot Classification with Multimodal LLMs
- Unlearned but Not Forgotten: Data Extraction after Exact Unlearning in LLM
- Unlearning-Aware Minimization
- Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations
- Unleashing Foundation Vision Models: Adaptive Transfer for Diverse Data-Limited Scientific Domains
- Unleashing Hour-Scale Video Training for Long Video-Language Understanding
- Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding
- Unleashing the Power of One-Step Diffusion based Image Super-Resolution via a Large-Scale Diffusion Discriminator
- Unlocker: Disentangle the Deadlock of Learning between Label-noisy and Long-tailed Data
- Unlocking Dataset Distillation with Diffusion Models
- Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time
- Unlocking Multimodal Mathematical Reasoning via Process Reward Model
- Unlocking SLM Potential for Data Analysis Code Generation via Non-Parametric Knowledge Distillation
- Unmasking Puppeteers: Leveraging Biometric Leakage to Expose Impersonation in AI-Based Videoconferencing
- Unraveling Metameric Dilemma for Spectral Reconstruction: A High-Fidelity Approach via Semi-Supervised Learning
- Unsupervised Federated Graph Learning
- Unsupervised Learning for Optimal Transport plan prediction between unbalanced graphs
- Unsupervised Trajectory Optimization for 3D Registration in Serial Section Electron Microscopy using Neural ODEs
- Unveiling Chain of Step Reasoning for Vision-Language Models with Fine-grained Rewards
- Unveiling Concept Attribution in Diffusion Models
- Unveiling Environmental Sensitivity of Individual Gains in Influence Maximization
- Unveiling Extraneous Sampling Bias with Data Missing-Not-At-Random
- Unveiling m-Sharpness Through the Structure of Stochastic Gradient Noise
- Unveiling the Compositional Ability Gap in Vision-Language Reasoning Model
- Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study
- Unveiling the Power of Multiple Gossip Steps: A Stability-Based Generalization Analysis in Decentralized Training
- Unveiling the Spatial-temporal Effective Receptive Fields of Spiking Neural Networks
- Unveiling the Uncertainty in Embodied and Operational Carbon of Large AI Models through a Probabilistic Carbon Accounting Model
- Unveiling Transformer Perception by Exploring Input Manifolds
- UrbanAI: Harnessing Artificial Intelligence for Smart Cities
- UrbanIng-V2X: A Large-Scale Multi-Vehicle, Multi-Infrastructure Dataset Across Multiple Intersections for Cooperative Perception
- URB - Urban Routing Benchmark for RL-equipped Connected Autonomous Vehicles
- URDF-Anything: Constructing Articulated Objects with 3D Multimodal Language Model
- U-REPA: Aligning Diffusion U-Nets to ViTs
- URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training
- User-Instructed Disparity-aware Defocus Control
- UtilGen: Utility-Centric Generative Data Augmentation with Dual-Level Task Adaptation
- Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
- UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
- V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation
- V2X-Radar: A Multi-modal Dataset with 4D Radar for Cooperative Perception
- VADB: A Large-Scale Video Aesthetic Database with Professional and Multi-Dimensional Annotations
- Vad-R1: Towards Video Anomaly Reasoning via Perception-to-Cognition Chain-of-Thought
- VADTree: Explainable Training-Free Video Anomaly Detection via Hierarchical Granularity-Aware Tree
- VAGEN: Reinforcing World Model Reasoning for Multi-Turn VLM Agents
- VA-GS: Enhancing the Geometric Representation of Gaussian Splatting via View Alignment
- Validating LLM-as-a-Judge Systems under Rating Indeterminacy
- Valid Inference with Imperfect Synthetic Data
- Valid Selection among Conformal Sets
- Value Diffusion Reinforcement Learning
- Value Gradient Guidance for Flow Matching Alignment
- Value-Guided Decision Transformer: A Unified Reinforcement Learning Framework for Online and Offline Settings
- Value-Guided KV Compression for LLMs via Approximated CUR Decomposition
- Value-Guided Search for Efficient Chain-of-Thought Reasoning
- Value Improved Actor Critic Algorithms
- VaMP: Variational Multi-Modal Prompt Learning for Vision-Language Models
- Vanish into Thin Air: Cross-prompt Universal Adversarial Attacks for SAM2
- VaporTok: RL-Driven Adaptive Video Tokenizer with Prior & Task Awareness
- \(\varepsilon\)-Optimally Solving Two-Player Zero-Sum POSGs
- VarFlow: Proper Scoring-Rule Diffusion Distillation via Energy Matching
- Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits
- Variance-Reduced Long-Term Rehearsal Learning with Quadratic Programming Reformulation
- Variational Inference with Mixtures of Isotropic Gaussians
- Variational Learning Finds Flatter Solutions at the Edge of Stability
- Variational Polya Tree
- Variational Regularized Unbalanced Optimal Transport: Single Network, Least Action
- Variational Supervised Contrastive Learning
- Variational Task Vector Composition
- Variational Uncertainty Decomposition for In-Context Learning
- VASA-3D: Lifelike Audio-Driven Gaussian Head Avatars from a Single Image
- V-CECE: Visual Counterfactual Explanations via Conceptual Edits
- VCM: Vision Concept Modeling with Adaptive Vision Token Compression via Instruction Fine-Tuning
- Vector Database Watermarking
- Vector Quantization in the Brain: Grid-like Codes in World Models
- Venus-MAXWELL: Efficient Learning of Protein-Mutation Stability Landscapes using Protein Language Models
- VERA: Variational Inference Framework for Jailbreaking Large Language Models
- VeriLoC: Line-of-Code Level Prediction of Hardware Design Quality from Verilog Code
- VeriThinker: Learning to Verify Makes Reasoning Model Efficient
- VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification
- Versatile differentially private learning for general loss functions
- Versatile Transferable Unlearnable Example Generator
- Vertical Federated Feature Screening
- VESSA: Video-based objEct-centric Self-Supervised Adaptation for Visual Foundation Models
- VETA-DiT: Variance-Equalized and Temporally Adaptive Quantization for Efficient 4-bit Diffusion Transformers
- VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption
- Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding
- VGGT-SLAM: Dense RGB SLAM Optimized on the SL(4) Manifold
- vHector and HeisenVec: Scalable Vector Graphics Generation Through Large Language Models
- VIBE: Annotation-Free Video-to-Text Information Bottleneck Evaluation for TL;DR
- Vicinal Label Supervision for Reliable Aleatoric and Epistemic Uncertainty Estimation
- Vicinity-Guided Discriminative Latent Diffusion for Privacy-Preserving Domain Adaptation
- ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs
- ViDAR: Video Diffusion-Aware 4D Reconstruction From Monocular Inputs
- VidEmo: Affective-Tree Reasoning for Emotion-Centric Video Foundation Models
- VideoCAD: A Dataset and Model for Learning Long‑Horizon 3D CAD UI Interactions from Video
- VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception
- Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision
- VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance
- VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
- VideoLucy: Deep Memory Backtracking for Long Video Understanding
- VideoMAR: Autoregressive Video Generation with Continuous Tokens
- Video Perception Models for 3D Scene Synthesis
- Video-R1: Reinforcing Video Reasoning in MLLMs
- Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
- VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models
- VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning
- Video-SafetyBench: A Benchmark for Safety Evaluation of Video LVLMs
- Videos are Sample-Efficient Supervisions: Behavior Cloning from Videos via Latent Representations
- VideoTitans: Scalable Video Prediction with Integrated Short- and Long-term Memory
- VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
- VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
- Video World Models with Long-term Spatial Memory
- Vid-SME: Membership Inference Attacks against Large Video Understanding Models
- Vietnamese Women in Computer Science (BTF)
- ViewCraft3D: High-fidelity and View-Consistent 3D Vector Graphics Synthesis
- ViewPoint: Panoramic Video Generation with Pretrained Diffusion Models
- VIKING: Deep variational inference with stochastic projections
- VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
- VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models
- Vinci: Deep Thinking in Text-to-Image Generation using Unified Model with Reinforcement Learning
- VIPAMIN: Visual Prompt Initialization via Embedding Selection and Subspace Expansion
- Virtual Fitting Room: Generating Arbitrarily Long Videos of Virtual Try-On from a Single Image
- Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data
- VisDiff: SDF-Guided Polygon Generation for Visibility Reconstruction, Characterization and Recognition
- Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It
- Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
- Vision-centric Token Compression in Large Language Model
- Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Generation
- Vision Function Layer in Multimodal LLMs
- Vision Language Models: Challenges of Real World Deployment
- Vision‑Language‑Vision Auto‑Encoder: Scalable Knowledge Distillation from Diffusion Models
- VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning
- Vision Transformers Don't Need Trained Registers
- Vision Transformers with Self-Distilled Registers
- ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding
- ViSPLA: Visual Iterative Self-Prompting for Language-Guided 3D Affordance Learning
- Visual Anagrams Reveal Hidden Differences in Holistic Shape Processing Across Vision Models
- Visual Diversity and Region-aware Prompt Learning for Zero-shot HOI Detection
- Visual Instruction Bottleneck Tuning
- Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting
- VisualLens: Personalization through Task-Agnostic Visual History
- VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank
- Visual Structures Help Visual Reasoning: Addressing the Binding Problem in LVLMs
- Visual Sync: Multi‑Camera Synchronization via Cross‑View Object Motion
- Visual Thoughts: A Unified Perspective of Understanding Multimodal Chain-of-Thought
- VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
- VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model
- VITRIX-CLIPIN: Enhancing Fine-Grained Visual Understanding in CLIP via Instruction-Editing Data and Long Captions
- VITRIX-UniViTAR: Unified Vision Transformer with Native Resolution
- VividFace: A Robost and High-Fidelity Video Face Swapping Framework
- VLA-Cache: Efficient Vision-Language-Action Manipulation via Adaptive Token Caching
- VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models
- VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models
- VLM in a flash: I/O-Efficient Sparsification of Vision-Language Model via Neuron Chunking
- VLMLight: Safety-Critical Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning Architecture
- VLM-R³: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
- VLMs can Aggregate Scattered Training Patches
- VLMs have Tunnel Vision: Evaluating Nonlocal Visual Reasoning in Leading VLMs
- VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
- VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set
- VL-SAM-V2: Open-World Object Detection with General and Specific Query Fusion
- VMDT: Decoding the Trustworthiness of Video Foundation Models
- Vocabulary-Guided Gait Recognition
- Vocabulary In-Context Learning in Transformers: Benefits of Positional Encoding
- VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play
- Volume Transmission Implements Context Factorization to Target Online Credit Assignment and Enable Compositional Generalization
- VORTA: Efficient Video Diffusion via Routing Sparse Attention
- VoxDet: Rethinking 3D Semantic Scene Completion as Dense Object Detection
- VPO: Reasoning Preferences Optimization Based on $\mathcal{V}$-Usable Information
- VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation
- VQToken: Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models
- VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
- VR-Drive: Viewpoint-Robust End-to-End Driving with Feed-Forward 3D Gaussian Splatting
- VT-FSL: Bridging Vision and Text with LLMs for Few-Shot Learning
- VTON-VLLM: Aligning Virtual Try-On Models with Human Preferences
- Vulnerable Data-Aware Adversarial Training
- Walking the Schrödinger Bridge: A Direct Trajectory for Text-to-3D Generation
- Walking the Tightrope: Autonomous Disentangling Beneficial and Detrimental Drifts in Non-Stationary Custom-Tuning
- WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
- WaLRUS: Wavelets for Long range Representation Using State Space Methods
- Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
- WarpGAN: Warping-Guided 3D GAN Inversion with Style-Based Novel View Inpainting
- WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
- Wasserstein Convergence of Critically Damped Langevin Diffusions
- Wasserstein Transfer Learning
- Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM
- Watermarking Autoregressive Image Generation
- WaveAR: Wavelet-Aware Continuous Autoregressive Diffusion for Accurate Human Motion Prediction
- Wavelet Canonical Coherence for Nonstationary Signals
- Wavy Transformer
- Weak-shot Keypoint Estimation via Keyness and Correspondence Transfer
- Weak-to-Strong Generalization under Distribution Shifts
- WearVQA: A Visual Question Answering Benchmark for Wearables in Egocentric Authentic Real-world scenarios
- WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization
- Weaver: Shrinking the Generation-Verification Gap by Scaling Compute for Verification
- WebDancer: Towards Autonomous Information Seeking Agency
- WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch
- Web-Scale Collection of Video Data for 4D Animal Reconstruction
- Web-Shepherd: Advancing PRMs for Reinforcing Web Agents
- WebThinker: Empowering Large Reasoning Models with Deep Research Capability
- We Should Chart an Atlas of All the World's Models
- What are you sinking? A geometric approach on attention sink
- What Can RL Bring to VLA Generalization? An Empirical Study
- What Can('t) Transformers Do?
- What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization
- What Does It Take to Build a Performant Selective Classifier?
- What Do Latent Action Models Actually Learn?
- What do you know? Bayesian knowledge inference for navigating agents
- What Expressivity Theory Misses: Message Passing Complexity for GNNs
- What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers
- What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
- What Makes a Good Video: Next Practices in Video Generation and Evaluation
- What Makes a Reward Model a Good Teacher? An Optimization Perspective
- WHAT MAKES MATH PROBLEMS HARD FOR REINFORCEMENT LEARNING: A CASE STUDY
- What Matters in Data for DPO?
- What Moves the Eyes: Doubling Mechanistic Model Performance Using Deep Networks to Discover and Test Cognitive Hypotheses
- What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains
- What Really is a Member? Discrediting Membership Inference via Poisoning
- What’s in Common? Multimodal Models Hallucinate When Reasoning Across Scenes
- What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models
- What We Miss Matters: Learning from the Overlooked in Point Cloud Transformers
- When Additive Noise Meets Unobserved Mediators: Bivariate Denoising Diffusion for Causal Discovery
- When and how can inexact generative models still sample from the data manifold?
- When Are Concepts Erased From Diffusion Models?
- When Can Model-Free Reinforcement Learning be Enough for Thinking?
- When Causal Dynamics Matter: Adapting Causal Strategies through Meta-Aware Interventions
- When Data Can't Meet: Estimating Correlation Across Privacy Barriers
- When Does Closeness in Distribution Imply Representational Similarity? An Identifiability Perspective
- When Does Curriculum Learning Help? A Theoretical Perspective
- When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective
- When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product
- When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners
- When Lower-Order Terms Dominate: Adaptive Expert Algorithms for Heavy-Tailed Losses
- When majority rules, minority loses: bias amplification of gradient descent
- When Models Don’t Collapse: On the Consistency of Iterative MLE
- When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration
- When No Paths Lead to Rome: Benchmarking Systematic Neural Relational Reasoning
- When One Moment Isn't Enough: Multi-Moment Retrieval with Cross-Moment Interactions
- When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
- When Thinking Drifts: Evidential Grounding for Robust Video Reasoning
- When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs
- When Worse is Better: Navigating the Compression Generation Trade-off In Visual Tokenization
- Where and How to Perturb: On the Design of Perturbation Guidance in Diffusion and Flow Models
- Where Does It Exist from the Low-Altitude: Spatial Aerial Video Grounding
- Where Graph Meets Heterogeneity: Multi-View Collaborative Graph Experts
- Which Algorithms Have Tight Generalization Bounds?
- Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions
- Whitened Score Diffusion: A Structured Prior for Imaging Inverse Problems
- Whole-Body Conditioned Egocentric Video Prediction
- Who Reasons in the Large Language Models?
- Whose Instructions Count? Resolving Preference Bias in Instruction Fine-Tuning
- Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models
- Who Speaks for the Trigger? Dynamic Expert Routing in Backdoored Mixture-of-Experts Transformers
- Who You Are Matters: Bridging Interests and Social Roles via LLM-Enhanced Logic Recommendation
- Why 1 + 1 < 1 in Visual Token Pruning: Beyond Naive Integration via Multi-Objective Balanced Covering
- Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
- Why Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in Training
- Why Do Multi-Agent LLM Systems Fail?
- Why Do Some Language Models Fake Alignment While Others Don't?
- Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation
- Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion
- Why Playing Against Diverse and Challenging Opponents Speeds Up Coevolution: A Theoretical Analysis on Combinatorial Games
- Why Popular MOEAs are Popular: Proven Advantages in Approximating the Pareto Front
- Wide-Horizon Thinking and Simulation-Based Evaluation for Real-World LLM Planning with Multifaceted Constraints
- Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search
- WildCAT3D: Appearance-Aware Multi-View Diffusion in the Wild
- Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs
- WISA: World simulator assistant for physics-aware text-to-video generation
- Wisdom is Knowing What not to Say: Hallucination-Free LLMs Unlearning via Attention Shifting
- With Limited Data for Multimodal Alignment, Let the STRUCTURE Guide You
- WKV-sharing embraced random shuffle RWKV high-order modeling for pan-sharpening
- WMCopier: Forging Invisible Watermarks on Arbitrary Images
- WolBanking77: Wolof Banking Speech Intent Classification Dataset
- Women in AI Research WiAIR (BTF)
- Wonder Wins Ways: Curiosity-Driven Exploration through Multi-Agent Contextual Calibration
- Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech Synthesis
- Words That Unite The World: A Unified Framework for Deciphering Central Bank Communications
- Workshop on Mechanistic Interpretability
- Workshop on Multi-Turn Interactions in Large Language Models
- Workshop on Scaling Environments for Agents
- World-aware Planning Narratives Enhance Large Vision-Language Model Planner
- WorldMem: Long-term Consistent World Simulation with Memory
- WorldModelBench: Judging Video Generation Models As World Models
- World Models as Reference Trajectories for Rapid Motor Adaptation
- World Models Should Prioritize the Unification of Physical and Social Dynamics
- WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception
- Worse than Zero-shot? A Fact-Checking Dataset for Evaluating the Robustness of RAG Against Misleading Retrievals
- WritingBench: A Comprehensive Benchmark for Generative Writing
- Wukong's 72 Transformations: High-fidelity Textured 3D Morphing via Flow Models
- X-Field: A Physically Informed Representation for 3D X-ray Reconstruction
- XIFBench: Evaluating Large Language Models on Multilingual Instruction Following
- xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories
- X-Mahalanobis: Transformer Feature Mixing for Reliable OOD Detection
- X-Scene: Large-Scale Driving Scene Generation with High Fidelity and Flexible Controllability
- XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation
- YEAST: Yet Another Sequential Test
- Yggdrasil: Bridging Dynamic Speculation and Static Runtime for Latency-Optimal Tree-Based LLM Decoding
- YOLOv12: Attention-Centric Real-Time Object Detectors
- You Can Trust Your Clustering Model: A Parameter-free Self-Boosting Plug-in for Deep Clustering
- You Only Communicate Once: One-shot Federated Low-Rank Adaptation of MLLM
- You Only Spectralize Once: Taking a Spectral Detour to Accelerate Graph Neural Network
- Your Pre-trained LLM is Secretly an Unsupervised Confidence Calibrator
- Zebra-Llama: Towards Extremely Efficient Hybrid Models
- ZEBRA: Towards Zero-Shot Cross-Subject Generalization for Universal Brain Visual Decoding
- ZeCO: Zero-Communication Overhead Sequence Parallelism for Linear Attention
- ZeroPatcher: Training-free Sampler for Video Inpainting and Editing
- ZeroSep: Separate Anything in Audio with Zero Training
- Zero-Shot Blind-Spot Image Denoising via Cross-Scale Non-Local Pixel Refilling
- Zero-Shot Context Generalization in Reinforcement Learning from Few Training Contexts
- Zero-shot Denoising via Neural Compression: Theoretical and algorithmic framework
- Zero-Shot Detection of LLM-Generated Text via Implicit Reward Model
- Zero-Shot Performance Prediction for Probabilistic Scaling Laws
- Zero-shot protein stability prediction by inverse folding models: a free energy interpretation
- Zero-Shot Trajectory Planning for Signal Temporal Logic Tasks
- Zero-shot World Models via Search in Memory
- ZeroS: Zero‑Sum Linear Attention for Efficient Transformers
- Zeroth-Order Optimization Finds Flat Minima
- ZEUS: Zero-shot Embeddings for Unsupervised Separation of Tabular Data
- ZigzagPointMamba: Spatial-Semantic Mamba for Point Cloud Understanding
- zip2zip: Inference-Time Adaptive Tokenization via Online Compression
- Zooming from Context to Cue: Hierarchical Preference Optimization for Multi-Image MLLMs
- ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS
Successful Page Load