Track: Mexico City Poster Session 4

Poster

Breakthrough Sensor-Limited Single View: Towards Implicit Temporal Dynamics for Time Series Domain Adaptation

Mingyang Liu · Xinyang Chen · Xiucheng Li · Weili Guan · Liqiang Nie

Unsupervised domain adaptation has emerged as a pivotal paradigm for mitigating distribution shifts in time series analysis. The fundamental challenge in time series domain adaptation arises from the entanglement of domain shifts and intricate temporal patterns. Crucially, the latent continuous-time dynamics, which are often inaccessible due to sensor constraints, are only partially observable through discrete time series from an explicit sensor-limited single view. This partial observability hinders the modeling of intricate temporal patterns, impeding domain invariant representation learning. To mitigate the limitation, we propose EDEN (multiple Explicit Domain Enhanced adaptation Network), expanding the raw dataset to multi-scale explicit domains, multi-subspace explicit domains and multi-segment explicit domains. EDEN enhances domain adaptation with three coordinated modules tailored to integrate multiple explicit domains: (1) Multi-Scale Curriculum Adaptation implements progressive domain alignment from coarse-scale to fine-scale. (2) Quality-Aware Feature Fusion evaluates feature quality in multi-subspace explicit domains and adaptively integrates temporal-frequency features. (3) Temporal Coherence Learning enforces segment-level consistency with multi-segment explicit domains. The representation enriched by multiple explicit domains bridges the gap between partially observed discrete samples and the underlying implicit temporal dynamics, enabling more accurate approximation of implicit temporal patterns for effective cross-domain adaptation. Our comprehensive evaluation across 6 time series benchmarks demonstrates EDEN's consistent superiority, achieving average accuracy improvements of 4.8% over state-of-the-art methods in cross-domain scenarios. Code is available at the anonymous link: .

Poster

Explainably Safe Reinforcement Learning

Sabine Rieder · Stefan Pranger · Debraj Chakraborty · Jan Kretinsky · Bettina Könighofer

Trust in a decision-making system requires both safety guarantees and the ability to interpret and understand its behavior. This is particularly important for learned systems, whose decision-making processes are often highly opaque. Shielding is a prominent model-based technique for enforcing safety in reinforcement learning. However, because shields are automatically synthesized using rigorous formal methods, their decisions are often similarly difficult for humans to interpret. Recently, decision trees became customary to represent controllers and policies. However, since shields are inherently non-deterministic, their decision tree representations become too large to be explainable in practice. To address this challenge, we propose a novel approach for explainable safe RL that enhances trust by providing human-interpretable explanations of the shield's decisions. Our method represents the shielding policy as a hierarchy of decision trees, offering top-down, case-based explanations. At design time, we use a world model to analyze the safety risks of executing actions in given states. Based on this risk analysis, we construct both the shield and a high-level decision tree that classifies states into risk categories (safe, critical, dangerous, unsafe), providing an initial explanation of why a given situation may be safety-critical. At runtime, we generate localized decision trees that explain which actions are allowed and why others are deemed unsafe. Altogether, our method facilitates the explainability of the safety aspect in the safe-by-shielding reinforcement learning. Our framework requires no additional information beyond what is already used for shielding, incurs minimal overhead, and can be readily integrated into existing shielded RL pipelines. In our experiments, we compute explanations using decision trees that are several orders of magnitude smaller than the original shield.

Spotlight Poster

GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution

Fengxiang Wang · Mingshuo Chen · Yueying Li · Di Wang · Haotian Wang · Zonghao Guo · Zefan Wang · Shan Boqi · Long Lan · Yulin Wang · Hongzhen Wang · Wenjing Yang · Bo Du · Jing Zhang

Ultra-high-resolution (UHR) remote sensing (RS) imagery offers valuable data for Earth observation but pose challenges for existing multimodal foundation models due to two key bottlenecks: (1) limited availability of UHR training data, and (2) token explosion caused by the large image size. To address data scarcity, we introduce **SuperRS-VQA** (avg. 8,376$\times$8,376) and **HighRS-VQA** (avg. 2,000$\times$1,912), the highest-resolution vision-language datasets in RS to date, covering 22 real-world dialogue tasks. To mitigate token explosion, our pilot studies reveal significant redundancy in RS images: crucial information is concentrated in a small subset of object-centric tokens, while pruning background tokens (e.g., ocean or forest) can even improve performance. Motivated by these findings, we propose two strategies: *Background Token Pruning* and *Anchored Token Selection*, to reduce the memory footprint while preserving key semantics. Integrating these techniques, we introduce **GeoLLaVA-8K**, the first RS-focused multimodal large language model capable of handling inputs up to 8K$\times$8K resolution, built on the LLaVA framework. Trained on SuperRS-VQA and HighRS-VQA, GeoLLaVA-8K sets a new state-of-the-art on the XLRS-Bench. Datasets and code were released at https://github.com/MiliLab/GeoLLaVA-8K.

Poster

How Different from the Past? Spatio-Temporal Time Series Forecasting with Self-Supervised Deviation Learning

Haotian Gao · Zheng Dong · Jiawei Yong · Shintaro Fukushima · Kenjiro Taura · Renhe Jiang

Spatio-temporal forecasting is essential for real-world applications such as traffic management and urban computing. Although recent methods have shown improved accuracy, they often fail to account for dynamic deviations between current inputs and historical patterns. These deviations contain critical signals that can significantly affect model performance. To fill this gap, we propose $\textbf{ST-SSDL}$, a $\underline{S}$patio-$\underline{T}$emporal time series forecasting framework that incorporates a $\underline{S}$elf-$\underline{S}$upervised $\underline{D}$eviation $\underline{L}$earning scheme to capture and utilize such deviations. ST-SSDL anchors each input to its historical average and discretizes the latent space using learnable prototypes that represent typical spatio-temporal patterns. Two auxiliary objectives are proposed to refine this structure: a contrastive loss that enhances inter-prototype discriminability and a deviation loss that regularizes the distance consistency between input representations and corresponding prototypes to quantify deviation. Optimized jointly with the forecasting objective, these components guide the model to organize its hidden space and improve generalization across diverse input conditions. Experiments on six benchmark datasets show that ST-SSDL consistently outperforms state-of-the-art baselines across multiple metrics. Visualizations further demonstrate its ability to adaptively respond to varying levels of deviation in complex spatio-temporal scenarios. Our code and datasets are available at https://github.com/Jimmy-7664/ST-SSDL.

Poster

Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression

Xi Zhang · Xiaolin Wu · Jiamang Wang · Weisi Lin

Large Language Models (LLMs) have demonstrated remarkable capabilities but typically require extensive computational resources and memory for inference. Post-training quantization (PTQ) can effectively reduce these demands by storing weights in lower bit-width formats. However, standard uniform quantization often leads to notable performance degradation, particularly in low-bit scenarios. In this work, we introduce a Grouped Lattice Vector Quantization (GLVQ) framework that assigns each group of weights a customized lattice codebook, defined by a learnable generation matrix. To address the non-differentiability of the quantization process, we adopt Babai rounding to approximate nearest-lattice-point search during training, which enables stable optimization of the generation matrices. Once trained, decoding reduces to a simple matrix-vector multiplication, yielding an efficient and practical quantization pipeline. Experiments on multiple benchmarks show that our approach achieves a better trade-off between model size and accuracy compared to existing post-training quantization baselines, highlighting its effectiveness in deploying large models under stringent resource constraints. Our source code is available on GitHub repository: https://github.com/xzhang9308/GLVQ.

Poster

Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models

Yi Liu · Dianqing Liu · Mingye Zhu · Junbo Guo · Yongdong Zhang · Zhendong Mao

The widespread adoption of large language models (LLMs) across industries has increased the demand for high-quality and customizable outputs. However, traditional alignment methods often require retraining large pretrained models, making it difficult to quickly adapt and optimize LLMs for diverse applications. To address this limitation, we propose a novel \textit{Residual Alignment Model} (\textit{RAM}) that formalizes the alignment process as a type of importance sampling. In this framework, the unaligned upstream model serves as the proposal distribution, while the alignment process is framed as secondary sampling based on an autoregressive alignment module that acts as an estimator of the importance weights. This design enables a natural detachment of the alignment module from the target aligned model, improving flexibility and scalability. Based on this model, we derive an efficient sequence-level training strategy for the alignment module, which operates independently of the proposal module. Additionally, we develop a resampling algorithm with iterative token-level decoding to address the common first-token latency issue in comparable methods. Experimental evaluations on two leading open-source LLMs across diverse tasks, including instruction following, domain adaptation, and preference optimization, demonstrate that our approach consistently outperforms baseline models.

Poster

Leveraging robust optimization for llm alignment under distribution shifts

Mingye Zhu · Yi Liu · Zheren Fu · Yongdong Zhang · Zhendong Mao

Preference alignment methods are increasingly critical for steering large language models (LLMs) to generate outputs consistent with human values. While recent approaches often rely on synthetic data generated by LLMs for scalability and cost-efficiency reasons, this reliance can introduce distributional shifts that undermine the nuanced representation of human preferences needed for desirable outputs. In this paper, we propose a novel distribution-aware optimization framework that improves preference alignment despite such shifts. Our approach first leverages well-learned classifiers to assign a calibration value to each training sample, quantifying its alignment with the target human-preferred distribution. These values are then incorporated into a robust optimization objective that minimizes the worst-case loss over regions of the data space most relevant to human preferences. By explicitly focusing optimization on the target distribution, our approach mitigates the impact of distributional mismatch and improves the generation of responses that better reflect intended values.

Poster

LLM at Network Edge: A Layer-wise Efficient Federated Fine-tuning Approach

Jinglong Shen · Nan Cheng · Wenchao Xu · Haozhao Wang · Yifan guo · Jiajie Xu

Fine-tuning large language models (LLMs) poses significant computational burdens, especially in federated learning (FL) settings. We introduce Layer-wise Efficient Federated Fine-tuning (LEFF), a novel method designed to enhance the efficiency of FL fine-tuning while preserving model performance and minimizing client-side computational overhead. LEFF strategically selects layers for fine-tuning based on client computational capacity, thereby mitigating the straggler effect prevalent in heterogeneous environments. Furthermore, LEFF incorporates an importance-driven layer sampling mechanism, prioritizing layers with greater influence on model performance. Theoretical analysis demonstrates that LEFF achieves a convergence rate of $\mathcal{O}(1/\sqrt{T})$. Extensive experiments on diverse datasets demonstrate that LEFF attains superior computational efficiency and model performance compared to existing federated fine-tuning methods, particularly under heterogeneous conditions.

Poster

Local Curvature Descent: Squeezing More Curvature out of Standard and Polyak Gradient Descent

Peter Richtarik · Simone Maria Giancola · Dymitr Lubczyk · Robin Yadav

We contribute to the growing body of knowledge on more powerful and adaptive stepsizes for convex optimization, empowered by local curvature information. We do not go the route of fully-fledged second-order methods, which require the expensive computation of the Hessian. Instead, our key observation is that, for some problems (e.g., when minimizing the sum of squares of absolutely convex functions), local curvature information is readily available, and can be used to obtain surprisingly powerful matrix-valued stepsizes, and meaningful theory. In particular, we develop three new methods — LCD1, LCD2, and LCD3 — where the abbreviation stands for local curvature descent. While LCD1 generalizes gradient descent with fixed stepsize, LCD2 generalizes gradient descent with Polyak stepsize. Our methods enhance these classical gradient descent baselines with local curvature information, and our theory recovers the known rates in the special case when no curvature information is used. Our last method, LCD3, is a variable-metric version of LCD2; this feature leads to a closed-form expression for the iterates. Our empirical results are encouraging and show that the local curvature descent improves upon gradient descent.

Poster

MSTAR: Box-free Multi-query Scene Text Retrieval with Attention Recycling

Liang Yin · Xudong Xie · Zhang Li · Xiang Bai · Yuliang Liu

Scene text retrieval has made significant progress with the assistance of accurate text localization. However, existing approaches typically require costly bounding box annotations for training. Besides, they mostly adopt a customized retrieval strategy but struggle to unify various types of queries to meet diverse retrieval needs. To address these issues, we introduce Multi-query Scene Text retrieval with Attention Recycling (MSTAR), a box-free approach for scene text retrieval. It incorporates progressive vision embedding to dynamically capture the multi-grained representation of texts and harmonizes free-style text queries with style-aware instructions. Additionally, a multi-instance matching module is integrated to enhance vision-language alignment. Furthermore, we build the Multi-Query Text Retrieval (MQTR) dataset, the first benchmark designed to evaluate the multi-query scene text retrieval capability of models, comprising four query types and $16k$ images. Extensive experiments demonstrate the superiority of our method across seven public datasets and the MQTR dataset. Notably, MSTAR marginally surpasses the previous state-of-the-art model by 6.4\% in MAP on Total-Text while eliminating box annotation costs. Moreover, on the MQTR benchmark, MSTAR significantly outperforms the previous models by an average of 8.5\%. The code and datasets are available at \href{https://github.com/yingift/MSTAR}{https://github.com/yingift/MSTAR}.

Poster

PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring

Wang · Xiao Yang · Qingyong Hu · Jack Tang · Can Liu · Dengbo He · Yuntao Wang · Yingcong Chen · Kaishun Wu

Robust and unobtrusive in-vehicle physiological monitoring is crucial for ensuring driving safety and user experience. While remote physiological measurement (RPM) offers a promising non-invasive solution, its translation to real-world driving scenarios is critically constrained by the scarcity of comprehensive datasets. Existing resources are often limited in scale, modality diversity, the breadth of biometric annotations, and the range of captured conditions, thereby omitting inherent real-world challenges in driving. Here, we present PhysDrive, the first large-scale multimodal dataset for contactless in-vehicle physiological sensing with dedicated consideration of various modality settings and driving factors. PhysDrive collects data from 48 drivers, including synchronized RGB, near-infrared camera, and raw mmWave radar data, accompanied by six synchronized ground truths (ECG, BVP, Respiration, HR, RR, and SpO2). It covers a wide spectrum of naturalistic driving conditions, including driver motions, dynamic natural light, vehicle types, and road conditions. We extensively evaluate both signal‑processing and deep‑learning methods on PhysDrive, establishing a comprehensive benchmark across all modalities, and release full open‑source code with compatibility for mainstream public toolboxes. We envision PhysDrive will serve as a foundational resource and accelerate research on multimodal driver monitoring and smart‑cockpit systems.

Poster

Pin the Tail on the Model: Blindfolded Repair of User-Flagged Failures in Text-to-Image Services

Gefei Tan · Ali Shahin Shamsabadi · Ellen Kolesnikova · Hamed Haddadi · Xiao Wang

Diffusion models are increasingly deployed in real-world text-to-image services. These models, however, encode implicit assumptions about the world based on web-scraped image-caption pairs used during training. Over time, such assumptions may become outdated, incorrect, or socially biased--leading to failures where the generated images misalign with users' expectations or evolving societal norms. Identifying and fixing such failures is challenging and, thus, a valuable asset for service providers, as failures often emerge post-deployment and demand specialized expertise and resources to resolve them. In this work, we introduce $\textit{SURE}$, the first end‑to‑end framework that $\textbf{S}$ec$\textbf{U}$rely $\textbf{RE}$pairs failures flagged by users of diffusion-based services. $\textit{SURE}$ enables the service provider to securely collaborate with an external third-party specialized in model repairing (i.e., Model Repair Institute) without compromising the confidentiality of user feedback, the service provider’s proprietary model, or the Model Repair Institute’s proprietary repairing knowledge. To achieve the best possible efficiency, we propose a co-design of a model editing algorithm with a customized two-party cryptographic protocol. Our experiments show that $\textit{SURE}$ is highly practical: $\textit{SURE}$ securely and effectively repairs all 32 layers of {Stable Diffusion v1.4} in under 17 seconds (four orders of magnitude more efficient than a general baseline). Our results demonstrate that practical, secure model repair is attainable for large-scale, modern diffusion services.

Poster

RankMatch: A Novel Approach to Semi-Supervised Label Distribution Learning Leveraging Rank Correlation between Labels

Zhiqiang Kou · Yucheng Xie · Hailin Wang · Junyang Chen · Jingq Wang · Ming-Kun Xie · Shuo Chen · Yuheng Jia · Tongliang Liu · Xin Geng

Pseudo label based semi-supervised learning (SSL) for single-label and multi-label classification tasks has been extensively studied; however, semi-supervised label distribution learning (SSLDL) remains a largely unexplored area. Existing SSL methods fail in SSLDL because the pseudo-labels they generate only ensure overall similarity to the ground truth but do not preserve the ranking relationships between true labels, as they rely solely on KL divergence as the loss function during training. These skewed pseudo-labels lead the model to learn incorrect semantic relationships, resulting in reduced performance accuracy. To address these issues, we propose a novel SSLDL method called \textit{RankMatch}. \textit{RankMatch} fully considers the ranking relationships between different labels during the training phase with labeled data to generate higher-quality pseudo-labels. Furthermore, our key observation is that a flexible utilization of pseudo-labels can enhance SSLDL performance. Specifically, focusing solely on the ranking relationships between labels while disregarding their margins helps prevent model overfitting. Theoretically, we prove that incorporating ranking correlations enhances SSLDL performance and establish generalization error bounds for \textit{RankMatch}. Finally, extensive real-world experiments validate its effectiveness.

Poster

Reasoning Is Not a Race: When Stopping Early Beats Going Deeper

Mohan Zhang · Jiaxuan Gao · Shusheng Xu · YI WU

We study the use of Process Reward Models (PRMs) for guiding Long Chain-of-Thought (CoT) reasoning in large language models. Although PRMs deliver fine-grained feedback in standard tasks, PRM-guided beam search does not consistently outperform PRM-free approaches in long CoT reasoning. We trace this shortfall to a "step quality degradation''—the expected step quality shows concave behavior, yielding unimodal or monotonically declining trends. To counteract this, we propose Z-Score Guided Early Stopping (ZGES), which halts search at the detected quality peak using local PRM-reward z-scores. Across multiple math benchmarks and model scales, ZGES outperforms both standard PRM-guided beam search and the PRM-free methods. Ablation studies further highlight the advantages and robustness of ZGES’s adaptive stopping mechanism.

Poster

Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA

Shuangyi Chen · Yuanxin Guo · Yue Ju · Hardik Dalal · Zhongwen Zhu · Ashish Khisti

Parameter-Efficient Fine-Tuning (PEFT) methods like Low-Rank Adaptation (LoRA) optimize federated training by reducing computational and communication costs. We propose RoLoRA, a federated framework using alternating optimization to fine-tune LoRA adapters. Our approach emphasizes the importance of learning up and down projection matrices to enhance expressiveness and robustness. We use both theoretical analysis and extensive experiments to demonstrate the advantages of RoLoRA over prior approaches that either generate imperfect model updates or limit expressiveness of the model. We provide a theoretical analysis on a linear model to highlight the importance of learning both the down-projection and up-projection matrices in LoRA. We validate the insights on a non-linear model and separately provide a convergence proof under general conditions. To bridge theory and practice, we conducted extensive experimental evaluations on language models including RoBERTa-Large, Llama-2-7B on diverse tasks and FL settings to demonstrate the advantages of RoLoRA over other methods.

Spotlight Poster

Robust Graph Condensation via Classification Complexity Mitigation

Jiayi Luo · Qingyun Sun · Beining Yang · Haonan Yuan · Xingcheng Fu · Yanbiao Ma · Jianxin Li · Philip S Yu

Graph condensation (GC) has gained significant attention for its ability to synthesize smaller yet informative graphs. However, existing studies often overlook the robustness of GC in scenarios where the original graph is corrupted. In such cases, we observe that the performance of GC deteriorates significantly, while existing robust graph learning technologies offer only limited effectiveness. Through both empirical investigation and theoretical analysis, we reveal that GC is inherently an intrinsic-dimension-reducing process, synthesizing a condensed graph with lower classification complexity. Although this property is critical for effective GC performance, it remains highly vulnerable to adversarial perturbations. To tackle this vulnerability and improve GC robustness, we adopt the geometry perspective of graph data manifold and propose a novel Manifold-constrained Robust Graph Condensation framework named MRGC. Specifically, we introduce three graph data manifold learning modules that guide the condensed graph to lie within a smooth, low-dimensional manifold with minimal class ambiguity, thereby preserving the classification complexity reduction capability of GC and ensuring robust performance under universal adversarial attacks. Extensive experiments demonstrate the robustness of MRGC across diverse attack scenarios.

Poster

Rope to Nope and Back Again: A New Hybrid Attention Strategy

Bowen Yang · Bharat Venkitesh · Dwaraknath Gnaneshwar Talupuru · Hangyu Lin · David Cairuz · Phil Blunsom · Acyr Locatelli

Long-context large language models (LLMs) have achieved remarkable advancements, driven by techniques like Rotary Position Embedding (RoPE) (Su et al., 2023) and its extensions (Chen et al., 2023; Liu et al., 2024c; Peng et al., 2023). By adjusting RoPE parameters and incorporating training data with extended contexts, we can train performant models with considerably longer input sequences. However, existing RoPE-based methods exhibit performance limitations when applied to extended context lengths. This paper presents a comprehensive analysis of various attention mechanisms, including RoPE, No Positional Embedding (NoPE), and Query-Key Normalization (QK-Norm), identifying their strengths and shortcomings in long-context modeling. Our investigation identifies distinctive attention patterns in these methods and highlights their impact on long-context performance, providing valuable insights for architectural design. on long context performance, providing valuable insights for architectural design. Building on these findings, we propose a novel architecture featuring a hybrid attention mechanism that integrates global and local attention spans. This design not only surpasses conventional RoPE-based transformer models with full attention in both long and short context tasks but also delivers substantial efficiency gains during training and inference.

Poster

S-Crescendo: A Nested Transformer Weaving Framework for Scalable Nonlinear System in S-Domain Representation

Junlang Huang · Chen Hao · Li Luo · Yong Cai · Lexin Zhang · Tianhao Ma · Yitian Zhang · Zhong Guan

Simulation of high-order nonlinear system requires extensive computational resources, especially in modern VLSI backend design where bifurcation-induced instability and chaos-like transient behaviors pose challenges. We present S-Crescendo - a nested transformer weaving framework that synergizes S-domain with neural operators for scalable time-domain prediction in high-order nonlinear networks, alleviating the computational bottlenecks of conventional solvers via Newton-Raphson method. By leveraging the partial-fraction decomposition of an n-th order transfer function into first-order modal terms with repeated poles and residues, our method bypasses the conventional Jacobian matrix-based iterations and efficiently reduces computational complexity from cubic $O(n^3)$ to linear $O(n)$.The proposed architecture seamlessly integrates an S-domain encoder with an attention-based correction operator to simultaneously isolate dominant response and adaptively capture higher-order non-linearities. Validated on order-1 to order-10 networks, our method achieves up to 0.99 test-set $R^2$ accuracy against HSPICE golden waveforms and accelerates simulation by up to 18$\times$, providing a scalable, physics-aware framework for high-dimensional nonlinear modeling.

Poster

SPICED: A Synaptic Homeostasis-Inspired Framework for Unsupervised Continual EEG Decoding

Yangxuan Zhou · Sha Zhao · Jiquan Wang · Haiteng Jiang · Shijian Li · Tao Li · Gang Pan

Human brain achieves dynamic stability-plasticity balance through synaptic homeostasis, a self-regulatory mechanism that stabilizes critical memory traces while preserving optimal learning capacities. Inspired by this biological principle, we propose SPICED: a neuromorphic framework that integrates the synaptic homeostasis mechanism for unsupervised continual EEG decoding, particularly addressing practical scenarios where new individuals with inter-individual variability emerge continually. SPICED comprises a novel synaptic network that enables dynamic expansion during continual adaptation through three bio-inspired neural mechanisms: (1) critical memory reactivation, which mimics brain functional specificity, selectively activates task-relevant memories to facilitate adaptation; (2) synaptic consolidation, which strengthens these reactivated critical memory traces and enhances their replay prioritizations for further adaptations and (3) synaptic renormalization, which are periodically triggered to weaken global memory traces to preserve learning capacities. The interplay within synaptic homeostasis dynamically strengthens task-discriminative memory traces and weakens detrimental memories. By integrating these mechanisms with continual learning system, SPICED preferentially replays task-discriminative memory traces that exhibit strong associations with newly emerging individuals, thereby achieving robust adaptations. Meanwhile, SPICED effectively mitigates catastrophic forgetting by suppressing the replay prioritization of detrimental memories during long-term continual learning. Validated on three EEG datasets, SPICED show its effectiveness. More importantly, SPICED bridges biological neural mechanisms and artificial intelligence through synaptic homeostasis, providing insights into the broader applicability of bio-inspired principles.

Poster

Spurious-Aware Prototype Refinement for Reliable Out-of-Distribution Detection

Reihaneh Zohrabi · Hosein Hasani · Mahdieh Soleymani · Anna Rohrbach · Marcus Rohrbach · Mohammad Hossein Rohban

Out-of-distribution (OOD) detection is crucial for ensuring the reliability and safety of machine learning models in real-world applications, where they frequently face data distributions unseen during training. Despite progress, existing methods are often vulnerable to spurious correlations that mislead models and compromise robustness. To address this, we propose SPROD, a novel prototype-based OOD detection approach that explicitly addresses the challenge posed by unknown spurious correlations. Our post-hoc method refines class prototypes to mitigate bias from spurious features without additional data or hyperparameter tuning, and is broadly applicable across diverse backbones and OOD detection settings. We conduct a comprehensive spurious correlation OOD detection benchmarking, comparing our method against existing approaches and demonstrating its superior performance across challenging OOD datasets, such as CelebA, Waterbirds, UrbanCars, Spurious Imagenet, and the newly introduced Animals MetaCoCo. On average, SPROD improves AUROC by 4.8% and FPR@95 by 9.4% over the second best.

Poster

Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression

Jiarui Jiang · Wei Huang · Miao Zhang · Taiji Suzuki · Liqiang Nie

State-space models (SSMs), particularly Mamba, emerge as an efficient Transformer alternative with linear complexity for long-sequence modeling. Recent empirical works demonstrate Mamba's in-context learning (ICL) capabilities competitive with Transformers, a critical capacity for large foundation models. However, theoretical understanding of Mamba’s ICL remains limited, restricting deeper insights into its underlying mechanisms. Even fundamental tasks such as linear regression ICL, widely studied as a standard theoretical benchmark for Transformers, have not been thoroughly analyzed in the context of Mamba. To address this gap, we study the training dynamics of Mamba on the linear regression ICL task. By developing novel techniques tackling non-convex optimization with gradient descent related to Mamba's structure, we establish an exponential convergence rate to ICL solution, and derive a loss bound that is comparable to Transformer's. Importantly, our results reveal that Mamba can perform a variant of \textit{online gradient descent} to learn the latent function in context. This mechanism is different from that of Transformer, which is typically understood to achieve ICL through gradient descent emulation. The theoretical results are verified by experimental simulation.

Poster

Unsupervised Federated Graph Learning

Lele Fu · Tianchi Liao · Sheng Huang · Bowen Deng · zhangchuanfu · Shirui Pan · Chuan Chen

Federated graph learning (FGL) is a privacy-preserving paradigm for modeling distributed graph data, designed to train a powerful global graph neural network. Existing FGL methods predominantly rely on label information during training, effective FGL in an unsupervised setting remains largely unexplored territory. In this paper, we address two key challenges in unsupervised FGL: 1) Local models tend to converge in divergent directions due to the lack of shared semantic information across clients. Then, how to align representation spaces among multiple clients is the first challenge. 2) Conventional federated weighted aggregation easily results in degrading the performance of the global model, then which raises another challenge, namely how to adaptively learn the global model parameters. In response to the two questions, we propose a tailored framework named FedPAM, which is composed of two modules: Representation Space Alignment (RSA) and Adaptive Global Parameter Learning (AGPL). RSA leverages a set of learnable anchors to define the global representation space, then local subgraphs are aligned with them through the fused Gromov-Wasserstein optimal transport, achieving the representation space alignment across clients. AGPL stacks local model parameters into third-order tensors, and adaptively integrates the global model parameters in a low-rank tensor space, which facilitates to fuse the high-order knowledge among clients. Extensive experiments on eight graph datasets are conducted, the results demonstrate that the proposed FedPAM is superior over classical and SOTA compared methods.

Main Navigation

Poster Session

Mexico City Poster Session 4

Don Alberto 4

Breakthrough Sensor-Limited Single View: Towards Implicit Temporal Dynamics for Time Series Domain Adaptation

Explainably Safe Reinforcement Learning

GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution

How Different from the Past? Spatio-Temporal Time Series Forecasting with Self-Supervised Deviation Learning

Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression

Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models

Leveraging robust optimization for llm alignment under distribution shifts

LLM at Network Edge: A Layer-wise Efficient Federated Fine-tuning Approach

Local Curvature Descent: Squeezing More Curvature out of Standard and Polyak Gradient Descent

MSTAR: Box-free Multi-query Scene Text Retrieval with Attention Recycling

PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring

Pin the Tail on the Model: Blindfolded Repair of User-Flagged Failures in Text-to-Image Services

RankMatch: A Novel Approach to Semi-Supervised Label Distribution Learning Leveraging Rank Correlation between Labels

Reasoning Is Not a Race: When Stopping Early Beats Going Deeper

Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA

Robust Graph Condensation via Classification Complexity Mitigation

Rope to Nope and Back Again: A New Hybrid Attention Strategy

S-Crescendo: A Nested Transformer Weaving Framework for Scalable Nonlinear System in S-Domain Representation

SPICED: A Synaptic Homeostasis-Inspired Framework for Unsupervised Continual EEG Decoding

Spurious-Aware Prototype Refinement for Reliable Out-of-Distribution Detection

Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression

Unsupervised Federated Graph Learning