Statistical Frontiers in LLMs and Foundation Models

Workshop

Statistical Frontiers in LLMs and Foundation Models

Anastasios Angelopoulos · Stephen Bates · Alexander D'Amour · Jessica Hullman · Fanny Yang · Sophia Sun · Tatsunori Hashimoto

Sat 14 Dec, 9 a.m. PST

[ Abstract ] Workshop Website

[ OpenReview]

We propose a workshop on the emerging frontier at the intersection between statistics and foundation models. Rigorous evaluation of large foundation models such as LLMs is necessary for reliable deployment, but it poses a towering challenge due to a lack of datasets and the black-box nature of many such models. The proposed workshop brings together the community working on understanding and improving LLMs with new statistical methodologies, and explores topics including benchmarking, measuring and correcting bias, automatic evaluation, watermarking, models/data auditing, and uncertainty quantification.

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Sat 9:00 a.m. - 9:39 a.m.	Opening Remarks ( Intro ) > SlidesLive Video	🔗
Sat 9:30 a.m. - 10:15 a.m.	Invited talk #1: Bernhard Schölkopf ( Invited Talk ) > SlidesLive Video	🔗
Sat 10:15 a.m. - 11:15 a.m.	Unstructured Time ( Unstructured Time ) >	🔗
Sat 11:15 a.m. - 12:00 p.m.	Invited talks #2: Mihaela van der Schaar ( Invited Talk ) > SlidesLive Video	🔗
Sat 12:00 p.m. - 12:45 p.m.	Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction ( Poster ) > link Link	Drew Nguyen · Reese Pathak · Anastasios Angelopoulos · Stephen Bates · Michael Jordan 🔗
Sat 12:00 p.m. - 12:45 p.m.	Enhancing Semantic Clustering for Uncertainty Quantification & Conformal Prediction by LLMs ( Poster ) > link Link	11 presenters Ramneet Kaur · Colin Samplawski · Adam Cobb · Anirban Roy · Brian Matejek · Manoj Acharya · Daniel Elenius · Alexander Berenbeim · John Pavlik · Nathaniel Bastian · Susmit Jha 🔗
Sat 12:00 p.m. - 12:45 p.m.	UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models ( Poster ) > link Link	11 presenters Siyuan Wu · Yue Huang · Gao Chujie · Dongping Chen · Qihui Zhang · Yao Wan · Tianyi Zhou · Xiangliang Zhang · Jianfeng Gao · Chaowei Xiao · Lichao Sun 🔗
Sat 12:00 p.m. - 12:45 p.m.	Infilling Score: A Pretraining Data Detection Algorithm for Large Language Models ( Poster ) > link Link	Negin Raoof · Litu Rout · Giannis Daras · Sujay Sanghavi · Constantine Caramanis · Sanjay Shakkottai · Alex Dimakis 🔗
Sat 12:00 p.m. - 12:45 p.m.	Harnessing Large Language Models for Market Research: A Data-augumentation Approach ( Poster ) > link Link	Mengxin Wang · Dennis Zhang · Heng Zhang 🔗
Sat 12:00 p.m. - 12:45 p.m.	CPP-UT-Bench: Can LLMs Write Complex Unit Tests in C++? ( Poster ) > link Link	Vaishnavi Bhargava · Rajat Ghosh · Debojyoti Dutta 🔗
Sat 12:00 p.m. - 12:45 p.m.	Mind the Gap: A Surgical Study on the Self-improvement Capabilities of LLMs ( Poster ) > link Link	Yuda Song · Hanlin Zhang · Udaya Ghai · Carson Eisenach · Sham Kakade · Dean Foster 🔗
Sat 12:00 p.m. - 12:45 p.m.	Protected Test-Time Adaptation via Online Entropy Matching ( Poster ) > link Link	Yarin Bar · Yaniv Romano · Shalev Shaer 🔗
Sat 12:00 p.m. - 12:45 p.m.	Weak-to-Strong Confidence Prediction ( Poster ) > link Link	Yukai Yang · Tracy Zhu · Marco Morucci · Tim G. J. Rudner 🔗
Sat 12:00 p.m. - 12:45 p.m.	Just rephrase it! Uncertainty estimation in closed-source language models via multiple rephrased queries ( Poster ) > link Link	Adam Yang · CHEN CHEN · Konstantinos Pitas 🔗
Sat 12:00 p.m. - 12:45 p.m.	Automated Social Science: Language Models as Scientist and Subjects ( Poster ) > link Link	Kehang Zhu · John Horton · Benjamin Manning 🔗
Sat 12:00 p.m. - 12:45 p.m.	Scheduling in LLM Inference with Blowed-up Memory Constraints ( Poster ) > link Link	Zijie Zhou · Jiashuo Jiang 🔗
Sat 12:00 p.m. - 12:45 p.m.	CLUE: Concept-Level Uncertainty Estimation for Large Language Models ( Poster ) > link Link	Yu-Hsiang Wang · Andrew Bai · Che-Ping Tsai · Cho-Jui Hsieh 🔗
Sat 12:00 p.m. - 12:45 p.m.	AutoBench-V: Can Large Vision-Language Models Benchmark Themselves? ( Poster ) > link Link	Han Bao · Yanbo Wang · Jiayi Ye · Yue Huang · Xiangqi Wang · Xiangliang Zhang 🔗
Sat 12:00 p.m. - 12:45 p.m.	MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following Benchmark ( Poster ) > link Link	Elliot Epstein · Kaisheng Yao · Jing Li · Xinyi Bai · Hamid Palangi 🔗
Sat 12:00 p.m. - 12:45 p.m.	Cannot or Should Not? Automatic Analysis of Refusal Composition in IFT/RLHF Datasets and Refusal Behavior of Black-Box LLMs ( Poster ) > link Link	Alexander von Recum · Christoph Schnabl · Gabor Hollbeck · Marvin von Hagen · Silas Alberti · Philip Blinde 🔗
Sat 12:00 p.m. - 12:45 p.m.	A Statistical Approach to Quantifying LLM Human Alignment ( Poster ) > link Link	Harbin Hong · Liu Leqi · Sebastian Caldas 🔗
Sat 12:00 p.m. - 12:45 p.m.	H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models ( Poster ) > link Link	Nhi Pham · Michael Schott 🔗
Sat 12:00 p.m. - 12:45 p.m.	CriticAL: Model Criticism Automation with Language Models ( Poster ) > link Link	Michael Li · Noah Goodman · Emily Fox 🔗
Sat 12:00 p.m. - 12:45 p.m.	Robust Conformal Prediction Using Privileged Information ( Poster ) > link Link	Shai Feldman · Yaniv Romano 🔗
Sat 12:00 p.m. - 12:45 p.m.	LLMs as Emotion Analyzers for Causal Models: Partial Identification with Fuzzy Interval Data ( Poster ) > link Link	Huidi Ma · Wendao Xue · Yifan Yu 🔗
Sat 12:00 p.m. - 12:45 p.m.	Detecting Watermark Spoofing Attacks ( Poster ) > link Link	Eliot Cowan · Max Daniels 🔗
Sat 12:00 p.m. - 12:45 p.m.	MisMo: More is More in Alignment ( Poster ) > link Link	Benjamin Feuer · Micah Goldblum · Teresa Datta · Raz Besaleli · Samuel Dooley · Max Cembalest · John Dickerson 🔗
Sat 12:00 p.m. - 12:45 p.m.	Learning to Generate Verbalized Confidences ( Poster ) > link Link	Sophia Hager · Nicholas Andrews 🔗
Sat 12:00 p.m. - 12:45 p.m.	Black-box Uncertainty Quantification Method for LLM-as-a-Judge ( Poster ) > link Link	Nico Wagner · Michael Desmond · Rahul Nair · Zahra Ashktorab · Elizabeth Daly · Qian Pan · Martín Santillán Cooper · J Johnson · Werner Geyer 🔗
Sat 12:00 p.m. - 12:45 p.m.	Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP ( Poster ) > link Link	Sedigheh (Sarah) Eslami · Gerard de Melo 🔗
Sat 12:00 p.m. - 12:45 p.m.	FEET: A Framework for Evaluating Embedding Techniques ( Poster ) > link Link	Simon Lee · John Lee 🔗
Sat 12:00 p.m. - 12:45 p.m.	Advancing Conversational Psychotherapy: Integrating Privacy, Dual-Memory, and Domain Expertise with Large Language Models ( Poster ) > link Link	XiuYu Zhang · Zening Luo 🔗
Sat 12:00 p.m. - 12:45 p.m.	Distribution-based sensitivity analysis for large language models ( Poster ) > link Link	Paulius Rauba · Qiyao Wei · Mihaela van der Schaar 🔗
Sat 12:00 p.m. - 12:45 p.m.	To Believe or Not to Believe Your LLM ( Poster ) > link Link	Yasin Abbasi Yadkori · Ilja Kuzborskij · András György · Csaba Szepesvari 🔗
Sat 12:00 p.m. - 12:45 p.m.	SureMap: Simultaneous mean estimation for single-task and multi-task disaggregated evaluation ( Poster ) > link Link	Misha Khodak · Lester Mackey · Miro Dudik · Alexandra Chouldechova 🔗
Sat 12:00 p.m. - 12:45 p.m.	LLM-RankFusion: Mitigating Intrinsic Inconsistency in LLM-based Ranking ( Poster ) > link Link	Yifan Zeng · Ojas Tendolkar · Raymond Baartmans · Qingyun Wu · Lizhong Chen · Huazheng Wang 🔗
Sat 12:00 p.m. - 12:45 p.m.	Learning to Localize: Practical Algorithms for Online Weighted Conformal Prediction ( Poster ) > link Link	Tiffany Ding · Anastasios Angelopoulos · Michael Jordan · Ryan Tibshirani 🔗
Sat 12:00 p.m. - 12:45 p.m.	Towards Probabilistically-Sound Beam Search with Masked Language Models ( Poster ) > link Link	Anna Sappington · Robert Calef · Creston Brooks · Charlie Cowen-Breen 🔗
Sat 12:00 p.m. - 12:45 p.m.	A STEP TOWARDS MIXTURE OF GRADER: STATISTICAL ANALYSIS OF EXISTING AUTOMATIC EVALUATION METRICS ( Poster ) > link Link	Yun Joon Soh · Jishen Zhao 🔗
Sat 12:00 p.m. - 12:45 p.m.	Towards the Effect of Examples on In-Context Learning: A Theoretical Case Study ( Poster ) > link Link	Pengfei He · Yingqian Cui · Han Xu · Hui Liu · Makoto Yamada · Jiliang Tang · Yue XING 🔗
Sat 12:00 p.m. - 12:45 p.m.	Uncertainty-Penalized Directed Preference Optimization ( Poster ) > link Link	Sam Houliston · Alexander Immer · Alizée Pace · Gunnar Rätsch 🔗
Sat 12:00 p.m. - 12:45 p.m.	Pearls from Pebbles: Improved Confidence Functions for Auto-labeling ( Poster ) > link Link	Harit Vishwakarma · Yi Chen · Sui Jiet Tay · Satya Sai Srinath Namburi · Frederic Sala · Ramya Korlakai Vinayak 🔗
Sat 12:00 p.m. - 12:45 p.m.	Consistency-based Black-box Uncertainty Quantification for Text-to-SQL ( Poster ) > link Link	Debarun Bhattacharjya · Balaji Ganesan · Michael Glass · Junkyu Lee · Radu Marinescu · Katya Mirylenka · Xiao Shou 🔗
Sat 12:00 p.m. - 12:45 p.m.	Statistically Valid Information Bottleneck via Multiple Hypothesis Testing ( Poster ) > link Link	Amirmohammad Farzaneh · Osvaldo Simeone 🔗
Sat 12:00 p.m. - 12:45 p.m.	Skilling laws: scaling laws for LLM benchmark performance ( Poster ) > link Link	Felipe Maia Polo · Seamus Somerstep · Leshem Choshen · Yuekai Sun · Mikhail Yurochkin 🔗
Sat 12:00 p.m. - 12:45 p.m.	Monty Hall and Score Optimization in Conformal Prediction to Improve LLMs for MCQs ( Poster ) > link Link	Harit Vishwakarma · Alan Mishler · Thomas Cook · Niccolo Dalmasso · Natraj Raman · Sumitra Ganesh 🔗
Sat 12:00 p.m. - 12:45 p.m.	A teacher-teacher framework for clinical language representation learning ( Poster ) > link Link	Feiqing Huang · Shenghan Zhang · Sara Sweet · Tianxi Cai 🔗
Sat 12:00 p.m. - 12:45 p.m.	Hessian-Free Laplace in Bayesian Deep Learning ( Poster ) > link Link	James McInerney · Nathan Kallus 🔗
Sat 12:00 p.m. - 12:45 p.m.	When is Differentially Private Finetuning Actually Private? ( Poster ) > link Link	Roy Rinberg · Martin Pawelczyk 🔗
Sat 12:00 p.m. - 12:45 p.m.	Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees ( Poster ) > link Link	Yu Gui · Ying Jin · Zhimei Ren 🔗
Sat 12:00 p.m. - 12:45 p.m.	Towards Optimal Statistical Watermarking ( Poster ) > link Link	Baihe Huang · Hanlin Zhu · Banghua Zhu · Kannan Ramchandran · Michael Jordan · Jason Lee · Jiantao Jiao 🔗
Sat 12:00 p.m. - 12:45 p.m.	Optimizing Adversarial Samples for Tighter Privacy Auditing in Final Model-Only Settings ( Poster ) > link Link	Sangyeon Yoon · Wonje Jeung · Albert No 🔗
Sat 12:00 p.m. - 12:45 p.m.	ICScore: Metrics for Evaluating Interestingness and Creativity of Stories ( Poster ) > link Link	Junha Lee · Jaeshin Cho · Youngjin Cho · Hyewon Jin · Hyemin Lee · Min Song 🔗
Sat 12:00 p.m. - 12:45 p.m.	Conformal Reasoning: Uncertainty Estimation in Interactive Environments ( Poster ) > link Link	Eric Frankel · Stella Li · Lillian Ratliff · Yulia Tsvetkov · Sewoong Oh · Pang Wei Koh 🔗
Sat 12:00 p.m. - 12:45 p.m.	Adaptive and Robust Watermark for Generative Tabular Data ( Poster ) > link Link	Dung Ngo · Daniel Scott · Saheed Obitayo · Vamsi Potluru · Manuela Veloso 🔗
Sat 12:00 p.m. - 12:45 p.m.	Poster Session #1 ( Poster Session ) >	🔗
Sat 2:00 p.m. - 2:45 p.m.	Invited Talk #3: Weijie Su ( Invited Talk ) > SlidesLive Video	🔗
Sat 3:00 p.m. - 3:45 p.m.	Invited Talk #4: Virginia Smith ( Invited Talk ) > SlidesLive Video	🔗
Sat 3:45 p.m. - 4:30 p.m.	Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs ( Poster ) > link Link	Ruijia Niu · Dongxia Wu · Rose Yu · Yian Ma 🔗
Sat 3:45 p.m. - 4:30 p.m.	An empirical study of in-context uncertainty quantification with conformal prediction ( Poster ) > link Link	Zhe Huang · Simone Rossi · Rui Yuan · Thomas Hannagan 🔗
Sat 3:45 p.m. - 4:30 p.m.	Evaluating language models as risk scores ( Poster ) > link Link	André F. Cruz · Moritz Hardt · Celestine Mendler-Dünner 🔗
Sat 3:45 p.m. - 4:30 p.m.	A Watermark for Black-Box Language Models ( Poster ) > link Link	Dara Bahri · John Wieting · Dana Alon · Donald Metzler 🔗
Sat 3:45 p.m. - 4:30 p.m.	Mitigating Hallucination in Large Language Models with Explanatory Prompting ( Poster ) > link Link	Alexander Braverman · Weitong Zhang · Quanquan Gu 🔗
Sat 3:45 p.m. - 4:30 p.m.	Source Attribution for Large Language Model-Generated Data ( Poster ) > link Link	Xinyang Lu · Jingtan Wang · Zitong Zhao · Zhongxiang Dai · Chuan Sheng Foo · See-Kiong Ng · Bryan Kian Hsiang Low 🔗
Sat 3:45 p.m. - 4:30 p.m.	Mitigating LLM Hallucinations via ConformalAbstention ( Poster ) > link Link	12 presenters Yasin Abbasi Yadkori · Ilja Kuzborskij · David Stutz · András György · Adam Fisch · Arnaud Doucet · Iuliya Beloshapka · Wei-Hung Weng · Yao-Yuan Yang · Csaba Szepesvari · Taylan Cemgil · Nenad Tomasev 🔗
Sat 3:45 p.m. - 4:30 p.m.	SCIURus: Shared Circuits for Interpretable Uncertainty Representations in Language Models ( Poster ) > link Link	Carter Teplica · Yixin Liu · Arman Cohan · Tim G. J. Rudner 🔗
Sat 3:45 p.m. - 4:30 p.m.	Taming False Positives in Out-of-Distribution Detection with Human Feedback ( Poster ) > link Link	Harit Vishwakarma · Heguang Lin · Ramya Korlakai Vinayak 🔗
Sat 3:45 p.m. - 4:30 p.m.	Length Optimization in Conformal Prediction ( Poster ) > link Link	Shayan Kiyani · George J. Pappas · Hamed Hassani 🔗
Sat 3:45 p.m. - 4:30 p.m.	Conformal Prediction Adaptive to Unknown Subpopulation Shifts ( Poster ) > link Link	Nien-Shao Wang · Sai Praneeth Karimireddy 🔗
Sat 3:45 p.m. - 4:30 p.m.	Bayesian Concept Bottleneck Models with LLM Priors ( Poster ) > link Link	Jean Feng · Avni Kothari · Lucas Zier · Chandan Singh · Yan Shuo Tan 🔗
Sat 3:45 p.m. - 4:30 p.m.	Statistical Uncertainty Quantification for Aggregate Performance Metrics in Machine Learning Benchmarks ( Poster ) > link Link	Rachel Longjohn · Giri Gopalan · Emily Casleton 🔗
Sat 3:45 p.m. - 4:30 p.m.	A Framework for Evaluating LLMs Under Task Indeterminacy ( Poster ) > link Link	Luke Guerdan · Hanna Wallach · Solon Barocas · Alexandra Chouldechova 🔗
Sat 3:45 p.m. - 4:30 p.m.	Evaluating Generative AI Systems is a Social Science Measurement Challenge ( Poster ) > link Link	20 presenters Hanna Wallach · Meera Desai · Nicholas Pangakis · A. Feder Cooper · Angelina Wang · Solon Barocas · Alexandra Chouldechova · Chad Atalla · Su Lin Blodgett · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Alexandra Olteanu · Stefanie Reed · Emily Sheng · Dan Vann · Jennifer Wortman Vaughan · Matthew Vogel · Hannah Washington · Abigail Jacobs 🔗
Sat 3:45 p.m. - 4:30 p.m.	Privately Learning from Graphs with Applications in Fine-tuning Large Pretrained Models ( Poster ) > link Link	Haoteng YIN · Rongzhe Wei · Eli Chien · Pan Li 🔗
Sat 3:45 p.m. - 4:30 p.m.	Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks ( Poster ) > link Link	Zizhang Chen · Pengyu Hong · Sandeep Madireddy 🔗
Sat 3:45 p.m. - 4:30 p.m.	Predictive Inference in Multi-environment Scenarios ( Poster ) > link Link	John Duchi · Suyash Gupta · Kuanhao Jiang · Pragya Sur 🔗
Sat 3:45 p.m. - 4:30 p.m.	Back-to-Basics Revisited: Benchmarking an Expanded Set of RLHF Algorithms ( Poster ) > link Link	Lucas Spangher · Rama Kumar Pasumarthi · Nick Masiewicki · Peter Grabowski · Eugene Ie · William Arnold · Daniele Calandriello · Bilal Piot 🔗
Sat 3:45 p.m. - 4:30 p.m.	Conformal Language Model Reasoning with Coherent Factuality ( Poster ) > link Link	Maya Gambhir · Maxon Rubin-Toles · Keshav Ramji · Aaron Roth · Surbhi Goel 🔗
Sat 3:45 p.m. - 4:30 p.m.	Adversarial Robust Deep Reinforcement Learning is Neither Robust Nor Safe ( Poster ) > link Link	Ezgi Korkmaz 🔗
Sat 3:45 p.m. - 4:30 p.m.	ReFeR: A Hierarchical Framework of Models as Evaluative and Reasoning Agents ( Poster ) > link Link	Yaswanth Narsupalli · Abhranil Chandra · Sreevatsa Muppirala · Manish Gupta · Pawan Goyal 🔗
Sat 3:45 p.m. - 4:30 p.m.	Formal Analysis and Unification of Generalization in Deep Reinforcement Learning ( Poster ) > link Link	Ezgi Korkmaz 🔗
Sat 3:45 p.m. - 4:30 p.m.	Interactive Semantic Interventions for VLMs: A Human-in-the-Loop Approach to Interpretability ( Poster ) > link Link	Lukas Klein · Kenza Amara · Carsten Lüth · Hendrik Strobelt · Mennatallah El-Assady · Paul Jaeger 🔗
Sat 3:45 p.m. - 4:30 p.m.	MarkMyWords: Analyzing and Evaluating Language Model Watermarks ( Poster ) > link Link	Julien Piet · Chawin Sitawarin · Vivian Fang · Norman Mu · David Wagner 🔗
Sat 3:45 p.m. - 4:30 p.m.	Deep Limit Model-free Prediction in Regression ( Poster ) > link Link	Kejin Wu · Dimitris Politis 🔗
Sat 3:45 p.m. - 4:30 p.m.	Fast yet Safe: Early-Exiting with Risk Control ( Poster ) > link Link	Metod Jazbec · Alexander Timans · Tin Hadži Veljković · Kaspar Sakmann · Dan Zhang · Christian Andersson Naesseth · Eric Nalisnick 🔗
Sat 3:45 p.m. - 4:30 p.m.	Conversational Question-Answering for process task guidance in manufacturing ( Poster ) > link Link	Ramesh Manuvinakurike · Elizabeth Watkins · Celal Savur · Anthony Rhodes · Sovan Biswas · Richard Beckwith · Gesem Mejia · Saurav Sahay · Giuseppe Raffa · Lama Nachman 🔗
Sat 3:45 p.m. - 4:30 p.m.	Towards LLM-guided Efficient and Interpretable Multi-linear Tensor Network Rank Selection ( Poster ) > link Link	Giorgos Iacovides · Wuyang Zhou · Danilo Mandic 🔗
Sat 3:45 p.m. - 4:30 p.m.	Auto-Evaluation with Few Labels through Post-hoc Regression ( Poster ) > link Link	Benjamin Eyre · David Madras 🔗
Sat 3:45 p.m. - 4:30 p.m.	vTune: Verifiable fine-tuning Through Backdooring ( Poster ) > link Link	Eva Zhang · Akilesh Potti · Micah Goldblum 🔗
Sat 3:45 p.m. - 4:30 p.m.	Diffusion-Powered Image Super-Resolution That You Can Actually Trust ( Poster ) > link Link	Daniel Csillag · Eduardo Adame · Guilherme Tegoni Goedert 🔗
Sat 3:45 p.m. - 4:30 p.m.	A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners ( Poster ) > link Link	Bowen Jiang · Yangxinyu Xie · Zhuoqun Hao · Xiaomeng Wang · Tanwi Mallick · Weijie Su · Camillo Taylor · Dan Roth 🔗
Sat 3:45 p.m. - 4:30 p.m.	Scalable Subsampling Inference for Deep Neural Networks ( Poster ) > link Link	Kejin Wu · Dimitris Politis 🔗
Sat 3:45 p.m. - 4:30 p.m.	HuLLMI: HUMAN VS. LLM IDENTIFICATION WITH EXPLAINABILITY ( Poster ) > link Link	Prathamesh Dinesh Joshi · Sahil Pocker · Raj Dandekar · Rajat Dandekar · Sreedath Panat 🔗
Sat 3:45 p.m. - 4:30 p.m.	A shared standard for valid measurement of generative AI systems' capabilities, risks, and impacts ( Poster ) > link Link	14 presenters Alexandra Chouldechova · Chad Atalla · Solon Barocas · A. Feder Cooper · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Nicholas Pangakis · Stefanie Reed · Emily Sheng · Dan Vann · Matthew Vogel · Hannah Washington · Hanna Wallach 🔗
Sat 3:45 p.m. - 4:30 p.m.	Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation ( Poster ) > link Link	Siyuan Wang · Zhuohan Long · Zhihao Fan · Xuanjing Huang · zhongyu wei 🔗
Sat 3:45 p.m. - 4:30 p.m.	Obtaining Conformal Prediction-like guarantees by standard concentration: an observation ( Poster ) > link Link	Emmanouil Seferis 🔗
Sat 3:45 p.m. - 4:30 p.m.	Reexpress: Similarity-Distance-Magnitude Calibration ( Poster ) > link Link	Allen Schmaltz 🔗
Sat 3:45 p.m. - 4:30 p.m.	LLMs for Causal Inference ( Poster ) > link Link	Jonathan Choi 🔗
Sat 3:45 p.m. - 4:30 p.m.	Uncertainty Quantification for Inverse Problems with Generative Priors under Distribution Shift ( Poster ) > link Link	Sara Fridovich-Keil 🔗
Sat 3:45 p.m. - 4:30 p.m.	Estimating and Correcting for Misclassification Error in Empirical Textual Research ( Poster ) > link Link	Jonathan Choi 🔗
Sat 3:45 p.m. - 4:30 p.m.	Are Police Biased? An NLP Approach ( Poster ) > link Link	Jonathan Choi 🔗
Sat 3:45 p.m. - 4:30 p.m.	Poster Session #2 ( Poster Session ) >	🔗
Sat 4:30 p.m. - 5:15 p.m.	Closing remarks and Discussions ( Discussions ) >	🔗