Workshop
Foundation Models for Science: Progress, Opportunities, and Challenges
Wuyang Chen 路 Pu Ren 路 Elena Massara 路 Yongji Wang 路 N. Benjamin Erichson 路 Laurence Perreault-Levasseur 路 Bo Li 路 Swarat Chaudhuri
Sun 15 Dec, 8:30 a.m. PST
The integration of artificial intelligence (AI) and machine learning (ML) into scientific discovery represents a pivotal shift in traditional methodologies. Historically, scientific exploration has been systematic and logical, but AI and ML promise to transform fundamental discoveries. This shift enhances interdisciplinary dialogue and stimulates innovative problem-solving, enriching the scientific community's ability to tackle complex problems. Foundation models, such as GPT-3 and CLIP, have revolutionized computer vision and natural language processing, providing versatile, pre-trained bases for various applications. Leveraging these models addresses critical challenges like long-term planning and multi-modal reasoning, essential for applications in robotics and dialogue systems. The integration of AI-for-Science and foundation models offers a transformative force in scientific domains, solving complex problems and enabling domain-specific adaptations. This synergy is poised to radically improve the modeling of complex phenomena, making it a crucial investment for future scientific advancements. This workshop aims to bring together experts to discuss and collaborate on transformative questions and challenges in advancing scientific problems through foundation models.
Schedule
Sun 8:30 a.m. - 9:10 a.m.
|
Invited Talk 1: Paris Perdikaris
(
Invited Talk
)
>
SlidesLive Video |
Paris Perdikaris 馃敆 |
Sun 9:45 a.m. - 10:25 a.m.
|
Invited Talk 2: Michael Mahoney
(
Invited Talk
)
>
SlidesLive Video |
Michael Mahoney 馃敆 |
Sun 10:30 a.m. - 11:10 a.m.
|
Invited Talk 3: Laure Zanna
(
Invited Talk
)
>
SlidesLive Video |
Laure Zanna 馃敆 |
Sun 11:15 a.m. - 11:55 a.m.
|
Invited Talk 4: Shirley Ho
(
Invited Talk
)
>
SlidesLive Video |
Shirley Ho 馃敆 |
Sun 12:00 p.m. - 12:10 p.m.
|
Molphenix: A Multimodal Foundation Model for PhenoMolecular Retrieval
(
Oral
)
>
link
SlidesLive Video |
Philip Fradkin 路 Puria Azadi Moghadam 路 Karush Suri 路 Frederik Wenkel 路 Maciej Sypetkowski 路 Dominique Beaini 馃敆 |
Sun 12:10 p.m. - 12:20 p.m.
|
ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy
(
Oral
)
>
link
SlidesLive Video |
13 presentersKian Kenyon-Dean 路 Jerry Wang 路 John Urbanik 路 Konstantin Donhauser 路 Jason Hartford 路 Saber Saberian 路 Nil Sahin 路 Ihab Bendidi 路 Safiye Celik 路 Marta Fay 路 Juan Rodriguez 路 Imran Haque 路 Oren Kraus |
Sun 12:20 p.m. - 12:30 p.m.
|
GFlowNet Pretraining with Inexpensive Rewards
(
Oral
)
>
link
SlidesLive Video |
Mohit Pandey 路 Gopeshh Subbaraj 路 Emmanuel Bengio 馃敆 |
Sun 2:20 p.m. - 3:00 p.m.
|
Invited Talk 5: Max Welling
(
Invited Talk
)
>
SlidesLive Video |
Max Welling 馃敆 |
Sun 3:30 p.m. - 4:10 p.m.
|
Invited Talk 6: Danielle Maddix Robinson
(
Invited Talk
)
>
SlidesLive Video |
Danielle Maddix 馃敆 |
Sun 4:15 p.m. - 4:25 p.m.
|
Towards Interpretable Scientific Foundation Models: Sparse Autoencoders for Disentangling Dense Embeddings of Scientific Concepts
(
Oral
)
>
link
SlidesLive Video |
Charles O'Neill 路 Christine Ye 路 Kartheik Iyer 路 John Wu 馃敆 |
Sun 4:25 p.m. - 4:35 p.m.
|
Extralonger: Toward a Unified Perspective of Spatial-Temporal Factors for Extra-Long-Term Traffic Forecasting
(
Oral
)
>
link
SlidesLive Video |
Zhiwei Zhang 路 Shaojun E 路 Fandong Meng 路 Jie Zhou 路 Wenjuan Han 馃敆 |
Sun 4:35 p.m. - 4:45 p.m.
|
Learning temperature-aware representations from millions of annotated protein sequences
(
Oral
)
>
link
SlidesLive Video |
Mingchen Li 路 Liang Zhang 路 Zilan Wang 路 Bozitao Zhong 路 Pan Tan 路 Jiabei Cheng 路 Bingxin Zhou 路 Liang Hong 路 Huiqun Yu 馃敆 |
-
|
Provable in-context learning of linear systems and linear elliptic PDEs with transformers ( Poster ) > link | Frank Cole 路 Yulong Lu 路 Tianhao Zhang 路 Riley O'Neill 馃敆 |
-
|
Specialized Foundation Models Struggle to Beat Traditional Supervised Learning Baselines ( Poster ) > link | Ritvik Gupta 路 Zongzhe Xu 路 Wenduo Cheng 路 Alexander Shen 路 Junhong Shen 路 Ameet Talwalkar 路 Misha Khodak 馃敆 |
-
|
Uncertainty and Generalizability in Foundation Models for Earth Observation ( Poster ) > link | Raul Ramos-Poll谩n 路 Freddie Kalaitzis 路 Karthick Panner Selvam 馃敆 |
-
|
Self-supervised Multimodal Model for Astronomy ( Poster ) > link | Mariia Rizhko 路 Joshua Bloom 馃敆 |
-
|
In-Context Learning for Function Approximation with DeepSet-ONet ( Poster ) > link | Shao-Ting Chiu 路 Junyuan Hong 路 Ulisses M. Braga-Neto 馃敆 |
-
|
Vision foundation models: can they be applied to astrophysics data? ( Poster ) > link | Erica Lastufka 路 Mariia Drozdova 路 Vitaliy Kinakh 路 Slava Voloshynovskiy 馃敆 |
-
|
A COMPARATIVE STUDY OF NEURAL ODE AND UNIVERSAL ODE MODELS IN SOLVING CHANDRASEKHAR鈥橲 WHITE DWARF EQUATION. ( Poster ) > link | Raymundo Vazquez Martinez 路 Raj Dandekar 路 Rajat Dandekar 路 Sreedath Panat 馃敆 |
-
|
Leveraging foundation models for data-limited ecological applications ( Poster ) > link | Kyle Doherty 路 Max Gurinas 路 Erik Samsoe 路 Charles Casper 路 Beau Larkin 路 Philip Ramsey 路 Brandon Trabucco 路 Ruslan Salakhutdinov 馃敆 |
-
|
Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences ( Poster ) > link | Niklas Schmidinger 路 Lisa Schneckenreiter 路 Philipp Seidl 路 Johannes Schimunek 路 Sohvi Luukkonen 路 Pieter-Jan Hoedt 路 Johannes Brandstetter 路 Andreas Mayr 路 Sepp Hochreiter 路 G眉nter Klambauer 馃敆 |
-
|
Generating and Validating Agent and Environment Code for Simulating Realistic Personality Profiles with Large Language Models ( Poster ) > link | Nathan Cloos 路 M Ganesh Kumar 路 Adam Manoogian 路 Christopher Cueva 路 Shawn Rhoads 馃敆 |
-
|
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature ( Poster ) > link |
13 presentersDavid Wadden 路 Kejian Shi 路 Jacob Morrison 路 Aakanksha Naik 路 Shruti Singh 路 Nitzan Barzilay 路 Kyle Lo 路 Tom Hope 路 Luca Soldaini 路 Zejiang Shen 路 Doug Downey 路 Hannaneh Hajishirzi 路 Arman Cohan |
-
|
VSMNO: Solving PDE by Utilizing Spectral Patterns of Different Neural Operators ( Poster ) > link | Fengrui Jing 路 Hongzhen Ding 路 Taosong 馃敆 |
-
|
Scalable Universal T-Cell Receptor Embeddings from Adaptive Immune Repertoires ( Poster ) > link | Paidamoyo Chapfuwa 路 Ilker Demirel 路 Lorenzo Pisani 路 Javier Zazo 路 Elon Portugaly 路 H. Zahid 路 Julia Greissl 馃敆 |
-
|
Enhancing Detail Recovery in ICF Radiographs: A Transformer-based Approach with ViXReg ( Poster ) > link | Nga T Nguyen-Fotiadis 路 Bradley Wolfe 路 Zhehui Wang 馃敆 |
-
|
Small Molecule Optimization with Large Language Models ( Poster ) > link | Menua Bedrosian 路 Philipp Guevorguian 路 Tigran Fahradyan 路 Gayane Chilingaryan 路 Hrant Khachatrian 路 Armen Aghajanyan 馃敆 |
-
|
Scientific Knowledge Graph and Ontology Generation using Open Large Language Models ( Poster ) > link | Alexandru Oarga 路 Matthew Hart 路 Andres M Bran 路 Magdalena Lederbauer 路 Philippe Schwaller 馃敆 |
-
|
Metalic: Meta-Learning In-Context with Protein Language Models ( Poster ) > link | Jacob Beck 路 Shikha Surana 路 Manus McAuliffe 路 Oliver Bent 路 Tom Barrett 路 Juan Jose Garau-Luis 路 Paul Duckworth 馃敆 |
-
|
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding ( Poster ) > link | Sihang Li 路 Jin Huang 路 Jiaxi Zhuang 路 Yaorui Shi 路 Cai Xiaochen 路 Mingjun Xu 路 Xiang Wang 路 Linfeng Zhang 路 Guolin Ke 路 Hengxing Cai 馃敆 |
-
|
CLOUD: A Scalable Scientific Foundation Model for Crystal Representation Learning ( Poster ) > link | Changwen Xu 路 Zhu 路 Venkatasubramanian Viswanathan 馃敆 |
-
|
A Mamba-Based Foundation Model for Chemistry ( Poster ) > link | Emilio Vital Brazil 路 Eduardo Soares 路 Victor Yukio Shirasuna 路 Renato Cerqueira 路 Dmitry Zubarev 路 Kristin Schmidt 馃敆 |
-
|
MAMORX: Multi-agent Multi-Modal Scientific Review Generation with External Knowledge ( Poster ) > link | Guanchao Wang 路 Pawin Taechoyotin 路 Tong Zeng 路 Bradley Sides 路 Daniel Acuna 馃敆 |
-
|
ChemDFM: A Large Language Foundation Model for Chemistry ( Poster ) > link |
13 presentersZihan Zhao 路 Da Ma 路 Lu Chen 路 Liangtai Sun 路 Zihao Li 路 Yi Xia 路 Hongshen Xu 路 Zichen Zhu 路 Su Zhu 路 Shuai Fan 路 Guodong Shen 路 Kai Yu 路 Xin Chen |
-
|
Bridging biomolecular modalities for knowledge transfer in bio-language models ( Poster ) > link | Mangal Prakash 路 Artem Moskalev 路 Peter DiMaggio 路 Steven Combs 路 Tommaso Mansi 路 Justin Scheer 路 Rui Liao 馃敆 |
-
|
Improving generalisability of 3D binding affinity models in low data regimes ( Poster ) > link | Julia Milena Buhmann 路 Ward Haddadin 路 Alan Bilsland 路 Luk谩拧 Pravda 路 Hagen Triendl 馃敆 |
-
|
AtmosArena: Benchmarking Foundation Models for Atmospheric Sciences ( Poster ) > link | Tung Nguyen 路 Prateik Sinha 路 Advit Deepak 路 Karen A McKinnon 路 Aditya Grover 馃敆 |
-
|
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems ( Poster ) > link | Patrick Emami 路 Zhaonan Li 路 Saumya Sinha 路 Truc Nguyen 馃敆 |
-
|
SciDFM: A Large Language Model with Mixture-of-Experts for Science ( Poster ) > link | Liangtai Sun 路 Danyu Luo 路 Da Ma 路 Zihan Zhao 路 BaocaiChen 路 Zhennan Shen 路 Su Zhu 路 Lu Chen 路 Xin Chen 路 Kai Yu 馃敆 |
-
|
Survey: Adaptive Physics-informed Neural Networks ( Poster ) > link | Edgar Torres Rios 路 Mathias Niepert 馃敆 |
-
|
Contextualizing biological perturbation experiments through language ( Poster ) > link | Menghua Wu 路 Russell Littman 路 Jacob Levine 路 Lin Qiu 路 Tommaso Biancalani 路 David Richmond 路 Jan-Christian Huetter 馃敆 |
-
|
Assessing interaction recovery of predicted protein-ligand poses ( Poster ) > link | David Errington 路 Constantin Schneider 路 C茅dric Bouysset 路 Fr茅d茅ric Dreyer 馃敆 |
-
|
Generative Models in Protein Engineering: A Comprehensive Survey ( Poster ) > link | Xinhui Chen 路 Yiwen Yuan 路 Joseph Liu 路 Chak Tou Leong 路 Xiaoye Zhu 路 Jiaqi Chen 馃敆 |
-
|
IgBlend: Unifying 3D Structure and Sequence for Antibody LLMs ( Poster ) > link | C茅dric Malherbe 路 Talip Ucar 馃敆 |
-
|
SeisLM: a Foundation Model for Seismic Waveforms ( Poster ) > link | Tianlin Liu 路 Jannes M眉nchmeyer 路 Laura Laurenti 路 Chris Marone 路 Maarten V. de Hoop 路 Ivan Dokmani膰 馃敆 |
-
|
Agnostic Causality-Driven Enhancement of Chemical Foundation Models on Downstream Tasks ( Poster ) > link | Victor Yukio Shirasuna 路 Eduardo Soares 路 Emilio Vital Brazil 路 Karen Fiorella Gutierrez 路 Renato Cerqueira 路 Dmitry Zubarev 路 Kristin Schmidt 馃敆 |
-
|
Can we pre-train ICL-based SFMs for the zero-shot inference of the 1D CDR problem with noisy data? ( Poster ) > link | Mingu Kang 路 Dongseok Lee 路 Woojin Cho 路 Kookjin Lee 路 Anthony Gruber 路 Nathaniel Trask 路 Youngjoon Hong 路 Noseong Park 馃敆 |
-
|
Maven: A Multimodal Foundation Model for Supernova Science ( Poster ) > link | Gemma Zhang 路 Thomas Helfer 路 Alex Gagliano 路 Siddharth Mishra-Sharma 路 V Villar 馃敆 |
-
|
ProtDiff: Function-Conditioned Masked Diffusion Models for Robust Directed Protein Generation ( Poster ) > link | Vishrut Thoutam 馃敆 |
-
|
Understanding Protein-DNA Interactions by Paying Attention to Protein and Genomics Foundation Models ( Poster ) > link | Dhruva Rajwade 路 Erica Wang 路 Aryan Satpathy 路 Alexander Brace 路 Hongyu Guo 路 Arvind Ramanathan 路 Shengchao Liu 路 Animashree Anandkumar 馃敆 |
-
|
Multi-View Mixture-of-Experts for Predicting Molecular Properties Using SMILES, SELFIES, and Graph-Based Representations ( Poster ) > link | Eduardo Soares 路 Indra Priyadarsini S 路 Emilio Vital Brazil 路 Victor Yukio Shirasuna 路 Seiji Takeda 馃敆 |
-
|
A Foundation Model for Metagenomic Sequences ( Poster ) > link | Ollie Liu 路 sami jaghouar 路 Johannes Hagemann 路 Jeff Kaufman 路 Willie Neiswanger 馃敆 |
-
|
A Large Encoder-Decoder Polymer-Based Foundation Model ( Poster ) > link | Eduardo Soares 路 Nathaniel Park 路 Emilio Vital Brazil 路 Victor Yukio Shirasuna 馃敆 |
-
|
OPI: An Open Instruction Dataset for Adapting Large Language Models to Protein-Related Tasks ( Poster ) > link | Hongwang Xiao 路 wenjun lin 路 Hui Wang 路 Zheng Liu 路 Qiwei Ye 馃敆 |
-
|
BiRNA-BERT: Adaptive Tokenization for Efficient RNA Language Modeling ( Poster ) > link | Toki Tahmid 路 Haz Sameen Shahgir 路 Sazan Mahbub 路 Yue Dong 路 Md. Shamsuzzoha Bayzid 馃敆 |
-
|
Solaris: A Foundation Model for the Sun ( Poster ) > link | Harris Abdul Majid 路 Pietro Sittoni 路 Francesco Tudisco 馃敆 |
-
|
Is Tokenization Needed for Masked Particle Modelling? ( Poster ) > link | Matthew Leigh 路 Samuel Klein 路 Francois Charton 路 Tobias Golling 路 Lukas Heinrich 路 Michael Kagan 路 Margarita Osadchy 馃敆 |
-
|
DiffBatt: A Diffusion Model for Battery Degradation Prediction and Synthesis ( Poster ) > link | Hamidreza Eivazi Kourabbaslou 路 Andr茅 Hebenbrock 路 Raphael Ginster 路 Steffen Bl枚meke 路 Stefan Wittek 路 Christoph Hermann 路 Thomas Spengler 路 Thomas Turek 路 Andreas Rausch 馃敆 |
-
|
ChatCite: LLM Agent with Human Workflow Guidance for Comparative Literature Summary ( Poster ) > link | Yutong Li 路 Lu Chen 路 Aiwei Liu 路 Kai Yu 路 Lijie Wen 馃敆 |
-
|
Understanding Drought through Spatial-Temporal Learning ( Poster ) > link | Xuwei Tan 路 Qian Zhao 路 Yanlan Liu 路 Xueru Zhang 馃敆 |
-
|
LLM Agent for Fire Dynamics Simulations ( Poster ) > link | Leidong Xu 路 Danyal Mohaddes 路 Yi Wang 馃敆 |
-
|
Language Models for Text-guided Protein Evolution ( Poster ) > link | Zhanghan Ni 路 Shengchao Liu 路 Animashree Anandkumar 馃敆 |
-
|
Cell ontology guided transcriptome foundation model ( Poster ) > link | XINYU YUAN 路 Zhihao Zhan 路 Zuobai Zhang 路 Manqi Zhou 路 Jianan Zhao 路 Boyu Han 路 Yue Li 路 Jian Tang 馃敆 |
-
|
Developing a Foundation Model for Predicting Material Failure ( Poster ) > link |
13 presentersAgnese Marcato 路 Javier E. Santos 路 Aleksandra Pachalieva 路 Kai Gao 路 Ryley Hill 路 Esteban Rougier 路 Qinjun Kang 路 Jeffrey Hyman 路 Abigail Hunter 路 Janel Chua 路 Earl Lawrence 路 Hari Viswanathan 路 Daniel O'Malley |
-
|
A Safety-aware Framework for Generative Enzyme Design with Foundation Models ( Poster ) > link | Xiaoyi Fu 路 Tao Han 路 Yuan Yao 路 Song Guo 馃敆 |
-
|
Scale-consistent learning with neural operators ( Poster ) > link | Zongyi Li 路 Samuel Lanthaler 路 Catherine Deng 路 Yixuan Wang 路 Kamyar Azizzadenesheli 路 Animashree Anandkumar 馃敆 |
-
|
Solving Out-of-Distribution Challenges in Optical Foundation Models using Self-Improving Data Augmentation ( Poster ) > link | Mingqian Ma 路 Taigao Ma 路 L. Jay Guo 馃敆 |
-
|
Pulsar Candidate Classification with Multimodal Large Language Models ( Poster ) > link | Fuyong Zhao 路 Yuyang Li 路 Yanhao Wang 路 Hui Li 路 Mei Chen 路 Panfeng Chen 路 Ningchen Sun 路 Cunshi Wang 路 Jifeng Liu 馃敆 |
-
|
PROSE-FD: A Multimodal PDE Foundation Model for Learning Multiple Operators for Forecasting Fluid Dynamics ( Poster ) > link | Yuxuan Liu 路 Jingmin Sun 路 Xinjie He 路 Griffin Pinney 路 Zecheng Zhang 路 Hayden Schaeffer 馃敆 |
-
|
Stylish and Functional: Guided Interpolation Subject to Physical Constraints ( Poster ) > link | Yan-Ying Chen 路 Nikos Arechiga 路 Chenyang Yuan 路 Matthew Hong 路 Matt Klenk 路 Charlene C. Wu 馃敆 |
-
|
BarcodeMamba: State Space Models for Biodiversity Analysis ( Poster ) > link | Tiancheng Gao 路 Graham Taylor 馃敆 |
-
|
Weighted Diversified Sampling for Efficient Data-Driven Single-Cell Gene-Gene Interaction Discovery ( Poster ) > link | Yifan Wu 路 Yuntao Yang 路 Zirui Liu 路 Zhao Li 路 Khushbu Pahwa 路 Rongbin Li 路 W. Jim Zheng 路 Xia Hu 路 Zhaozhuo Xu 馃敆 |
-
|
Adapting Segment Anything Model (SAM) to Experimental Datasets via Fine-Tuning on GAN-based Simulation: A Case Study in Additive Manufacturing ( Poster ) > link | Anika Tabassum 路 Amir K Ziabari 馃敆 |
-
|
Learning temperature-aware representations from millions of annotated protein sequences ( Poster ) > link | Mingchen Li 路 Liang Zhang 路 Zilan Wang 路 Bozitao Zhong 路 Pan Tan 路 Jiabei Cheng 路 Bingxin Zhou 路 Liang Hong 路 Huiqun Yu 馃敆 |
-
|
Towards Interpretable Scientific Foundation Models: Sparse Autoencoders for Disentangling Dense Embeddings of Scientific Concepts ( Poster ) > link | Charles O'Neill 路 Christine Ye 路 Kartheik Iyer 路 John Wu 馃敆 |
-
|
Extralonger: Toward a Unified Perspective of Spatial-Temporal Factors for Extra-Long-Term Traffic Forecasting ( Poster ) > link | Zhiwei Zhang 路 Shaojun E 路 Fandong Meng 路 Jie Zhou 路 Wenjuan Han 馃敆 |
-
|
GFlowNet Pretraining with Inexpensive Rewards ( Poster ) > link | Mohit Pandey 路 Gopeshh Subbaraj 路 Emmanuel Bengio 馃敆 |
-
|
ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy ( Poster ) > link |
13 presentersKian Kenyon-Dean 路 Jerry Wang 路 John Urbanik 路 Konstantin Donhauser 路 Jason Hartford 路 Saber Saberian 路 Nil Sahin 路 Ihab Bendidi 路 Safiye Celik 路 Marta Fay 路 Juan Rodriguez 路 Imran Haque 路 Oren Kraus |
-
|
Molphenix: A Multimodal Foundation Model for PhenoMolecular Retrieval ( Poster ) > link | Philip Fradkin 路 Puria Azadi Moghadam 路 Karush Suri 路 Frederik Wenkel 路 Maciej Sypetkowski 路 Dominique Beaini 馃敆 |
-
|
SpectraFM: Tuning into Stellar Foundation Models ( Poster ) > link | Nolan Koblischke 路 Jo Bovy 馃敆 |