Timezone: »

Workshop
Sara Hooker · Rosanne Liu · Pablo Samuel Castro · FatemehSadat Mireshghallah · Sunipa Dev · Benjamin Rosman · João Madeira Araújo · Savannah Thais · Sara Hooker · Sunny Sanyal · Tejumade Afonja · Swapneel Mehta · Tyler Zhu

Sat Dec 03 06:45 AM -- 03:00 PM (PST) @ Room 394-395

This workshop aims to discuss the challenges and opportunities of expanding research collaborations in light of the changing landscape of where, how, and by whom research is produced. Progress toward democratizing AI research has been centered around making knowledge (e.g. class materials), established ideas (e.g. papers), and technologies (e.g. code, compute) more accessible. However, open, online resources are only part of the equation. Growth as a researcher requires not only learning by consuming information individually, but hands-on practice whiteboarding, coding, plotting, debugging, and writing collaboratively, with either mentors or peers. Of course, making "collaborators" more universally accessible is fundamentally more difficult than, say, ensuring all can access arXiv papers because scaling people and research groups is much harder than scaling websites. Can we nevertheless make access to collaboration itself more open?

 Sat 6:45 a.m. - 7:00 a.m. Opening Remarks 🔗 Sat 7:00 a.m. - 7:30 a.m. Invited Keynote 1 (Keynote) Yoshua Bengio 🔗 Sat 7:30 a.m. - 8:15 a.m. Panel 1: The Rise of Community-driven Research (Discussion Panel) 🔗 Sat 8:15 a.m. - 8:30 a.m. Coffee Break (Break) 🔗 Sat 8:30 a.m. - 8:45 a.m. Petals: Collaborative Inference and Fine-tuning of Large Models (Oral) []   link »    Many NLP tasks benefit from using large language models (LLMs) that often have more than 100 billion parameters. With the release of BLOOM-176B and OPT-175B, everyone can download pretrained models of this scale. Still, using these models requires high-end hardware unavailable to many researchers. In some cases, LLMs can be used more affordably via RAM offloading or hosted APIs. However, these techniques have innate limitations: offloading is too slow for interactive inference, while APIs are not flexible enough for research. In this work, we propose Petals - a system for inference and fine-tuning of large models collaboratively by joining the resources of multiple parties. We demonstrate that this strategy significantly outperforms offloading for very large models, running inference of BLOOM-176B on consumer GPUs with $\approx$ 1 step per second. Unlike most inference APIs, Petals also natively exposes the hidden states of served models, allowing its users to train and share custom model extensions based on efficient fine-tuning methods. Link » Alexander Borzunov · Dmitry Baranchuk · Tim Dettmers · Max Ryabinin · Younes Belkada · Artem Chumachenko · Pavel Samygin · Colin Raffel 🔗 Sat 8:45 a.m. - 9:00 a.m. EleutherAI: Going Beyond "Open Science" to "Science in the Open" (Oral) []   link »    Over the past two years, EleutherAI has established itself as a radical initiative aimed at both promoting open-source research and conducting research in a transparent, openly accessible and collaborative manner. EleutherAI's approach to research goes beyond transparency; by doing research entirely in public, anyone in the world can observe and contribute at every stage. Our work has been received positively and resulted in several high-impact projects in Natural Language Processing and other fields. In this paper, we describe our experience doing public-facing machine learning research, the benefits we believe this approach brings, and the pitfalls we have encountered. Link » Jason Phang · Herbie Bradley · Leo Gao · Louis Castricato · Stella Biderman 🔗 Sat 9:00 a.m. - 9:15 a.m. Towards Openness Beyond Open Access: User Journeys through 3 Open AI Collaboratives (Oral) []   link »    Open Artificial Intelligence (Open AI) collaboratives offer alternative pathways for how AI can be developed beyond well-resourced technology companies and who can be a part of the process. To understand how and why they work and what additionality they bring to the landscape, we focus on three such communities, each focused on a different kind of activity around AI: building models (BigScience workshop), tools/ways of working (The Turing Way), and ecosystems (Mozilla Festival’s Building Trustworthy AI Working Group). First, we document the community structures that facilitate these distributed, volunteer-led teams, comparing the collaboration styles that drive each group towards their specific goals. Through interviews with community leaders, we map user journeys for how members discover, join, contribute, and participate. Ultimately, this paper aims to highlight the diversity of AI work and workers that have come forth through these collaborations and how they offer a broader practice of openness to the AI space. Link » Jennifer Ding · Christopher Akiki · Yacine Jernite · Anne Steele · Temi Popo 🔗 Sat 9:15 a.m. - 10:15 a.m. A Lesson From a Student Community: To Each Their Role (Poster) []  []   link » As the frontiers of machine learning (ML) continue to expand, the gap between the public understanding of ML and state-of-the-art research widens. While laboratory researchers benefit from easily accessible and encouraged collaboration with domain experts, the same cannot be said of newcomers to the field. At the undergraduate level, where socioeconomic inequality means some students have stronger backgrounds than their peers, increasing the accessibility of practical, hands-on opportunities in machine learning is essential to narrowing this gap. In this paper, we detail the approach of Machine Learning @ Berkeley (ML@B), a university-based undergraduate student organization aimed at bridging this gap by encouraging collaboration with established figures in the field as well as within the organization itself. We hope that our perspectives gained from ML@B provide insights into successfully integrating undergraduates into a technical environment and fostering an academic culture that encourages collaboration. Link » Lizzie Lau · Valeriy Rotan · Ashwin Reddy · Michael Equi 🔗 Sat 9:15 a.m. - 10:15 a.m. Understanding Post-Baccalaureate Cultural Gaps: Building Equitable Ecosystems for AI Research and What We Can Learn from Federal TRIO Programs (Poster) []  []   link » This paper aims to survey the problem space around cultural barriers in research collaboration, specifically for Machine Learning (ML). We review (1) unequal representation in ML/AI and STEM, (2) socioeconomic influences on retention of scientists and researchers, and (3) existing educational opportunity programs for people from disadvantaged backgrounds, with emphasis on Post-Baccalaureate support. We provide evidence that students from disadvantaged backgrounds not only experience barriers to gaining intellectual and technical expertise, they often experience cultural gaps that impede their participation in graduate programs and inclusion in research collaborations. We discuss relevant research on culture differences and the ways that some U.S. Federal TRIO programs explicitly address them, highlighting standardization as one means of demystifying academic and research cultures. We conclude with recommendations toward understanding post-education culture gaps with the goal of finding better solutions for increasing diversity in research collaborations. Link » Sherol Chen · Jared Ali · Rick Sommer · Danny Kim · Jean Griffin · Tammy Chang · Julia Choi · Kwame Webster · Warren Yamashita 🔗 Sat 9:15 a.m. - 10:15 a.m. Expanding Access to ML Research through Student-led Collaboratives (Poster) []   link » We present a model of a student-led community of researchers to highlight the impact of pursuing collaborative machine learning research on the group’s members individually as well as towards achieving shared goals. We provide concrete examples of the guiding principles that led to the evolution of the collaborative from a reading group into a research group and eventually launching a non-profit software product to help non-technical stakeholders leverage artificial intelligence (AI), improving access to advanced technologies, and promoting open science. Our goal is to lay out a template to launch similar small-scale collaborative organisations at different institutes around the world. Link » Deep Gandhi · Raghav Jain · Jay Gala · Jhagrut Lalwani · Swapneel Mehta 🔗 Sat 9:15 a.m. - 10:15 a.m. Petals: Collaborative Inference and Fine-tuning of Large Models (Poster) []  []   link » Many NLP tasks benefit from using large language models (LLMs) that often have more than 100 billion parameters. With the release of BLOOM-176B and OPT-175B, everyone can download pretrained models of this scale. Still, using these models requires high-end hardware unavailable to many researchers. In some cases, LLMs can be used more affordably via RAM offloading or hosted APIs. However, these techniques have innate limitations: offloading is too slow for interactive inference, while APIs are not flexible enough for research. In this work, we propose Petals - a system for inference and fine-tuning of large models collaboratively by joining the resources of multiple parties. We demonstrate that this strategy significantly outperforms offloading for very large models, running inference of BLOOM-176B on consumer GPUs with $\approx$ 1 step per second. Unlike most inference APIs, Petals also natively exposes the hidden states of served models, allowing its users to train and share custom model extensions based on efficient fine-tuning methods. Link » Alexander Borzunov · Dmitry Baranchuk · Tim Dettmers · Max Ryabinin · Younes Belkada · Artem Chumachenko · Pavel Samygin · Colin Raffel 🔗 Sat 9:15 a.m. - 10:15 a.m. DSN’s Multi-stakeholder, Inclusive and Integrated Collaboration Framework for Sustainable Social Impact Research and Innovation in emerging market. (Poster) []  []   link » Data Science Network (DSN) is a non-profit based out of Nigeria that focuses its efforts on high impact research and capacity building of the next generation of African data scientists. The organization harnesses trans-disciplinary and transnational partnerships to create data driven solutions for public health, agriculture, public safety, education, inequality, among others. DSN’s unique organizational structure hinges on the flow of connections and resources among its Knowledge Centres, Enabling Centres, Supporting Centres, and Catalyzing Centres. The results from the organization’s unique model has won global awards from UNESCO, the Economic Computation Conference, and by the African Union as a reference point on home-grown African-centric approach to sustainable application of research practice to address social problems. The organization's structure and key performance indicators of DSN are discussed in this position paper. Link » Olubayo Adekanmbi · Oluwatoyin Adekanmbi 🔗 Sat 9:15 a.m. - 10:15 a.m. EleutherAI: Going Beyond "Open Science" to "Science in the Open" (Poster) []  []   link » Over the past two years, EleutherAI has established itself as a radical initiative aimed at both promoting open-source research and conducting research in a transparent, openly accessible and collaborative manner. EleutherAI's approach to research goes beyond transparency; by doing research entirely in public, anyone in the world can observe and contribute at every stage. Our work has been received positively and resulted in several high-impact projects in Natural Language Processing and other fields. In this paper, we describe our experience doing public-facing machine learning research, the benefits we believe this approach brings, and the pitfalls we have encountered. Link » Jason Phang · Herbie Bradley · Leo Gao · Louis Castricato · Stella Biderman 🔗 Sat 9:15 a.m. - 10:15 a.m. Democratizing RL Research by Reusing Prior Computation (Poster) []   link » Learning tabula rasa, that is without any prior knowledge, is the prevalent workflow in reinforcement learning (RL) research. Unfortunately, the inefficiency of deep RL typically excludes researchers without access to industrial-scale resources from tackling computationally-demanding problems. Furthermore, as RL research moves toward more complex benchmarks, the computational barrier to entry would further increase. To address these issues, we present reincarnating RL as an alternative workflow or class of problem settings, where prior computational work (e.g., learned policies) is reused or transferred between design iterations of an RL agent, or from one RL agent to another. RRL can democratize research by allowing the broader community to tackle complex RL problems without requiring excessive computational resources. To demonstrate this, we present a case study on Atari games showing how superhuman Atari agents can be trained using only a few hours, as opposed to few days on a single GPU. Finally, we address reproducibility and generalizability concerns in this research workflow. Overall, this work argues for an alternate approach to RL research, which we believe could significantly improve real-world RL adoption and help democratize it further. Link » Rishabh Agarwal 🔗 Sat 9:15 a.m. - 10:15 a.m. Semantics Extraction and Analytics from SEBI Regulations and Case Files: Industry-academia Collaboration (Poster) []   link » Extracting insights from text documents and developing predictive models for analytics is of critical importance in several domains. However, it is a challenging task owing to the diversity in linguistic characteristics of large scale text corpora, exacerbated by a lack of labeled data. We present here a case-study on extracting semantics from complex legal and regulatory documents and applying them to perform analytical tasks such as violation detection and penalty estimation. Our system was developed in a joint academic-industry collaboration effort and benefited from their complementary research strengths. Specifically, the domain expertise and problem formulation process in the industrial setting were combined with the exploratory research and experimentation rigor of the educational world to develop a system that can help legal actors improve their productivity. We outline our collaboration mechanism, detail the techniques used and functionalities developed, and also discuss the key take-aways that can benefit the research community. Link » Natraj Raman · Pulkit Parikh · Lini Thomas · Kamalakar Karlapalem 🔗 Sat 9:15 a.m. - 10:15 a.m. Towards Openness Beyond Open Access: User Journeys through 3 Open AI Collaboratives (Poster) []   link » Open Artificial Intelligence (Open AI) collaboratives offer alternative pathways for how AI can be developed beyond well-resourced technology companies and who can be a part of the process. To understand how and why they work and what additionality they bring to the landscape, we focus on three such communities, each focused on a different kind of activity around AI: building models (BigScience workshop), tools/ways of working (The Turing Way), and ecosystems (Mozilla Festival’s Building Trustworthy AI Working Group). First, we document the community structures that facilitate these distributed, volunteer-led teams, comparing the collaboration styles that drive each group towards their specific goals. Through interviews with community leaders, we map user journeys for how members discover, join, contribute, and participate. Ultimately, this paper aims to highlight the diversity of AI work and workers that have come forth through these collaborations and how they offer a broader practice of openness to the AI space. Link » Jennifer Ding · Christopher Akiki · Yacine Jernite · Anne Steele · Temi Popo 🔗 Sat 9:15 a.m. - 10:15 a.m. Train Offline, Test Online: A Real Robot Learning Benchmark (Poster) []  []   link » Three challenges limit the progress of robot learning research: robots are expensive (few labs can participate), everyone uses different robots (findings do not generalize across labs), and we lack internet-scale robotics data. We take on these challenges via a new benchmark: Train Offline, Test Online (TOTO). TOTO provides remote users with access to shared robots for evaluating methods on common tasks and an open-source dataset of these tasks for offline training. Its manipulation task suite requires challenging generalization to unseen objects, positions, and lighting. We present initial results on TOTO comparing five pretrained visual representations and four offline policy learning baselines, remotely contributed by five institutions. The real promise of TOTO, however, lies in the future: we release the benchmark for additional submissions from any user, enabling easy, direct comparison to several methods without the need to obtain hardware or collect data. Link » Gaoyue Zhou · Victoria Dean · Mohan Kumar Srirama · Aravind Rajeswaran · Jyothish Pari · Kyle Hatch · Aryan Jain · Tianhe Yu · Pieter Abbeel · Lerrel Pinto · Chelsea Finn · Abhinav Gupta 🔗 Sat 9:15 a.m. - 10:15 a.m. At the Intersection of Conceptual Art and Deep Learning: The End of Signature (Poster) []  []   link » MIT wanted to commission a large scale artwork that would serve to “illuminate a new campus gateway, inaugurate a space of exchange between MIT and Cambridge, and inspire our students, faculty, visitors, and the surrounding community to engage with art in new ways and to have art be part of their daily lives.” Among other things, the art was to reflect the fact that scientific discovery is often the result of many individual contributions, both acknowledged and unacknowledged. This paper details a collaboration between a widely-exhibited artist, computer scientists, and the broader community to produce a set of collective signatures. After collecting signatures from two communities - the university, and the surrounding city - the computer scientists developed generative models and a human-in-the-loop feedback process to work with the artist to create an original signature-like structure representative of each community. These signatures are now large-scale steel, LED, and neon light sculptures that appear to sign two new buildings in Cambridge, MA. Link » Kathleen Lewis · Divya Shanmugam · Jose Javier Gonzalez Ortiz · Agnieszka Kurant · John Guttag 🔗 Sat 9:15 a.m. - 10:15 a.m. Separation of Research Data from Its Presentation (Poster) []  []   link » This is a position paper proposing the idea of separating research data from its presentation, or view. Today, researchers must not only produce the results of their research but must also think about how to present them. Moreover, in many cases, it is a matter of determining how well the results are displayed within the display format of a single paper. This has caused research results to be unnecessarily difficult to read and research to be unacceptable due to poor presentation. Therefore, we propose to allow separate creators of research data and creators of its display. This can lead to the following benefits: adoption of display formats for different purposes, utilization of buried data, improved reproducibility of studies, and gradual transition to better display formats. Link » Shiro Takagi 🔗 Sat 9:15 a.m. - 10:15 a.m. BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model (Poster) []   link » The BigScience Workshop was a value-driven initiative that spanned one and half years of interdisciplinary research and culminated in the creation of ROOTS, a 1.6TB multilingual dataset that was used to train BLOOM, one of the largest multilingual language models to date.In addition to the technical outcomes and artifacts, the workshop fostered multidisciplinary collaborations around large models, datasets, and their analysis. This in turn led to a wide range of research publications spanning topics from ethics to law, data governance, modeling choices and distributed training. This paper focuses on the collaborative research aspects of BigScience and takes a step back to look at the challenges of large-scale participatory research, with respect to participant diversity and the tasks required to successfully carry out such a project. Our main goal is to share the lessons we learned from this experience, what we could have done better and what we did well. We show how the impact of such a social approach to scientific research goes well beyond the technical artifacts that were the basis of its inception. Link » Christopher Akiki · Giada Pistilli · Margot Mieskes · Matthias Gallé · Thomas Wolf · Suzana Ilic · Yacine Jernite 🔗 Sat 9:15 a.m. - 10:15 a.m. Managing the Whole Research Process on GitHub (Poster) []  []   link » This is a position paper proposing the idea of managing the entire research process on GitHub. The current machine learning research community faces a variety of problems, such as poor quality and low reproducibility of peer review at international conferences. These problems are caused by a lack of transparency in the research process and a lack of accessibility, where not everyone can participate in any given process of research. Thus, we propose that any information that arises in the research process be posted on GitHub and that contributions to the research be managed like those in an open-source software project. This could provide a springboard for solving the challenges of machine learning through clarifying contributors, allowing fine-grained contributions, improving reproducibility, enabling post-publication peer review, enhancing diversity, and protecting ideas. Link » Shiro Takagi 🔗 Sat 9:15 a.m. - 10:15 a.m. Challenges and Opportunities of Large Transnational Datasets: A Case Study on European Administrative Crop Data (Poster) []  []   link » Expansive, informative datasets are vital in providing foundations and possibilities for scientific research and development across many fields of study. Assembly of grand datasets, however, frequently poses difficulty for the author and stakeholders alike, with a variety of considerations required throughout the collaboration efforts and development lifecycle. In this work, we discuss and analyse the challenges and opportunities we faced throughout the creation of a transnational, European agricultural dataset containing reference labels of cultivated crops. Together, this forms a succinct framework of important elements one should consider when forging a dataset of their own. Link » Maja Schneider · Christian Marchington · Marco Körner 🔗 Sat 10:15 a.m. - 10:30 a.m. Train Offline, Test Online: A Real Robot Learning Benchmark (Oral) []   link »    Three challenges limit the progress of robot learning research: robots are expensive (few labs can participate), everyone uses different robots (findings do not generalize across labs), and we lack internet-scale robotics data. We take on these challenges via a new benchmark: Train Offline, Test Online (TOTO). TOTO provides remote users with access to shared robots for evaluating methods on common tasks and an open-source dataset of these tasks for offline training. Its manipulation task suite requires challenging generalization to unseen objects, positions, and lighting. We present initial results on TOTO comparing five pretrained visual representations and four offline policy learning baselines, remotely contributed by five institutions. The real promise of TOTO, however, lies in the future: we release the benchmark for additional submissions from any user, enabling easy, direct comparison to several methods without the need to obtain hardware or collect data. Link » Gaoyue Zhou · Victoria Dean · Mohan Kumar Srirama · Aravind Rajeswaran · Jyothish Pari · Kyle Hatch · Aryan Jain · Tianhe Yu · Pieter Abbeel · Lerrel Pinto · Chelsea Finn · Abhinav Gupta 🔗 Sat 10:30 a.m. - 10:45 a.m. BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model (Oral) []   link »    The BigScience Workshop was a value-driven initiative that spanned one and half years of interdisciplinary research and culminated in the creation of ROOTS, a 1.6TB multilingual dataset that was used to train BLOOM, one of the largest multilingual language models to date.In addition to the technical outcomes and artifacts, the workshop fostered multidisciplinary collaborations around large models, datasets, and their analysis. This in turn led to a wide range of research publications spanning topics from ethics to law, data governance, modeling choices and distributed training. This paper focuses on the collaborative research aspects of BigScience and takes a step back to look at the challenges of large-scale participatory research, with respect to participant diversity and the tasks required to successfully carry out such a project. Our main goal is to share the lessons we learned from this experience, what we could have done better and what we did well. We show how the impact of such a social approach to scientific research goes well beyond the technical artifacts that were the basis of its inception. Link » Christopher Akiki · Giada Pistilli · Margot Mieskes · Matthias Gallé · Thomas Wolf · Suzana Ilic · Yacine Jernite 🔗 Sat 10:45 a.m. - 11:00 a.m. Managing the Whole Research Process on GitHub (Oral) []   link » This is a position paper proposing the idea of managing the entire research process on GitHub. The current machine learning research community faces a variety of problems, such as poor quality and low reproducibility of peer review at international conferences. These problems are caused by a lack of transparency in the research process and a lack of accessibility, where not everyone can participate in any given process of research. Thus, we propose that any information that arises in the research process be posted on GitHub and that contributions to the research be managed like those in an open-source software project. This could provide a springboard for solving the challenges of machine learning through clarifying contributors, allowing fine-grained contributions, improving reproducibility, enabling post-publication peer review, enhancing diversity, and protecting ideas. Link » Shiro Takagi 🔗 Sat 11:00 a.m. - 11:15 a.m. Coffee Break (Break) 🔗 Sat 11:15 a.m. - 12:00 p.m. Panel 2: How has funding and incentive structures impacted the history of computer science? How has it shaped the landscape of how we contribute and assess the merits of research today? (Discussion Panel) 🔗 Sat 12:00 p.m. - 12:30 p.m. Invited Keynote 2 (Keynote) Mohammad Emtiyaz Khan · Mohammad Emtiyaz Khan 🔗 Sat 12:30 p.m. - 12:45 p.m. Coffee Break (Break) 🔗 Sat 12:45 p.m. - 1:30 p.m. Panel 3: Geographic disparities in where research is produced (Discussion Panel) 🔗 Sat 2:00 p.m. - 3:00 p.m. Social (Discussion) 🔗

#### Author Information

##### FatemehSadat Mireshghallah (University of California San Diego)

Computing Innovation Fellow 2020, Research Assistant at University of Utah, Postdoctoral Fellow at UCLA starting Jan 2020. Research interests are Responsible and Interpretable AI, NLP and Algorithmic Fairness.

##### Sara Hooker (Cohere For AI)

I lead Cohere For AI, a non-profit research lab that seeks to solve complex machine learning problems. We support fundamental research that explores the unknown, and are focused on creating more points of entry into machine learning research. Prior to Cohere, I was a research scientist Google Brain doing work on training models that go beyond test-set accuracy to fulfill multiple desired criteria -- interpretable, compact, fair and robust. I enjoy working on research problems where progress translates to reliable and accessible machine learning in the real-world.

##### Sunny Sanyal (The University of Texas at Austin)

I am a PhD student in the Department of Electrical and Computer Engineering at the University of Texas at Austin. Although I am broadly interested in computer vision tasks, my primary research interests lie in the area of Vision and Language pretraining, Knowledge Distillation and Self-Supervised Learning. Last summer (2022) I interned at Amazon's Alexa team and worked on a new feature for the amazon echo devices. I graduated with an M.Eng. degree in Information and Communication Engineering from Chongqing University of Posts and Telecommunications, Chongqing, China in 2019, and received a B.Tech degree in Electronics and Communication Engineering from the Maulana Abul Kalam Azad University of Technology (formerly known as West Bengal University of Technology), Kolkata, India. During my undergrad, I gloriously failed to scale up my startup, Tronix India, and later worked in an Indian multinational IT firm, TechMahindra. I received the Chinese Government Scholarship (nominated by MHRD, India) for my master's studies. I also received an Honorary mention award in IEEE ComSoc student competition 2018, Excellent master's thesis Award 2019, and Outstanding international student award 2019. Also, I have served as a reviewer, TPC, PC, and publicity co-chair for some top journal(s) and conferences.