Dataset and Benchmark Track 1
The Datasets and Benchmarks track serves as a novel venue for high-quality publications, talks, and posters on highly valuable machine learning datasets and benchmarks, as well as a forum for discussions on how to improve dataset development. Datasets and benchmarks are crucial for the development of machine learning methods, but also require their own publishing and reviewing guidelines. For instance, datasets can often not be reviewed in a double-blind fashion, and hence full anonymization will not be required. On the other hand, they do require additional specific checks, such as a proper description of how the data was collected, whether they show intrinsic bias, and whether they will remain accessible.
Queer in AI Workshop 1
Queer in AI’s demographic survey reveals that most queer scientists in our community do not feel completely welcome in conferences and their work environments, with the main reasons being a lack of queer community and role models. Over the past years, Queer in AI has worked towards these goals, yet we have observed that the voices of marginalized queer communities - especially transgender, non-binary folks and queer BIPOC folks - have been neglected. The purpose of this workshop is to highlight issues that these communities face by featuring talks and panel discussions on the inclusion of neuro-diverse people in our communities, the intersection of queer and animal rights, as well as worker rights issues around the world.
The main topics of the workshop will revolve around:
- the intersection of AI, queer identity and neurodiversity
- queer identity and labor rights and organization
- AI and animal rights
- queer identity and caste-based discrimination
Additionally, at Queer in AI’s socials at NeurIPS 2021, we will focus on creating a safe and inclusive casual networking and socializing space for LGBTQIA+ individuals involved with AI. There will also be additional social events, stay tuned for more details coming soon. Together, these components will create a community space where attendees can learn and grow from connecting with each other, bonding over shared experiences, and learning from each individual’s unique insights into AI, queerness, and beyond!
New in ML 1
Is this your first time to a top conference? Have you ever wanted your own work recognized by this huge and active community? Do you encounter difficulties in polishing your ideas, experiments, paper writing, etc? Then, this session is exactly for you!
This year, we are organizing the New in ML workshop, co-locating with NeurIPS 2021. We are targeting anyone who has not published a paper at the NeurIPS main conference yet. We invited top researchers to review your work and share with you their experience. The best papers will get oral presentations!
Our biggest goal is to help you publish papers at next year’s NeurIPS conference, and generally provide you with the guidance you need to contribute to ML research fully and effectively!
Duolingo is the most popular way to learn languages in the world. With over half a billion exercises completed every day, we have the largest dataset of people learning languages ever amassed. In this talk I will describe all the different ways in which we use AI to improve how well we teach and how to keep our learners engaged.
Demonstrations 1
Demonstrations must show novel technology and must run online during the conference. Unlike poster presentations or slide shows, interaction with the audience is a critical element. Therefore, the creativity of demonstrators to propose new ways in which interaction and engagement can fully leverage this year’s virtual conference format will be particularly relevant for selection. This session has the following demonstrations:
- Interactive Exploration for 60 Years of AI Research
- SenSE: A Toolkit for Semantic Change Exploration via Word Embedding Alignment
- Training Transformers Together
- GANs for All: Supporting Fun and Intuitive Exploration of GAN Latent Spaces
- Lesan - Machine Translation for Low Resource Languages
Dataset and Benchmark Poster Session 1
The Datasets and Benchmarks track serves as a novel venue for high-quality publications, talks, and posters on highly valuable machine learning datasets and benchmarks, as well as a forum for discussions on how to improve dataset development. Datasets and benchmarks are crucial for the development of machine learning methods, but also require their own publishing and reviewing guidelines. For instance, datasets can often not be reviewed in a double-blind fashion, and hence full anonymization will not be required. On the other hand, they do require additional specific checks, such as a proper description of how the data was collected, whether they show intrinsic bias, and whether they will remain accessible.
LatinX in AI (LXAI) Research @ NeurIPS 2021
The workshop is a one-day event with invited speakers, oral presentations, and posters. The event brings together faculty, graduate students, research scientists, and engineers for an opportunity to connect and exchange ideas. There will be a panel discussion and a mentoring session to discuss current research trends and career choices in artificial intelligence and machine learning. While all presenters will identify primarily as latinx, all are invited to attend.
Competition Track Day 1: Overviews + Breakout Sessions
The program includes a wide variety of exciting competitions in different domains, with some focusing more on applications and others trying to unify fields, focusing on technical challenges or directly tackling important problems in the world. The aim is for the broad program to make it so that anyone who wants to work on or learn from a competition can find something to their liking.
In this session, we have the following competitions:
* The Open Catalyst Challenge
* The Robustness and Uncertainty under Real-World Distributional Shifts Challenge
* VisDA21: Visual Domain Adaptation
* HEAR 2021: Holistic Evaluation of Audio Representations
* The WebQA Competition
* Diamond: A MineRL Competition on Training Sample-Efficient Agents
New in ML 2
Is this your first time to a top conference? Have you ever wanted your own work recognized by this huge and active community? Do you encounter difficulties in polishing your ideas, experiments, paper writing, etc? Then, this session is exactly for you!
This year, we are organizing the New in ML workshop, co-locating with NeurIPS 2021. We are targeting anyone who has not published a paper at the NeurIPS main conference yet. We invited top researchers to review your work and share with you their experience. The best papers will get oral presentations!
Our biggest goal is to help you publish papers at next year’s NeurIPS conference, and generally provide you with the guidance you need to contribute to ML research fully and effectively!
If data is power, this keynote asks what methodologies and frameworks, beyond measuring bias and fairness in ML, might best serve communities that are, otherwise, written off as inevitable ‘data gaps?’ To address this question, the talk applies design justice principles articulated in 2020 by scholar Costanza-Chock to the case of community-based organizations (CBOs) serving marginalized Black and Latinx communities in North Carolina. These CBOs, part of an 8-month study of community healthcare work, have become pivotal conduits for COVID-19 health information and equitable vaccine access. As such, they create and collect the so-called ‘sparse data’ of marginalized groups often missing from healthcare analyses. How might health equity—a cornerstone of social justice—be better served by equipping CBOs to collect community-level data and set the agendas for what to share and learn from the people that they serve? The talk will open with an analysis of the limits of ML models that prioritize the efficiencies of scale over attention to just and inclusive sampling. It will then examine how undertheorized investments in measuring bias and fairness in data and decision-making systems distract us from considering the value of collecting data with rather than for communities. Outlining an early learning theory proposed by Russian psychologist Lev Vygotsky (1978), the presentation will argue that focusing on the demands of collecting community members’ data and observing the social interactions that are computationally hard to measure but qualitatively invaluable to see are necessary to advance socially-just ML. The talk will conclude with recommendations for how to reorient computer science and machine learning to a more explicit theory and practice of data power-sharing.
Latinx in AI Social
Latinx in AI wants to host a social event with the goal of creating new connections between the NeurIPS community and Latinx researchers across the world. Our event will consist of networking sessions between attendees in Zoom or Gathertown. While our event will target researchers who identify as Latinx, our doors are open to anyone who wants to engage with our community.
The organizers of the event will be Luis Lamb - Federal University of Rio Grande do Sul - luisl@latinxinai.org José Luis Lima de Jesus Silva - Linköping University - jsilva@latinxinai.org Andrés Muñoz Medina - Google Research - andresm@latinxinai.org
Can Technology Be Used to Help Combat Maternal Mortality?
Women in the United States are more likely to die from childbirth or pregnancy-related causes than other women in the developed world” according to the Public Health Grand Rounds podcast by the Centers for Disease Control. Although, maternal mortality is a global issue the focus will be on maternal mortality in the United States due to familiarity with domestic practices related to this topic. For instance, in 2015 the maternal mortality rate in Finland was 3 deaths/100,000 live births versus in the United States which had 14 deaths/100,000 live births according to the World Factbook from the Central Intelligence Agency. The reason that the number is high for the United States is due to the number of black women that have died and continue to die in the US, due to maternal mortality. “Black women have a maternal mortality rate three times higher than that of white women” according to National Geographic. A question that may arise from many when learning of this data is: why are these deaths occurring at such high rates? A next question for many may be: are these deaths preventable? ". The Public Health Grand Rounds podcast by the Centers for Disease Control podcast further states "research suggests half of these deaths are preventable". Research suggests that there are patterns that could be identified preventing death.
AI could be a pathway forward for preventing maternal mortality. We as a community have a unique opportunity to brainstorm and discuss strategies for combating maternal mortality. The discussion will highlight the United States as a test case and the obtained insights could also be used in other parts of the world. The current idea for a proposed workflow for using AI to address this issue entails using wearable devices that could monitor women for potentially critical periods of time after childbirth and take certain measurements. The collected data could be analyzed using Machine or Deep Learning, and classifications from this data could be used to alert members of a medical care team.
During this session through conversation, dedicated reflection time, and creation of action plans we as a community will have an opportunity to explore what are ways that AI in conjunction with other resources can be used to address this ongoing and worsening issue.
Women in Research Roundtable
This roundtable aims to highlight the experiences of women researchers in tech and promote an engaging discussion around how women can excel in the research science field. GRT will incorporate a minimum of 4 Women researchers who are considered experts in their field at Amazon. Attendees will be encouraged to ask questions while a moderator will facilitate guided discussion points throughout the 1 hour session. I would like to add this RSVP link somewhere so that we can gather any questions ahead of time.
Do We Know How to Estimate the Mean?
In this talk I discuss mean estimation based on independent observations, perhaps the most basic problems in statistics. Despite its long history, the subject has attracted a flurry of renewed activity. Motivated by applications in machine learning and data science, the problem has been viewed from new angles both from statistical and computational points of view. We review some recent results on the statistical performance of mean estimators that allow heavy tails and adversarial contamination in the data, focusing on high-dimensional aspects.