Timezone: »

Machine Learning for the Developing World
William Herlands · Maria De-Arteaga

Fri Dec 08 08:00 AM -- 06:30 PM (PST) @ S7
Event URL: https://sites.google.com/site/ml4development/ »

Six billion people live in developing world countries. The unique development challenges faced by these regions have long been studied by researchers ranging from sociology to statistics and ecology to economics. With the emergence of mature machine learning methods in the past decades, researchers from many fields - including core machine learning - are increasingly turning to machine learning to study and address challenges in the developing world. This workshop is about delving into the intersection of machine learning and development research.

Machine learning present tremendous potential to development research and practice. Supervised methods can provide expert telemedicine decision support in regions with few resources; deep learning techniques can analyze satellite imagery to create novel economic indicators; NLP algorithms can preserve and translate obscure languages, some of which are only spoken. Yet, there are notable challenges with machine learning in the developing world. Data cleanliness, computational capacity, power availability, and internet accessibility are more limited than in developed countries. Additionally, the specific applications differ from what many machine learning researchers normally encounter. The confluence of machine learning's immense potential with the practical challenges posed by developing world settings has inspired a growing body of research at the intersection of machine learning and the developing world.

This one-day workshop is focused on machine learning for the developing world, with an emphasis on developing novel methods and technical applications that address core concerns of developing regions. We will consider a wide range of development areas including health, education, institutional integrity, violence mitigation, economics, societal analysis, and environment. From the machine learning perspective we are open to all methodologies with an emphasis on novel techniques inspired by particular use cases in the developing world.

Invited speakers will address particular areas of interest, while poster sessions and a guided panel discussion will encourage interaction between attendees. We wish to review the current approaches to machine learning in the developing world, and inspire new approaches and paradigms that can lay the groundwork for substantial innovation.

Fri 8:45 a.m. - 9:00 a.m.
Introductory remarks (Introduction)
Artur Dubrawski
Fri 9:00 a.m. - 9:30 a.m.

Mobile money platforms are gaining traction across developing markets as a convenient way of sending and receiving money over mobile phones. Recent joint collaborations between banks and mobile-network operators leverage a customer's past mobile phone transactions in order to create a credit score for the individual. These scores allow access to low-value, short-term, un-collateralized loans. In this talk we will look at the problem of launching a mobile-phone based credit scoring system in a new market without either labeled examples of repayment or the marginal distribution of features of borrowers in the new market. The latter assumption rules out traditional transfer learning approaches such as a direct covariate shift. We apply a Three Population Covariate Shift method to account for the differences in the original and new markets. The three populations are: a) Original Market Members, b) Original Market Borrowers who self-selected into a loan product, and c) New Market Members. The goal of applying a generalized covariate shift to these three populations is to understand the repayment behavior of a fourth: d) New Market Borrowers who will self-select into a loan product when it becomes available.

Skyler D. Speakman
Fri 9:30 a.m. - 9:40 a.m.
Contributed talk: Unique Entity Estimation with Application to the Syrian Conflict (Contributed talk)
Beidi Chen
Fri 9:40 a.m. - 9:50 a.m.
Contributed talk: Field Test Evaluation of Predictive Models for Wildlife Poaching Activity in Uganda (Contributed talk)
Shahrzad Gholami
Fri 9:50 a.m. - 10:00 a.m.
Contributed talk: Household poverty classification in data-scarce environments: a machine learning approach (Contributed talk)
Varun Kshirsagar
Fri 10:00 a.m. - 10:30 a.m.

Present advances in Artificial Intelligence and Machine Learning offer unique opportunities for solving impactful developing world problems. At the AI and Data Science research lab at Makerere University and UN PulseLab Kampala in Uganda, we have 8 years of trying to marry good computational techniques with good developing world problems. In this talk I will give some examples of some of the projects we are working on or have worked on. These will include automating disease diagnosis in crops and humans, crowd-sourcing surveillance, traffic monitoring and using public radio data to infer humanitarian crises. I will talk about the relative strengths of the different types of data that can be reliably collected in the developing world and some deployment options that (seem to) work.

Ernest T Mwebaze
Fri 11:00 a.m. - 11:30 a.m.
Emma Brunskill (Stanford) (Invited speaker)
Emma Brunskill
Fri 11:30 a.m. - 11:40 a.m.
P. Anandan (Wadhwani Institute of AI) (Presentation)
Fri 11:40 a.m. - 12:30 p.m.
Posters (Poster session)
Biswarup Bhattacharya, Darius Lam, Sandeep Vidyapu, Shreya Shankar, Therese Anders, Bryan Wilder, Muhammad R Khan, Yunpeng Li, Nazmus Saquib, Varun Kshirsagar, Anthony Perez, Pengfei Zhang, Shahrzad Gholami, Rediet Abebe
Fri 12:30 p.m. - 2:00 p.m.
Lunch (Break)
Fri 2:00 p.m. - 2:30 p.m.
Jen Ziemke (International Crisis Mappers) (Invited speaker)
Jen Ziemke
Fri 2:30 p.m. - 3:00 p.m.

We are living inside a data revolution that is transforming the way we understand and interact with each other and the world - and it has only just begun. Every field is now having its “data moment,” giving mission-driven organizations brand new opportunities to harness data to advance their work. In fact, the same algorithms that companies use to boost profits can help these organizations boost their impact. From poverty alleviation to healthcare access to improved education, machine learning has the potential to move the needle on seemingly insurmountable issues, but only if there is close collaboration between data scientists and subject matter experts. Since DataKind was founded in 2011, its volunteers have delivered over $25 million in pro bono services to social change organizations worldwide - helping organizations deliver vaccines more effectively to creating chatbots that connect people to critical services during a natural disaster to helping at risk students reach graduation, using satellite imagery to estimate poverty and identify crop diseases, and more. This talk will focus on the ways that DataKind engages with nonprofits across industries and economies, with particular emphasis on techniques, tools, and approaches that can provide guidance to ML in the developing world. We’ll dive in on the exciting potential of big data to tackle big social issues and how data scientists can apply their skills for the greater good.

Caitlin Augustin
Fri 3:00 p.m. - 3:30 p.m.
Coffee break (Break)
Fri 3:30 p.m. - 4:00 p.m.

Recent technological developments are creating new spatio-temporal data streams that contain a wealth of information relevant to sustainable development goals. Modern AI techniques have the potential to yield accurate, inexpensive, and highly scalable models to inform research and policy. As a first example, I will present a machine learning method we developed to predict and map poverty in developing countries. Our method can reliably predict economic well-being using only high-resolution satellite imagery. Because images are passively collected in every corner of the world, our method can provide timely and accurate measurements in a very scalable end economic way, and could revolutionize efforts towards global poverty eradication. As a second example, I will present some ongoing work on monitoring food security outcomes.

Stefano Ermon
Fri 4:00 p.m. - 5:00 p.m.

This panel discussion brings together core machine learning researchers and developing world application domain experts in a conversation regarding challenges, opportunities and future directions of ML4D.

Author Information

William Herlands (Carnegie Mellon University)
Maria De-Arteaga (Carnegie Mellon University)

More from the Same Authors