Timezone: »

Workshop on Computer Assisted Programming (CAP)
Augustus Odena · Charles Sutton · Nadia Polikarpova · Josh Tenenbaum · Armando Solar-Lezama · Isil Dillig

Sat Dec 12 08:30 AM -- 04:10 PM (PST) @ None
Event URL: https://capworkshop.github.io/ »

There are many tasks that could be automated by writing computer programs, but most people don’t know how to program computers (this is the subject of program synthesis, the study of how to automatically write programs from user specifications). Building tools for doing computer-assisted-programming could thus improve the lives of many people (and it’s also a cool research problem!). There has been substantial recent interest in the ML community in the problem of automatically writing computer programs from user specifications, as evidenced by the increased volume of Program Synthesis submissions to ICML, ICLR, and NeurIPS.

Despite this recent work, a lot of exciting questions are still open, such as how to combine symbolic reasoning over programs with deep learning, how to represent programs and user specifications, and how to apply program synthesis within computer vision, robotics, and other control problems. There is also work to be done on fusing work done in the ML community with research on Programming Languages (PL) through collaboration between the ML and PL communities, and there remains the challenge of establishing benchmarks that allow for easy comparison and measurement of progress. The aim of the CAP workshop is to address these points. This workshop will bring together researchers in programming languages, machine learning, and related areas who are interested in program synthesis and other methods for automatically writing programs from a specification of intended behavior.

Sat 8:30 a.m. - 8:40 a.m.
Welcome Talk
Augustus Odena
Sat 8:40 a.m. - 9:10 a.m.

Title: New directions in Programming by Examples

Abstract: Programming by examples (PBE) involves synthesizing programs in an underlying domain-specific language from input-output examples. Our journey in developing usable PBE systems has motivated two kinds of advances: (a) development of algorithms that can synthesize intended programs in real-time and from very few examples, (b) variants of the classical PBE problem including predictive synthesis and modeless synthesis.

We have leveraged logical reasoning techniques and its integration with machine learning techniques to develop effective PBE solutions for some domains including string/datatype transformations, table extraction from semi-structured documents (e.g., custom text files, webpages, PDF), and repetitive edits in code. These solutions have shipped inside various mass-market products including Excel, PowerBI, Visual Studio, and Sql Server Management Studio. In this talk, I will describe these applications, technical advances, and the form factors inside different products.

Bio: Sumit Gulwani is a computer scientist connecting ideas, research & practice, and (with) people with varied roles. He invented the popular Flash Fill feature in Excel and has shipped program synthesis innovations across multiple Microsoft products (Office, SQL, Visual Studio, Powershell, PowerQuery), having authored 65+ patent applications. He has co-authored 10 award winning papers (including test-of-time awards from ICSE and POPL) amongst 130+ research publications across multiple computer science areas and delivered 50+ keynotes/invited talks. He has received the Robin Milner Young Researcher Award, ACM SIGPLAN Outstanding Doctoral Dissertation Award (PhD from UC-Berkeley), and President’s Gold Medal from IIT Kanpur.

Sumit Gulwani
Sat 9:10 a.m. - 9:40 a.m.


The dream of classical program synthesis is to generate programs from complete, formal specifications of their expected behavior. An increasingly favored paradigm of synthesis is inductive program synthesis, where specifications of program behavior are provided in the form of examples. Inductive program synthesis not only helps make program synthesis more tractable, but also has the potential to democratize programming!

Unfortunately, inductive synthesis engines encounter challenges like overfitting, ambiguity, and brittleness, similar to other inductive learning engines. PL researchers have typically attacked these problems by applying syntactic biases to the search space in the form of tailored domain-specific languages, grammars and ranking functions. In this talk, I will show how one can further enhance the generalizability and robustness of such synthesis engines by applying semantic biases to the search space.


Roopsha Samanta is an Assistant Professor the Department of Computer Science at Purdue University. She leads the Purdue Formal Methods (PurForM) group and is a member of the Purdue Programming Languages (PurPL) group. Before joining Purdue in 2016, she completed her PhD at UT Austin in 2013, advised by E. Allen Emerson and Vijay K. Garg, and was a postdoctoral researcher at IST Austria from 2014-2016 with Thomas A. Henzinger. She is a recipient of the 2019 NSF CAREER award.

Her research interests are in program verification, program synthesis, and concurrency. She likes to work at the intersection of formal methods and programming languages to develop frameworks to assist programmers write reliable programs. Her current research agenda is centered around two themes—formal reasoning about distributed systems and semantics-guided inductive program synthesis.


Roopsha Samanta
Sat 9:40 a.m. - 10:10 a.m.
Spotlight Session 1 (Spotlight Talks)
Augustus Odena, Maxwell Nye, Disha Shrivastava, Mayank Agarwal, Vincent J Hellendoorn, Charles Sutton
Sat 10:10 a.m. - 11:00 a.m.
Poster Session 1 (Posters)
Sat 11:00 a.m. - 11:30 a.m.

Neural Attribute Grammars for Semantics-Guided Program Generation Swarat Chaudhuri UT Austin


I will talk about Neural Attribute Grammars (NAG), a framework for deep statistical generation of source code modulo language-level semantic requirements (such as type safety or initialization of variables before use). Neural models for source code have received significant attention in the recent past. However, these models tend to be trained on syntactic program representations, and consequently, often generate programs that violate essential semantic invariants. In contrast, the NAG framework exposes the semantics of the target language to the training procedure for the neural model using attribute grammars. During training, the model learns to replicate the relationship between the syntactic rules used to construct a program, and the semantic attributes (for example, symbol tables) of the context in which the rule is fired. In the talk, I will give some concrete examples of NAGs and show how to use them in the conditional generation of Java programs. I will demonstrate that these NAGs generate semantically "sensible" programs with significantly higher frequency than traditional neural models of source code.

(This talk is based on joint work with Rohan Mukherjee, Chris Jermaine, Tom Reps, Dipak Chaudhari, and Matt Amodio.)

Bio: Swarat Chaudhuri is an Associate Professor of computer science at the University of Texas at Austin. His research studies topics in the intersection of machine learning and programming languages, including program induction, probabilistic programming, neurosymbolic programming, programmatically interpretable/explainable learning, learning-accelerated formal reasoning, and formally certified learning. Swarat received a bachelor's degree from the Indian Institute of Technology, Kharagpur, in 2001, and a doctoral degree from the University of Pennsylvania in 2007. Before joining UT Austin, he held faculty positions at Rice University and the Pennsylvania State University. He is a recipient of the National Science Foundation CAREER award, the ACM SIGPLAN John Reynolds Doctoral Dissertation Award, and the Morris and Dorothy Rubinoff Dissertation Award from the University of Pennsylvania.

Swarat Chaudhuri
Sat 11:30 a.m. - 12:00 p.m.

Title Increasing the Power of [Human+Program Synthesis] through Interface Design

Abstract Program synthesis is a powerful tool for generating programs, but in the hands of users, its potential can be severely limited by unanticipated usability obstacles. In this talk, I will describe several key usability obstacles and new synthesis-powered interaction mechanisms that help users get past these obstacles to their goal: a program that behaves the way they want it to.

Updated Bio Elena Glassman is an Assistant Professor of Computer Science at the Harvard Paulson School of Engineering & Applied Sciences and the Stanley A. Marks & William H. Marks Professor at the Radcliffe Institute for Advanced Study, specializing in human-computer interaction. At MIT, she earned a PhD and MEng in Electrical Engineering and Computer Science and a BS in Electrical Science and Engineering. Before joining Harvard, she was a postdoctoral scholar in Electrical Engineering and Computer Science at the University of California, Berkeley, where she received the Berkeley Institute for Data Science Moore/Sloan Data Science Fellowship.

Elena Glassman
Sat 12:00 p.m. - 12:30 p.m.
Spotlight Session 2 (Spotlight Talks)
Augustus Odena, Kensen Shi, David Bieber, Ferran Alet, Charles Sutton, Roshni Iyer
Sat 12:30 p.m. - 1:00 p.m.

Title: Growing generalizable, interpretable knowledge with wake-sleep program learning

Abstract: Two challenges in engineering program synthesis systems are: (1) crafting specialized yet expressive domain specific languages, and (2) designing search algorithms that can tractably explore the space of expressions in this domain specific language. We take a step toward the joint learning of domain specific languages, and the search algorithms performs synthesis in that language. We propose an algorithm which starts with a relatively minimal domain specific language, and then enriches that language by compressing out common syntactic patterns into a library of reusable domain specific code. In tandem, the system trains a neural network to guide search over expressions in the growing language. From a machine learning perspective, this system implements a wake-sleep algorithms similar to the Helmholtz machine. We apply this algorithm to AI and program synthesis problems, with the goal of understanding how domain specific languages and neural program synthesizers can mutually bootstrap one another.

Bio: Kevin Ellis works across program synthesis and artificial intelligence. His focuses on using machine learning to develop better program synthesis algorithms, and on applications of program synthesis to graphics and natural language. He recently finished his PhD at MIT coadvised by Josh Tenenbaum and Armando Solar-Lezama, and is working as a research scientist at Common Sense Machines before starting as an assistant professor at Cornell in summer 2021.

Kevin Ellis
Sat 1:00 p.m. - 2:30 p.m.
Poster Session 2 (Posters)
Sat 2:30 p.m. - 3:00 p.m.

Title: Automatic Program Repair using Getafix

Abstract: Developers spend a significant amount of their time fixing bugs. Fixes often are repetitive, so it appears that some portion of this work should be automated. Indeed, some recent approaches offer automation, but these typically explore a large space of potential fixes by making varying combinations of mutations, trying them all until one that passes the test suite. This is not only computationally expensive, but the suggested may not look natural to a developer. We present Getafix, a tool that offers readable bug fixes without requiring massive computational resources. Getafix learns from your bug fix history. It extracts past code changes that fixed bugs and learns, in an off-line phase, a set of templates from those fixes. As new bug reports appear, Getafix uses these templates to create and rank a set of suggestions in mere seconds, as well as offer fixes that resemble human-made fixes. At Facebook, Getafix has been used to auto-fix bugs reported by static analysis tools like Infer.

Satish Chandra, Augustus Odena, Charles Sutton
Sat 3:00 p.m. - 3:30 p.m.

Title: Deep Learning for Program Synthesis from Input-Output Examples

Abstract: There has been an emerging interest in applying machine learning-based techniques, especially deep neural networks, for program synthesis. However, because of some unique characteristics of the program domain, directly applying deep learning techniques developed for other applications is generally inappropriate. In this talk, I will present my work on program synthesis from input-output examples, aiming at synthesizing programs with higher complexity and better generalization. I will first discuss our work on execution-guided synthesis, where we develop approaches to leverage the execution results of both partial and full programs. In the second part of my talk, I will discuss our work on neural-symbolic architectures for compositional generalization.

Xinyun Chen
Sat 3:30 p.m. - 4:00 p.m.
Panel (Virtual Panel)
Augustus Odena, Charles Sutton, Roopsha Samanta, Xinyun Chen, Elena Glassman
Sat 4:00 p.m. - 4:10 p.m.
closing talk
Augustus Odena, Charles Sutton

Author Information

Augustus Odena (Google Brain)
Charles Sutton (Google)
Nadia Polikarpova (University of California, San Diego)
Josh Tenenbaum (MIT)

Josh Tenenbaum is an Associate Professor of Computational Cognitive Science at MIT in the Department of Brain and Cognitive Sciences and the Computer Science and Artificial Intelligence Laboratory (CSAIL). He received his PhD from MIT in 1999, and was an Assistant Professor at Stanford University from 1999 to 2002. He studies learning and inference in humans and machines, with the twin goals of understanding human intelligence in computational terms and bringing computers closer to human capacities. He focuses on problems of inductive generalization from limited data -- learning concepts and word meanings, inferring causal relations or goals -- and learning abstract knowledge that supports these inductive leaps in the form of probabilistic generative models or 'intuitive theories'. He has also developed several novel machine learning methods inspired by human learning and perception, most notably Isomap, an approach to unsupervised learning of nonlinear manifolds in high-dimensional data. He has been Associate Editor for the journal Cognitive Science, has been active on program committees for the CogSci and NIPS conferences, and has co-organized a number of workshops, tutorials and summer schools in human and machine learning. Several of his papers have received outstanding paper awards or best student paper awards at the IEEE Computer Vision and Pattern Recognition (CVPR), NIPS, and Cognitive Science conferences. He is the recipient of the New Investigator Award from the Society for Mathematical Psychology (2005), the Early Investigator Award from the Society of Experimental Psychologists (2007), and the Distinguished Scientific Award for Early Career Contribution to Psychology (in the area of cognition and human learning) from the American Psychological Association (2008).

Armando Solar-Lezama (MIT)
Isil Dillig (UT Austin)

More from the Same Authors