Timezone: »
We provide a new multi-task benchmark for evaluating text-to-image models and perform a human evaluation comparing two of the most common open source (Stable Diffusion) and commercial (DALL-E 2) models. Twenty computer science AI graduate students evaluated the two models, on three tasks, at three difficulty levels, across ten prompts each, providing 3,600 ratings. Text-to-image generation has seen rapid progress to the point that many recent models have demonstrated their ability to create realistic high-resolution images for various prompts. However, current text-to-image methods and the broader body of research in vision-language understanding still struggle with intricate text prompts that contain many objects with multiple attributes and relationships. We introduce a new text-to-image benchmark that contains a suite of fifty tasks and applications that capture a model’s ability to handle different features of a text prompt. For example, asking a model to generate a varying number of the same object to measure its ability to count or providing a text prompt with several objects that each have a different attribute to correctly identify its ability to match objects and attributes. Rather than subjectively evaluating text-to-image results on a set of prompts, our new multi-task benchmark consists of challenge tasks at three difficulty levels (easy, medium, and hard) along with human ratings for each generated image.
Author Information
Vitali Petsiuk (Boston University)

I am a PhD candidate in Computer Science working in the Image and Video Computing group at Boston University, advised by Professor Kate Saenko. My research lies in the field of Explainable AI for Computer Vision and Natural Language Processing models. During my research internships at Adobe, I have worked on developing novel methods for making CV and NLP models more interpretable with applications in Document Understanding. Prior to joining Boston University I have recieved my M.S.,B.S. degree in Computer Science and Applied Mathematics at Belarusian State University. During my studies there I have been doing research on Graph Theory and Semantic Segmentation for 2D and 3D Lung Imaging.
Alexander E. Siemenn (Massachusetts Institute of Technology)
Saisamrit Surbehera (Columbia University)
Qi Qi Chin (Harvard University)
Keith Tyser (Boston University, MIT Lincoln Laboratory)
Gregory Hunter (Columbia University)
Arvind Raghavan
Yann Hicke (Cornell University)
Bryan Plummer (Boston University)
Ori Kerret
Tonio Buonassisi (Massachusetts Institute of Technology)
Kate Saenko (Boston University & MIT-IBM Watson AI Lab, IBM Research)
Armando Solar-Lezama (MIT)
Iddo Drori (BU, MIT, Columbia University)
More from the Same Authors
-
2020 : Session B, Poster 23: Galaxytsp: A New Billion-Node Benchmark For TSP »
Iddo Drori -
2021 Spotlight: Look at What I’m Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos »
Reuben Tan · Bryan Plummer · Kate Saenko · Hailin Jin · Bryan Russell -
2021 Spotlight: Program Synthesis Guided Reinforcement Learning for Partially Observed Environments »
Yichen Yang · Jeevana Priya Inala · Osbert Bastani · Yewen Pu · Armando Solar-Lezama · Martin Rinard -
2021 : AutumnSynth: Synthesis of Reactive Programs with Structured Latent State »
Ria Das · Zenna Tavares · Josh Tenenbaum · Armando Solar-Lezama -
2021 : Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation »
Aadarsh Sahoo · Rameswar Panda · Rogerio Feris · Kate Saenko · Abir Das -
2021 : Extending the WILDS Benchmark for Unsupervised Adaptation »
Shiori Sagawa · Pang Wei Koh · Tony Lee · Irena Gao · Sang Michael Xie · Kendrick Shen · Ananya Kumar · Weihua Hu · Michihiro Yasunaga · Henrik Marklund · Sara Beery · Ian Stavness · Jure Leskovec · Kate Saenko · Tatsunori Hashimoto · Sergey Levine · Chelsea Finn · Percy Liang -
2021 : Surprisingly Simple Semi-Supervised Domain Adaptation with Pretraining and Consistency »
Samarth Mishra · Kate Saenko · Venkatesh Saligrama -
2021 : Predicting Critical Biogeochemistry of the Southern Ocean for Climate Monitoring »
Ellen Park · Jae Deok Kim · Nadege Aoki · Yumeng Cao · Yamin Arefeen · Matthew Beveridge · David Nicholson · Iddo Drori -
2021 : Synthesis of Reactive Programs with Structured Latent State »
Ria Das · Zenna Tavares · Armando Solar-Lezama · Josh Tenenbaum -
2021 : Predicting Atlantic Multidecadal Variability »
Glenn Liu · Peidong Wang · Matthew Beveridge · Young-Oh Kwon · Iddo Drori -
2022 : Neurosymbolic Programming for Science »
Jennifer J Sun · Megan Tjandrasuwita · Atharva Sehgal · Armando Solar-Lezama · Swarat Chaudhuri · Yisong Yue · Omar Costilla Reyes -
2022 : Lemma: Bootstrapping High-Level Mathematical Reasoning with Learned Symbolic Abstractions »
Zhening Li · Gabriel Poesia Reis e Silva · Omar Costilla Reyes · Noah Goodman · Armando Solar-Lezama -
2022 : Fifteen-minute Competition Overview Video »
Kate Saenko · Samarth Mishra · Dina Bashkirova · Vitaly Ablavsky · Sarah Bargal · Rachel Lai · Piotr Teterwak · James Akl · Fadi Alladkani · Donghyun Kim · Berk Calli -
2022 Competition: VisDA 2022 Challenge: Sim2Real Domain Adaptation for Industrial Recycling »
Dina Bashkirova · Samarth Mishra · Piotr Teterwak · Donghyun Kim · Rachel Lai · Fadi Alladkani · James Akl · Vitaly Ablavsky · Sarah Bargal · Berk Calli · Kate Saenko -
2022 : Challenge Introduction »
Dina Bashkirova · Samarth Mishra · Piotr Teterwak · Donghyun Kim · Sarah Bargal · Diala Lteif · Kate Saenko -
2022 : Q & A »
Swarat Chaudhuri · Jennifer J Sun · Armando Solar-Lezama -
2022 Tutorial: Neurosymbolic Programming »
Swarat Chaudhuri · Jennifer J Sun · Armando Solar-Lezama -
2022 : Neurosymbolic Programming »
Swarat Chaudhuri · Jennifer J Sun · Armando Solar-Lezama -
2022 : Identifying Structure in the MIMIC ICU Dataset »
Qi Qi Chin -
2022 : Accelerating the Discovery of Rare Materials with Bounded Optimization Techniques »
Alexander E. Siemenn · zekun ren · Qianxiao Li · Tonio Buonassisi -
2022 Poster: DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations »
Ximeng Sun · Ping Hu · Kate Saenko -
2022 Poster: Finding Differences Between Transformers and ConvNets Using Counterfactual Simulation Testing »
Nataniel Ruiz · Sarah Bargal · Cihang Xie · Kate Saenko · Stan Sclaroff -
2022 Poster: How Transferable are Video Representations Based on Synthetic Data? »
Yo-whan Kim · Samarth Mishra · SouYoung Jin · Rameswar Panda · Hilde Kuehne · Leonid Karlinsky · Venkatesh Saligrama · Kate Saenko · Aude Oliva · Rogerio Feris -
2022 Poster: FETA: Towards Specializing Foundational Models for Expert Task Applications »
Amit Alfassy · Assaf Arbelle · Oshri Halimi · Sivan Harary · Roei Herzig · Eli Schwartz · Rameswar Panda · Michele Dolfi · Christoph Auer · Peter Staar · Kate Saenko · Rogerio Feris · Leonid Karlinsky -
2021 : Predicting Atlantic Multidecadal Variability »
Glenn Liu · Peidong Wang · Matthew Beveridge · Young-Oh Kwon · Iddo Drori -
2021 Workshop: Distribution shifts: connecting methods and applications (DistShift) »
Shiori Sagawa · Pang Wei Koh · Fanny Yang · Hongseok Namkoong · Jiashi Feng · Kate Saenko · Percy Liang · Sarah Bird · Sergey Levine -
2021 Poster: OpenMatch: Open-Set Semi-supervised Learning with Open-set Consistency Regularization »
Kuniaki Saito · Donghyun Kim · Kate Saenko -
2021 Poster: Program Synthesis Guided Reinforcement Learning for Partially Observed Environments »
Yichen Yang · Jeevana Priya Inala · Osbert Bastani · Yewen Pu · Armando Solar-Lezama · Martin Rinard -
2021 Poster: Look at What I’m Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos »
Reuben Tan · Bryan Plummer · Kate Saenko · Hailin Jin · Bryan Russell -
2021 : VisDA21: Visual Domain Adaptation + Q&A »
Kate Saenko · Kuniaki Saito · Donghyun Kim · Samarth Mishra · Ben Usman · Piotr Teterwak · Dina Bashkirova · Dan Hendrycks -
2021 Poster: Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing »
Aadarsh Sahoo · Rutav Shah · Rameswar Panda · Kate Saenko · Abir Das -
2020 : Invited Talk (Armando Solar-Lezama) »
Armando Solar-Lezama -
2020 : Poster Session B »
Ravichandra Addanki · Andreea-Ioana Deac · Yujia Xie · Francesco Landolfi · Antoine Prouvost · Claudius Gros · Renzo Massobrio · Abhishek Cauligi · Simon Alford · Hanjun Dai · Alberto Franzin · Nitish Kumar Panigrahy · Brandon Kates · Iddo Drori · Taoan Huang · Zhou Zhou · Marin Vlastelica · Anselm Paulus · Aaron Zweig · Minsu Cho · Haiyan Yin · Michal Lisicki · Nan Jiang · Haoran Sun -
2020 Workshop: Workshop on Computer Assisted Programming (CAP) »
Augustus Odena · Charles Sutton · Nadia Polikarpova · Josh Tenenbaum · Armando Solar-Lezama · Isil Dillig -
2020 Poster: Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment »
Ben Usman · Avneesh Sud · Nick Dufour · Kate Saenko -
2020 Poster: Program Synthesis with Pragmatic Communication »
Yewen Pu · Kevin Ellis · Marta Kryven · Josh Tenenbaum · Armando Solar-Lezama -
2020 Poster: Learning Compositional Rules via Neural Program Synthesis »
Maxwell Nye · Armando Solar-Lezama · Josh Tenenbaum · Brenden Lake -
2020 Poster: Uncertainty-Aware Learning for Zero-Shot Semantic Segmentation »
Ping Hu · Stan Sclaroff · Kate Saenko -
2020 Poster: Universal Domain Adaptation through Self Supervision »
Kuniaki Saito · Donghyun Kim · Stan Sclaroff · Kate Saenko -
2020 Poster: Auxiliary Task Reweighting for Minimum-data Learning »
Baifeng Shi · Judy Hoffman · Kate Saenko · Trevor Darrell · Huijuan Xu -
2020 Poster: Neurosymbolic Transformers for Multi-Agent Communication »
Jeevana Priya Inala · Yichen Yang · James Paulos · Yewen Pu · Osbert Bastani · Vijay Kumar · Martin Rinard · Armando Solar-Lezama -
2020 Poster: AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning »
Ximeng Sun · Rameswar Panda · Rogerio Feris · Kate Saenko -
2019 : Lunch + Poster Session »
Frederik Gerzer · Bill Yang Cai · Pieter-Jan Hoedt · Kelly Kochanski · Soo Kyung Kim · Yunsung Lee · Sunghyun Park · Sharon Zhou · Martin Gauch · Jonathan Wilson · Joyjit Chatterjee · Shamindra Shrotriya · Dimitri Papadimitriou · Christian Schön · Valentina Zantedeschi · Gabriella Baasch · Willem Waegeman · Gautier Cosne · Dara Farrell · Brendan Lucier · Letif Mones · Caleb Robinson · Tafara Chitsiga · Victor Kristof · Hari Prasanna Das · Yimeng Min · Alexandra Puchko · Alexandra Luccioni · Kyle Story · Jason Hickey · Yue Hu · Björn Lütjens · Zhecheng Wang · Renzhi Jing · Genevieve Flaspohler · Jingfan Wang · Saumya Sinha · Qinghu Tang · Armi Tiihonen · Ruben Glatt · Muge Komurcu · Jan Drgona · Juan Gomez-Romero · Ashish Kapoor · Dylan J Fitzpatrick · Alireza Rezvanifar · Adrian Albert · Olya (Olga) Irzak · Kara Lamb · Ankur Mahesh · Kiwan Maeng · Frederik Kratzert · Sorelle Friedler · Niccolo Dalmasso · Alex Robson · Lindiwe Malobola · Lucas Maystre · Yu-wen Lin · Surya Karthik Mukkavili · Brian Hutchinson · Alexandre Lacoste · Yanbing Wang · Zhengcheng Wang · Yinda Zhang · Victoria Preston · Jacob Pettit · Draguna Vrabie · Miguel Molina-Solana · Tonio Buonassisi · Andrew Annex · Tunai P Marques · Catalin Voss · Johannes Rausch · Max Evans -
2019 Poster: Write, Execute, Assess: Program Synthesis with a REPL »
Kevin Ellis · Maxwell Nye · Yewen Pu · Felix Sosa · Josh Tenenbaum · Armando Solar-Lezama -
2019 Poster: Adversarial Self-Defense for Cycle-Consistent GANs »
Dina Bashkirova · Ben Usman · Kate Saenko -
2018 Poster: Learning to Infer Graphics Programs from Hand-Drawn Images »
Kevin Ellis · Daniel Ritchie · Armando Solar-Lezama · Josh Tenenbaum -
2018 Poster: Learning Libraries of Subroutines for Neurally–Guided Bayesian Program Induction »
Kevin Ellis · Lucas Morales · Mathias Sablé-Meyer · Armando Solar-Lezama · Josh Tenenbaum -
2018 Spotlight: Learning to Infer Graphics Programs from Hand-Drawn Images »
Kevin Ellis · Daniel Ritchie · Armando Solar-Lezama · Josh Tenenbaum -
2018 Spotlight: Learning Libraries of Subroutines for Neurally–Guided Bayesian Program Induction »
Kevin Ellis · Lucas Morales · Mathias Sablé-Meyer · Armando Solar-Lezama · Josh Tenenbaum -
2018 Poster: Verifiable Reinforcement Learning via Policy Extraction »
Osbert Bastani · Yewen Pu · Armando Solar-Lezama -
2018 Poster: Speaker-Follower Models for Vision-and-Language Navigation »
Daniel Fried · Ronghang Hu · Volkan Cirik · Anna Rohrbach · Jacob Andreas · Louis-Philippe Morency · Taylor Berg-Kirkpatrick · Kate Saenko · Dan Klein · Trevor Darrell -
2018 Poster: Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections »
Xin Zhang · Armando Solar-Lezama · Rishabh Singh -
2016 : Invited Talk: Domain Adaption for Perception and Action (Kate Saenko, Boston University) »
Kate Saenko -
2016 Poster: Sampling for Bayesian Program Learning »
Kevin Ellis · Armando Solar-Lezama · Josh Tenenbaum -
2015 Workshop: Transfer and Multi-Task Learning: Trends and New Perspectives »
Anastasia Pentina · Christoph Lampert · Sinno Jialin Pan · Mingsheng Long · Judy Hoffman · Baochen Sun · Kate Saenko -
2015 Poster: Unsupervised Learning by Program Synthesis »
Kevin Ellis · Armando Solar-Lezama · Josh Tenenbaum