Open Catalyst Challenge

Abhishek Das · Muhammed Shuaibi · Aini Palizhati · Siddharth Goyal · Adeesh Kolluru · Janice Lan · Ammar Rizvi · Nima Shoghi · Anuroop Sriram · Brook Wander · Brandon Wood · Zachary Ulissi · Larry Zitnick



Advancements to renewable energy processes are needed urgently to address climate change and energy scarcity around the world. Many of these processes, including the generation of electricity through fuel cells or fuel generation from renewable resources are driven through chemical reactions. The use of catalysts in these chemical reactions plays a key role in developing cost-effective solutions by enabling new reactions and improving their efficiency. Unfortunately, the discovery of new catalyst materials is limited due to the high cost of computational atomic simulations and experimental studies. Machine learning has the potential to significantly reduce the cost of computational simulations by orders of magnitude. By filtering potential catalyst materials based on these simulations, candidates of higher promise may be selected for experimental testing and the rate at which new catalysts are discovered could be greatly accelerated.The 2nd edition of the Open Catalyst Challenge invites participants to submit results of machine learning models that simulate the interaction of a molecule on a catalyst's surface. Specifically, the task is to predict the energy of an adsorbate-catalyst system in its relaxed state starting from an arbitrary initial state. From these values, the catalyst's impact on the overall rate of a chemical reaction may be estimated; a key factor in filtering potential catalysis materials. Competition participants are provided training and validation datasets containing over 6 million data samples from a wide variety of catalyst materials, and a new testing dataset specific to the competition. Results will be evaluated and winners determined by comparing against the computationally expensive approach of Density Functional Theory to verify the relaxed energies predicted. Baseline models and helper code are available on Github: