Scientists Develop Machine Learning Framework to Accelerate Catalyst Discovery

Scientists at the U.S. Department of Energy's (DOE) Brookhaven National Laboratory developed a new machine learning framework that can accelerate the search for better catalysts - the materials that speed up chemical reactions - and offer more reliable results.

Finding high-performing catalysts, which are used to accelerate processes from chemical manufacturing to energy production, can be a slow, expensive process, often relying on years of trial-and-error or massive computational resources. To add to the difficulty, ideal catalyst candidates are rare.

"Imagine driving somewhere new without using GPS," said Brookhaven Lab chemist Wenjie Liao. "You'll probably get there eventually, but you'll take long detours and waste time. The discovery of catalysts can be like that."

Researchers in the Catalysis Reactivity and Structure group in Brookhaven Lab's Chemistry Division tackled those discovery challenges with a new multi-layer machine learning approach that screens catalysts step by step, mimicking how scientists evaluate performance in real experiments.

The team successfully used the chemical conversion of carbon dioxide (CO2) to methanol - a type of alcohol that can be used as fuel - as a case study for the new approach, which outperformed conventional models. The study also shed new light on how scientists can control key chemical reactions steps that tune two important features that make for an effective catalyst in that process: activity and selectivity.

A paper describing their work was recently published in Chem Catalysis.

The best catalysts must be active enough to drive reactions efficiently, but selective enough to favor the desired product over unwanted byproducts.

"Highly active and selective catalysts save energy and costs," said Brookhaven Lab chemist Ping Liu, who is also an adjunct professor at Stony Brook University. "An active catalyst means it doesn't require high pressure or high temperatures to speed up a reaction, and a selective catalyst means it doesn't require purification, which can be costly, to get the product you want."

Machine learning models promise faster catalyst discovery, but they face hurdles that the Brookhaven scientists set out to overcome in their study. Existing single-layer models have been limited by high costs to generate large databases needed for analysis, low data quality and uneven spread of data, the researchers said. Additionally, conventional models have not been trained with a chemical understanding to make accurate predictions about catalysts.

"Simpler one-layer models overlook the domain expertise need to reliably predict a good catalyst," Liu said. "Based on all these limitations, we developed a multi-layer binary machine learning approach that targets complex reaction networks for real catalysis, which has never been considered before in this kind of model."

Case Study: Turning CO2 Into Methanol

Instead of asking a single model to predict catalyst performance all at once, the Brookhaven team's method breaks the problem into a series of simpler decisions. To test their approach, the researchers studied the performance of copper-based catalysts used to convert CO2 into methanol.

The researchers trained multiple models using synthetic datasets generated from kinetic Monte Carlo simulations, which meant for a low computational cost, according to the study. These simulations capture how chemical reactions unfold over time, including competition between multiple reaction pathways - an important feature often missing from simpler models.

"This helps improve the accuracy and reliability of the model," said An Nguyen, a visiting graduate student from Stony Brook University. "Each layer is related to how we think about catalysts as chemists, how we break it in down into different categories with chemical or catalysis understanding."

In their case study, the researchers' multi-layer approach asked whether a catalyst could drive the reaction to convert CO2 to methanol, a desired product, and if it performed as well as - or even better than - the copper-based catalyst widely used in industrial and academic applications.

Applying the new framework, the team successfully screened catalyst designs that were both more active and more selective than copper catalysts. The method consistently outperformed conventional single-layer machine learning models, which struggled to find rare, high-performing candidates.

The framework also revealed which reaction steps mattered most. The analysis showed that transitions between competing reaction pathways - rather than individual steps alone - play a critical role in controlling both activity and selectivity.

"The multilayer approach allows us to dig deeper into the understanding between what we identified as key features and reaction behaviors," Liu said. "We identified key steps that control both the activity and selectivity for CO2 to methanol, providing new insight into this process."

The process of converting CO2 into methanol, known as hydrogenation, is already a commercial process. This work could be a step towards improving the workflow for industry partners, the researchers said. The framework can be adapted to other processes.

To develop the new framework, the researchers used computational resources from the Center for Functional Nanomaterials, a DOE Office of Science user facility at Brookhaven; the Scientific Computing and Data Facilities at Brookhaven; and SeaWulf, a high-performance computing cluster at Stony Brook University.

The research was supported by the DOE Office of Science.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.