In recent years, theoretical chemistry has transformed our understanding of the world around us through efficient and accurate calculation of materials properties and catalytic rates or mechanisms. Nevertheless, outstanding challenges remain in going beyond understanding to the computational design of new materials and the prediction of new mechanisms. This is especially true in challenging materials spaces such as open-shell transition metal chemistry where electronic structure calculations with density functional theory (DFT) are both needed, due to the relatively large number of electrons to consider, but in practice likely lack sufficient accuracy for design. Our group is tackling the development of new systematically improvable methods, from systematic multi-scale modeling to machine learning, to enable computational design and mechanistic prediction in open-shell transition metal chemistry. We expect these tools to advance the role of computational chemistry in the physical sciences.
Machine learning accelerated materials discovery
The number of molecules and materials that could potentially be studied far outpaces what is tractable for experiments and even for fast electronic structure calculations. A smarter approach to materials design that bypasses traditional trial and error is needed. Our group is developing computational workflows and novel representations for the machine learning (ML)-accelerated discovery of new transition metal complexes and metal–organic frameworks. In most real-world design challenges, multiple trade-offs between desired properties are apparent, and the time to discovery is dictated by the fact that it is difficult to find a single material that satisfies all design objectives. Using multi-objective Bayesian optimization techniques, we are accelerating traditional discovery approaches by factors of 1000 or more, identifying exceptional materials from millions of candidates for applications from energy storage to catalysis and photoactive materials. We are pairing machine learning models trained on DFT data with models trained on experimental data from the literature to inform our computational workflows about what is experimentally realizable. These tools have the promise of transforming the time to discovery from decades to weeks.
Enzyme catalysis
Enzymes catalyze nearly every reaction in the cell with exquisite selectivity and unprecedented rate enhancement in comparison to non-biological catalysts, and understanding enzyme dysfunction is essential to improving human health. Computational modeling has played an essential role in revealing the mechanism of newly discovered enzymes and identifying when subtle changes in enzyme structure lead to divergent mechanisms. Nevertheless, the large size of enzymes (ca. thousands of atoms) and the slow time scale over which protein motions occur (ca. picoseconds to seconds) mandate careful choices in balancing cost and accuracy in enzyme modeling, which can influence computational predictions. Our lab aims to understand and overcome the limitations of computational modeling by developing systematic modeling techniques to achieve predictive chemical accuracy in enzyme catalysis. We use these techniques to identify when seemingly similar enzymes catalyze divergent reactions or how enzymes use their distinct architectures to catalyze the same reaction on divergent substrates. This work is aided by high-throughput screening workflows that allow us to study enzymes with both unprecedented accuracy and at larger scale. Much of this work is carried out in close collaboration with enzymologists to aid mechanistic understanding in biological catalysis. Using the knowledge derived from these large-scale studies, we will leverage what we learn about enzymes to design biomimetic catalysts.
Making theoretical chemistry autonomously improvable
Traditional quantum chemical studies have been undertaken typically by carrying out small-scale studies of well-known materials in which careful benchmarking against experiment or higher levels of theory is often possible. In the discovery of new materials and mechanisms, such benchmarks are seldom available. Thus, it becomes essential to have techniques that detect when methods are insufficiently accurate in new materials spaces. We are addressing this challenge through the development of artificial intelligence models that detect the presence of difficult multi-reference character that would limit the applicability of more common methods (i.e., DFT). We have also used properties of the quantum mechanical wavefunction to build a density functional “recommender” that can select the best method for a given compound and property. We are also identifying low-cost strategies for the prediction of high-quality energetics at DFT cost or lower using novel machine learning model representations. Through a judicious combination of machine learning models and novel wavefunction descriptors, these workflows will soon have the same capability to make decisions about which calculations to run that normally takes years for experts to acquire through sufficient experience and intuition.