Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2022 Oct 17;1518(1):106–119. doi: 10.1111/nyas.14913

Computational modeling to assist in the discovery of supramolecular materials

Kim E Jelfs 1,
PMCID: PMC10091946  PMID: 36251351

Abstract

Computational modeling is increasingly used to assist in the discovery of supramolecular materials. Supramolecular materials are typically primarily built from organic components that are self‐assembled through noncovalent bonding and have potential applications, including in selective binding, sorption, molecular separations, catalysis, optoelectronics, sensing, and as molecular machines. In this review, the key areas where computational prediction can assist in the discovery of supramolecular materials, including in structure prediction, property prediction, and the prediction of how to synthesize a hypothetical material are discussed, before exploring the potential impact of artificial intelligence techniques on the field. Throughout, the importance of close integration with experimental materials discovery programs will be highlighted. A series of case studies from the author's work across some different supramolecular material classes will be discussed, before finishing with a discussion of the outlook for the field.

Keywords: computational chemistry, material design, materials discovery, supramolecular chemistry


The review discusses how computational modeling can be used to assist in the discovery of supramolecular materials, from assisting in structure prediction, property prediction, and synthesis route prediction. The potential impact of artificial intelligence is also discussed as well as the outlook for the field.

graphic file with name NYAS-1518-106-g002.jpg

INTRODUCTION

The discovery of new materials can transform our lives, for example, facilitating the miniaturization of batteries that has revolutionized electronics, allowing small handheld devices with enormous computing power. However, the discovery process is inherently slow, typically over decades. An exploration of how new materials are discovered suggests while understanding the design principles for a materials field allows incremental improvements in a material class, rarely are novel material structures discovered this way. 1 Serendipitous discoveries of new material classes are rare, but what is evident is that the time lag between the synthesis of new material and the discovery of its full application potential is almost always decades. 1 This slow discovery rate, set against the problems facing humanity of resource scarcity and climate change, underlines the need to find ways to accelerate the material discovery process.

For the last few decades, computational approaches focused on assisting materials discovery have continued to develop and alongside massively increased computer power, this is allowing both a larger number of systems, larger systems themselves, and longer timescales to be simulated. However, there remain significant challenges to the widescale application and success of computer‐guided material discovery. For example, assessing thermodynamic, kinetic, and chemical stability is challenging, as is ensuring that predicted materials are synthetically realizable through known synthetic routes. Similarly, it is difficult to incorporate sufficient complexity in the models to accurately reproduce material properties, as well as to consider factors that influence device‐level performance. However, we have additional technology increasingly at our disposal, including artificial intelligence (AI) techniques, such as machine learning (ML) that can allow rapid property calculation, efficient optimization, and alternative suggestions that a human chemist or material scientist might not propose. In addition, automation has increasingly been used in the laboratory over the last few decades, and while challenging to apply to many material synthesis and characterization steps, with further robotics developments, there is significant potential to impact materials discovery.

In this review, the focus is on materials built mostly or entirely from organic components, principally containing carbon, nitrogen, hydrogen, oxygen, and other light atoms. It is arguable that computer‐assisted discovery of such organic materials lags behind that of inorganic materials, where there is more defined strong directional bonding assisting in the prediction process. For organic‐based materials, there is difficulty in the prediction of self‐assembly of organic molecular (or polymeric) subcomponents that is typically directed by only weak(er) noncovalent intermolecular forces. Very small changes to the structure of the individual components can have very large effects on their solid‐state (or solution‐state) arrangement, which naturally has a large impact on the material properties. This makes it very challenging to attempt either forward prediction, from molecular components to solid‐state properties, or inverse design, to go from desired properties backward to the necessary components to achieve those properties.

Supramolecular materials, the focus of this review, are a particular example of organic‐based materials where the material architecture consists of molecules self‐assembled without direct chemical bonding. Examples of supramolecular systems include receptors, cation‐ and anion‐binding hosts, self‐healing polymers, compounds that can sorb gases or molecules, such as network solids like polymers and metal‐organic frameworks (MOFs) and covalent‐organic frameworks (COFs), supramolecular catalysts, interlocked molecules, molecular devices, such as molecular electronics and sensors, liquid crystals, dendrimers, and supramolecular gels. Some examples are shown in Figure 1. This review will first discuss the areas where computational prediction can assist in the discovery of supramolecular materials, before moving on to discuss example case studies across different material classes. These case studies will be focused on a few examples of supramolecular materials and their applications from my own research and are not aimed at being comprehensive and exhaustive. Finally, I will discuss the outlook for this field as new technologies are increasingly coming to the point of fruition.

FIGURE 1.

FIGURE 1

Examples of supramolecular materials. From the top left: a porous organic cage, a metal‐organic cage, a covalent‐organic framework (COF), metal‐organic framework (MOF), a fragment of a porous organic polymer, and a pseudo‐rotaxane molecule. The first five systems can form host/guest supramolecular systems. Carbons are shown in gray, nitrogen in blue, oxygen in red, boron in pink, iron in orange‐brown, zinc in purple, and bromine in maroon. Hydrogens are omitted.

COMPUTATIONAL PREDICTION

Computational materials chemistry approaches have a long history of assisting in rationalizing, at an atomistic level, the origin of experimental observations. This insight and understanding of structure–property relationships has allowed for guided optimization of target properties. Ideally, however, the computation could be used in another way, where new materials could be tested on a computer to identify only the most promising materials for synthesis. Given the difference in timeframes, hours–days–weeks for computation to weeks–months–years for experimental synthesis and characterization, along with the difference in cost, the potential benefit of reliable computation is clear. However, the process for a prediction is not trivial; the unpredictability of the self‐assembly process in supramolecular materials means that inverse design from a desired set of functions back to a material that can achieve those functions can largely be ruled out for now. Instead, a process of screening would need to occur, where one first predicts the material structure and then the properties with further calculations. 2 Thereafter, materials with a promising set of properties can be selected for experimental testing.

Structure prediction

The goal of any structure prediction process is to predict the solid‐state, or in some cases solution‐state, structure of the material from only knowledge of the materials’ subcomponents. For a supramolecular material, this would mean being provided with only a chemical two‐dimensional sketch of the precursors and being able to predict first how those precursors would react together, then predict the three‐dimensional molecular conformation, and finally how that molecule would arrange in the solid‐state. While the reaction type is typically known from the initial selection, and the molecular conformation can be relatively easily predicted with standard conformer searching algorithms combined with a classical mechanics description of the molecule—that is, without the need for a quantum mechanical description—the last task is the most challenging. Typically, a molecule can have many potential solid‐state arrangements, known as polymorphs, that differ only slightly in energy and thus small errors in the energetic description of the system in the model will result in an incorrect prediction.

While predicting the crystal packing of a molecule, a process referred to as crystal structure prediction (CSP), is challenging, it is not impossible and significant progress has been made over the last decade. CSP for organic molecules is widely used for pharmaceutical polymorph prediction as an aid to experimentally finding and characterizing the polymorphs relevant to the development of pharmaceutical products and protecting intellectual property. 3 Toward this aim, and to allow comparison of approaches and to survey the state‐of‐the‐art field, a series of blind tests have been held. 4 For these blind tests, participating groups are given several molecular structures and several months to predict their crystal structure. Over the six tests to date, the results have shown significant improvement despite the increasing complexity of the molecules, for example, with increasing conformational flexibility or hydrate structures. 4

While the exact method for CSP varies, the process typically involves testing, either randomly or through another optimization approach, thousands or hundreds of thousands of hypothetical crystal structures for the provided molecule(s). 3 The relative energy of the different hypothetical polymorphs is then assessed. Generally, an energy assessment is initially done at a classical mechanics level, necessarily computationally low cost due to the large number of structures for which the energy must be calculated. However, a key component of the improved performance of CSP approaches in the last 10 years has been the addition of electronic structure calculations as a final, more accurate energy ranking of the hypothetical polymorphs. 4 This typically involves using density functional theory (DFT) calculations with an effective correction for the poorly described dispersion interactions in DFT—the latter point being key for the very accurate energy description that will inevitably be very sensitive to errors in the nonbonded interactions in these self‐assembled materials. 5 , 6 Assuming an accurate energetic description, then it can reasonably be expected that the lowest energy hypothetical polymorphs are the ones that will be experimentally observable. 7

Of course, not all supramolecular materials are crystalline in the solid‐state, and many operate in solution, where it can be expected that solvent effects are important to consider to accurately predict the materials’ properties. The situation can be further complicated by solvent being present during the synthesis but absent from the final solid‐state structure. This can include the templating effect of solvent on a structure with specific interactions directing to a given structure, or space‐filling effects where it is the bulk of the solvent molecule(s) and not specific interactions influencing the structure, for example, inflating a porous molecule. We have previously developed an approach to reproduce the “inflated” open structures of porous organic molecules that lie hundreds of kJ mol−1 above the lowest energy nonporous conformation. 8 The open, higher energy, but metastable conformations are not found by typical conformer search approaches, and so instead, we developed constrained molecular dynamics (MD) approaches that enforce open structures while still searching for lower energy open conformations. There are also approaches that can quantitatively predict solvent influence on noncovalent interactions and, thus, properties such as functional group interaction profiles. 9 The method by, and the extent to which, solvent interactions need to be included in calculations on either structure or property is system‐ and problem‐dependent but will need careful consideration and is often very challenging to effectively consider.

Structure prediction of amorphous materials is possible via a variety of different routes. This can include coarse‐grained models that simplify the description of a system by modeling groups of atoms together rather than directly describing the interactions of each atom (or indeed of electrons). 10 This may be necessary to simulate the larger length scales required to consider lower long‐range order. Alternatively, for amorphous polymer systems, for example, porous polymer membranes, it may be necessary to simulate with all atoms to describe the small pore openings and specific interactions responsible for molecular separation performance. In this case, polymerization software can be used that starts with a random packing of the molecular constituents of the polymer, before allowing iterative bond formation combined with dynamical annealing procedures to reach realistic structures. 11 While not the focus of the research discussed in later sections here, describing the dynamical behavior of many supramolecular systems is key to accurately considering their structure in a model, and consequently their properties. An example of this is from coarse‐grain modeling of bioinspired supramolecular polymers that are dynamically exchanging their monomers, which is key to their self‐healing behavior and ability to reconfigure in response to stimuli. 12

Property prediction

If the structure of a supramolecular material is known, either from databases of known structures or from structure prediction, then the property of the material can be predicted through simulation. The accuracy of theoretical predictions relative to experimental measurements is highly dependent on the specific system and property, with only qualitatively correct results in some cases (which can still be very useful), but quantitatively accurate results in others. This underlines the importance of field expertise and of computational researchers working closely with experimental experts to ensure a valuable level of accuracy is reached. Properties that can be calculated include, but are not limited to, guest binding or sorption and thermodynamic or kinetic selectivity, binding energies and constants, porosity, transport and diffusion rates, activation energies for catalysis, electronic energy levels, UV‐Vis spectra, optical or band gap, and charge transfer integrals or mobility.

While in the following section prediction using AI will be discussed, here the focus is on prediction through direct molecular simulation. Of course, the type of simulation to be used depends on multiple factors, including the number of atoms in the structural model, the number of materials that are to be screened for their properties, the level of accuracy required, and the type of property being predicted. There are several different types of theory levels for simulations, starting from coarse‐graining, then atomistic or forcefield calculations, where each atom is directly considered with the intramolecular and intermolecular interactions between atoms described with simple functional forms, for example, bonds as balls and springs with Hooke's law to describe the energy change with bond length. Forcefield calculations use classical mechanics as they do not directly consider electron motion. Forcefields are parameterized either against experimental data or more accurate simulations. While ideally, a forcefield will be transferable across many different systems, this is typically at the cost of a reduction in accuracy as forcefields will perform best closest to the systems they were parameterized on. However, as supramolecular materials consist of organic components, there may be reasonable accuracy across a series of materials from one forcefield.

If a forcefield calculation is not sufficiently accurate, or cannot describe the feature of interest, for example, chemical reactivity or optical activity, then electronic structure calculations can be used. DFT calculations are commonly used and while they are much more computationally expensive than forcefield calculations, DFT is still computationally cheaper than higher‐level electronic structure methods. DFT calculations make use of quantum mechanics and describe the properties of many‐electron systems with functionals (functions of functions) of the electron density. The selection of the functional is key to accurate property calculation. As supramolecular systems by their very nature are structured based upon intermolecular interactions, which are not described well by DFT calculations, it is very important to include a description of these dispersive interactions in another way. Dispersion corrections can be achieved by (empirical) corrections, such as Grimme's dispersion corrections, 13 or exchange‐hole dipole moment models. 14 For consideration of optical properties, such as excitation energies and photoabsorption spectra, time‐dependent DFT can be used. 15 There are also semiempirical methods that sit in between forcefield and DFT calculations in terms of their computational cost and we have found these very useful in supramolecular materials where the computational cost of DFT calculations can be prohibitive as the systems typically have a large number of atoms, and we are often considering many systems. One semiempirical method that we have made frequent use of is GFN2‐xTB from Grimme and coworkers; it is a semiempirical tight‐binding method, designed for fast calculations of both structures, intermolecular interactions, and optoelectronic properties. 16 We have previously calibrated GFN2‐xTB against DFT calculations to increase the accuracy of the method for particular tasks. 17

When assisting in supramolecular material discovery, while one may often be postrationalizing experimentally observed properties to learn structure–property relationships and thus design rules for further optimization of the material, there is always the ultimate goal of computational researchers in this field of being predictive in advance of the experiment. In this case, it is likely that many (hundreds, thousands, millions) structures will need to be considered. The computational cost of this, even with “cheap” property calculations, will be considerable and, thus, workflows need to be developed that reflect this unless it is possible to apply a “brute force” approach where the properties of all the materials under consideration can be calculated. A common approach is to use a filtering method where you start with a computationally cheap, but less accurate, method to apply to the largest number of structures in order to remove materials that clearly do not meet the property requirements, before next moving to a more accurate, but more computationally expensive method. Filtering methods can have multiple layers of methods before the final set of promising materials is identified. At that point, typically an experimental supramolecular chemist will select materials for synthesis and experimental testing. Filtering methods do need to be applied with care and with knowledge of the accuracy of the computational methods. Beyond filtering methods, a variety of different optimization approaches can be used to efficiently explore the search space of possible materials, minimizing the number of material property calculations that need to be performed.

While AI‐based optimization algorithms are beginning to be used, we have previously developed an evolutionary algorithm‐based approach for seeking optimal supramolecular material properties. 18 Evolutionary algorithms mimic evolution in nature, in our case, from an initial “generation” of supramolecular materials, their properties are calculated, with those with better properties (“fitter individuals”) being more likely to proceed to the next generation. Chemical modifications are performed on the materials that mimic crossover and mutation, and over the generations, the “fitness” of the materials will improve. A particular feature of our group's work has been to try to use one of the complicating factors of supramolecular materials, their multicomponent nature, in our favor from a screening perspective. Calculating the properties from a solid‐state structure has disadvantages, both from the time‐consuming nature of having to first perform structure prediction and then from the increased computational cost of property calculation on systems with many atoms. The modular nature of supramolecular materials opens the possibility of first (or only) screening the properties of an individual module of the material as this should be very descriptive of the bulk properties. We have demonstrated this can be surprisingly effective, even in scenarios where it would be expected that the solid‐state packing would significantly influence the properties. 19

A key class of materials explored in our group are (micro)porous materials, where their porosity can be used for a variety of applications, including encapsulation, selective guest uptake, and molecular separation through diffusion and sensing. There are several different types of calculations that can be used for determining the properties of porous materials. These include algorithms for void analysis that can determine the size of the internal cavities, the degree of interconnection of the pore channels, surface areas, and pore volumes. Grand canonical Monte Carlo (GCMC) calculations can be used to calculate the guest uptake at a given temperature and pressure, allowing simulations of sorption isotherms and the ability to determine guest selectivities. 20 GCMC calculations are generally applied to rigid, static structures, and thus care must be taken to consider if ignoring the true flexible nature of any porous host will give erroneous results. 20 MD simulations can be used to explore the dynamic motion of a guest through a host system, including consideration of the flexible nature of the porous host. 20 MD simulations can provide information on guest diffusion, including guest diffusion rates to determine selectivity. In scenarios where the guest diffusion is slow, enhanced sampling methods, such as metadynamics or umbrella sampling, can be applied, 21 which encourage the simulations to explore the full energy landscape rather than only local minima–in our case encouraging guest diffusion to occur more often. To reproduce behavior when a system is not in an equilibrium state, nonequilibrium MD can be used, 22 for example, to explore ion selectivity in a porous material within a battery device by applying an electric field.

Using the approaches and software we have developed for porous organic materials, we have also explored organic electronic systems. There are many potential properties of interest here, but they include the HOMO and LUMO levels, dipole moments, ionization potential, electron affinity, and optical gaps, and here the most used methods are DFT or time‐dependent DFT calculations. Here, careful consideration of the selection of the theory level is needed to ensure reasonable accuracy. Another consideration, for example, in semiconductors, is the material's charge mobility, which we typically calculate assuming a charge‐hopping regime for transport using DFT calculations to calculate charge transfer integrals and nonadiabatic Marcus theory to calculate charge mobility. 23

A combination of CSP approaches and property calculation can be used to calculate and then explore energy–structure–function maps, 24 where the relative energies of hypothetical polymorph structures are known, as well as their properties. This has been applied to both porous organic molecular systems 25 and organic semiconductors. 26 As well as providing a lot of information on structure–property relationships, the use of energy–structure–function maps can assist in the identification of new phases with promising properties, providing experimenters with the confidence to put considerable effort into the synthesis of those materials. This approach has allowed the experimental realization of new phases of materials with record‐breaking surface areas, even though the molecule itself was known—it had just not been crystallized in the record‐breaking polymorph structure. 25

While not the focus of our group's research, it is important to mention that the calculation of bulk material properties alone may not be sufficient for consideration of the material in a device. Factors, such as defects, processing conditions, and macroscopic features, such as interfaces and grain boundaries, can all heavily influence the material performance. Device‐level modeling of materials, for instance, multicomponent devices, such as solar cells and batteries, can be performed to explore the optimal device setup, 27 , 28 , 29 , 30 and this should be considered alongside modeling of bulk material properties.

Progress using AI

AI has been making an impact across many fields, although within chemistry, its wider application is often thwarted by the lack of quantity and quality of data required to train ML models. Arguably, the greatest use of ML within the materials discovery field is supervised ML of material properties. A supervised ML model can predict properties on the order of seconds, and orders of magnitude faster than an electronic structure calculation. This obviously completely changes the number of materials that can be predicted as well as the size of the structural models that can be treated. In addition to the selection of an appropriate algorithm and its setup, it is necessary to determine descriptors that capture the key chemical, structure, and electronic characteristics of the material. Descriptors are often numerical representations, typically in vector form, and for many properties, the descriptors will need to be three‐dimensional in nature to provide chemical insight. There are myriad choices for molecular descriptors, and it is generally best to test many different ones for their performance, but descriptors of the solid‐state or intermolecular arrangements are more challenging and it may be that further ones need to be developed that are particularly suited to capturing key features of supramolecular systems.

Supervised ML models can be used to make property predictions when trained by labeled data where the output is known, for example, a set of material structures with known properties. However, supervised ML can require 10,000s to 100,000s of training data points depending on the algorithm being used, and while there have been significant developments in large‐scale databases of material structures and their properties, these have largely been focused on inorganic materials and for the field of supramolecular materials, such databases are essentially absent. It may, however, be possible to first build a database through molecular simulations on material structures and then use the calculated properties as the training data. An alternative route to finding sufficient ML training data is to leverage the enormous quantity of scientific literature in journal articles, technical documents, and patents. As the scientific literature is largely written in a formulaic way, there is the potential to apply text‐mining algorithms based upon natural language processing and named entity recognition to automate the data extraction that would be intractable manually. 31 Cole and coworkers have developed ChemDataExtractor, which is trained for extraction of chemical information, including, for example, identification of the names and properties within a chemical text, and can be extended to new tasks. Within the field of supramolecular materials, there are challenges in that there is not always complete consistency in reporting methods; key information may sometimes be found in the supporting information, which is more challenging to text‐mine; and the large complex supramolecular structures are sometimes only represented with images, which then require image recognition software to convert to a typical format for a chemical structure.

ML algorithms can also be used for efficient optimization of search space to quickly find the material with the optimal properties. One increasingly used type of algorithm is Bayesian optimization; 32 for an unknown function, such as how property relates to a large number of material variables, Bayesian optimization can be used to select the next set of materials to test. The approach balances the exploration of new areas where there is high uncertainty in the model predictions and exploitation where the model seeks to improve the function based on known information. The approach has begun to be used in areas of materials chemistry, both combined with experiment 33 or to reduce the number of calculations required in simulation. 34 Unsupervised ML can learn patterns in data that is not labeled (in contrast to supervised ML where the data are labeled). Unsupervised ML can be used to provide insight into the relationships in data, for example, clustering or classifying structures with related motifs or properties; examples include clustering thousands of hypothetical polymorphs from a CSP calculation of organic electronic materials as to whether or not they have similar packing motifs, 35 or classifying the arrangements of supramolecular assemblies. 36

The above screening approaches have focused upon workflows where a large number of hypothetical materials are constructed, with property prediction following structure prediction. These are likely to rely on the use of databases of known materials or the use of libraries of building blocks. An issue with libraries is that if designed by (synthetic) chemists, it is highly unlikely that any very novel structures will be predicted, and the search space has already been defined—often quite narrowly based on only a handful of building blocks. Generative models are a type of unsupervised ML that when trained on an existing dataset can then produce new examples of that data. Generative models in chemistry can be used to provide a different route to the generation of new candidate systems, 37 and can also provide some potential for routes that would allow true inverse design rather than high‐throughput screening. Generative ML models generally require large training datasets and decompose materials and their structures to a continuous vector representation known as latent space. This latent space can either be used to generate similar materials to those in the training set or the space can be sampled to generate new material structures, potentially while seeking to optimize the materials’ properties. Examples of generative models include variational autoencoders and generative adversarial networks, both making use of neural networks. 37

Overall, there is a lot of potential for ML to impact computational supramolecular materials discovery, but this is hindered in particular by the lack of available data and so is still in an early stage. However, with increased awareness of the need to generate high quality and quantity datasets, including data on “failures,” alongside algorithmic developments focused on operating in “low data” regimes, such as one‐shot learning that may need only dozens of data points, 38 the use of ML in supramolecular chemistry is set to grow.

Synthetic route prediction

Arguably, the most challenging stage in trying to use computation to guide materials discovery programs is in the translation of a computational prediction to realization in the laboratory. There can be many reasons behind this, including cost, lack of trust in the prediction, or the infeasibility of the material, such as due to the lack of chemical or thermodynamic stability. Another significant reason is that a computer‐predicted material does not come with a synthesis procedure for the material's experimental realization. 39 This problem can be tackled in a variety of ways, including close working with experimental researchers (see further discussion below), the use of viable chemical precursors, either because they are commercially available or have known inexpensive and simple synthesis routes, or through synthesis prediction software. Computer‐aided synthesis planning (CASP) has a long history, focused on the synthesis of organic molecules. 40 For materials, further efforts are needed, although it is notable that the first stage in supramolecular material synthesis is the synthesis of the organic molecular precursors. AI has been making an impact in the CASP field with the development of retrosynthesis planners, facilitated by efforts to collate large‐scale databases of known reactions. While feasible routes can now be found with these planners, finding and ranking the best routes is still challenging and many planners are only commercially available. For a material, sometimes the synthesis route to the material may be simple one‐pot reactions but finding the synthesis variables for optimal properties may be more challenging due to the size of the search space. For example, MOFs can have very variable surface areas dependent on the solvent and temperature used for the synthesis and so efforts have been made to use computation to efficiently guide the exploration of the synthesis search space. 41

Working together with experimenters

At any stage of a computational materials discovery process, it is best to work in synergy with experimental partners. 42 Our group has had considerable success in working closely together with synthetic groups, working to fully understand the problems faced by experimental researchers and identifying new ways in which we can contribute to accelerating their progress. This will ensure prioritization of the most relevant properties, for example, the solubility of the supramolecule may be key to processing in a device, the synthesizability of both the material precursors and the final supramolecular material, and for the computation to assist in the optimization of the material and its synthesis conditions. It is obviously more efficient if the integration of experiment and computation occurs with feedback loops throughout the process rather than only occurring sequentially or with computation only being applied after synthesis.

CASE STUDIES ACROSS MATERIAL CLASSES

In this section, some case studies from our group will be highlighted that demonstrate various aspects of computational assistance in supramolecular materials discovery. The case studies cover several different classes of materials, as approaches and software developed to screen one supramolecular material can often be easily adapted and transferred to new materials and properties. We have worked most extensively on porous organic molecules, specifically porous organic cages, which are discrete molecules with an internal cavity that has more than two entrance/exit routes for guests. 43 , 44 Porous organic cages have potential application in encapsulation, molecular separations, and sensing. Coordination cages, or metal‐organic polyhedral, also have cage‐like structures but with the inclusion of coordinating complexes can have additional application in catalysis. 45 , 46 We have also explored organic molecules and linear polymers with desired (opto)electronic features as semiconductors for organic field effect transistors, as organic dyes for dye‐sensitized solar cells, and for photocatalytic water‐splitting. Finally, we have explored amorphous network polymers that when formed in membranes can be used in molecular separations, including for solvent and liquid separations and for ion separations, for instance, within flow batteries.

Software for computational supramolecular materials

Fundamental to our ability to perform any computational discovery on supramolecular materials has been the development of software to assist in the assembly, structure prediction, and property prediction stages. To that end, we have developed several pieces of open‐source software. This software includes pyWindow, developed by Miklitz and Jelfs, which analyzes the internal void of systems, such as porous organic cages, determining the size of the internal cavity, the number of pore windows, and the size of the windows. 47 pyWindow is sufficiently computationally efficient that it can be applied to analyze the output of MD simulations, allowing the exploration of how pore windows breathe over time both in the presence and absence of guest molecules. Our key group software is the supramolecular toolkit (stk), which was reported by Turcani et al. 48 and further developed more recently. 49 There is online documentation and tutorials available to support new users. stk is written in Python and can assemble molecular components into a range of different supramolecular assemblies, including porous organic cages in a range of topologies, metal‐organic cages, metal‐organic complexes, linear polymers, COFs, and interlocked molecules. This range of materials and topologies is easily extendable. Thereafter, stk interfaces with other software to perform molecular structure prediction and then property calculation—again, it is easily extendable to new calculations through interfacing with other software. stk already interfaces with pyWindow and GFN2‐xTB. stk also contains an evolutionary algorithm to allow for searching the chemical space of supramolecular materials with target properties.

Porous organic cages

A significant part of our group's research has been to expand on ideas of how to use computation to assist in porous organic cage discovery. 50 Through using stk's evolutionary algorithm, Berardo et al. predicted promising porous organic cage targets with target pore sizes of 16 Å, as well as identifying design rules for such systems, such as <20% of their bonds being rotatable. 18 The evolutionary algorithm was implemented such that a crossover switched the molecular precursors of two‐parent individual cages and a mutation swapped one of the individual's precursors with one from the library, either randomly or so that the new precursor was similar to that it was replacing. This mixture of random and similar mutations was found to be effective at searching the large search space for cages. The fitness function that the evolutionary algorithm seeks to optimize is determined by the user and can include a series of properties with different weightings, with specific features either selected for or penalized. The approach and setup for the evolutionary algorithm was validated by demonstrating that it could rediscover a known cage, CC3, from a search space of tens of thousands of cages (Figure 2). The approach could also be used to allow different cage topologies to compete to see whether specific topologies were selected when the algorithm was targeting a given fitness function.

FIGURE 2.

FIGURE 2

Rediscovery of known porous organic cages with an evolutionary algorithm. The evolutionary algorithm sought to maximize the fitness value, which was a combination of a match to the CC3 pore size and symmetry of the windows. Shown is the fitness value distribution (indicated by color, where yellow is a poor score and purple is a good score) of 49,700 porous organic cages when trying to rediscover the known cage CC3. Cages with higher fitness are represented with darker, larger dots. CC3 is shown in A (with the highest score) and B–D are three other cages with different fitness values. The x and y axes show the similarity of the two building blocks to those of CC3, with 0 representing no similarity and 1 representing a perfect match. Reproduced from Ref. 18 with permission from the Royal Society of Chemistry.

The predictions from Berardo et al. had an issue, in that most of the predicted porous organic cages were not chemically feasible to synthesize, particularly when time and cost were considered. This was due to the lengthy, multistep syntheses that would be required to synthesize many of the precursors. The precursor library used in the study was taken from the Reaxys database, with the intention that this would at least mean that there was a reported route for their syntheses. However, a literature route being available does not mean the route is simple or low cost, as would be desired for an experimental materials discovery program. When the evolutionary algorithm was run, the fitness function only included factors, such as void size and molecular symmetry, and thus there was no factor that penalized synthetic complexity. Existing algorithms that rank synthetic complexity and accessibility have not been developed with materials synthesis in mind and thus we found that they were not particularly useful for consideration in our prediction pipeline. Therefore, Bennett et al. instead developed the Materials Precursor Score (MPS), a model that aims to predict chemists’ intuition for synthetic accessibility. 51 This was applied to the same evolutionary algorithm runs for porous organic cages as in the Berardo et al. study and was found to significantly improve the synthetic accessibility of the predictions. MPS was developed by building an ML model that used training data from experimental materials chemists’ ranking of organic precursor molecules. These data were collected by building an app that showed the user a molecule and asked them to rank it as synthetically accessible at a reasonable scale (>1 g) in less than five steps or not.

Ideally, we would like to be able to be given only the two‐dimensional chemical sketch of the organic molecule precursors of a supramolecular material, and from that be able to predict how they will react together (e.g., molecular mass), their molecular conformation, and their solid‐state assembly. From the solid‐state structure, the properties can be calculated. We had separately worked on all these different stages, but in work by Greenaway et al. in 2019, we were able to put all these stages together for the first time. 52 This began with the idea of combining three different precursors into a porous organic cage, compared to only two precursors in most reported cages. This added an additional layer of complexity, in that the precursors could react with “social self‐sorting,” where all three components combined into one cage molecule, or “narcissistic self‐sorting,” where they separate into different two‐component cages. So, the first prediction task was which of these sorting options would be observed, and this was approached by assembling the different possibilities, searching for their low energy conformations with a forcefield‐based approach, and then calculating their relative energies using DFT calculations. These revealed that the three‐component assemblies were energetically competitive and, indeed, they were then experimentally observed. The next stage was to conduct CSP calculations to predict the molecules’ crystal packing. This revealed preferences for racemic over enantiopure packing, alongside preferred packing modes in each case. Further experimental work to crystallize and determine the molecules’ single crystal X‐ray diffraction crystal structures confirmed that the packing preference predictions were correct.

A particular prediction challenge with porous organic cages is to predict which topology, and, thus, molecular mass, will be observed for a given reaction of two precursors—for instance, whether a [2 + 3], [4 + 6], or [8 + 12] reaction will be preferred. 53 We have approached this by conformer search calculations followed by DFT to compare the relative energies of the different reaction outcomes. In research by Greenaway et al. in 2018, this approach was tested on a large scale. 54 Seventy‐eight precursor pairings were selected and high‐throughput calculations to predict the assemblies were run alongside high‐throughput automated experimental screening. This revealed that the topology prediction approach was effective but could be incorrect in cases where the energy differences between the different assemblies were small (<15 kJ mol−1). This is a result of the simplification of the calculations, where the solvent is neglected but could sway the result in scenarios of small energy differences. Of the 78 reactions attempted experimentally, only 33 were successful (42%), despite the precursors being selected by experimental chemists with high hope of success. This demonstrates the ongoing challenge to predict which synthesis reactions are successful and under which conditions. Greenaway et al. compared DFT‐predicted formation energies against the reaction success result and observed that the DFT calculations picked up key trends, for example, that one precursor that was energetically unfavorable had almost entirely unsuccessful experimental reactions. However, the formation energies were not perfect indicators, in that some very energetically favorable reactions were not experimentally successful; this could be due to the lack of solvent in the calculations, or because the reaction gets kinetically trapped before reaching the final product, something not considered in the calculations that only examine the energetics of the reactions and final product.

A key property prediction for porous organic cages is whether they are shape persistent, which is the feature of maintaining an internal void even in the absence of solvent. This is an important prediction given the vast majority of systems lack shape persistence and it could take 1–2 years to discover this through experiment. 55 Computationally, this prediction is relatively simple, in that if forcefield‐level MD simulations show the lowest energy conformation is open, rather than collapsed with no void, then it can be expected that the cage is shape persistent. However, this requires the ability to conduct those simulations and we typically use the OPLS3 56 forcefield, which is commercial, due to its transferability across new systems; this forcefield is not available to most experimental labs. Therefore, Turcani et al. built a supervised ML model to predict whether a cage is shape persistent or not from knowledge only of the two‐dimensional connectivity of its precursors. 57 As there were not experimental data available (only a few hundred cages have been experimentally reported), the training data were built by constructing cage models using stk and then running >66,000 short forcefield‐level MD simulations and using pyWindow to classify the molecules’ shape persistency (whether there was a void in the lowest energy sampled conformation from the MD simulation). All of this could be automated with scripts such that the task of 66,000 MD simulations was not arduous for the researcher. The ML model not only predicts cage shape persistency in seconds but was also made available as a web app so that no computational experience was required to access the predictions. More recently, Yuan et al. used the same dataset to train a graph neural network for the same prediction, achieving better performance as well as some explicability as to the substructures that promote shape persistency. 58

Metal‐organic cages

With the above capabilities for porous organic cages in hand, one obvious extension was to apply these approaches to assist in the discovery of metal‐organic cages. 59 This is more challenging than porous organic cages as the different metals and their coordination can be challenging to model without electronic structure methods, particularly if one wants an approach that was transferable to new topologies and coordination chemistries. In addition, the effects of solvent and counterions can be important to consider in the calculations, which significantly increases the complexity. However, we have been able to develop an approach that has been applicable and validated against experiment for several different systems. 60 , 61 With a validated approach, Tarzia et al. were able to apply this to the high‐throughput screening of new low symmetry Pd2L4 cages. 62 Low‐symmetry cages involve ligands where the coordinating groups at either end of the ditopic ligand (L) are chemically distinct. This adds complexity in that there are four possible isomers depending upon the relative orientation of the ligands with respect to each other. In this combined computational and experimental workflow (Figure 3), we used GFN2‐xTB to provide a low‐cost approach to assess the assembly outcomes of 60 different unsymmetrical ligands, calculating the relative energies of the total 240 different isomer cages. Several ligands were then selected for synthesis based upon the stronger energetic preference for a single cis isomer as well as how close to an ideal square‐planar geometry the Pd center was. Several new low‐symmetry–predicted Pd2L4 cages were experimentally realized.

FIGURE 3.

FIGURE 3

High‐throughput screening of metal‐organic cages. (A) The joint experimental and computational workflow shows relative time frames of each step. (B) Unsymmetrical cage ligand ijk formed from three building blocks from the library (shown in (C) and (D)), where j is a core building block, and i and k are two different coordinating end groups for the ligand, with the connecting points between the building blocks shown with purple circles. (E) Assembly of the Pd2L4 structure from the unsymmetrical ligand and Pd using stk. Figure reused from Ref. 62 with permission from Wiley.

Molecular organic electronics

A further extension of our work has been to molecular organic systems, where we have explored energy–structure–function relationships for chiral organic semiconductors. Experimentally, it is very challenging to distinguish the influence of molecular chirality, molecular changes, and crystal packing on the properties, in particular charge transport. This is because changing any one of these almost inevitably changes the others—slight changes to molecular structure, such as changing a chlorine group to fluorine, can completely change the crystal packing. This type of emergent behavior is distinct from many inorganic solid‐state systems where there are many isostructural systems, for example, perovskites, where it is possible to substitute different cations without changing the structure. Using CSP, we had previously explained the hundred‐fold change in charge mobility when moving from the racemic to enantiopure aza[6]helicene system. 63 This was due to the racemic and enantiopure structures packing in completely different crystal arrangements, which influenced the charge transport routes. Thus, we carried out CSP on a single carbo[6]helicene to explore how changing the crystal structure alone, while keeping the same molecular structure, influenced the charge mobility. 26 We found that particular packing motifs were almost always responsible for high electron mobility and that this packing motif was found in about two‐thirds of the low‐energy hypothetical polymorphs. On this basis, in work by Schmidt et al., we conducted computational molecular‐level screening of >1300 different functionalized [6]helicenes assuming this common packing motif in an attempt to optimize electron mobility without the need for time‐consuming full CSP calculations on this large number of systems. 64 We used DFT calculations to calculate the transfer integral and inner reorganization energy for each system as indicative metrics for high charge mobility in a crystal. This identified that it was very difficult to out‐compete carbo[6]helicene, but that some fluoro[6]helicenes were promising. CSP calculations were used to verify the results for the most promising candidates (Figure 4).

FIGURE 4.

FIGURE 4

Exploring the energy landscapes of chiral organic semiconductors. The energy landscapes calculated via CSP calculations for (A) mono‐ and (B) di‐fluoro[6]helicenes identified from screening >1300 molecules as potential electron transport materials. As full CSP calculations are computationally costly, a molecular‐level screening process was developed that assumed a translational one‐dimensional packing motif that is both common in previously reported [6]helicene experimental structures and present in approximately two‐thirds of hypothetical polymorphs of [6]helicenes where CSP has been carried out. The lowest 10 polymorphs are numbered, and cyan‐colored points contain the assembly motif assumed in the screening workflow, whereas pink points do not have that motif. Figure reused with permission from Ref. 64.

An interesting question is whether generative AI algorithms can produce different ideas or make “wild card” suggestions that a human might not have considered. Yuan et al. tested how recurrent neural networks (RNNs) can perform at producing organic molecules with target optoelectronic properties. 65 As there are no sufficiently large databases of molecules with these properties, the RNN was first trained on a library of more than 1 million organic molecules so that it learnt how to reproduce the syntax of chemical structures. Thereafter, a library of ∼200 molecules with the desired optoelectronic properties was used for transfer learning of the RNN to the production of new optoelectronic molecules with the target properties. In this case, the algorithm “learnt” to do functional transformations, such as switching oxygen for sulfur and adding methyl groups that are approaches used by human chemists in that field for property tuning at a molecular level. While “wild card” suggestions were not found, it remains impressive that the algorithm can learn these key approaches in the field and may indicate that chemists have fully explored the molecular transformation possibilities for this system.

Polymers for photocatalysis and molecular separations

Polymers can have many different applications and we explored with Zwijnenburg and coworkers the chemical space of ∼350,000 different organic linear polymers using stk and a calibrated GFN2‐xTB approach for calculating their optoelectronic properties, including ionization potential, electron affinity, and optical gap. 66 This large dataset allowed for the training of a neural network to predict these properties in seconds as well as an exploration of the property limits for this material class and data‐led insight into the effect of functional groups on these optoelectronic properties. We have also applied a similar approach to study linear polymers for photocatalysis 17 and small molecules as potential dyes for dye‐sensitized solar cells. 67

Polymer membranes have significant potential in a range of separations, including of liquids and of ions, for example, in redox flow batteries. A challenging aspect of the design of these systems is that their amorphous, disordered nature means that there is a lack of understanding of their atomistic structure and the structure–property relationships. Therefore, in this field, we have focused on structure prediction, particularly of polymers of intrinsic microporosity (PIMs) that are designed to pack together inefficiently. Using the polymatic software developed by Colina and coworkers, 11 after loading a simulation cell with the monomers, polymer bonds are iteratively formed, and finally, the structure is annealed. The final structures can be analyzed for their void structure, or through further simulations for other properties to explain the experimental observations (Figure 5). We have used this approach to rationalize trends in properties with changing monomer units for PIMs for liquid separations, 68 the separation of hydrocarbons, 69 and selective ion separation for flow batteries. 70 , 71 , 72 We have also recently extended our approach to the simulation of the structures of amorphous MOFs, allowing an explanation of the level of disorder and defects through comparison to experimental scattering measurements 73 , 74 and unique insight into the effect of defects and disorder on properties that would not be possible through experiment alone. 75

FIGURE 5.

FIGURE 5

Amorphous polymer membrane models analyzed for porosity. An amorphous model of a polymer built from a contorted monomer (reactants shown in the top right). (Left) The atomistic model, excluding hydrogens, with voids colored blue. (Middle) The pore network of the model is colored green, where the pores are interconnected, and so accessible to the diffusion of a small guest, and colored red, where the voids are not interconnected to the guest. (Right) Voids are colored according to their size, with larger voids yellow/red and smaller voids dark blue, allowing insight into the pore size distribution and uniformity of the pore system. Adapted with permission from Ref. 68.

Beyond insight into structure–property relationships in porous polymer membranes, we would ideally be able to rapidly predict properties from knowledge of the monomer structures alone, rather than need to first predict structure. While one can imagine the use of ML for this task, we do not currently have the scale of data needed available to train these models. However, Yuan et al. have recently used ML to fill in missing data (a process known here as “imputation”) for incomplete databases of polymer permeability data. 76 In these databases, there are reports of permeability for some guests, but not all, meaning not all potentially important selectivity performances are known. Through imputation by multiple imputation by chained equations algorithms, the missing data could be used to calculate all selectivities. One can imagine using this to revisit historical data to identify promising polymers that were not fully experimentally characterized, but also so that from only one or two initial experimental data points for a new system, all key selectivities could be calculated in seconds, thus saving experimental effort and directing characterization to the most promising applications. The results also showed that some guests were better indicators where there were sparse data of only a single guest, whereas some were less reliable.

OUTLOOK

The computational discovery of supramolecular materials is relatively underexplored when compared with inorganic materials where there are large‐scale screening efforts and many different initiatives to collate databases of structures and properties. This underexploration is a result of the increased structural complexity of these soft materials, where there are complicating factors, such as the influence of solvent, a large range of structural forms, and often significant dynamic behavior influencing properties. In our own work, we will continue to expand our open‐source software to new classes of materials built from organic building blocks and to new properties and applications. The field could better exploit the potential of AI techniques with community efforts to collate databases of supramolecular structures and their properties, which will also require efforts to standardize reporting of structure, properties, and synthesis routes. Such databases would allow the development of models to predict properties rapidly and to assist in preventing missed opportunities where new material is only tested for the application originally in mind, missing promise in another area. Automation techniques, while challenging to apply in many areas of supramolecular chemistry, not only provide potential routes to larger‐scale experimental screening but also to the generation of data for ML. Continuing to increase the integration of experimental and computational materials research programs will also increase the effectiveness of the discovery progress. Progress is also being assisted by increasing computational and coding literacy in experimental researchers, meaning improved understanding between researchers and the ability to fully exploit the increasingly available open‐source software and data. Overall, there is a bright future for the contribution of computational modeling in the field of supramolecular chemistry.

COMPETING INTERESTS

The author declares no competing interests.

ACKNOWLEDGMENTS

I acknowledge funding from the European Research Council under FP7 (CoMMaD, ERC Grant No. 758370) and the Royal Society for a University Research Fellowship.

Jelfs, K. E. (2022). Computational modeling to assist in the discovery of supramolecular materials. Ann NY Acad Sci., 1518, 106–119. 10.1111/nyas.14913.

Funding information

H2020 European Research Council, Grant/Award Number: CoMMaD, ERC Grant No. 758370; Royal Society, Grant/Award Number: University Research Fellowship

REFERENCES

  • 1. Cheetham, A. K. , Seshadri, R. , & Wudl, F. (2022). Chemical synthesis and materials discovery. Nature Synthesis, 1, 514–520. [Google Scholar]
  • 2. Jansen, M. , & Schon, J. C. (2004). Rational development of new materials putting the cart before the horse? Nature Materials, 3, 838. [DOI] [PubMed] [Google Scholar]
  • 3. Price, S. L. (2014). Predicting crystal structures of organic compounds. Chemical Society Reviews, 43, 2098–2111. [DOI] [PubMed] [Google Scholar]
  • 4. Reilly, A. M. , Cooper, R. I. , Adjiman, C. S. , Bhattacharya, S. , Boese, A. D. , Brandenburg, J. G. , Bygrave, P. J. , Bylsma, R. , Campbell, J. E. , Car, R. , Case, D. H. , Chadha, R. , Cole, J. C. , Cosburn, K. , Cuppen, H. M. , Curtis, F. , Day, G. M. , DiStasio, R. A. Jr , Dzyabchenko, A. , … Groom, CR. (2016). Report on the sixth blind test of organic crystal structure prediction methods. Acta Crystallographica, B72, 439–459. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Otero‐De‐La‐Roza, A. , & Johnson, E. R. (2012). A benchmark for non‐covalent interactions in solids. Journal of Chemical Physics, 137, 054103. [DOI] [PubMed] [Google Scholar]
  • 6. Maurer, R. J. , Freysoldt, C. , Reilly, A. M. , Brandenburg, J. G. , Hofmann, O. T. , Björkman, T. , Lebègue, S. , & Tkatchenko, A. (2019). Advances in density‐functional calculations for materials modeling. Annual Review of Materials Research, 49, 1–30. [Google Scholar]
  • 7. Price, S. L. (2014). Lattice energy, nailed? Science, 345, 619–620. [DOI] [PubMed] [Google Scholar]
  • 8. Santolini, V. , Tribello, G. A. , & Jelfs, K. E. (2015). Predicting solvent effects on the structure of porous organic molecules. Chemical Communications, 51, 15542–15545. [DOI] [PubMed] [Google Scholar]
  • 9. Driver, M. D. , Williamson, M. J. , Cook, J. L. , & Hunter, CA. (2020). Functional group interaction profiles: A general treatment of solvent effects on non‐covalent interactions. Chemical Science, 11, 4456–4466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Joshi, S. Y. , & Deshmukh, S. A. (2020). A review of advancements in coarse‐grained molecular dynamics simulations. Molecular Simulation, 47, 1–18. [Google Scholar]
  • 11. Abbott, L. J. , Hart, K. E. , & Colina, C. M. (2013). Polymatic: A generalized simulated polymerization algorithm for amorphous polymers. Theoretical Chemistry Accounts, 132, 1334. [Google Scholar]
  • 12. Bochicchio, D. , Salvalaglio, M. , & Pavan, G. M. (2017). Into the dynamics of a supramolecular polymer at submolecular resolution. Nature Communications, 8, 147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Grimme, S. , Antony, J. , Ehrlich, S. , & Krieg, H. (2010). A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT‐D) for the 94 elements H‐Pu. Journal of Chemical Physics, 132, 154104. [DOI] [PubMed] [Google Scholar]
  • 14. Price, A. J. A. , Bryenton, K. R. , & Johnson, E. R. (2021). Requirements for an accurate dispersion‐corrected density functional. Journal of Chemical Physics, 154, 230902. [DOI] [PubMed] [Google Scholar]
  • 15. Maitra, N. T. (2016). Perspective: Fundamental aspects of time‐dependent density functional theory. Journal of Chemical Physics, 144, 220901. [DOI] [PubMed] [Google Scholar]
  • 16. Bannwarth, C. , Ehlert, S. , & Grimme, S. (2018). GFN2‐xTB—An accurate and broadly parametrized self‐consistent tight‐binding quantum chemical method with multipole electrostatics and density‐dependent dispersion contributions. Journal of Chemical Theory and Computation, 15, 1652–1671. [DOI] [PubMed] [Google Scholar]
  • 17. Wilbraham, L. , Berardo, E. , Turcani, L. , Jelfs, K. E. , & Zwijnenburg, M. A. (2018). High‐throughput screening approach for the optoelectronic properties of conjugated polymers. Journal of Chemical Information and Modeling, 58, 2450–2459. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Berardo, E. , Turcani, L. , Miklitz, M. , & Jelfs, K. E. (2018). An evolutionary algorithm for the discovery of porous organic cages. Chemical Science, 9, 8513–8527. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Miklitz, M. , Jiang, S. , Clowes, R. , Briggs, M. E. , Cooper, A. I. , & Jelfs, K. E. (2017). Computational screening of porous organic molecules for xenon/krypton separation. Journal of Physical Chemistry C, 121, 15211–15222. [Google Scholar]
  • 20. Jelfs, K. E. (2021). Computer simulation of porous materials. Royal Society of Chemistry. [Google Scholar]
  • 21. Yang, Y. I. , Shao, Q. , Zhang, J. , Yang, L. , & Gao, Y. Q. (2019). Enhanced sampling in molecular dynamics. Journal of Chemical Physics, 151, 070902. [DOI] [PubMed] [Google Scholar]
  • 22. Frentrup, H. , Avenda–o, C. , Horsch, M. , & Müller, E. A. (2012). Modelling fluid flow in nanoporous membrane materials via non–equilibrium molecular dynamics. Procedia Engineering, 44, 383–385. [Google Scholar]
  • 23. Marcus, R. A. (1956). On the theory of oxidation‐reduction reactions involving electron transfer. I. Journal of Chemical Physics, 24, 966–978. [Google Scholar]
  • 24. Day, G. M. , & Cooper, A. (2017). Energy–structure–function maps: Cartography for materials discovery. Advanced Materials, 36, 1704944. [DOI] [PubMed] [Google Scholar]
  • 25. Pulido, A. , Chen, L. , Kaczorowski, T. , Holden, D. , Little, M. A. , Chong, S. Y. , Slater, B. J. , Mcmahon, D. P. , Bonillo, B. , Stackhouse, C. J. , Stephenson, A. , Kane, C. M. , Clowes, R. , Hasell, T. , Cooper, A. I. , & Day, G. M. (2017). Functional materials discovery using energy–structure–function maps. Nature, 543, 657–664. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Rice, B. , Leblanc, L. M. , Otero‐De‐La‐Roza, A. , Fuchter, M. J. , Johnson, E. R. , Nelson, J. , & Jelfs, K. E. (2018). A computational exploration of the crystal energy and charge‐carrier mobility landscapes of the chiral [6]helicene molecule. Nanoscale, 10, 1865–1876. [DOI] [PubMed] [Google Scholar]
  • 27. Dai, C. , Liu, Y. , & Wei, D. (2022). Two‐dimensional field‐effect transistor sensors: The road toward commercialization. Chemical Reviews, 122, 10319–10392. [DOI] [PubMed] [Google Scholar]
  • 28. Yan, Q. , & Kanatzidis, M. G. (2022). High‐performance thermoelectrics and challenges for practical devices. Nature Materials, 21, 503–513. [DOI] [PubMed] [Google Scholar]
  • 29. Zhang, L. , Feng, R. , Wang, W. , & Yu, G. (2022). Emerging chemistries and molecular designs for flow batteries. Nature Reviews Chemistry, 6, 524–543. [DOI] [PubMed] [Google Scholar]
  • 30. Baloch, A. A. B. , Aly, S. P. , Hossain, M. I. , El‐Mellouhi, F. , Tabet, N. , & Alharbi, F. H. (2017). Full space device optimization for solar cells. Scientific Reports, 7, 11984. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Kononova, O. , He, T. , Huo, H. , Trewartha, A. , Olivetti, E. A. , & Ceder, G. (2021). Opportunities and challenges of text mining in materials research. iScience, 24, 102155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Shields, B. J. , Stevens, J. , Li, J. , Parasram, M. , Damani, F. , Alvarado, J. I. M. , Janey, J. M. , Adams, R. P. , & Doyle, A. G. (2021). Bayesian reaction optimization as a tool for chemical synthesis. Nature, 590, 89–96. [DOI] [PubMed] [Google Scholar]
  • 33. Burger, B. , Maffettone, P. M. , Gusev, V. V. , Aitchison, C. M. , Bai, Y. , Wang, X. , Li, X. , Alston, B. M. , Li, B. , Clowes, R. , Rankin, N. , Harris, B. , Sprick, R. S. , & Cooper, A. I. (2020). A mobile robotic chemist. Nature, 583, 237–241. [DOI] [PubMed] [Google Scholar]
  • 34. Pyzer‐Knapp, E. O. , Chen, L. , Day, G. M. , & Cooper, A. I. (2021). Accelerating computational discovery of porous solids through improved navigation of energy–structure–function maps. Science Advances, 7, eabi4763. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Yang, J. , De, S. , Campbell, J. E. , Li, S. , Ceriotti, M. , & Day, G. M. (2018). Large‐scale computational screening of molecular organic semiconductors using crystal structure prediction. Chemistry of Materials, 30, 4361–4371. [Google Scholar]
  • 36. Gardin, A. , Perego, C. , Doni, G. , & Pavan, G. M. (2022). Classifying soft self‐assembled materials via unsupervised machine learning of defects. Communications Chemistry, 5, 82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Bilodeau, C. , Jin, W. , Jaakkola, T. , Barzilay, R. , & Jensen, K. F. (2022). Generative models for molecular discovery: Recent advances and challenges. WIREs Computational Molecular Science, 12(5), e1608. [Google Scholar]
  • 38. Altae‐Tran, H. , Ramsundar, B. , Pappu, A. S. , & Pande, V. (2017). Low data drug discovery with one‐shot learning. ACS Central Science, 3, 283–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Szczypinski, F. T. , Bennett, S. , & Jelfs, K. E. (2021). Can we predict materials that can be synthesised? Chemical Science, 12, 830–840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Coley, C. W. , Green, W. H. , & Jensen, K. F. (2018). Machine learning in computer‐aided synthesis planning. Accounts of Chemical Research, 51, 1281–1289. [DOI] [PubMed] [Google Scholar]
  • 41. Moosavi, S. M. , Chidambaram, A. , Talirz, L. , Haranczyk, M. , Stylianou, K. C. , & Smit, B. (2019). Capturing chemical intuition in synthesis of metal‐organic frameworks. Nature Communications, 10, 539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Greenaway, R. L. , & Jelfs, K. E. (2021). Integrating computational and experimental workflows for accelerated organic materials discovery. Advanced Materials, 33, 2004831. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Hasell, T. , & Cooper, A. I. (2016). Porous organic cages: Soluble, modular and molecular pores. Nature Reviews Materials, 1, 10686. [Google Scholar]
  • 44. Zhang, G. , & Mastalerz, M. (2014). Organic cage compounds – From shape‐persistency to function. Chemical Society Reviews, 43, 1934–1947. [DOI] [PubMed] [Google Scholar]
  • 45. Inokuma, Y. , Kawano, M. , & Fujita, M. (2011). Crystalline molecular flasks. Nature Chemistry, 3, 349–358. [DOI] [PubMed] [Google Scholar]
  • 46. Tranchemontagne, D. J. , Ni, Z. , O'keeffe, M. , & Yaghi, O. M. (2008). Reticular chemistry of metal–organic polyhedra. Angewandte Chemie, 47, 5136–5147. [DOI] [PubMed] [Google Scholar]
  • 47. Miklitz, M. , & Jelfs, K. E. (2018). pywindow: Automated structural analysis of molecular pores. Journal of Chemical Information and Modeling, 58, 2387–2391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Turcani, L. , Berardo, E. , & Jelfs, K. E. (2018). stk: A python toolkit for supramolecular assembly. Journal of Computational Chemistry, 39, 1931–1942. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Turcani, L. , Tarzia, A. , Szczypiński, F. T. , & Jelfs, K. E. (2021). stk: An extendable Python framework for automated molecular and supramolecular structure assembly and discovery. Journal of Chemical Physics, 154, 214102. [DOI] [PubMed] [Google Scholar]
  • 50. Jelfs, K. E. , & Cooper, A. I. (2013). Molecular simulations to understand and to design porous organic molecules. Current Opinion in Solid State & Materials Science, 17, 19–30. [Google Scholar]
  • 51. Bennett, S. , Szczypiński, F. T. , Turcani, L. , Briggs, M. E. , Greenaway, R. L. , & Jelfs, K. E. (2021). Materials precursor score: Modeling chemists’ intuition for the synthetic accessibility of porous organic cage precursors. Journal of Chemical Information and Modeling, 61, 4342–4356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Greenaway, R. L. , Santolini, V. , Pulido, A. , Little, M. A. , Alston, B. M. , Briggs, M. E. , Day, G. M. , Cooper, A. I. , & Jelfs, K. E. (2019). From concept to crystals via prediction: Multi‐component organic cage pots by social self‐sorting. Angewandte Chemie, 131, 16421–16427. [DOI] [PubMed] [Google Scholar]
  • 53. Santolini, V. , Miklitz, M. , Berardo, E. , & Jelfs, K. E. (2017). Topological landscapes of porous organic cages. Nanoscale, 9, 5280–5298. [DOI] [PubMed] [Google Scholar]
  • 54. Greenaway, R. L. , Santolini, V. , Bennison, M. J. , Alston, B. M. , Pugh, C. J. , Little, M. A. , Miklitz, M. , Eden‐Rump, E. G. B. , Clowes, R. , Shakil, A. , Cuthbertson, H. J. , Armstrong, H. , Briggs, M. E. , Jelfs, K. E. , & Cooper, A. I. (2018). High‐throughput discovery of organic cages and catenanes using computational screening fused with robotic synthesis. Nature Communications, 9, 2849. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Jelfs, K. E. , Wu, X. , Schmidtmann, M. , Jones, J. T. A. , Warren, J. E. , Adams, D. J. , & Cooper, A. I. (2011). Large self‐assembled chiral organic cages: Synthesis, structure, and shape persistence. Angewandte Chemie, 50, 10653–10656. [DOI] [PubMed] [Google Scholar]
  • 56. Harder, E. , Damm, W. , Maple, J. , Wu, C. , Reboul, M. , Xiang, J. Y. , Wang, L. , Lupyan, D. , Dahlgren, M. K. , Knight, J. L. , Kaus, J. W. , Cerutti, D. S. , Krilov, G. , Jorgensen, W. L. , Abel, R. , & Friesner, R. A. (2016). OPLS3: A force field providing broad coverage of drug‐like small molecules and proteins. Journal of Chemical Theory and Computation, 12, 281–296. [DOI] [PubMed] [Google Scholar]
  • 57. Turcani, L. , Greenaway, R. L. , & Jelfs, K. E. (2019). Machine learning for organic cage property prediction. Chemistry of Materials, 31, 714–727. [Google Scholar]
  • 58. Yuan, Q. i. , Szczypiński, F. T. , & Jelfs, K. E. (2022). Explainable graph neural networks for organic cages. Digital Discovery, 1, 127–138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Tarzia, A. , & Jelfs, K. E. (2022). Unlocking the computational design of metal–organic cages. Chemical Communications, 58, 3717–3730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Lewis, J. E. M. , Tarzia, A. , White, A. J. P. , & Jelfs, K. E. (2020). Conformational control of Pd2L4 assemblies with unsymmetrical ligands. Chemical Science, 11, 677–683. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Zou, Y.‐Q. , Zhang, D. , Ronson, T. K. , Tarzia, A. , Lu, Z. , Jelfs, K. E. , & Nitschke, J. R. (2021). Sterics and hydrogen bonding control stereochemistry and self‐sorting in BINOL‐based assemblies. Journal of the American Chemical Society, 143, 9009–9015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Tarzia, A. , Lewis, J. E. M. , & Jelfs, K. E. (2021). High‐throughput computational evaluation of low symmetry Pd2L4 cages to aid in system design. Angewandte Chemie, 60, 20879–20887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Yang, Y. , Rice, B. , Shi, X. , Brandt, J. R. , Correa Da Costa, R. , Hedley, G. J. , Smilgies, D.‐M. , Frost, J. M. , Samuel, I. D. W. , Otero‐De‐La‐Roza, A. , Johnson, E. R. , Jelfs, K. E. , Nelson, J. , Campbell, A. J. , & Fuchter, M. J. (2017). Emergent properties of an organic semiconductor driven by its molecular chirality. ACS Nano, 11, 8329–8338. [DOI] [PubMed] [Google Scholar]
  • 64. Schmidt, J. A. , Weatherby, J. A. , Sugden, I. J. , Santana‐Bonilla, A. , Salerno, F. , Fuchter, M. J. , Johnson, E. R. , Nelson, J. , & Jelfs, K. E. (2021). Computational screening of chiral organic semiconductors: Exploring side‐group functionalization and assembly to optimize charge transport. Crystal Growth & Design, 21, 5036–5049. [Google Scholar]
  • 65. Yuan, Q. i. , Santana‐Bonilla, A. , Zwijnenburg, M. A. , & Jelfs, K. E. (2020). Molecular generation targeting desired electronic properties via deep generative models. Nanoscale, 12, 6744–6758. [DOI] [PubMed] [Google Scholar]
  • 66. Wilbraham, L. , Sprick, R. S. , Jelfs, K. E. , & Zwijnenburg, M. A. (2019). Mapping binary copolymer property space with neural networks. Chemical Science, 10, 4973–4984. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Heath‐Apostolopoulos, I. , Vargas‐Ortiz, D. , Wilbraham, L. , Jelfs, K. E. , & Zwijnenburg, M. A. (2021). Using high‐throughput virtual screening to explore the optoelectronic property space of organic dyes; Finding diketopyrrolopyrrole dyes for dye‐sensitized water splitting and solar cells. Sustainable Energy & Fuels, 5, 704–719. [Google Scholar]
  • 68. Jimenez‐Solomon, M. F. , Song, Q. , Jelfs, K. E. , Munoz‐Ibanez, M. , & Livingston, A. G. (2016). Polymer nanofilms with enhanced microporosity by interfacial polymerization. Nature Materials, 15, 760–767. [DOI] [PubMed] [Google Scholar]
  • 69. Thompson, K. A. , Mathias, R. , Kim, D. , Kim, J. , Rangnekar, N. , Johnson, J. R. , Hoy, S. J. , Bechis, I. , Tarzia, A. , Jelfs, K. E. , Mccool, B. A. , Livingston, A. G. , Lively, R. P. , & Finn, M. G. (2020). N‐Aryl–linked spirocyclic polymers for membrane separations of complex hydrocarbon mixtures. Science, 369, 310–315. [DOI] [PubMed] [Google Scholar]
  • 70. Ye, C. , Tan, R. , Wang, A. , Chen, J. , Comesaña Gándara, B. , Breakwell, C. , Alvarez‐Fernandez, A. , Fan, Z. , Weng, J. , Bezzu, C. G. , Guldin, S. , Brandon, N. P. , Kucernak, A. R. , Jelfs, K. E. , Mckeown, N. B. , & Song, Q. (2022). Long‐life aqueous organic redox flow batteries enabled by amidoxime‐functionalized ion‐selective polymer membranes. Angewandte Chemie, 61(38), e202207580. https://onlinelibrary.wiley.com/doi/10.1002/anie.202207580 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Tan, R. , Wang, A. , Malpass‐Evans, R. , Williams, R. , Zhao, E. W. , Liu, T. , Ye, C. , Zhou, X. , Darwich, B. P. , Fan, Z. , Turcani, L. , Jackson, E. , Chen, L. , Chong, S. Y. , Li, T. , Jelfs, K. E. , Cooper, A. I. , Brandon, N. P. , Grey, C. P. , … Song, Q. (2019). Hydrophilic microporous membranes for selective ion separation and flow‐battery energy storage. Nature Materials, 19, 195–202. [DOI] [PubMed] [Google Scholar]
  • 72. Ye, C. , Wang, A. , Breakwell, C. , Tan, R. , Grazia Bezzu, C. , Hunter‐Sellars, E. , Williams, D. R. , Brandon, N. P. , Klusener, P. A. A. , Kucernak, A. R. , Jelfs, K. E. , Mckeown, N. B. , & Song, Q. (2022). Development of efficient aqueous organic redox flow batteries using ionsieving sulfonated polymer membranes. Nature Communications, 13, 3184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Sapnik, A. F. , Bechis, I. , Collins, S. M. , Johnstone, D. N. , Divitini, G. , Smith, A. J. , Chater, P. A. , Addicoat, M. A. , Johnson, T. , Keen, D. A. , Jelfs, K. E. , & Bennett, T. D. (2021). Mixed hierarchical local structure in a disordered metal–organic framework. Nature Communications, 12, 2062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74. Sapnik, A. F. , Bechis, I. , Bumstead, A. M. , Johnson, T. , Chater, P. A. , Keen, D. A. , Jelfs, K. E. , & Bennett, T. D. (2022). Multivariate analysis of disorder in metal–organic frameworks. Nature Communications, 13, 2173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Bechis, I. , Sapnik, A. , Tarzia, A. , Wolpert, E. , Addicoat, M. , Keen, D. , Bennett, T. , & Jelfs, K. (2022). Modeling the effect of defects and disorder in amorphous metal−organic frameworks. 10.26434/chemrxiv-2022-fs1kk [DOI] [PMC free article] [PubMed]
  • 76. Yuan, Q. i. , Longo, M. , Thornton, A. W. , Mckeown, N. B. , Comesaña‐Gándara, B. , Jansen, J. C. , & Jelfs, K. E. (2021). Imputation of missing gas permeability data for polymer membranes using machine learning. Journal of Membrane Science, 627, 119207. [Google Scholar]

Articles from Annals of the New York Academy of Sciences are provided here courtesy of Wiley

RESOURCES