Effect of Partial Charges on the Data-Driven Analytics of CO2 Adsorption Metrics
Abstract
Carbon capture from point-source emitters, such as fossil fuel-based power plants, is a critical tool in our effort to reduce the atmospheric concentration of greenhouse gases. Solid sorbents for CO2 capture offer a promising route to overcome the cost-benefit hurdles of amine-based liquid adsorption. Among the solid materials that have been considered, Metal-Organic Frameworks (MOFs) stand out for their chemical diversity and customisable porous structure. A good carbon capture material must not only exhibit high adsorption capacity, but also high selectivity for CO2 over more abundant post-combustion gas components such as N2. The number of possible MOF materials presents a significant challenge to screening prospective candidates for a defined scenario. The computational expense of simulation techniques such as Grand Canonical Monte Carlo (GCMC) for calculating the molecular adsorption properties of all possible MOF materials renders such an approach impractical. An alternative approach is to derive correlation relationships between the desired adsorption metrics and various geometrical, topological and/or chemical descriptors, which are significantly faster and easier to calculate. These features can then be deployed to build a data-driven model capable of predicting the adsorption performance of candidate materials without performing expensive physics simulations. The target properties required for model training depend upon the parameters of the GCMC method. In this technique, molecular adsorption is modelled as a stochastic process in which the atomic interaction energy plays a fundamental role. This energy can be described by the sum of Van der Waals and Coulomb contributions, and MC moves are accepted or rejected depending on the energy balance of each configuration. For this reason, accurately estimating the magnitude of the partial atomic charges is critical to the success of these simulations, and the target adsorption properties that we employ to correlate and train a data-driven model. In the presented work, we explore the influence of the charge calculation method on the emergence of correlations and on our ability to build data-driven predictive models for adsorption. We tried several methods, ranging from very fast charge-equilibration methods to very accurate DFT-based methods, each one with its associated pros and cons. Our results show how one can achieve an optimal balance between slow (yet accurate) and fast (but less accurate) methods.