Publications relevant to MOPAC

Proteins

An approach to creating a more realistic working model from a protein data bank entry Supplementary material (not very useful!) (4.7Mb, in ZIP format)

Brandon CJ, Martin BP, McGee KJ, Stewart JJP, Braun-Sand SB (2015) An approach to creating a more realistic working model from a protein data bank entry. J Mol Modeling 21:1:11

Abstract

An accurate model of three-dimensional protein structure is important in a variety of fields such as structure-based drug design and mechanistic studies of enzymatic reactions. While the entries in the Protein Data Bank (http://www.pdb.org) provide valuable information about protein structures, a small fraction of the PDB structures were found to contain anomalies not reported in the PDB file. The semiempirical PM7 method in MOPAC2012 was used for identifying anomalously short hydrogen bonds, C?H^?O/C?H^?N interactions, non-bonding close contacts, and unrealistic covalent bond lengths in recently published Protein Data Bank files. It was also used to generate new structures with these faults removed. When the semiempirical models were compared to those of PDB_REDO (http://www.cmbi.ru.nl/pdb_redo/), the clashscores, as defined by MolProbity (http://molprobity.biochem.duke.edu/), were better in about 50 % of the structures. The semiempirical models also had a lower root-mean-square-deviation value in nearly all cases than those from PDB_REDO, indicative of a better conservation of the tertiary structure. Finally, the semiempirical models were found to have lower clashscores than the initial PDB file in all but one case. Because this approach maintains as much of the original tertiary structure as possible while improving anomalous interactions, it should be useful to theoreticians, experimentalists, and crystallographers investigating the structure and function of proteins.

Accuracy issues involved in modeling in vivo protein structures using PM7 Files mentioned (1.4Mb, in ZIP format)

Martin BP, Brandon CJ, Stewart JJ, Braun-Sand SB (2015) Accuracy issues involved in modeling in vivo protein structures using PM7. Proteins: Structure, Function, and Bioinformatics 83 (8):1427-1435

Abstract

Using the semiempirical method PM7, an attempt has been made to quantify the error in prediction of the in vivo structure of proteins relative to X-ray structures. Three important contributory factors are the experimental limitations of X-ray structures, the difference between the crystal and solution environments, and the errors due to PM7. The geometries of 19 proteins from the Protein Data Bank that had small R values, that is, high accuracy structures, were optimized and the resulting drop in heat of formation was calculated. Analysis of the changes showed that about 10% of this decrease in heat of formation was caused by faults in PM7, the balance being attributable to the X-ray structure and the difference between the crystal and solution environments. A previously unknown fault in PM7 was revealed during tests to validate the geometries generated using PM7. Clashscores generated by the Molprobity molecular mechanics structure validation program showed that PM7 was predicting unrealistically close contacts between nonbonding atoms in regions where the local geometry is dominated by very weak noncovalent interactions. The origin of this fault was traced to an underestimation of the core-core repulsion between atoms at distances smaller than the equilibrium distance.

A comparison of X-ray and calculated structures of the enzyme MTH1 Files mentioned (10.3Mb, in ZIP format)

Ryan H, Carter M, Stenmark P, Stewart JJ, Braun-Sand SB, Journal of Molecular Modeling (2016) 22 (7):1-18

Abstract

Modern computational chemistry methods provide a powerful tool for use in refining the geometry of proteins determined by X-ray crystallography. Specifically, computational methods can be used to correctly place hydrogen atoms unresolved by this experimental method and improve bond geometry accuracy. Using the semiempirical method PM7, the structure of the nucleotide-sanitizing enzyme MTH1, complete with hydrolyzed substrate 8-oxo-dGMP, was optimized and the resulting geometry compared with the original X-ray structure of MTH1. After determining hydrogen atom placement and the identification of ionized sites, the charge distribution in the binding site was explored. Where comparison was possible, all the theoretical predictions were in good agreement with experimental observations. However, when these were combined with additional predictions for which experimental observations were not available, the result was a new and alternative description of the substrate-binding site interaction. An estimate was made of the strengths and weaknesses of the PM7 method for modeling proteins on varying scales, ranging from overall structure to individual interatomic distances. An attempt to correct a known fault in PM7, the under-estimation of steric repulsion, is also described. This work sheds light on the specificity of the enzyme MTH1 toward the substrate 8-oxo-dGTP; information that would facilitate drug development involving MTH1.

A method for predicting individual residue contributions to enzyme specificity and binding site energies, and its application to MTH1 Supplementary material

James J. P. Stewart, Journal of Molecular Modeling (2016) 22: 259.

Abstract

A new method for predicting the energy contributions to substrate binding and to specificity has been developed. Conventional global optimization methods do not permit the subtle effects responsible for these properties to be modeled with sufficient precision to allow confidence to be placed in the results, but by making simple alterations to the model, the precision of the various energies involved can be improved from of about ?2 kcal∙mol-1 to ?0.1 kcal∙mol-1. This technique was applied to the oxidized nucleotide pyrophosphohydrolase enzyme MTH1. MTH1 is unusual in that the binding and reaction sites are well-separated, an advantage from a computational chemistry perspective that allows the energetics involved in docking to be modeled without the need to consider any issues relating to reaction mechanisms. In this study, two types of energy terms were investigated: the non-covalent interactions between the binding site and the substrate, and those responsible for discriminating between the oxidized nucleotide 8-oxo-dGTP and the normal dGTP. Both of these were investigated using the semiempirical method PM7 in the program MOPAC. Individual contributions from each residue to both the binding energy and the specificity of MTH1 were calculated by simulating the effect of mutations. Where comparisons were possible, all calculated results were in agreement with experimental observations. This technique provides a new insight into the binding mechanism that enzymes use for discriminating between possible substrates.

An investigation into the applicability of the semiempirical method PM7 for modeling the catalytic mechanism in the enzyme Chymotrypsin Supplementary material

James J. P. Stewart, Journal of Molecular Modeling (2017) 23: 154. doi:10.1007/s00894-017-3326-8

Abstract

The complete catalytic cycle for the serine protease a-chymotrypsin was investigated in an attempt to determine the suitability of using the semiempirical method PM7 in the program MOPAC for investigating enzyme‑catalyzed reactions. All six classical intermediates were modeled using standard methods, and were characterized as stable minima on the potential energy surface. Using a modified saddle point optimization method, five transition states were located and verified both by vibrational and by intrinsic reaction coordinate analysis. Some individual features, such as the hydrogen bonds in the oxyanion hole, the nature of various electrostatic interactions, and the role of Met192 were examined. This involved designing and running computational experiments to model mutations that would allow features of interest to be isolated.

Three features within the enzyme were examined in detail: the reaction site itself, where covalent bonds were made and broken, the electrostatic effects of the buried aspartate anion, a passive but essential component of the catalytic triad, and the oxyanion hole, where hydrogen bonds help stabilize charged intermediates.

With one minor exception, all phenomena investigated agreed with previously-reported descriptions. This result, along with the fact that all the techniques used were relatively straightforward, leads to the recommendation that PM7 and similar methods, such as PM6-D3H4, are appropriate for modeling similar enzyme-catalyzed reactions.

Experimental and Computational Snapshots of C-C Bond Formation in a C-Nucleoside Synthase Supplementary material

Wenbo Li, Georgina C. Girt, Ashish Radadiya, James J. P. Stewart, Nigel G. J. Richards, and James H. Naismith, Open Biology (2022)

Abstract

Remarkably little is known about the structural and mechanistic enzymology of C-C bond formation in C-nucleoside and C-nucleotide biosynthesis. One of these enzymes, ForT, catalyzes the coupling of 4-amino-1H-pyrazoledicarboxylic acid to MgPRPP with the concomitant loss of CO₂ and inorganic pyrophosphate. The transformation catalyzed by ForT is of chemical interest because it is one of only a few examples in which C-C bond formation takes place via an electrophilic substitution of a small, aromatic heterocycle. In addition, ForT is capable of discriminating between the aminopyrazoledicarboxylic acid and an analog in which the amine is replaced by a hydroxyl group; a remarkable feat given the steric and electronic similarities of the two molecules. Here we report biophysical measurements, structural biology, and quantum chemical calculations that provide a detailed molecular picture of ForT-catalyzed C-C bond formation and the conformational changes that are coupled to catalysis. Our findings set the scene for employing ForT in the biocatalytic production of novel, anti-viral C-nucleoside and C-nucleotide analogs.

A semiempirical method optimized for modeling proteins Supplementary material

James J. P. Stewart and Anna C. Stewart, Journal of Molecular Modeling (2023) 29:284 https://doi.org/10.1007/s00894-023-05695-1

Abstract

In recent years, semiempirical methods such as PM6, PM6-D3H4, and PM7 have been increasingly used for modeling proteins, in particular enzymes. These methods were designed for more general use, and consequently were not optimized for studying proteins. Because of this, various specific errors have been found that could potentially cast doubt on the validity of these methods for modeling phenomena of biochemical interest such as enzyme catalytic mechanisms and protein-ligand interactions. To correct these and other errors, a new method specifically designed for use in organic and biochemical modeling has been developed.

Two alterations were made to the procedures used in developing the earlier PMx methods. A minor change was made to the theoretical framework, which affected only the non-quantum theory interatomic interaction function, while the major change involved changing the training set for optimizing parameters, moving the focus to systems of biochemical significance. This involved both the selection of reference data and the weighting factors, i.e., the relative importance that the various data were given. As a result of this change of focus, the accuracy in prediction of heats of formation, hydrogen bonding, and geometric quantities relating to non-covalent interactions in proteins was improved significantly.

Prediction of enzyme inhibition (IC₅₀) using a combination of protein/ligand docking and semiempirical quantum mechanics Supplementary material

Robert C. Glen, Jason C. Cole, and James J. P. Stewart, Journal of Molecular Modeling (2025) 31:209 DOI: https://doi.org/10.1007/s00894-025-06423-7

Abstract

The ability to predict the relative binding energies of ligands to a biological receptor would be of great value in drug discovery. However, accurately calculating the predicted binding energies is limited by the high accuracy required, by the presence of multiple minima on the potential energy surface, and by issues specific to the intrinsic properties of the binding site, such as details of the geometry of the ligand - protein complex. To address these issues, a systematic analysis of potential sources of error was carried out which resulted in a few relatively small changes being made to the MOPAC program.

A set of 77 ligands was constructed for which experimentally-determined IC50 values were available. For each of the ligands, prediction of the protein - ligand interaction energy was carried out in two distinct stages. In the first stage, the Protein-Ligand docking program GOLD was used to generate several distinct conformations of the ligand bound to a protein. The geometries of these systems were then optimised using the MOPAC program. A comparison of the relative binding energies of the ligands with the reported IC₅₀ values showed a very poor predictive power. By partitioning the ligand set into two subsets, and eliminating six ligands that were inconsistent with the experimental results, a large increase in accuracy was obtained.

Publications relevant to MOPAC

Proteins

An approach to creating a more realistic working model from a protein data bank entry Supplementary material (not very useful!) (4.7Mb, in ZIP format)

Abstract

Accuracy issues involved in modeling in vivo protein structures using PM7 Files mentioned (1.4Mb, in ZIP format)

Abstract

A comparison of X-ray and calculated structures of the enzyme MTH1 Files mentioned (10.3Mb, in ZIP format)

Abstract

A method for predicting individual residue contributions to enzyme specificity and binding site energies, and its application to MTH1 Supplementary material

Abstract

An investigation into the applicability of the semiempirical method PM7 for modeling the catalytic mechanism in the enzyme Chymotrypsin Supplementary material

Abstract

Experimental and Computational Snapshots of C-C Bond Formation in a C-Nucleoside Synthase Supplementary material

Abstract

A semiempirical method optimized for modeling proteins Supplementary material James J. P. Stewart and Anna C. Stewart, Journal of Molecular Modeling (2023) 29:284 https://doi.org/10.1007/s00894-023-05695-1

Prediction of enzyme inhibition (IC50) using a combination of protein/ligand docking and semiempirical quantum mechanics Supplementary material

Abstract

A semiempirical method optimized for modeling proteins Supplementary material

James J. P. Stewart and Anna C. Stewart, Journal of Molecular Modeling (2023) 29:284 https://doi.org/10.1007/s00894-023-05695-1

Prediction of enzyme inhibition (IC₅₀) using a combination of protein/ligand docking and semiempirical quantum mechanics Supplementary material