Reactions are fundamental events in chemistry, biochemistry, and thus in life. As such, a cheminformatics toolkit cannot do without a reaction framework. This chapter will outline the reaction data model present in the CDK. It will first outline the core data interfaces, and how they can be used.
A single reaction consists of reacting chemical and the products of the reaction. Optionally, a reaction
can be catalyzed. This idea is captured in the IReaction
interface, which directly extends the
IChemObject
interface. Let’s consider the following reaction:
2 H3COH H3O+ H3COCH3 + H2O
This reaction between two methanol molecules is catalyzed by acid and results in methoxymethane and water. To encode this into a CDK data model, we need to set the reaction coefficient, the reactants, products, and catalyst. The latter is called an agent in the data model. We know how to create molecules and that will not be repeated here. Given these atom containers, we create this reaction with:
Script 8.1 code/MethanolReaction.groovy
reaction = new Reaction()
reaction.addReactant(methanol, (double)2.0)
reaction.setDirection(IReaction.Direction.FORWARD)
reaction.addAgent(acid)
reaction.addProduct(dimethoxymethane)
reaction.addProduct(water)
This example shows we can set the reaction direction too. We can list the balance directions that
are available by the Direction
enum:
Script 8.2 code/ReactionDirections.groovy
IReaction.Direction.each {
println it
}
which returns us these current options:
FORWARD
BACKWARD
BIDIRECTIONAL
NO_GO
RETRO_SYNTHETIC
RESONANCE
There are matching get methods to access all reactants and products:
Script 8.3 code/ReactionGetters.groovy
println "Reactants: "
for (reactant in reaction.reactants.atomContainers()) {
formula = MolecularFormulaManipulator
.getMolecularFormula(reactant)
println MolecularFormulaManipulator
.getString(formula)
}
println "Products: "
for (product in reaction.products.atomContainers()) {
formula = MolecularFormulaManipulator
.getMolecularFormula(product)
println MolecularFormulaManipulator
.getString(formula)
}
This scripts takes advantage of the MolecularFormulaManipulator
class (see Section 4.4)
and outputs the molecular formula of the reactants and products:
Reactants:
CH4O
Products:
C2H6O
H2O
There are a few file formats that can store reaction. This short paragraph will give some quick pointers which these are, and how files in that format can be read into a data model. The full IO details are presented in Chapter 12.
The first, and likely more common format, is the MDL RXN file format. This format basically consists of
a special concatenation of MDL molfiles. The MDLRXNReader
will read the content from such files into
a IReaction
class:
Script 8.4 code/ReactionMDLRXN.groovy
MDLRXNReader reader = new MDLRXNReader(
new File("data/anie.201203222.rxn").newReader()
);
IReaction reaction = new Reaction();
reaction = reader.read(reaction);
reader.close();
println "Reactants: " + reaction.reactants.atomContainerCount
println "Products: " + reaction.products.atomContainerCount
From there on, we can easily extract the reaction details:
Reactants: 1
Products: 1
There is also a CML extension for reactions [1]. But because CML files can contain a lot
of information, we read an IChemFile
from this file, and extract the IReaction
from that:
Script 8.5 code/ReactionCMLReact.groovy
CMLReader reader = new CMLReader(
new File("data/anie.201203222.cml").newInputStream()
);
IChemFile file = new ChemFile();
reaction = reader.read(file);
reader.close();
sequence = file.getChemSequence(0)
model = sequence.getChemModel(0)
reactions = model.getReactionSet()
reaction = reactions.getReaction(0)
println "Reactants: " + reaction.reactants.atomContainerCount
println "Products: " + reaction.products.atomContainerCount
But once down to the IReaction
, we are back in business:
Reactants: 1
Products: 1