Reactions are fundamental events in chemistry, biochemistry, and thus in life. As such, a cheminformatics toolkit cannot do without a reaction framework. This chapter will outline the reaction data model present in the CDK. It will first outline the core data interfaces, and how they can be used.

A single reaction

A single reaction consists of reacting chemical and the products of the reaction. Optionally, a reaction can be catalyzed. This idea is captured in the IReaction interface, which directly extends the IChemObject interface. Let’s consider the following reaction:

2 H3COH H3O+ H3COCH3 + H2O

This reaction between two methanol molecules is catalyzed by acid and results in methoxymethane and water. To encode this into a CDK data model, we need to set the reaction coefficient, the reactants, products, and catalyst. The latter is called an agent in the data model. We know how to create molecules and that will not be repeated here. Given these atom containers, we create this reaction with:

Script 8.1 code/MethanolReaction.groovy

reaction = new Reaction()
reaction.addReactant(methanol, (double)2.0)

This example shows we can set the reaction direction too. We can list the balance directions that are available by the Direction enum:

Script 8.2 code/ReactionDirections.groovy

IReaction.Direction.each {
  println it

which returns us these current options:


There are matching get methods to access all reactants and products:

Script 8.3 code/ReactionGetters.groovy

println "Reactants: "
for (reactant in reaction.reactants.atomContainers()) {
  formula = MolecularFormulaManipulator
  println MolecularFormulaManipulator
println "Products: "
for (product in reaction.products.atomContainers()) {
  formula = MolecularFormulaManipulator
  println MolecularFormulaManipulator

This scripts takes advantage of the MolecularFormulaManipulator class (see Section 4.4) and outputs the molecular formula of the reactants and products:


Reaction from File

There are a few file formats that can store reaction. This short paragraph will give some quick pointers which these are, and how files in that format can be read into a data model. The full IO details are presented in Chapter 12.

MDL RXN files

The first, and likely more common format, is the MDL RXN file format. This format basically consists of a special concatenation of MDL molfiles. The MDLRXNReader will read the content from such files into a IReaction class:

Script 8.4 code/ReactionMDLRXN.groovy

MDLRXNReader reader = new MDLRXNReader(
  new File("data/anie.201203222.rxn").newReader()
IReaction reaction = new Reaction();
reaction =;
println "Reactants: " + reaction.reactants.atomContainerCount
println "Products: " + reaction.products.atomContainerCount

From there on, we can easily extract the reaction details:

Reactants: 1
Products: 1

CMLReact files

There is also a CML extension for reactions [1]. But because CML files can contain a lot of information, we read an IChemFile from this file, and extract the IReaction from that:

Script 8.5 code/ReactionCMLReact.groovy

CMLReader reader = new CMLReader(
  new File("data/anie.201203222.cml").newInputStream()
IChemFile file = new ChemFile();
reaction =;
sequence = file.getChemSequence(0)
model = sequence.getChemModel(0)
reactions = model.getReactionSet()
reaction = reactions.getReaction(0)
println "Reactants: " + reaction.reactants.atomContainerCount
println "Products: " + reaction.products.atomContainerCount

But once down to the IReaction, we are back in business:

Reactants: 1
Products: 1


  1. Holliday GL, Murray-Rust P, Rzepa HS. Chemical Markup, XML, and the World Wide Web. 6. CMLReact, an XML Vocabulary for Chemical Reactions. JCIM. 2006 Jan;46(1):145–57. doi:10.1021/CI0502698 (Scholia)