Going from one CDK release to another brings in API changes. While the project tries to keep the number of changes minimal, these are inevitible. This chapter discusses some API changes, and shows code examples on how to change your code. The following sections discuss the migration between various versions.
The set of changes include changed class names. For example, the CDK 1.2
class MDLWriter
is now called MDLV2000Writer
to reflect the
V2000 version of the MDL formats.
The time out in AllRingsFinder
has been replaced by a treshold that reflects
the percentage of PubChem for which the algorithm finishes. Use the new
AllRingsFinger(Treshold)
constructor, instead.
This section highlights the important API changes between the CDK 1.4 and 2.0 series. Innovations of CDK 2.0 are described in [1].
Several classes have been removed in this version, for example, because they are superceeded by other code or were considered redundant.
The NoNotificationChemObjectBuilder
and the matching implementation
classes are removed. Please use the SilentChemObjectBuilder
instead.
The same, of course, applies to all implementation classes. For example,
NNMolecule
is removed.
The IMolecule
interface and all implementing classes have been
removed. They were practically identical in functionality to the
IAtomContainer
interface, except the implication that the
IMolecule
was for fully connected structures only. This separation
was found to be complicated, and was therefore removed. Please use the
IAtomContainer
interface instead.
Generally, IMolecule
, IMoleculeSet
, Molecule
,
and MoleculeSet
can be replaced with the ‘atomcontainer’ equivalents.
Additionally, for IMoleculeSet
you may also have to replace
use of methods like getMoleculeCount()
with their matching getAtomContainerCount()
.
Strictly speaking the MDL files span a set of files and a SD file is different
from a molfile. This is reflected in the reader name change:
IteratingMDLReader
is now called IteratingSDFReader
.
The method findMatchingAtomTypes
of the CDKAtomTypeMatcher
gained a ‘s’ and was previously called findMatchingAtomType
. The new
name is more consistent, reflecting the fact that it perceives atom types
for all atoms.
Some classes and methods have the same API, but have slightly different
behavior as before. For example, the SmilesGenerator
now requires
that all atoms have implicit hydrogen counts set. This can be done with
the CDKHydrogenAdder
as explained in Section 15.4.
The advantage of the builders in the CDK is that code can be independent of
data class implementations (and we have three of them in CDK 1.6, at this
moment). Over the past years more and more code started using the approach,
but that does involve that more and more class constructors take a
IChemObjectBuilder
. CDK 1.6 has two more constructors that now take
a builder.
The DescriptorEngine
constructor is changed to now take a
IChemObjectBuilder
which is needed to initialize descriptor instances.
The second constructor that now needs a IChemObjectBuilder
is that of the
SMARTSQueryTool
. Here it is passed on to the SMARTSParser
which
needs it for its data structure for the matching.
The getInstance()
method of the ModelBuilder3D
class now also
requires a IChemObjectBuilder
.
A significant change in the CDKAtomTypeMatcher
behavior is that it now
returns a special ‘X’ atom type when no atom type could be perceived.
See Section 13.2.
Some previously static methods are no longer, and now require the instantiation of the class.
The UniversalIsomorphismTester
is an example class that now needs to be
instantiated. However, the class is easy to instantiate. For example:
Script 20.1 code/Isomorphism.groovy
butane = MoleculeFactory.makeAlkane(4);
isomorphismTester = new UniversalIsomorphismTester()
println "Is isomorphic: " +
isomorphismTester.isIsomorph(
butane, butane
)
A major API change happened around the IsotopeFactory
. Previously, this
class was used to get isotope information, which it gets from an configurable XML
file. This functionality is now available from the XMLIsotopeFactory
class.
However, to improve the speed of getting basic isotope information as well as to
reduce the size of the core modules, CDK 1.6 introduces a Isotopes
class,
which contains information extracted from the XML file, but is available as a pure
Java class. The APIs for getting isotope information is mostly the same, but the
instantiation is much simpler, and also no longer requires an IChemObjectBuilder
:
Script 20.2 code/IsotopesDemo.groovy
isofac = Isotopes.getInstance();
uranium = 92;
for (atomicNumber in 1..uranium) {
element = isofac.getElement(atomicNumber)
}
The IFingerprinter
API was changed to accomodate for two types of fingerprints:
the bit fingerprint, outlined by the IBitFingerprint
interfaces, and
the count fingerprint, defined in the ICountFingerprint
interface. The
IFingerprinter
interface now defines getRawFingerprint(IAtomContainer)
,
getCountFingerprint(IAtomContainer)
, and getBitFingerprint(IAtomContainer)
.
These methods returns various kind of fingerprints. For example,
getRawFingerprint(IAtomContainer)
returns a Map
with strings representing
the various parts of the fingerprint as well as the matching count, and it is this
map that is used as input to the getCountFingerprint(IAtomContainer)
method,
which returns this information as a ICountFingerprint
implementation. If the
count for each bit is not important, the getBitFingerprint(IAtomContainer)
method
can be used, which returns a IBitSetFingerprint
implementation.
Because the previous Fingerprinter
interface did not include the counting of
how often a bit was set, implementing the new getRawFingerprint(IAtomContainer)
method
will likely take some effort, but the other two methods can in many cases just wrap
other methods in the class, as shown in this example code:
Script 20.3 code/FingerprinterMigration.java
public ICountFingerprint getCountFingerprint(
IAtomContainer molecule
) throws CDKException {
return new IntArrayCountFingerprint(
getRawFingerprint(molecule)
);
}
public IBitFingerprint getBitFingerprint(
IAtomContainer molecule
) throws CDKException {
return new BitSetFingerprint(
getFingerprint(molecule)
);
}
}
The SMILES stack is replaced in this CDK version. This introduces a few API changes,
outlined here. The new code base is much faster and more functional that what the CDK
had before. Below are typical new SmilesGenerator
API usage.
Generating unique SMILES is done slightly differently, but elegantly:
Script 20.4 code/UniqueSMILES.groovy
generator = SmilesGenerator.unique()
smiles = generator.createSMILES(mol)
println "$smiles"
Because SMILES with lower case element symbols reflecting aromaticity has less
explicit information, it is not my suggestion to use. Still, I know that some of you
are keen on using it, for various sometimes logical reasons, so here goes. Previously,
you would use the setUseAromaticityFlag(true)
method for this, but you can now
use instead:
Script 20.5 code/AromaticSMILES.groovy
generator = SmilesGenerator.generic().aromatic()
smiles = generator.createSMILES(mol)
println "$smiles"
Aromaticity is differently calculated now, see Section 18.5.