Fun with OPSIN

10 November 2012 - CRD update

In the meanwhile we are making progress on our Chemical Reaction Database (CRD). Entering chemical reaction data into a database is time consuming and every tool we can get our hands on to speed things up is very welcome. In a previous episode we have been looking at Open Babel for easy generation of images from a SMILES string.
Another tool is called OPSIN, introduced in 2011 by the Unilever Centre for Molecular Science Informatics in Cambridge (10.1021/ci100384d DOI). It is an open-source web service for converting IUPAC systematic names into SMILES, inChi and CML strings. The chemical journals do not publish them so getting the SMILES codes with every molecule involved in a reaction would require paying a visit to the PubChem sketcher and draw the molecule first.
The OPSIN tool should then make live easier. The Computer Lab surrendered the necessary code (under 20 lines!) to the CRD project and the first OPSIN results are really good. Collecting SMILES,InChi and CML from a single chemical name input takes a fraction of a second. So far it has managed 2'-aminoacetophenone, 2-aminoacetophenone, sodium hydroxide, 3-Methyl-1H-indazole without complaints. Di-µ-bromobis(tri-tert-butylphosphine)dipalladium returns a 400. The SMILES string is directly routed to the Open Babel tool so actually drawing molecules is a thing of the past.