This week the journal Nature reported on the latest chemistry open-access initiative called chemspider at www.chemspider.com (DOI). This database aims to be the chemical counterpart of biology PubChem and already contains over 20 million chemical compounds. Both free-access initiatives intent to make life miserable for the Chemical Abstracts run by the American Chemical Society. The chemspider venture is privately funded and attracts revenues with online advertising.
A basic search for a chemical gives an extensive list of synonyms, chemical structure, SMILES, InChI, and a section on predicted properties. A promising tool is the semi-graphical substructure search like the one implemented in the online Aldrich catalogue. The chemical elements search allows you to select any compound with nitrogen, oxygen and sulfur but not containing carbon or phosphorous (74 hits).
There is a downside. The application depends on aggregating information from many other databases which introduces the risk that a single mistake in one database is perpetuated in many others. The application also relies on contributions made by members of the chemical community and active user contributions are always a bottleneck in any Internet venture.
The system is not yet flawless.
When searching for compounds with chemical formula C6H6, chemspider comes up with 24 isomers of benzene, not only the classic isomers such as Dewar benzene, benzvalene, prismane and fulvene but also many linear polyenes and polyynes. So far so good. But 4 structures are actually ions such as the dianion of cyclohexadiene which should not count as a discrete compound. The cyclohexane hexaanion C6H6 is highly esoteric.
A Simple search for formula C47H51NO14 (Taxol) results in 53 hits but all compounds listed are in fact taxol. In the datasheet there is no indication given that the relevant taxol molecule is in fact chiral and although the boiling point is listed as 957 °C it will probably decompose well before that temperature.