Reaction mapping exercise

13 October 2023 - Cheminformatics

Having a collection of organic reactions is one thing but having a collection of reliable organic reaction is quite another. When is a reaction reliable? The reactants and the products must make sense: ideally forming a balanced equation and with functional groups not involved in the reaction taking place unchanged and with conservation of topology and stereochemistry. Of course the reagents, solvents and catalysts used must be realistic and reaction time and reaction temperature reasonable.

This issue is as yet unsolved as far as the CRD is concerned. Collecting reactions is not manual labour but heavily automated and copying errors can occur and have certainly occurred. Hurdles to tackle are optical character recognition (OCR) errors (is it THF or THl?), paragraph errors, missing compound errors (the infamous "compound 10"), heavy use of jargon ("Xphos Pd G2") and errors in assigning a role to a reaction participant.

Enter RXNMapper, the AI reaction mapper introduced in 2021 by IBM Research (Schwaller et al. DOI) The demo section at the rxnmapper.ai website brings home the central message: feed it any reaction SMILES and out pops an reaction image with all atoms labeled with numbers in the reactants and again in the products making it clear where each atoms ends up in a chemical conversion. Can RXNMapper be used to weed out improbable reactions? It is not obvious from the demo but each reaction mapping result is accompanied by a confidence score. Surely, a probable reaction will score high and an improbable one low? That would surely help in de quest for a sane dataset.

The reaction mapper code is open-source and available on Github. It can be run on a consumer Macos laptop and total running time for 244K reactions (reagents and solvents omitted from the reaction SMILES) was about a week. The end result is a csv file with reaction handles and a confidence score. In a visualization confidences are broken up in 0.05 intervals form zero to one. If all 244K reaction s are reliable one would expect all of them in the 0.95 - 1.0 interval but it is obvious that only 40% of the reactions are in the top 10% range, a disappointing overall result. But are all reactions in the 0 - 0.2 interval unreliable? Not really. In fact, in a random sample it is difficult to find dodgy reactions,

Bona-fide reactions can be found, a nucleophilic substitution here, acylations here and a Mitsunobi reaction here (depicted below).

Weird but valid reactions are found for example here . invalid reactions but-you-have-to-look-twice can found for example here valid reactions but-with-a-irrelevent-reagent can be found for example here invalid reactant mix-ups can be found for example here and blatantly invalid reactions can be found for example here (raspberry ketone?). A special prize is reserved for the entry here (depicted below), again a reaction that managed to evade all quality checks. It pretends to be a reaction involving sorbitol but the patent is actually about fentanyl laced lollypops.

In conclusion, the confidence score a relevant metric for reaction sanety? Not yet. It is reassuring that the confidence plot looks similar to the plot included in the supporting information of the IBM article but binning 60% of the dataset?. A rethink is needed on how to construct the reaction SMILES and on the selection of the dataset. The quest continues.