For several years in BI(g) Pharma, I was responsible for determining how ongoing changes in US legislation on controlled substances affected the screening collection. The following came about from thoughts on the legal definition of 'positional isomer' and how to identify structures that might fall under that definition. It is based on a manuscript that was approved for publication, but didn't make it into print.
Identification of Positional Isomers subject to US controlled substance legislation
In the course of developing a computational tool to flag structures that could be covered under the US legislation on controlled substances we were challenged by the definition provided for positional isomers.1
‘As used in Sec. 1308.11(d) of this chapter, the term "positional isomer" means any substance possessing the same molecular formula and core structure and having the same functional group(s) and/ or substituent(s) as those found in the respective schedule I hallucinogen, attached at any position(s) on the core structure, but in such manner that no new chemical functionalities are created and no existing chemical functionalities are destroyed relative to the respective schedule I hallucinogen. Rearrangements of alkyl moieties within or between functional group(s) or substituent(s), or divisions or combinations of alkyl moieties, that do not create new chemical functionalities or destroy existing chemical functionalities, are allowed i.e., result in compounds which are positional isomers. For purposes of this definition, the "core structure" is the parent molecule that is the common basis for the class; for example, tryptamine, phenethylamine, or ergoline. Examples of rearrangements resulting in creation and/or destruction of chemical functionalities (and therefore resulting in compounds which are not positional isomers) include, but are not limited to: ethoxy to alpha-hydroxyethyl, hydroxy and methyl to methoxy, or the repositioning of a phenolic or alcoholic hydroxy group to create a hydroxyamine. Examples of rearrangements resulting in compounds which would be positional isomers include: tert- butyl to sec-butyl, methoxy and ethyl to isopropoxy, N,N-diethyl to N- methyl-N-propyl, or alpha-methylamino to N-methylamino.’
The definition is narrower than the IUPAC definition of isomer2 ‘One of several species (or molecular entities) that have the same atomic composition (molecular formula) but different line formulae or different stereochemical formulae and hence different physical and/or chemical properties’. It is broader than the simplest interpretation of positional isomers as species with the same functional groups attached to different positions on the same chain or scaffold. The definition is relevant to some sixty eight substances covered in Schedule 1 (d), an additional five substances temporarily listed in Schedule I, four substances in Schedule III and one substance in Schedule IV as of 3/14/2017.3
The concept that a query structure should match a target where alkyl groups have been split, possibly recombined and reattached to a core complicates any computational matching process. As an example, in Figure 2, for the controlled substance, 2,5-dimethoxy-4-(n)-propylthiophenethylamine (1) whose positional isomers are also controlled,3 the propylthio group can be deconstructed to an ethylthio group with the residual carbon reattached anywhere to the molecule as a methyl group as in 2, or to a methylthio group with the residual two carbons reattached as an ethyl or two methyl substituents as in 3. Compounds 2 and 3 are two example positional isomers of 1 out of a possible sixty one that are consistent with the definition.
As a result, a straightforward virtual fragmentation of the query (1) and target structures (2) and (3) to a core scaffold and attachments, as in 4 (Figure 1), with matching of the fragment sets will not flag these structures as positional isomers, even though such a process would flag ‘standard’ positional isomers. The definition of the core structure as ‘the parent molecule that is the common basis for the class’ presents an additional complication. The core structure in most cases where positional isomers are controlled differs from the computationally accessible Murcko scaffold4, e.g. a benzene ring in contrast to a phenethylamine for 1. This presents a further challenge for scaffold-based matching. While it is conceivable to enumerate all the positional isomers of a query and match these with a target structure we wanted a more general solution.
In considering possible solutions to flagging such positional isomers, we arrived at the conceptual workflow outlined in Figure 2 with query structure 1 and target structure 2 shown as exemplars. In a preliminary step, identical molecular formulae flag target structure 2 as an isomer and possible positional isomer of query 1. Iterative, exhaustive removal of methyl groups from both query and target is then followed by fragmentation to Murcko scaffolds and attachments. Connection points are removed from the Murcko scaffold, but retained in the attachments. Identity at this stage will signify positional isomerism as defined. Computationally this is achieved by matching the contracted representations of the scaffold and attachments. The connection points in the attachments are retained since this allows the differentiation of, for example, ethoxy and alpha-hydroxyethyl substituents as outlined in the description. In the example in Figure 2, identity is also apparent before fragmentation at the point of methyl group removal, however this would not hold if the two methoxy substituents in 2 were 1,2- as opposed to 1,4-disposed.
Computationally, after qualification as isomers based on molecular formulae, the following SMIRKS transformations are applied exhaustively to the query and target sets.
These transformations remove methyl groups attached to carbon atoms and uncharged nitrogen atoms. The different treatment of nitrogen attachments accords with the guidance that ‘hydroxy and methyl to methoxy’ does not constitute positional isomerism, whereas ‘alpha-methylamino to N-methylamino’ does. We considered that removal of a methyl group from a nitrogen atom attached to 4 non hydrogen substituents would constitute creation and/or destruction of chemical functionalities in that, for example, a quaternary salt would transform to a tertiary amine or a tertiary amine N-oxide would transform to a hydroxylamine tautomer both of which create or destroy functionality. This process is embedded in a larger Pipeline Pilot protocol that also encompasses the isomers, esters, ethers and salts of the legislation and is designed to flag possibly controlled matches across the broader scope of the legislation. The portion of the protocol that handles the positional isomers is illustrated in Figure 3.