To understand how Regulatory Agencies can use digital technology to support assessment and decision-making process. On the other hand, to recognise how Industry can use the same type of technology to predict outcome of the assessment and adjust drug development in order to avoid failure.
The European Medicines Agency (EMA) recommends initiatives to develop digital technology and artificial intelligence and understand how such data and analysis can be used to support regulatory decisions. In this context, Sheffield University conducted a scientific and regulatory project with main objective to correlate CHMP assessment with computers software in the context of the orphan drug legislation. The main objective was to establish a robust web application that can be easily used by assessors and industry across Europe in order to calculate measures of structural similarity by using a software containing selected 2D fingerprints. Based on the measure of structural similarity, the software is able to calculate the probability of a medicine for a rare disease being considered similar to orphan drug(s) already authorised by using validated mathematical models, established as a result of similarity assessments previously collected during a similarity survey involving assessors. As a result, assessors are able to use and include this information in their assessment reports on similarity, Industry is able to predict the outcome of possible CHMP assessment and adapt their strategy in terms of regulatory scientific requirements.
The project started with an analysis of the correlation between human and computed judgements of similarity for 100 pairs of molecules chosen from the Drug Bank 3.0 database. The human similarity assessments for these pairs of molecules were obtained from a total of 143 experts from Europe, Asia and the US, with the experts being asked to state whether each pair was, or was not, similar. The percentage of the experts judging a pair to be similar was then compared to the Tanimoto coefficient computed using a range of different types of descriptors (1D, 2D and 3D), with the aim of identifying those descriptors that correlated most closely with the human judgments.
Logistic regression models were developed for each type of descriptor, relating the Tanimoto similarity for a pair of molecules computed with the probability of the human experts regarding that pair as being similar. The resulting regression models were then validated using a separate test-set containing 100 pairs of molecules that had previously been evaluated by the European Medicines Agency in the context of the authorisation of medicines for rare diseases. The best models were able to reproduce over 95% of the human judgments. This success rate was increased to 98-99% using a simple data fusion approach in which a pair of molecules is classified as similar (or non-similar) when three or more of the individual fingerprints are in agreement.