Getting to the heart of synthetic data
Words by Danny Buckland
Synthetic data can streamline clinical trials and enable ideas to be tested without draining R&D budgets, but what are the pros and cons of this new approach?
The breakthrough therapies that are promising to conquer rare diseases have one inherent challenge on their road to approval – a lack of patients to provide control arms. It’s like rounding a bend in a sleek sports car to encounter a mountain. The direction of travel is clear, the speed is not a problem, but how do you break through to the promised land?
Enter synthetic data. Rows of finely tuned data, extrapolated from real-world evidence, creating the perfect first responders for clinical trials and more. Willing, compliant and available to rinse and repeat in seemingly any research scenario.
Perks and possibilities
For companies locked into exhaustive global trawls for suitable patients, the allure of synthetic data algorithms is irresistible. Traditional clinical trial methods are becoming increasingly expensive and patient recruitment is a constant barrier, particularly in the pursuit of therapies for rare diseases.
“Synthetic data offers a way to increase confidence before large investments in clinical work are made,” says Alistair Stuart, Customer Director Health and Life Sciences, Faculty AI, a machine learning specialist with an extensive client list from Arup to Oxford Neuroscience. He believes it holds “significant advantages to the alternative”, which is using patient data from healthcare systems or other clinical trials.
Diversity is critical to equitable healthcare
As it is modelled from existing data and has no personal details, synthetic data overcomes privacy issues and gives scientists greater scope to explore and challenge hypotheses and ‘what ifs’ to drive innovation. Data specialists have presented compelling evidence of accelerated clinical trials and the identification of further applications using simulated data sets, a practice that has a strong track record in the finance and retail sectors.
But, despite its promise, the regulators are cautious. This isn’t retail and drugs are not ill-fitting garments or odd grocery product replacements that you can send back. There is also concern about any bias built into models becoming amplified and throwing results off-target. Synthetic data also has to live up to its promise to improve diversity in clinical trials.
Gen Li, Founder and President, Phesi, a clinical development analytics specialist at the leading edge of synthetic data, prefers the terminology ‘digital patient profile’ or ‘digital trial arms’ to underscore the value and potential of the process.
“Synthetic does not create the right image of what can be achieved,” he says. “Companies have had some impressive results from our technology because we are able to input so much data and interpret it in ways that were previously unimaginable.”
Phesi has data on 60 million patients from more than 400,000 clinical trials which drive deeper insights, according to Li, former Head of Productivity, Pfizer Worldwide Clinical Development, who founded the company in 2007.
Synthetic data offers a way to increase confidence before large investments in clinical work are made
A highlight achievement, he says, came from a small biotech company developing a therapy aimed at people with the blood disorder beta thalassemia and capturing a share of a $600m market. With the application of 10,000 Phesi digital patients, the research timescale rapidly accelerated and revealed extra applications in myelodysplastic syndromes, opening up another sector worth $2bn.
“Not surprisingly, their stock price increased by 500 times,” says Li. “Using terms such as ‘synthetic control arms’ causes misperceptions that the data can only be used as a control arm when the benefit is much more profound, penetrating different aspects of R&D and helping us avoid mistakes and failures.”
Obstacles to adoption
Research from Gartner UK, a management consulting firm, predicts that 60% of all data used in AI will be synthetic by 2024. “There is a lot of excitement about synthetic data and a big push,” notes Stuart. However, he is quick to add that “we will have to see if these Gartner figures come to bear”.
Li sees the barriers to the adoption of synthetic data in healthcare as a lack of awareness within the pharmaceutical industry and the cautious approach of FDA and the EMA. These are two of the key hurdles to take note of, but creating data free from bias is the steepest peak to be climbed.
“If there are underlying biases in the data then they can be amplified as you are generating huge volumes of data off of a much smaller data set,” explains Stuart. “Synthetic data can be more prone to bias than real-world data, so its use needs to be carefully planned as part of an overall evidence generation strategy.”
This is a view echoed by Ash Rishi, Founder, COUCH Health, a creative health engagement agency, who welcomes the potential of synthetic data to create cheaper, faster and more diverse trials, but points out that bias can be an issue. “There is a danger that biases could become even bigger with the use of synthetic data, and this could have negative implications for communities,” he corroborates. He describes how crucial it is that the algorithms used to generate the data are validated and tested.
Undoubtedly, the dawn of synthetic data is here, and despite its potential growing pains, it will be crucial to healthcare innovation, promising a new world of possibilities for drugmakers and patients. As Rishi points out: “Diversity is critical to equitable healthcare.” If synthetic data can live up to its potential and be built without bias, healthcare systems, industry and patients will all be winners.
This feature appears in GOLD 27 – read the full issue here.