DNA in medicine and research

Some enthusiasts for genomic medicine argue that in the future everyone will have their genome stored on a vast database and on their mobile phone, with medicines, drugs and lifestyle advice linked to their genetic make-up. This idea raises major privacy concerns because every individual and their relatives could be tracked and identified using their DNA. It also raises important questions about whether such vast databases can be justified in terms of their claimed benefits to health. 

An increasing amount of research is being dedicated to genomics by both public, private and public-private enterprises. Moreover, DNA is also increasingly being collected by government authorities including law enforcement and immigration authorities, who may also have access to commercial genetic databases under certain circumstances. As genetic databases grow, there is also a growing need to ensure that regulatory landscapes stay abreast of database expansion and sharing in order to respect human rights to privacy.  

How is genetics relevant to health?

Health and illness is influenced by a variety of factors, including environmental, social and genetic. There are a few disorders that have relatively simple genetic causes, some of the most common include Down’s Syndrome, Cystic Fibrosis, Huntington’s and Sickle Cell disease. Some other illnesses such as neurodegenerative diseases include a minority of cases that have largely genetic causes. For example, an estimated 1-5% of Alzheimer’s disease is early-onset and this includes some largely inherited forms of the condition. The vast majority of cases however, are considered to be complex diseases, resulting from a mixture of environmental and genetic factors. There are genetic variants for example, that can increase risk of developing Alzheimer’s but these variants are also present in people who don’t go on to get the condition, indicating that other factors are at play beyond the role of genes. Similarly, while most cancers do not have simple genetic causes, some like breast cancer have rarer, largely inherited forms. About 5% of cases of breast cancer are associated with mutations in a gene (BRCA1 gene), that confer an estimated increased risk of developing breast cancer by 50-85%. 

Globally, the top causes of death by ill health are caused by non-communicable diseases such as cardiovascular disease, respiratory conditions, cancers and diabetes, as well as neonatal conditions (e.g. birth trauma and pre-term complications). The world’s biggest killer is currently heart disease which was reported by the World Health Organisation to be responsible for an estimated 16 % of deaths in 2019. Communicable diseases, i.e. those caused by an infection, are responsible for a higher proportion of illness and death in lower income settings, but non-communicable diseases are on the rise globally. This rise is associated with increased urbanisation and westernised lifestyles related to unhealthy diets, smoking, reduced physical activity and pollution (and other factors) as they become more dominant. Indeed, environmental, as well as social factors are critical determinants of many common diseases, both communicable e.g. malaria and the coronavirus pandemic, as well as non-communicable diseases such as obesity and cancer. Socio-economic status, linked to poor access to healthy diets and unpolluted green spaces, healthcare and social services; stress; discrimination; low job status or security are also important systemic drivers of ill health that serve to generate health inequalities. Limiting health interventions to biological mechanisms such as genes thus risks restricting social measures that need to be addressed to improve overall health. 

Predicting risk of common diseases

By comparing genetic data with health information, genome projects hope to show how genes and environment interact to cause common illnesses. There has been a general failure to find simple genetic causes for the vast majority of common diseases. This is exemplified by the completion of the Human Genome Project in 2003, which was predicted by enthusiasts to revolutionise healthcare by identifying genetic causes and thus treatments or cures for most if not all, common diseases. The project has been disappointing in its clinical impact, with very few disease-causing variants discovered that have translated into improving clinical care. 

It has since been more widely recognised that in most cases, multiple genes play only a small and complicated role in most diseases, in combination with other environmental and social factors. For example, environmental pollutants, dietary factors and chronic stress are well evidenced to have causative roles in common diseases like cancers, diabetes and hypertension. 

How much the risk of a disease is based on genetic factors has been classically estimated by performing studies on twins, also referred to as ‘twin studies’. Assessing to what extent a trait is inherited by both twins, in pairs of identical and non-identical twins, has been the basis for estimating how much a trait is genetically determined. However, to date there has been what is termed ‘missing heritability’ where traits have failed to be linked to sufficient genetic variants to explain their predicted genetic components. Conversely, our understanding of other modes of inheriting traits are increasing, including epigenetic as well as cultural inheritance as mechanisms of heritability, that are also potential drivers of evolution. 

Despite the lack of clinical translation of the genetics field to widespread medical advancements, the costs of genetic analyses such as sequencing technologies, have become significantly cheaper and more efficient. Sequencing technologies are roughly 20 times cheaper than they were 15 years ago. Supportive techniques such as statistical analyses, computational modelling, and big data analysis has also evolved significantly, as have companies and start-ups aiming to capitalise on the growing industry. 

In light of improved data acquisition and processing, and a more nuanced understanding of the role of genetics in health, the field has since pivoted towards looking for the small contributions of combinations of many disease variants in influencing health and disease. 

Searching for contributions of combinations of genetic variants, to give what is called a ‘polygenic risk score’ (PRS), recognises the understanding that the vast majority of common disease cannot be attributed to single gene variants. Polygenic risk scores, which are the sum of effects of all the genetic variants associated with a given trait in an individual, have thus become a seductive avenue of research that claims to provide potential for personalised management strategies to prevent or treat common disease, something that previous incarnations of genetics research were unable to do.  

Polygenic risk scores are calculated by computing. The computer models aim to weigh variants thought to have bigger effects with those thought to have smaller effects. For example, a polygenic risk score for breast cancer, should give a genetic risk based on the combination of at-risk variants. Proponents of this latest phase of genetics research argue that testing for polygenic risk scores will again revolutionise cancer prevention and treatments. Although such arguments are currently under-evidenced for common diseases, it is enthusiastically supported by government agendas in various countries, where large genome studies are on the rise. 

Assessing the link between combinations of thousands or even millions of genetic variants however, is not an assessment of whether these PRS scores are indeed causing disease risk, but instead only whether they correlate with a disease of interest. Teasing out gene-environment effects, and even purely environmental effects remains an unfinished task that limits the utility of such assessments in improving healthcare in the vast majority of cases. Clinical accuracy has yet to be attained to fulfil the latest promises of using PRS as a means to improve healthcare. Indeed, many scientists argue that PRS will never have sufficient predictive value to be useful in healthcare, because genetic factors do not play sufficiently important role in common diseases. Nonetheless, this new “data- driven biology” forms part of an emerging medical-industrial complex that promises to manifest personalised medicine.

Testing tumours

A growing area of research is the sequencing of cancer genomes, the genomes of the tumour cells, or neighbouring tissues. Cancers are mostly caused by the accumulation of genetic mutations. As such, tumour cells carry many genetic changes, for example in gene regions involved in cell division, promoting uncontrolled replication of cells and subsequent tumour development. By sequencing the DNA from tumours, researchers aim to further understand how cancers develop, identify potential treatments, or identify patients whose tumours may be helped by, or be resistant to, a particular therapy depending on what changes have occurred in their tumour. Detected genetic mutations may also be used as a biomarker to monitor treatment progress.  These mutations that occur in tumour cells could be those that have been inherited from parents (germline mutations), or alternatively, they are mutations that have been acquired in a person during their lifetime (somatic mutations), for example by exposure to carcinogenic compounds that cause DNA damage.  One of the difficulties for this type of cancer treatment, is that some genetic changes can be unique to each person or tumour, or even different cells within a tumour. While masses of data are being accumulated from cancer genomes, interpreting such data for clinical application remains a challenge. Identifying correct treatments for such a diversity of genetic changes is also difficult, and some changes that involve large-scale alterations, such as rearrangements or translocations of chromosomal segments, would not be picked up by standard sequencing techniques. Further complexities arise from the observation that numerous mutations associated with tumours are also found in healthy cells that rarely progress to cancer malignancy. Cancer cells also evolve resistance so they can survive treatment and the tumour can grow and spread again. Resistance to treatments can occur via various mechanisms where cells adapt by, for example, resisting cell death, inactivating drugs, reducing absorption of drugs by cells, alterations in drug metabolism, increasing DNA repair to counter DNA damage induced by chemotherapies, and changing the activity of genes (via epigenetic mechanisms) to override the effects of cancer therapies. Such challenges mean that the use of sequencing for personalised cancer treatment currently remains more of an investigational strategy than one with clear evidence of clinical value. While the identification of particular genes known to be relevant to specific treatments can improve clinical outcome and provide research insights, the utility of whole genome sequencing to allow for personalised treatments remains unclear. 

While this research focuses on the DNA of the tumour, any genetic changes require comparison with the non-cancerous, normal genome of the individual. As such, DNA collection for researching tumours often also involves collection of a patient’s DNA, which may be saved as an additional research resource for any future investigation. 

Distracting from public health alternatives

Badly performed genetics research has the potential to distract from more proven, effective treatments or solutions for many common diseases, particularly prevention. It may also waste finite healthcare resources on spurious links between genes and diseases or behaviour. New medicines that improve health are always vital goals, however the cost or affordability of medical innovations are often not considered. This leaves behind an issue of fairness and distributive justice, with the need to have affordable healthcare for all patients and all diseases. 

Questions remain regarding prospective applications such as individualised screening programs. Currently, the evidence is that polygenic risk scores, the latest in genomics analyses that aim to give a risk score based on algorithms of thousands or even millions of genetic variants, are yet to reach clinical accuracy. Additional suggestions for example, to screen for chemo-preventative treatments, or pharmacogenomics to assess efficacy of drugs for individuals, is currently not applicable to the majority of medical treatments. Moreover, the utility of genomics information rests in its coupling with patient tissue data, raising infrastructural and capacity issues to ensure tissue integrity, transport, appropriate preservation and then coordination and feedback with these supportive processes for biopsies, diagnostics, consent and data integration. 

The question of affordability for genomic innovations is further contextualised by profit-driven drug development model aims to maximise profit and shareholder value. High prices are thus needed to maintain this business model. Such business models have global implications, with the majority of pharmaceutical companies based in few high income countries that set the baseline for prices across the world. Careful scrutiny is thus needed to ensure that the latest push for personalised medicine not only delivers clinical benefits to the patient, but wider benefits to the health of the population based on the affordability and cost-effectiveness of any future genomics programs. 


One avenue of genomics research for personalised medical treatments is to sequence people’s DNA in order to predict how they may respond to a drug. The safety and efficacy of many pharmaceutical treatments are influenced by how the drug is metabolised, which in turn is influenced by wide-ranging factors including nutritional status, intestinal flora, tobacco or alcohol consumption, food constituents, age, gender, health status, interactions between drugs and even the time of day a treatment is taken, alongside some genetic factors. Drug response phenotypes are thus influenced by a complex interplay between environmental, and bodily factors from the molecular to system level including genetic, epigenetic, protein, metabolic, cell, tissue, organ and whole body, with each level of the bodily system interacting and interlinking across complex networks, also interacting with the wider environment.   Identified genetic biomarkers are thus estimated to have limited impact (10-15 %) on drug responses. Moreover, insufficient testing of drug safety; limitations of clinical trials in detecting all adverse reactions; medical errors related to monitoring, administration, incorrect drug selection along with underreporting of side effects, all form part of the multifactorial causes of adverse drug events. 

Another factor affecting adverse drug reactions is the limited capacity of clinical trial protocols to detect rare side effects. The limited duration of drug testing performed on a limited number of trial participants means that side effects of new drugs are often missed until after commercialisation. The increase in adverse reactions is thus at least in part down to the drug development and testing process, as opposed to inherent genetic factors in patients. Indeed, adverse drug reactions (ADRs) are one of the leading causes of death in countries such as the U.S. A recent European study also identified that over 40 % of reported ADRs to a European database were for reactions that were not yet labelled for the drug in question. 

Common foods and drinks including grapefruit juice for example, are well known to interfere with various drugs, including those for high cholesterol, high blood pressure, organ-transplant rejection drugs, abnormal heart rhythms, allergies, as well as corticosteroid and anti-anxiety drugs. Grapefruit juice is thought to increase drug bioavailability by blocking drug metabolism and thus increasing risks of drug toxicity. Conversely, cruciferous vegetables (such as broccoli, cabbage and sprouts), charcoal broiled foods and high protein intake can increase drug metabolism and thus may lower the effect of certain drugs. Evidence also shows there is a two-way relationship between the gut microbiome (microbes, such as bacteria, which live in the gut) and an individual’s response to drugs. Drugs can affect the composition of someone’s microbiome, and similarly, an individual’s gut microbiome can influence how someone responds to a drug (and also is now understood to influence a variety of diseases). Gut microbes may alter a drug’s structure, bioavailability, bioactivity or toxicity – a phenomenon now referred to as pharmacomicrobiomics. In turn, which microbes are present is thought to vary a lot from person to person, also influenced by other lifestyle, dietary, ecological and other factors. A known example includes the Parkinson’s disease drug levodopa, whose efficacy and safety is modulated by gut microbes. Similarly, efficacy of cancer immunotherapies which work to stimulate a person’s immune system to fight cancer are also thought to be modulated by gut microbes. A recent study found over 170 drugs to be metabolised by bacteria, including malaria, chemotherapy and Parkinson’s disease drugs. 

All the wide-ranging factors that influence drug efficacy and safety are not only important to consider when determining the importance of genetics in influencing drug treatments, but also with regard to their potential to complicate or skew data from studies assessing the genetic influence on treatments. Additional complexities are raised by intra-individual variability, which means responses can differ even within the same person and thus cannot be explained by genetic factors. 

To date, the translation of pharmacogenomic studies to the clinic has been disappointing. One example is the use of the blood thinner warfarin. Warfarin can be highly toxic, and is commercially sold as a rat poison. As such, correct dosing is key for safe treatment protocols. While warfarin is a drug where genetic factors are well established to influence its metabolism, there is a lack of consensus with regard to the validity and reliability of genetic testing, with clinical benefit yet to be well established across trials and studies. While some tests have been developed, general uptake in clinical practice has generally been low. Not all variation in drug responses correlate with adverse drug reactions, further limiting the relevance of focusing on genetic determinants of drug responses for improving clinical outcomes.  Moreover, cost effectiveness has yet to be well established for various genetic tests, including for warfarin, and questionable evidence remains on how they may improve outcomes. This, along with the availability of alternative drug treatment options, has inhibited routine adoption in the clinic. Drugs such as clopidogrel have genetic tests available, but rather than adopting genetic testing, clinicians may alternatively administer alternative drugs to avoid having to go through genetic testing procedures. 

Is sequencing newborn babies useful for health? 

Genomic screening of new born babies is being actively explored with the aim of offering personalised medical care, aiding in the diagnosis of sick newborns or, as a resource to be used throughout an individual’s life. 

There are different types of genetic testing that may be performed on newborn babies. Genetic testing or sequencing could be used to aid in diagnosing and identifying treatments for sick newborns, building on the current screens already performed.  Different countries have different policies for screening newborn babies for disease, with for example the UK testing for 9 conditions that affect early life survival and health, all of which have treatments that can improve life chances for an affected baby. These are not genetic tests, but look at blood markers, though some test for genetic conditions. In the event of introducing sequencing for symptomatic sick babies, symptoms can serve to guide genetic testing and interpretation of data. Screening methods for sick children that may be lacking in a diagnosis for a rare disease for example, may thus benefit from targeted forms of genetic sequencing in such circumstances, such as gene panels that assess putative gene variants. In contrast, any whole genome sequencing in babies without symptoms risks yielding lots of data with uncertain and unactionable results. 

A recent study reported for example, that for sick newborns, 25 % of cases received a molecular diagnosis, but relatively few cases resulted in a specific treatment to reverse the condition. The majority of illnesses do not result from clear known genetic causes, with many known genetic variants having unclear health implications. Sequencing provides little predictive power in determining risk for most common disease. Sequencing entire genomes thus has the potential to yield masses of data with unknown or uncertain significance and meaning. 

Another way in which newborns could have their DNA sequenced is under plans to perform mass screening of all newborns, whether or not they are presenting with sickness. Mass genetic screening however, raises serious complexities and uncertainties around potential benefits and usefulness that marks a shift away from targeted screening of sick babies. The lack of predictive power for assessing disease risk, combined with the increased level of resources that would be required not just for the screening but any follow-up care, e.g. monitoring and genetic counselling, suggests a cautious and targeted approach to newborn testing is preferable to any mass screening program. 

Another contentious issue with regard to mass screening of newborns is whether it should also include variants associated with adult-onset disease. The utility of knowing such information is highly questionable.  It raises ethical concerns regarding potential discriminatory or insurance uses as a child reaches adulthood, as well as the child’s right to choose whether to have a genetic test or not when they are old enough to do so. Finding information on potential later life disease risk can bring ongoing stress and anxiety of future uncertainty, and also risks increasing family distress at a time of vital bonding with a new baby. It can serve to create ‘patients in waiting’ that may increase the medicalisation of infants, and the administration of unnecessary interventions and monitoring that also increases healthcare costs. It also challenges issues of consent, with consent only possible by proxy from the baby’s parents or caregivers, even for information that will only be relevant to an individual once they have reached adulthood. With any long-term retention of data for research purposes, giving informed consent at one static time point, that can cover unknown potential future uses, including the sharing of data for secondary use, by for example commercial companies or government authorities, raises concerns that are yet to be fully resolved. What constitutes ‘fully informed’ consent in such contexts involves complex ethical questions increasingly being raised by such big data collection projects.

Any program that offers routine mass sequencing of newborns will also effectively result in eventual universal collection of the genomes of entire populations. Thus, an additional consideration is the wider implications for how such information could be repurposed for other applications, for example in generating universal DNA databases or national ID systems that incorporate genetic information. Such databases could allow every individual and their relatives to be identified and tracked. 

Additional complexities lie in the long-term implications of expanding newborn screening. 

Any screening program that expands the number of conditions being tested for, also introduces the potential to exacerbate health inequalities, if follow-up services are not accessible to everybody. 


The rise of mass genomic testing for research and medicine raises serious ethical concerns regarding the privacy of the genetic information of the individual donor, and also family members who share genetic information with the individual. Proper consent mechanisms are key for allowing participants to know, in advance of donating their DNA, if their data will be shared with third parties, whether that may be commercial companies, or government authorities such as law enforcement and immigration authorities.  

The use of genetic information is changing the context of biomedical research, expanding data collection and analysis of genetic as well as other health-related information. This expansion is challenging the current mechanisms by which researchers and medical professionals gain fully informed consent from participants. For example, DNA collected for research may be retained for future use, although what this use is cannot be known at the time of DNA collection. This raises problems with regard to current mechanisms of consent that usually consist of a static singular time point where consent is given, prior to any DNA collection. Moreover, the anonymisation of data is becoming increasingly difficult to guarantee with large datasets, especially when there is increasing potential to combine data with other sources to allow for the re-identification of participants. This is important when considering that future use may also include access to datasets by government authorities or data being sold off or shared for secondary use to commercial companies.  It is questionable whether any consent mechanism is indeed fully informed if it does not also provide options for ‘informed withdrawal’ when taking into consideration potential future uses.  Future use also raises questions about the feedback of data. For example, if a parent gives consent for sequencing of their newborn baby, and this data is used when the participant is an adult with the discovery of genetic information that might pertain to health, ethical questions arise when considering any feeding of results back to the participant, who had not explicitly given this form of consent. While some people may want to know if they have a genetic risk to a disease that may affect them in adult life, others may not wish to do so. 

The rise of genetic research warrants new mechanisms of consent that can deal with issues such as informed withdrawal, patient perspectives on the sharing of anonymised data, any return of results to participants, and acceptability of interacting with the datasets online, if additional forms of consent are developed. For example, researchers have been developing ‘dynamic consent’ models that involve digital participation, communication and engagement of participants with their data. This is designed to allow participants to re-visit and review consent decisions when they choose. As health systems move to digital forms of healthcare provision, regulations and policy around informed consent need to be clearly established and developed with the public to ensure privacy of genetic information, alongside other medical information, can be fully protected. 

Are commercial tests being used for medicine and research?

In some countries such as France and the Germany, commercial direct-to-consumer tests for assessing health issues are not allowed, while in other countries such as the US, a limited number tests are commercially available. For example, the company 23andMe (at the time of writing) offers tests for breast cancer (selected variants), late onset Alzheimer’s disease, Parkinson’s disease, celiac disease, some rare blood disorders, as well as for pharmacogenomics – to test an individual’s response to certain drugs. Commercial companies in the US had been providing completely unregulated tests for health conditions until 2013, when the Food and Drug Administration (FDA) halted the sale of these tests. This halt was based on the lack of evidence that tests were accurate, or safe with regard to the implications of receiving such a test result, with risks such as the performing unnecessary interventions e.g. surgery, in the event that a result suggests risk to disease. Consumer tests have often made inaccurate and misleading health claims facilitated by unregulated environments that require little proof of efficacy (see Commercial DNA tests for more information). The requirement for FDA approval has since greatly limited the number of health tests that can be marketed, though a few tests have become available following FDA approval. Nonetheless, there are important limitations to these tests: for example, they can be poor at detecting rare genetic variants, and some (such as 23andMe’s breast cancer test) do not detect all variants. Moreover, pharmacogenomics tests are of limited value for those who are not taking the drugs being tested for, or even for those that are already taking the drug in question, and thus can be a waste of money for those purchasing tests for this purpose. Further, not all genetic tests are regulated by the FDA. Some are marketed as ‘lifestyle tests’ as a means to avoid regulation.

Recent EU laws have been developed to curb the use of commercial tests that lack clinical accuracy and utility in bringing actionable improvements in healthcare. This is to come into effect in 2022, but with an extended transition period to 2025. Considering that genetic information is currently a poor predictor of most health and disease outcomes, ensuring tests are regulated properly to test for efficacy is vital. Moreover, genetic testing done by medical professionals where other health information is taken into account, and genetic counselling and follow up care can be provided, reduces the risks of spurious test results and ensures better care for individuals. 

Genomic data from commercial companies may also be used for research purposes. Some companies may explicitly ask if consumers wish to opt-in for their data to be used for research, while others do not. In general, opting in to research means your data may be shared with other companies. Furthermore, direct-to-consumer companies do not generally give any rights to consumers if their genomic data leads to commercialisation or patents. 

 Find resources in this category here.