Cell-free DNA (cfDNA) has emerged as a groundbreaking tool in genetic testing, offering a non-invasive window into the human genome. This circulating DNA, released by cells into the bloodstream, carries valuable information about gene expression and DNA methylation. As research progresses, scientists are uncovering the potential of cfDNA fragmentomics, a field that examines the size, distribution, and end motifs of DNA fragments, to revolutionize disease diagnosis and screening.
The study of cfDNA fragmentomics is opening new avenues in cancer detection, prenatal testing, and tissue-of-origin identification. By analysing the unique fragmentation patterns associated with different cell types and disease states, researchers are developing more sensitive and specific diagnostic tools. This approach combines insights from epigenetics, nucleosome positioning, and DNA fragmentation to provide a comprehensive view of an individual’s health status. The following sections will explore the science behind cfDNA fragmentation, its signatures in various diseases, and the analytical methods used in this cutting-edge field.
The science behind cfDNA fragmentation
cfDNA shows promise as a powerful tool for non-invasive diagnostics and disease monitoring. To understand its potential, it’s crucial to explore the underlying mechanisms of cfDNA release and fragmentation.
Mechanisms of cfDNA release
cfDNA molecules primarily originate from cell death processes, including apoptosis and necrosis. These DNA fragments carry genetic and epigenetic information from their tissues of origin, making them valuable biomarkers for various conditions. The release of cfDNA into the bloodstream occurs through multiple mechanisms, including active release and cell death events.
Factors affecting fragment size and patterns
The fragmentation of cfDNA is not a random process but is influenced by several factors:
- Nucleosome structure: cfDNA fragments typically have a characteristic size of approximately 167 base pairs, which corresponds to the length of DNA wrapped around a single nucleosome plus linker DNA. This size distribution reflects the protection offered by nucleosomes during the fragmentation process.
- Nuclease activity: Various nucleases play crucial roles in cfDNA fragmentation. Key enzymes include DFFB (DNA fragmentation factor subunit beta), DNASE1 (deoxyribonuclease 1), and DNASE1L3 (deoxyribonuclease 1 like 3). Each nuclease has distinct preferences and activities:
- DFFB cleaves double-strand DNA into high molecular weight fragments and then into oligo-nucleosomal fragments.
- DNASE1 preferentially cleaves nucleosome-free naked DNA.
- DNASE1L3’s activity correlates with DNA methylation.
- End motifs: The cleavage of cfDNA by nucleases is not random but tends to occur at specific bases, resulting in characteristic sequences called end motifs. These end motifs can provide insights into the nucleases involved in fragmentation and the tissue of origin.
- Chromatin structure: The accessibility of DNA to nucleases is influenced by chromatin compaction, leading to non-random fragmentation patterns. Open chromatin regions are more susceptible to cleavage, while tightly packed heterochromatin is more protected.
Nucleosome positioning and DNA methylation
Nucleosome positioning and DNA methylation play significant roles in shaping cfDNA fragmentation patterns:
- Nucleosome footprints: cfDNA fragmentation patterns can reveal the nucleosome landscape of the cells from which they originated. This is because DNA wrapped around histone octamers is protected from digestion during apoptosis.
- DNA methylation: Recent studies have uncovered a strong link between DNA methylation and cfDNA fragmentation. Low-level DNA methylation can increase nucleosome accessibility, altering the cutting activities of nucleases during DNA fragmentation. This leads to variations in the size distribution and the sequence of cutting sites of cfDNA.
- Gene expression: The degree of cfDNA fragmentation at promoters correlates with the expression levels of the corresponding genes. Actively expressed genes are typically located in open chromatin regions flanked by well-positioned nucleosomes, influencing the fragmentation patterns observed in cfDNA.
- Epigenetic information: The fragmentation patterns of cfDNA can be used to infer epigenetic features, such as nucleosome footprints and gene expression. This information can be valuable for identifying the tissue origin of cfDNA, particularly in pathological states like cancer or pregnancy.
Understanding these complex interactions between DNA methylation, nucleosome positioning, and nuclease activity provides valuable insights into cfDNA fragmentation mechanisms. This knowledge forms the foundation for developing more sensitive and specific diagnostic tools based on cfDNA fragmentomics, opening new avenues for non-invasive disease detection and monitoring.
Fragmentomic signatures in disease
The study of cfDNA fragmentation patterns, known as fragmentomics, represents a potentially powerful method for non-invasive disease diagnosis and monitoring. These fragmentation signatures provide valuable insights into various pathological conditions, including cancer, foetal abnormalities, and tissue-specific disorders.
Cancer-specific fragmentation patterns
Cancer-associated cfDNA exhibits distinct fragmentation patterns that can be leveraged for early detection and monitoring. These patterns arise from alterations in chromatin structure, nuclease activity, and DNA methylation in cancer cells. Researchers have identified several key features of cancer-specific cfDNA fragmentation:
- Fragment size distribution: Cancer-derived cfDNA tends to have a shorter fragment size compared to non-cancerous cfDNA. This characteristic has been exploited to improve assay sensitivity in cancer detection by focusing on shorter cfDNA fragments.
- End motifs: The profile of cfDNA cleavage site motifs represents another class of biomarker for liquid biopsy in oncology. Studies have revealed tumour-associated cfDNA preferred end coordinates in patients with hepatocellular carcinoma, achieving a sensitivity of >80% at >90% specificity.
- Fragmentation patterns within open chromatin regions: cfDNA within open chromatin regions is more susceptible to fragmentation. By examining fragmentation patterns within these regions, researchers have developed machine learning techniques to evaluate their performance in cancer diagnosis.
- Integration of multiple fragmentation patterns: To address the instability of individual fragmentation patterns, an ensemble classifier integrating all fragmentation patterns has been developed. This approach has demonstrated notable improvements in cancer detection and tissue-of-origin determination.
Tissue-of-origin identification
Fragmentomic analysis has proven capable of identifying the tissue of origin for cfDNA fragments, which is particularly useful when screening for cancer in asymptomatic individuals with no previously known tumour. Several approaches have been developed to achieve this:
- Nucleosome positioning: The genome-wide map of in vivo nucleosome occupancy of cfDNA, based on deep sequencing, can be utilised to perform tissue-of-origin analysis. The windowed protection score (WPS) has been defined to determine nucleosome occupancy at a given genomic coordinate.
- Gene expression prediction: Plasma DNA coverage in promoter regions can be used to predict gene expression. A nucleosome promoter analysis utilising machine learning and whole-genome sequencing datasets has been developed to determine the expression status of a gene based on coverage at the transcription start site.
- Cell type contribution analysis: By analysing cfDNA coverage patterns, researchers have identified aberrant cell type contributions in plasma cfDNA from cancer patients. For example, in colorectal cancer patients, intestinal cells accounted for 33% of the significantly up-ranked epithelial cells, including various intestinal cell types.
- Ultra-low-coverage analysis: Even with ultra-low-coverage sequencing data (<0.3-fold coverage), cell type-specific disease signatures can be detected. In a study of multiple myeloma patients, plasmacytoid dendritic cells, plasma cells, plasmablasts, B cells, and myeloid progenitor cells were amongst the top 20 up-ranked cell types compared to controls.
Foetal DNA fragments in maternal blood
Fragmentomic analysis has also found applications in non-invasive prenatal testing (NIPT), providing insights into foetal DNA characteristics and improving diagnostic accuracy:
- Fragment size distribution: foetal cfDNA fragments are typically shorter than maternal cfDNA fragments. This size difference has been used to improve assay sensitivity in NIPT by focusing on shorter cfDNA fragments.
- Genetic-epigenetic tissue mapping: Long cfDNA fragments have enabled single-molecule methylation analysis to deduce the tissue of origin of individual plasma molecules. This approach has been used to determine maternal inheritance in the foetus without the need for sensitive molecular counting and dosage-based techniques.
- Challenges in NIPT: Accurate NIPT results can be obscured by multiple gestations, maternal germline copy number variations (CNVs), and absence of heterozygosity (AOH). In twin pregnancies, the individual cfDNA contributed by each foetus is usually lower than that of a singleton foetus affected with an aneuploidy, which exacerbates the NIPT assay sensitivity.
These fragmentomic signatures in various diseases demonstrate the potential of cfDNA analysis as a powerful diagnostic and monitoring tool. By leveraging the unique fragmentation patterns associated with different cell types and disease states, researchers are developing more sensitive and specific diagnostic approaches across multiple medical fields.
Analytical approaches in fragmentomics
The field of fragmentomics has seen significant advancements in analytical techniques, enabling researchers to extract valuable information from cfDNA fragmentation patterns. These approaches have proven instrumental in disease diagnosis, monitoring, and understanding the underlying biological processes.
Size profiling techniques
Size profiling of cfDNA fragments has emerged as a powerful tool for disease detection and monitoring. The length of cfDNA fragments can reflect an individual’s physiological state, with notable differences observed between normal and pathological conditions. For instance, cancer-derived cfDNA tends to have a shorter fragment size (dominant peak at ~143 bp) compared to cfDNA from normal cells (dominant peak at ~167 bp). Similarly, foetal cfDNA fragments are typically shorter than maternal cfDNA fragments, with dominant peaks at ~146 bp and ~166 bp, respectively.
These size differences have been leveraged to develop various applications:
- Enrichment of short fragments: Researchers have improved the detection of pancreatic cancer by focusing on short mutant cfDNA fragments.
- Foetal fraction estimation: The proportion of short cfDNA fragments has been used to detect foetal chromosomal aneuploidies, such as trisomy 21 and trisomy 18, with 100% sensitivity.
- NIPT: Qiao et al. improved foetal fraction estimation in NIPT by enriching shorter cfDNA fragments.
End motif analysis
The analysis of cfDNA end motifs has provided valuable insights into the fragmentation process and its potential as a biomarker. End motifs refer to the short terminal nucleotide sequences of cfDNA fragments. Recent studies have identified several key features:
- Preferred end coordinates: Certain genomic locations are preferentially cleaved during cfDNA generation, resulting in plasma DNA preferred ends. These preferred ends can be selective for foetus-derived or maternal-derived cfDNA, correlating with foetal fraction.
- Cancer-specific end motifs: Jiang et al. found that the most commonly occurring end motif (CCCA) was less likely to be present in cfDNA of patients with hepatocellular carcinoma compared to healthy subjects. Additionally, the diversity of cfDNA end motifs significantly increased in cancer patients.
- Breakpoint motifs: Analysis of upstream and downstream nucleotides of cfDNA fragment break points has been used to predict early lung cancer.
Machine learning algorithms for pattern recognition
The complexity of cfDNA fragmentation patterns has necessitated the use of advanced computational approaches, particularly machine learning algorithms, for pattern recognition and analysis:
- Ensemble classifiers: To address the instability of individual fragmentation patterns, researchers have developed ensemble classifiers that integrate multiple fragmentation patterns. These approaches have shown notable improvements in cancer detection and tissue-of-origin determination.
- Fragmentation pattern evaluation: Cristiano et al. developed an approach called DELFI (DNA evaluation of fragments for early interception) to evaluate fragmentation patterns across the genome. They found that fragment profiles of individuals with cancer are more variable than those of healthy individuals.
- Nucleosome footprint analysis: Machine learning techniques have been applied to analyse nucleosome footprints in cfDNA. Girirajan et al. generated maps of genome-wide nucleosome occupancy in vivo and found that short cfDNA fragments harbour the footprints of transcription factors.
- Gene expression prediction: Ulz et al. identified discrete regions in the transcription start site where nucleosome occupancy resulted in different coverage depths of expressed and silenced genes. This discovery has been applied to predict pregnancy complications by inferring expressed genes in pregnant women.
These analytical approaches in fragmentomics have significantly enhanced our ability to extract meaningful information from cfDNA, paving the way for more accurate and non-invasive diagnostic tools across various medical fields.
Conclusion
The field of cfDNA fragmentomics has opened up exciting possibilities in genetic testing and disease diagnosis. By examining the unique fragmentation patterns of cell-free DNA, researchers have gained valuable insights into various health conditions, including cancer and foetal abnormalities. The analysis of fragment size, end motifs, and nucleosome positioning has led to more sensitive and specific diagnostic tools, paving the way for non-invasive screening methods.
As research in this area continues to advance, we can expect to see further improvements in early disease detection and personalised medicine. The combination of fragmentomics with other genomic and epigenomic approaches has the potential to revolutionise healthcare by providing a more comprehensive view of an individual’s health status. This progress highlights the importance of ongoing research and development in the field of cfDNA analysis to unlock its full potential in improving patient outcomes.