Redacción HC
10/09/2025
For decades, scientists have been working to map the human genome, the complete set of genetic instructions that make us who we are. However, this map has always been incomplete. While we've gotten good at identifying small, single-letter changes in our DNA, a vast and complex layer of genetic variation has remained largely hidden. A groundbreaking study published in the prestigious Nature is changing all of that. By applying cutting-edge technology, an international team of researchers has effectively doubled our knowledge of large-scale genetic variations, offering an unprecedented look into human diversity and the genetic underpinnings of disease.
Most of our understanding of human genetic variation has come from studying single-nucleotide variants (SNVs)—the "typos" in our DNA where a single letter is different between individuals. While important, this approach has been akin to trying to understand a massive library by only looking for misspelled words in a few books. A much richer and more significant source of variation comes from structural variants (SVs). These are large-scale changes to our DNA, such as sections that are duplicated, deleted, inverted, or inserted in new places.
For a long time, these SVs have been a challenge to study. Traditional short-read sequencing technologies, which read DNA in tiny fragments, struggled to piece together these large-scale changes. It was like trying to solve a jigsaw puzzle with thousands of identical, tiny pieces. This technological limitation left a massive gap in our understanding of the human genome and its connection to health and disease.
To overcome these limitations, a research consortium led by institutions like the European Molecular Biology Laboratory (EMBL) and the Research Institute of Molecular Pathology (IMP) turned to a newer technology: long-read sequencing (LRS). Unlike previous methods, LRS can read thousands, or even tens of thousands, of DNA bases in a single pass. This provides much larger, more contiguous segments of genetic information, making it far easier to identify complex structural changes.
The team applied this powerful technology to 1,019 diverse human genomes from 26 different populations across five continents, sourced from the well-known 1000 Genomes Project. The global and diverse nature of this sample was crucial, as previous studies often focused on less varied populations, which limited our understanding of worldwide genetic diversity. In addition to LRS, the study utilized a sophisticated combination of linear and graph-based genome analyses to identify and categorize the SVs with unprecedented detail.
The study’s findings are nothing short of extraordinary. The team successfully identified over 167,000 new structural variants, a discovery that effectively doubles the known catalogue of structural variation in the human pangenome. This "hidden treasure" of genetic information revealed just how much diversity exists between us, showing that on average, each person in the study carried 7.5 million additional DNA letters in the form of SVs.
The research also provided new insights into the biological mechanisms that create these SVs. It found that these variants don’t just arise from one process but from a spectrum of recombination-mediated mechanisms. The scientists identified new ways in which transposons, often called "jumping genes," can move segments of DNA to new locations, generating fresh variants and contributing to our genetic uniqueness.
This study has profound implications for the future of medicine. First and foremost, the comprehensive dataset created is an open-access resource for the entire scientific community. This improved reference map for the human genome is a game-changer for future clinical studies, particularly for prioritizing disease-associated variants. The researchers demonstrated that using this new dataset as a reference significantly improves the accuracy of identifying pathogenic variants, which is essential for accurate genetic diagnoses of diseases like cancer.
The study also provides a clear "roadmap" for building a truly complete human pangenome—a reference genome that includes a wide range of human diversity rather than just a single, linear sequence. This is a crucial step toward personalized medicine. The findings also carry significant policy implications, highlighting the need for public health initiatives and large-scale genetic projects to adopt long-read sequencing technology to better understand the genetic basis of diseases across diverse populations.
Topics of interest
HealthReference: Schloissnig S, Pani S, Korbel JO, et al. Structural variation in 1,019 diverse humans based on long-read sequencing. Nature [Internet]. 2025;631(7901):22-29. Available on: https://doi.org/10.1038/s41586-025-09290-7
![]()