Understanding a human pangenome map (GS Paper 3, Science and Technology)
Why in news?
- A new study describes a pangenome reference map, built using genomes from 47 anonymous individuals (19 men and 28 women), mainly from Africa but also from the Caribbean, Americas, East Asia, and Europe.
What is a genome?
- The genome is the blueprint of life, a collection of all the genes and the regions between the genes contained in our 23 pairs of chromosomes. Each chromosome is a contiguous stretch of DNA string.
- The genome consists of 23 different strings, each composed of millions of individual building blocks called nucleotides or bases. The four types of building blocks (A, T, G and C) are arranged and repeated millions of times in different combinations.
- Genome sequencing is the method used to determine the precise order of the four letters and how they are arranged in chromosomes. Sequencing individual genomes helps us understand human diversity at the genetic level and how prone we are to certain diseases.
What is a reference genome?
- When genomes are newly sequenced, they are compared to a reference map called a reference genome. This helps to understand the regions of differences between the newly sequenced genome and the reference genome.
- One of this century’s scientific breakthroughs was the making of the first reference genome in 2001. It helped scientists discover thousands of genes linked to various diseases; better understand diseases like cancer at the genetic level; and design novel diagnostic tests.
- Although a remarkable feat, the reference genome of 2001 was 92% complete and contained many gaps and errors. Additionally, it was not representative of all human beings as it was built using mostly the genome of a single individual of mixed African and European ancestry.
- Since then, the reference genome map has been refined and improved to have complete end-to-end sequences of all the 23 human chromosomes.
- Although complete and error-free, the finished reference genome map does not represent all of human diversity.
What is a pangenome map?
- Unlike the earlier reference genome, which is a linear sequence, the pangenome is a graph.
- The graph of each chromosome is like a bamboo stem with nodes where a stretch of sequences of all 47 individuals converge (similar), and with internodes of varying lengths representing genetic variations among those individuals from different ancestries.
- To create complete and contiguous chromosome maps in the pangenome project, the researchers used long-read DNA sequencing technologies, which produce strings of contiguous DNA strands of tens of thousands of nucleotides long.
- Using longer reads helps assemble the sequences with minimum errors and read through the repetitive regions of the chromosomes which are hard to sequence with short-read technologies used earlier.
Why is a pangenome map important?
- Although any two humans are more than 99% similar in their DNA, there is still about a 0.4% difference between any two individuals. This may be a small percentage, but considering that the human genome consists of 3.2 billion individual nucleotides, the difference between any two individuals is a whopping 12.8 million nucleotides.
- A complete and error-free human pangenome map will help understand those differences and explain human diversity better. It will also help us understand genetic variants in some populations, which result in underlying health conditions.
- The pangenome reference map has added nearly 119 million new letters to the existing genome map and has already aided the discovery of 150 new genes linked to autism.
What it holds for India?
- Although the project is a leap forward, genomes from many populations are still not a part of it. For example, genomes from more people from Africa, the Indian sub-continent, indigenous groups in Asia and Oceania, and West Asian regions are not represented in the current version of the pangenome map.
- Even though the current map does not contain genome sequences from Indians, it will help map Indian genomes better against the error-free and complete reference genomes known so far.
Way Forward:
- Future pangenome maps that include high quality genomes from Indians, including from many endogamous and isolated populations within the country, will shed light on disease prevalence, help discover new genes for rare diseases, design better diagnostic methods, and help discover novel drugs against those diseases.