Male chromosome mapped

Two teams of international scientists, working independently, have successfully mapped the human male’s Y chromosome.


The first team, comprising nearly 90 researchers from the Telomere-to-Telomere (T2T) Consortium, decoded the 62,460,029 ‘letters’ of DNA comprising the Y chromosome, adding over 30 million new letters, and identifying 41 previously unknown genes that encode proteins; while the second examined Y chromosomes from 43 men from 21 different parts of the world, evaluating individual variations to determine how these have evolved over time.

Both papers were published in Nature this week and the findings closed many gaps in the existing Y chromosome reference, provided insights into new DNA sequences and the signatures of conserved regions, as well as greater understanding of the molecular mechanisms that contribute to the complex structure of the male chromosome.

Bioinformatician and corresponding author, Dr Adam Phillippy, from the T2T consortium and Head of Genome Informatics at the US National Human Genome Research Institute, explained that the human Y chromosome has been difficult to sequence and assemble owing to its complex structure.

“More than half of this chromosome was missing from the current human reference genome assembly, GRCh38, and as a result, our understanding of the Y chromosome was incomplete, limiting what we knew about its composition, complexity and how it might vary across populations,” he said.

The new map corrects multiple errors in the Y chromosome from the current human reference genome assembly, adds over 30 million base pairs of sequence to that reference, as well as revealing the complete structures of several gene families and identifying 41 new protein-coding genes.

“The results also correct assumptions made in microbiome studies, in which previously unknown human Y chromosome sequences were mistakenly assigned as bacterial sequences,” Dr Phillippy said.

In a separate paper, Dr Charles Lee and colleagues assemble human Y chromosomes from 43 male individuals representing 21 different world populations to provide a more detailed view of genetic variation between Y chromosomes due to of human evolution.

Most cells in the body are diploid, meaning they have two copies of each chromosome, comprising 22 pairs of each autosomal chromosome plus a pair of sex chromosomes – either two X’s or one X and one Y.

However, gametes (sperm and eggs) are haploid, with only one copy of each autosomal chromosome, plus one sex chromosome – an X or Y – to ensure that the resulting embryo will have the correct number of chromosomes.

During meiosis, the chromosomes from the mother pair up with the father’s chromosomes and undergo recombination, where sections of genetic material are exchanged, before being separated into unique haploid daughter cells.

Genes spaced far apart on the chromosomes are expected to recombine regularly over generations – with genes at the ends of a chromosome likely originating from different parts of the family tree – while genes that are close together are less likely to be split up during meiosis, forming clusters of genetic material that came exclusively from one parent, known as a haplotype.

However, even though all parts of the autosomal chromosomes and the X chromosome can undergo recombination when paired during meiosis, the Y chromosome never pairs with another Y chromosome, resulting in large sections that do not recombine – which means that it can be used to trace the paternal lineage of specific haplotypes.

In wider use, the term can also be used to describe groups of DNA variations that tend to be inherited as a set – these can be whole genes, or polymorphisms in coding or noncoding regions.

The human reference genome, the most widely used resource in human genetics, is a linear composite of merged haplotypes from more than 20 people, with data from a single individual comprising most of the sequence.

“However, understanding the composition and appreciating the complexity of the Y chromosomes in the human population requires access to assemblies from many diverse individuals,” Dr Lee explained.

“We have combined PacBio HiFi and ONT long-read sequence data to assemble the Y chromosomes from 43 males, representing the five continental groups from the 1000 Genomes Project.

“While both the GRCh38 (mostly R1b-L20 haplogroup) and the T2T Y represent European Y lineages, 49% of our Y chromosomes represent African lineages and include most of the deepest-rooting human Y lineages.

“This newly assembled dataset of 43 Y chromosomes thus provides a more comprehensive view of genetic variation at the nucleotide level across over 180,000 years of human Y chromosome evolution.”