Glycomics – the next frontier – explored by GWAS

posted in: Genetic associations

This blog is about the new genome-wide association study (GWAS) with blood protein N-glycosylation “Defining the genetic control of human blood plasma N-glycome using genome-wide association study” by Sharapov et al. on bioRxiv.

The 2012 National Research Council (US) Committee report “Transforming Glycoscience: A Roadmap for the Future” asked


“Glycans are one of the four fundamental classes of macromolecules that comprise living systems, along with nucleic acids, proteins, and lipids, and are made up of individual sugar units linked to one another in a multitude of ways. Understanding the structures and functions of glycans is central to understanding biology” [ref].

Glycosylation is a common post-translational modification of proteins. Glycans are directly involved in the pathophysiology of every major disease. Despite the fact that most proteins are glycosylated, we know nearly nothing about the mechanisms that regulate glycosylation.

The new GWAS study by Sharapov et al. is shedding some light on this mystery. Defining genetic factors altering glycosylation provides a basis for novel approaches to diagnostic and pharmaceutical applications.


A genome-wide association study of the human blood plasma N-glycome composition in up to 3811 people discovered and replicated twelve loci.

What the study found

First of all – the study replicate six loci previously reported by Huffman et al., showing great consistency between studies. Then, it discovered and replicated twelve loci with a wide range of glycan traits. Finally, the authors put these finding into their biological and biomedical context, creating a network that shows the tight connections between the different genetic loci and glycans.

Square nodes represent genetic loci labeled with the names of candidate gene(s), circle nodes represent glycan traits. Green highlights candidate genes, located in genomic regions that were previously found to be associated with IgG N-glycome. Yellow highlights candidate genes, located in genomic regions associated with plasma N-glycome. Pink color highlights glycan traits mostly containing glycans that are linked to immunoglobulins. Blue color highlights traits that are mostly formed by glycans linked to other (not immunoglobulin) proteins. Blue/pink color highlights glycan traits, formed by a mixture of glycans that are linked to immunoglobulin and non-immunoglobulin proteins. Arrows represent genetic association (P-value < 1.66E-9) between gene and specific glycan


The study’s highlights

For instance, the genetic variation in the FUT3/FUT6 locus is a major (in terms of proportion of variance explained and number of glycans affected) genetic factor for non-immunoglobulins glycosylation. According to current knowledge, these enzymes catalyze fucosylation of antennary GlcNAc32, resulting in glycan structures that are not found on IgG. This is consistent with the spectrum of glycan traits associated with FUT3/FUT6 locus (Figure 2). Eight out of the twelve replicated loci contained genes that encode enzymes directly involved in glycosylation (FUT3/FUT6, FUT8, B3GAT1, ST6GAL1, B4GALT1, ST3GAL4, MGAT3, and MGAT5). There is a clear overlap in genetic control between plasma and IgG glycosylation. Moreover, we start seeing loci and genes which are likely to reflect other, more complex, aspects of plasma glycosylation process.

This study, while using a smaller sample but more precise UPLC technology and recent GWAS  imputation panels, confirmed the association of five known loci and identified and replicated an additional seven new loci, demonstrating that genetic control of plasma protein N-glycosylation is a complex process that involves a network of interacting proteins.

Summary statistics from this plasma N-glycome GWAS for 113 glycan traits are available for interactive exploration at the GWAS archive