Wednesday, November 4, 2020

A gene seems to be lost in chicken - is it really true?

Plg-RKT or Plasminogen Receptor K terminal was first identified a decade ago (in the year 2010) based on a proteomics study. Given its recent discovery, the number of research papers focused solely on this protein are limited. List of the papers dealing with the discovery and functional characterization of this protein as well as its relevance to understanding the healthy and disease state of the body are provided in the references. Prior to its characterization, Plg-RKT was known as C9orf46 due to its presence on human chromosome 9 (ORF 46). The original discovery of the role of Plg-RKT and most of the subsequent work related to this protein emanate from the lab of Lindsey Miles (Professor of Cell and Molecular Biology) at The Scripps Research Institute, La Jolla, CA. The 2010 paper notes

Our isolation of peptides corresponding to C9orf46 homolog is, to our knowledge, the first demonstration of the existence of this protein. We have designated the protein, Plg-RKT, to indicate a plasminogen receptor with a C-terminal lysine and having a transmembrane domain.

Presence of the C-terminal lysine in this protein seems to be highly conserved across mammals and birds. This lysine residue is exposed on the cell surface and is recognized by plasminogen. The known functions of the Plg-RKT gene can be summarized as follows:

  1. Regulation of macrophage phenotype
  2. Mammary development and lactation
  3. Regulation of efferocytosis
  4. Metabolic homeostasis and adipose function
  5. Mediation of Lipoprotein(a) endocytosis
  6. Regulation of cell surface plasminogen activation

Given the evidence for multiple important functional roles of this gene, it seems unlikely that this gene is dispensable. Several other receptors for plasminogen do exist and could potentially play a compensatory role. The orthologs of Plg-RKT are well conserved and a 1 to 1 ortholog (CG13404 (FBgn0030559)) is annotated in Drosophila melanogaster. A recent pre-print implicates this orthologous gene in Coxiella burnetii Infection susceptibility in Drosophila melanogaster based on a GWAS study that relies on DGRP. Although no ortholog is annotated in yeast, two homologs (tag-280 (WBGene00044322) and tag-281 (WBGene00044323)) are annotated in C. elegans and remain uncharacterised.

To further identify potential gene losses in chicken, we obtained a list of genes that are co-expressed with Plg-RKT in human samples or otherwise known to interact with Plg-RKT and evaluated whether their orthologs are present in chicken. 

 Sl. No Human Gene stable IDGene name Chicken ortholog Remark 
 1 ENSG00000062038CDH3 ENSGALG00000051984 Pseudogene annotation on ensemble but annotated mRNA with ORF at KY120273.1 
 2 ENSG00000137975CLCA2 ENSGALG00000050155 Ortholog found 
 3 ENSG00000149547EI24 ENSGALG00000038097 Ortholog found 
 4 ENSG00000126749EMG1 ENSGALG00000014568 Ortholog found 
 5 ENSG00000068438FTSJ1  Is this lost ??Chicken Chr 12 and Chr 13 breakpoint 
 6 ENSG00000189280 GJB5 ENSGALG00000054289 Ortholog found 
 7 ENSG00000108010GLRX3 ENSGALG00000010464 Ortholog found
 8 ENSG00000196743GM2A ENSGALG00000027534 Ortholog found 
 9 ENSG00000138271GPR87 ENSGALG00000010377 Ortholog found 
 10 ENSG00000113161HMGCR ENSGALG00000014948Ortholog found 
 11 ENSG00000053747LAMA3 ENSGALG00000015056 Ortholog found 
 12 ENSG00000172172MRPL13 ENSGALG00000041863 Ortholog found 
 13 ENSG00000131467PSME3 ENSGALG00000002937 Ortholog found 
 14 ENSG00000087494PTHLH ENSGALG00000017295 Ortholog found 
 15 ENSG00000176225RTTN ENSGALG00000013745 Ortholog found
 16 ENSG00000104549SQLE ENSGALG00000036915 Ortholog found 
 17 ENSG00000056972TRAF3IP2 ENSGALG00000015026 orthology not annotated 
 18 ENSG00000087245MMP2 ENSGALG00000003580 Ortholog found 
 19 ENSG00000100985MMP9 ENSGALG00000006992 Ortholog found 

Most of the above genes have clear 1 to 1 orthologs in chicken. The origin and diversification of the plasminogen activation system has been explored by looking at homologs of 15 genes consisting of the following groups:

  • PLG, HGF and MST-1
  • HABP2, HGFAC, tPA and uPA
  • SERPINE1 (PAI-1), SERPINE2, SERPINE3 and SERPINI1
  • PAI-2
  • VTN
  • 3LU and uPAR
When the orthologs of these genes are searched in chicken, we again find most of them. The exceptions are PLAUR and SERPINE1. Prior work has suggested these genes are lost in chicken. In addition, to these loss events, we see duplication of PLAU and PLG like loci. The potential loss of PLAUR could be interesting as PLAUR is known to interact with PLG. 

Gene stable IDGene nameChicken gene stable ID
ENSG00000173531MST1ENSGALG00000002722
ENSG00000122861PLAU *ENSGALG00000050317
ENSG00000122861PLAU *ENSGALG00000046993
ENSG00000148702HABP2ENSGALG00000008905
ENSG00000163536SERPINI1ENSGALG00000009470
ENSG00000122194PLG *ENSGALG00000028886
ENSG00000122194PLG *ENSGALG00000004293
ENSG00000135919SERPINE2ENSGALG00000005135
ENSG00000011422PLAUR
ENSG00000104368PLATENSGALG00000003709
ENSG00000253309SERPINE3ENSGALG00000017017
ENSG00000109758HGFACENSGALG00000015623
ENSG00000019991HGFENSGALG00000033974
ENSG00000106366SERPINE1
ENSG00000109072VTNENSGALG00000003589

We next compiled the list of all the plasminogen receptors from previous reviews.

Sl NoGene stable IDGene symbolChicken gene stable IDGene nameRemarks
1ENSG00000074800ENO1ENSGALG00000002377alpha-enolaseFirst plasminogen receptor to be identified. See: Activation of plasminogen into plasmin at the surface of endothelial microparticles: a mechanism that modulates angiogenic properties of endothelial progenitor cells in vitro (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2495018/)
2ENSG00000189403HMGB1ENSGALG00000042875Amphoterin 
3ENSG00000197747S100A10ENSGALG00000028774P11 
4ENSG00000107020PLGRKTIs this really lost?Plg-RKT 
5ENSG00000182718ANXA2ENSGALG00000003770Annexin A2 
6ENSG00000170421KRT8 ENSGALG00000050400Cytokeratin 8Orthology not annotated
7ENSG00000005961ITGA2BENSGALG00000054766Integrin Alpha-IIb/beta-3 
8ENSG00000138448ITGAVENSGALG00000002655Integrin AlphaVbeta3 
9ENSG00000169896ITGAMOrthologs found in lizard and alligator but not in birds. Duplication of ITGAX (see: Structural analysis of the CD11b gene and phylogenetic analysis of the alpha-integrin gene family demonstrate remarkable conservation of genomic organization and suggest early diversification during evolution.(https://www.jimmunol.org/content/150/2/480.long))Integrin Subunit Alpha MIntegrin αMβ2 Orchestrates and Accelerates Plasminogen Activation and Fibrinolysis by Neutrophils (https://www.jbc.org/content/279/17/18063.long)
10ENSG00000160255ITGB2ENSGALG00000007511Integrin subunit beta 2 
11Histone genes occur in cluster and all copies retain high levels of sequence similarity. See: Molecular Evolution of the Nontandemly Repeated Genes of the Histone 3 Multigene Family (https://academic.oup.com/mbe/article/19/1/68/1066713)  Histone 2BPhosphatidylserine as an anchor for plasminogen and its plasminogen receptor, Histone H2B, to the macrophage surface (https://onlinelibrary.wiley.com/doi/full/10.1111/j.1538-7836.2010.04132.x)

The Immunogenetics journal has previously (2019 [Convergent inactivation of the skin-specific C-C motif chemokine ligand 27 in mammalian evolution (https://link.springer.com/article/10.1007/s00251-019-01114-z)] and 2018 [Cetacea Are Natural Knockouts for IL20 (https://pubmed.ncbi.nlm.nih.gov/29998404/)]) published gene loss stories in cetaceans. However, both IL20 and CCL27 are well studied genes and the observed loss spanned several species including independent losses. The authors could also provide a fairly convincing explanation for why these genes were lost in cetacean species and provide evidence from re-sequencing datasets and RNA-seq experiments. Loss of NLRC4 and NAIP in pigs was reported [Pig lacks functional NLRC4 and NAIP genes (https://link.springer.com/article/10.1007/s00251-016-0955-5)] in 2017. Being a domesticated species, changes in the immune repertoire of the pig has implications for the pork industry. Interestingly, this paper makes a reference to the lack of RIG-I in chicken and how the immune response is different because of this. 

Given all this background information we wanted to make sure we provided enough evidence for the pseudogenisation of PLGRKT to convince the reviewers. The recently published online PseudoChecker tool failed to find the remnants of the PLGRKT gene. Neither was it able to find the intact gene in Duck when the human exons and CDS were used as the reference. So PLGRKT can be added to the list of less than 5% genes that PseudoChecker is supposedly unable to find. Fortunately, we have been told that real lossomicists (scientists whose specialty is finding gene loss events) use exon by exon tblastx followed by careful scrutiny to prove gene loss. So we did this and find very clear evidence for the existence of exon-3 remains and largely intact exon-4 in chicken. Side by side comparison with results with duck cDNA are provided here: https://github.com/ceglab/PLGRKT/tree/master/tblastx. So all this and the evidence presented by Sharma et. al., suggests this genes is truly lost in chicken. 

In addition to the work done by the Miles lab, recently published papers are from the University of Otago (see talk describing the work here (PLGRKT starts around 30 minutes into the video): https://www.youtube.com/watch?v=0bpNZSZdeQU) and Medical University of Vienna (see video here: https://www.youtube.com/watch?v=xjPmTDkhWr8).

 References

1.      Plasminogen and the Plasminogen Receptor, Plg-RKT, Regulate Macrophage Phenotypic, and Functional Changes (https://www.frontiersin.org/articles/10.3389/fimmu.2019.01458/full)

2.      The Plasminogen Receptor, Plg-RKT, and Macrophage Function (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3484331/)

3.      The Plasminogen Receptor, Plg-RKT, is Essential for Mammary Lobuloalveolar Development and Lactation (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5965281/)

4.      The Novel Plasminogen Receptor, Plasminogen ReceptorKT (Plg-RKT), Regulates Catecholamine Release (https://www.jbc.org/content/286/38/33125.full)

5.      Plasminogen receptors and their role in the pathogenesis of inflammatory, autoimmune and malignant disease (https://onlinelibrary.wiley.com/doi/pdf/10.1111/jth.12064)

6.      Deficiency of Plasminogen Receptor, Plg-RKT, Causes Defects in Plasminogen Binding and Inflammatory Macrophage Recruitment in vivo (https://pubmed.ncbi.nlm.nih.gov/27714956/)

7.      Plasminogen and the Plasminogen receptor, Plg-RKT, regulate efferocytosis and macrophage reprogramming (https://www.fasebj.org/doi/abs/10.1096/fasebj.2018.32.1_supplement.280.4)

8.      The Plasminogen Receptor, Plg-RKT, Regulates Metabolic Homeostasis and Promotes Healthy Adipose Function (https://www.ahajournals.org/doi/abs/10.1161/circ.134.suppl_1.19088)

9.      Proteomics-based discovery of a novel, structurally unique, and developmentally regulated plasminogen receptor, Plg-RKT, a major regulator of cell surface plasminogen activation (https://ashpublications.org/blood/article/115/7/1319/26700/Proteomics-based-discovery-of-a-novel-structurally)

10.  Regulation of Macrophage Migration by a Novel Plasminogen Receptor Plg-R KT (https://pubmed.ncbi.nlm.nih.gov/21940822/)

11.  New Insights Into the Role of Plg-RKT in Macrophage Recruitment (https://pubmed.ncbi.nlm.nih.gov/24529725/)

12.  Plasminogen Receptors: The First Quarter Century (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3938387/)

13.  New Insight on the Role of Plasminogen Receptor in Cancer Progression (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4521684/)

14.  Plasminogen Receptors in Human Malignancies: Effects on Prognosis and Feasibility as Targets for Drug Development (https://pubmed.ncbi.nlm.nih.gov/31755385/)

15.  Differential expression of Plg-RKT and its effects on migration of proinflammatory monocyte and macrophage subsets (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6688429/)

16.  Recycling of Apolipoprotein(a) After PlgRKT-Mediated Endocytosis of Lipoprotein(a) (https://www.ahajournals.org/doi/full/10.1161/circresaha.116.310272)

17.  Natural Genetic Variation in Drosophila melanogaster Reveals Genes Associated with Coxiella burnetii Infection  (https://www.biorxiv.org/content/10.1101/2020.05.21.109371v1.full)

18. Origin and diversification of the plasminogen activation system among chordates (https://bmcevolbiol.biomedcentral.com/articles/10.1186/s12862-019-1353-z)

 


Monday, November 2, 2020

Genome sequencing and assembly of Mesua Ferrea or Nagakeshara

The first draft genome assembly of the plant Mesua Ferrea also colloquially known as Nagakeshara has been published by Patil et.al., 2020. Using high coverage (~180X ) Illumina sequencing data, the draft genome has been assembled using the latest genome assembly software. Due to its importance in traditional medicine and use as a biofuel, the plant has also acquired important religious significance. Infact, it has been made the state flower of the North-eastern states of Tripura and Mizoram. The de novo assembly generated by Patil et.al., 2020 is 614 Mega-base pair (Mbp) in size and has an N50 of 392 Kilo-base pairs (Kbp). The assembly quality is thought to be comparable to other published Malpighiales genomes. 

Some genome assemblies aspire to have even higher N50 values to be considered of high quality. To achieve these exceedingly high contiguity values these projects tend to rely upon Pacbio sequencing data or Nanopore sequencing data. In addition to this, some projects utilize optical mapping data also. However, these advanced methods of sequencing are expensive and equipment for these methods are hard to find. However, the manuscript published by Patil et.al., 2020 adds an additional dimension to their study by performing a comparative analysis of the demographic histories of several forest plants using the PSMC program. Notably, the parameter settings used to run PSMC are noted and proper optimization is performed. This is in contrast a slew of papers which tend to ignore the parameter settings that are to be used.

A previous version of the manuscript titled "CoalQC - Quality control while inferring demographic histories from genomic data: Application to forest tree genomes" dealing with various technical aspects now continues to languish on the Biorxiv repository. This may be a good testament to the fact that good English writing skills and proper structuring of the manuscript is more important than technical correctness when publishing in higher impact factor journals. Appeals to the contrary are interesting but unlikely to make much of an impact.