Wednesday, December 11, 2019

Correcting the nucleotide sequence of the tiger genome at the base-pair level

Tiger is the national animal of not just India but also South Korea, Malaysia and Bangladesh. Such importance accorded to this animal is a reflection of its true grandeur. Unfortunately, the historic range of tigers has diminished drastically in this century leading to tigers being classified as an endangered species. Being an endangered large cat, considerable efforts have been directed at conservation of the tiger. Recent conservation efforts have turned to using genomic tools to answer new questions (for example, see: "Conservation priorities for endangered Indian tigers through a genomic lens").

Most studies focusing on conservation using genetic tools have been dealing with magnitude of the diversity, demographic history and its interaction with geographic extent. These approaches have helped develop strategies to control illegal trade and associated poaching. However, the use of expensive genomic tools to aid conservation efforts is still not a mainstream topic. Despite discussions regarding conservation genomics and its utility, concrete examples of genomics making a difference on the ground are still rare and far between.

When the genomes of primates such as human and chimp were sequenced almost two decades ago, the promise of comparative genomics in identifying human specific traits was of great interest. A very compelling example of differences between human and chimp within the exonic region is that of exon2 in PRM1 gene. The alignment of human and chimp genomes for this region is given below:

Human      GGTGCTGCCGCCCCAGGTACAGACCGCGATGTAGAAGACACTAATTGCACAAAATAGCACATC
Chimpanzee GGTGCTGCCGCCGCAGGTCCAGAATGAGACGTAGAAGACACTAATTGCACAGAATAGCACATC 
Originally, this pattern of four amino-acid encoding differences within a single exon was reported by Sabeti et al 2006 (Positive Natural Selection in the Human Lineage).


Human                    CGCCCCAGGTACAGACCGCGATGTAGAAGACACTAATTGC
Bonobo                   CGCCGCAGGTCCAGACTGAGACGTAGAAGACACTAATTGC
Chimpanzee               CGCCGCAGGTCCAGAATGAGACGTAGAAGACACTAATTGC
Gorilla                  CGCCGCAGGAACAGACTGAGACGTAGAAAACACTAATTGC
Orangutan                CGCCGCAGGTACAGACTGAGATGTAGAAGACACTAATTGC
Gibbon                   CGCCCCAGGTACAGGCTGAGACGTAGAAGACACTAATTGC
Sooty mangabey           CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAATTGC
Drill                    CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAATTGC
Olive baboon             CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAATTGC
Gelada                   CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAATTGC
Crab-eating macaque      CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAATTGC
Macaque                  CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAATTGC
Pig-tailed macaque       CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAATTGC
Vervet-AGM               CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAATTGC
Angola colobus           CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAATTGC
Ugandan red Colobus      CGCCGCAGGTACAGGCGGAGGTGTAGAAGATACTAATTGC
Black snub-nosed monkey  CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAATTGC
Golden snub-nosed monkey CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAATTGC
Ma's night monkey        CGCCGCAGGTATAAGCCGCGGTGTAGAAGACACTAATTGC
Marmoset                 CGCCGCAGGTACAAGCTGCCATGTAGAAGATACTAATTGC
Capuchin                 CGCCGCAGGTACAGACTGAGGTGTAGAAGATACTAATTGC
Bolivian squirrel monkey CGCCGCAGGTACAAGCTGAGGTGTAGAAGATACTAATTGC
Tarsier                  CGCCGCTCCTTCCGGCTGAGGTGTAGAAGATACTGA-CGC
Mouse Lemur              CGCCGCAGGTACAGGTGTAGAAGAAGAAGATACTAAATGC
Greater bamboo lemur     CGCCGCAGGTACAGG------TGTAGAAGATACTAAATGC
Coquerel's sifaka        CGCCGCAGGTACAG---GTGTAGAAGAAGATACTAAATGC
Bushbaby                 CGCCGCAGGTACAGGCTGAGGTGTAGAAGATACTAAACGC

Using a multiple sequence alignment that spans 27 primate species we are able to further delineate the changes that have occurred in the human lineage vs those that have happened in the chimp lineage. The PRM1 gene codes for a protamine protein that acts as a substitute for histones in the chromatin of sperm during the haploid phase of spermatogenesis. Striking patterns of positive selection and associated changes in the sperm morphology have been documented in various species. Identification of such amino-acid altering substitutions between species would contribute to a better understanding of the species and help define the entity that is the focus of conservation.

Given such interesting insights at the molecular level from genome sequencing, genome sequencing of any species has the potential to reveal interesting new information about a species. The genome of the tiger was first reported by Cho et al 2013 (The tiger genome and comparative analysis with lion and snow leopard genomes). Subsequent studies have used the tiger genome for comparative analysis in many high profile papers to identify patterns of protein evolution. 

Mittal et al 2019 (Comparative analysis of corrected tiger genome provides clues to its neuronal evolution) report corrections in the genome assembly sequence of the first tiger genome published by Cho et al 2013 and currently available as PanTig1.0 on ensemble as part of the release 98 (September 2019). In addition to the support from raw read data, the authors rely upon multiple sequence alignment based ancestral states and re-sequencing data from other individuals to ensure that the corrections that they are performing are correct. Having been on biorxiv for almost a year, this corrected tiger genome will hopefully motivate a speedy update of the tiger genome assembly on ensemble. The underlying program used for genome correction is called SeqBug. It is freely available for download on its own github page and is a better version of BCD.



Sunday, August 18, 2019

Wombats are herbivores with CDCA and 15-alpha-OH as the major bile salts

The wombat looks like a overgrown rat or even a cat with rat like features. However, it is neither a rodent nor a carnivore. Being a marsupial confined in its geographic distribution to Australia, many of us have probably never seen it. However, it does look similar to the koala bear in someways. Their claws and front teeth are used for burrowing as well as eating tough vegetation. These species feed on roots and bark. A very slow metabolism has been documented and is thought to help them survive in arid environments.
Vombatus ursinus -Maria Island National Park.jpg

The bile composition of the wombat (Vombatus ursinus) has been quantified using HPLC. It mainly consists of CDCA and 15-alpha OH bile acids. It is unclear whether the other two species of Northern and Southern hairy-nosed wombats (Lasiorhinus krefftii & latifrons) have a similar bile content. Given the frequent changes in bile composition of closely related species, it is possible that bile composition might be different in these other species.

Shinde et al., explores the signatures of relaxed selection in the CYP8B1 gene and finds strong patterns of relaxed selection in the wombat CYP8B1 gene. The time between biorxiving and acceptance of the paper is fairly short given the fast turnaround time of the journal of molecular evolution. All the code used for the analysis along with detailed instructions are posted on the github-CYP8B1 page that goes with the paper. In addition to the striking pattern of relaxation seen in the wombat CYP8B1 gene sequence the manuscript also explores few other aspects related to cetaceans, birds, afrotheria and technical challenges associated with detecting relaxed selection and gene loss. By investigating population level variability of the gene in chicken, we are able to identify the CYP8B1 gene that is not annotated in the latest build of the Gallus gallus genome. Located beside the ACKR2 gene seen in the picture below, it is conserved across a large number of chicken breeds despite having acquired a stop codon in the genome of the individual used for performing genome assembly. Future versions of the chicken genome will hopefully annotate this gene.

Figure 1: Lack of annotation for the CYP8B1 gene in the chicken genome.

Saturday, February 2, 2019

Incomplete Bhojeshwar Temple - a treasure trove of insights into ancient temple construction

India is without any doubt a country filled with temples. Every temple that i have been to has a large number of religious visitors and very few "cultural" only visitors. Fortunately, i visited a very interesting temple recently. This temple is incomplete and has been for almost a thousand years and tends to attract many non-religious visitors as it does not have traditional pooja (worship).

It is unclear why the temple construction was abandoned. However, anecdotal stories about why the construction was stopped range from war, natural calamities, superstition and even divine intervention. Nonetheless, the fact that temple construction is frozen in time has made it possible for experts to study temple construction methods of the 11th century. This is similar to freezing natural phenomenon in time to study them. Study of genomes to understand evolutionary processes seems very similar to studying an incomplete temple to understand construction methods.