Monday, December 26, 2016

Why god hates ring species?

If god has an inordinate fondness for beetles, then he must surely hate ring species. As eminent evolutionary biologist, prominent god denier and all round good guy retired Professor Jerry Coyne notes in his blog, there are no ring species. Atleast no ring species that meet the textbook definition. He describes how genetic evidence has shown that even the most well regarded ring species have been shown to have some levels of allopatry. 

Being a long time follower of his blog, why evolution is true as well as his numerous papers and books that have established the field of Speciation, i was over joyed to write a digest for journal Evolution around this topic. The other great name in Speciation, Trevor Price has worked on Ring Species making this one of those cool things that has captivated the imagination of very smart people. Barriers to gene flow and ring species formation is a recent article that uses agent based model simulations to show how ring species formation requires narrow corridors. These narrow corridors are highly susceptible to disruption due to local changes. Such disruption can result in allopatry. 

"Tiny" amounts of allopatry can lead to a ring species not meeting the textbook definition. So obviously, god hates ring species :) and that is why he "designed" such a characteristic feature. On a more serious note, this paper does highlight the importance of using simulations in the field of Evolution. Hopefully, we will see increasing use of such approaches in the future.

Thursday, November 17, 2016

Stable polypoidy in yeast identified through experimental evolution

The paper we discussed today in the journal club is a topic that has always been very fascinating. Polyploidy, especially speciation driven by polyploidy is thought to be very common among plants. Janaki Ammal, "attributes the higher rate of plant speciation in the northeast Himalayas compared to the northwest to polyploidy" in her paper : The effect of the Himalayan uplift on the genetic composition of the flora of Asia. Infact, a lot of work done by her revolves around polyploidy. Some of this work is fascinating and seems relevant even today. One could continue her research program today and still be very current.

Coming back to the paper, it is yeast not plants. It makes it easier to study due to the ease of manipulation. Title of the paper is "Experimental Evolution Reveals Interplay between Sch9 and Polyploid Stability in Yeast". After constructing the strains, experimental evolution was conducted for 1000 generations to evolve the required strains. All that is needed for figure 1 and 7 is flow cytometry. Figure 3 and 5 are both relative fitness assay's and seem doable but need a lot of work. However, figure 2 and 4 need a array platform and might be out of reach.

The fact that they are able to pin down the the Sch9 gene and the TORC pathway takes the paper to the next level. Although it is not clear what the mechanism is or how general and widespread this pathway can be in stabilizing polyploidy. It would of course be interesting to see if the natural polyplod isolates from the evolution canyon show changes in this very pathway.

Wednesday, November 9, 2016

Heterogeneous genome differentiation : crow hybrid zones

Our paper about the analysis (mostly characterization, with some new ideas) of the genome wide differentiation landscape in more than 100 crow genomes is now available to read. With a title that reads "Evolution of heterogeneous genome differentiation across multiple contact zones in a crow species complex" we hopefully portray the appropriate message. The message would be that it is rather easy for a well managed and well funded lab to choose a study system and build it up to a level of a model system rather quickly. The uniqueness of the study is the presence of replicate hybrid zones, being able to contrast the phenotypes at different evolutionary distances and try and disentangle how selection acts across the genome.

We say "..parallelism by pathway rather than by repeated single-gene effects." I like this part in that it suggests multiple genes that can lead to the same phenotype. However, pinning down the actual genes and the causative variants is easier said than done. Systems that are easier to manipulate with much more historical context are still not resolved to this resolution. This is not to say that this work is not needed. It definitely is needed before we go to the next step. Numerous bird species have come to a stage where these resources have been developed. Integrating these data into a theoretical framework that spans multiple study systems will probably still take time.

The main result of the paper is basically figure 2, which tries to subtract out the background signals and find signatures that are unique to each hybrid zone. Similar attempts are underway in various other species with slightly more rigorous models. Some incorporate recombination rate maps, others use sequence conservation across species. A resolution of how common selective sweeps (hard, soft, partial etc.) are in natural populations as well as the methodology to detect them will probably still have to play out as functional validation methods are developed.

Saturday, September 24, 2016

Janaki Ammal's last day in the USA

A letter dated May 8th, 1931 from Cambridge (in USA) to Cobb Blanchard is a short letter. Janaki Ammal (JA for short), is happy to be spending her time in this "interesting part" of the country. She also refers to the fact that she has not had time to think about Ann Arbor after leaving and regrets at having "left many a thing unfinished and undone."

Just 3 days before this, May 5th, 1931 is a type written, unsigned letter sent to JA. The letter talks about Grace (Cobb's daughter) looking at JA's portrait and recognizing her. It has some more questions about what to do about some refunded money and type writer that was left behind. Cobb thanks JA for a homespun cover that she left for Cobb, remarking that it is "lovely". We further come to know that JA took a bus from Ann Arbor to Washington (for sight seeing) and then onto Cambridge and Boston.  It seems the May 8th letter is a response to the letter sent on May 5th.

On June 8th, 1931, Cobb writes a reply to the May 8th letter. In this they talk about Eileen's accident and hope for her recovery. Cobb talks about various things regarding her family. Its probably worth noting that she refers to her husband Frank being tired due to the extra duties of looking after the children. She also thanks JA for a postcard of sea gulls sent from New York. This letter talks about the third child (a boy) and his arrival into the family.

On June 28th, 1931 JA write a reply to this letter from John Innes Horticultural Institution. As a true cytologist she writes "I am so happy that atleast one of the three has the Y chromosome of the family".  Dorothy and Grace are the older girls referred to in the previous letters. JA is also very happy with atmosphere for working at John Innes. More than anything else, she is impressed by C.D Darlington and writes "is a really brilliant man with a delightful sense of humor and an infinite capacity for talk and discussion." At tea time and lunch, JA has heard about the two books being written by Darlington, one is a text book of cytology and another about travels in Persia.

In the same letter JA talks about having looked at root tips from her egg plants and seeing 46 to 48 chromosomes. She is excited to think that they are tetraploids. Her egg plants here of special importance as they are named "Janaki Brengal" and referred to as her first love in the Michigan Alumnus Volume 42, page 532, who's who. The article in the Michigan Almunus also notes that she lived in the Martha cook building.

Thursday, September 22, 2016

Janaki Ammal Correspondence with Cobb Blanchard - Part 1

Have managed to photograph all the correspondence involving Janaki Ammal from the Bentley archives. So i have around 150 pages worth of material to read through and discuss. So, this is going to be spread over numerous blog posts with my own comments and observations. Finally, i hope to put all this information together to paint a picture of Janaki Ammal as seen through these letters.

The first letter i am going to talk about is dated December 24th, 1931 from the Department of Botany, the Presidency college Madras. It is addressed to Frieda Cobb Blanchard, who along with Harley Harris Bartlett worked on evening primrose research. Cobb, earned her doctorate in 1920 while studying Mendelian inheritance in certain strains of Oenothera. This letter is written by Janaki Ammal, after she has returned to India and has taken up a position as a Research Fellow at the Presidency college.

The letter begins by thanking Cobb Blanchard for the data about her Egg plant cultures sent along with the letter. Then she inquires about the little girls (Cobb's daughters). As a Indian, i can appreciate the similarity of life back in 1930's to what we have now when she writes "I led a very gypsy life in Malabar visiting all the members of the E.K clan and being introduced to all the new arrivals". Its hard for me to decipher if this sentence reflects some amusement on her part at having to visit so many members of her Clan.

She writes "I have just fitted up a small cytological lab." Her teachers in Botany, Prof. Tyson and Dr Ekambaram are keen that they perform cytological studies. She is also supposed to guide 3 boys working for their Msc and give a few lectures in Cytology.

Apart from this position, she is also in contact with a Sugar cane expert at the agricultural station in Coimbatore who wants her help in studying the cytology of Indian sugar canes. In November she visited this cane station and fixed (for cytological examination) some material. She also talks about how life in Madras is very interesting and that they are having the "All India Women's Educational Conference". Most of the delegates are being house at the Queen Marry college where she is at that time. Its probably worth noting that the All India Women's Conference was formally registered in 1930. However, i did not find mention of this event in Madras in their history.

The 37's: How i came to find more information about Janaki Ammal

"Rust" - high levels of ferric oxide, led star trek voyager to a old automobile floating through space. They find even more evidence of human objects on a planet, halfway across the universe in the delta quadrant. True to their nature of exploration, they follow these pieces to find the 37. "The 37" are actually earth humans that were "abducted by aliens" in the 20th century and transported to the delta quadrant. Some of the progeny of these individuals manage to overthrow the alien masters and establish a human civilization in the far off Delta quadrant. While i could begin to wonder if a population from such a small group of people would have enough diversity to sustain a whole new earth, it has to be pointed out that the aliens have abducted very different people, a world war 2 Japanese soldier, a African-American farmer and even Amelia Earhart.

Yes, the first female aviator to fly solo across the Atlantic ocean who disappeared leaving behind numerous theories about what happened. This episode of star trek is a fitting tribute to Amelia and her legacy of inspiring generations of men and women that followed. Another lady who was active during the same time, traveled half way across the world, was a scientist of repute, Janaki Ammal. Apart from various academic achievements, a Padma Shri award has been conferred on her by the government of India. Her story is one of pioneering inspiration and grit.

Recently, while walking past the Rackham graduate school (Janaki Ammal was a Barbour scholar at University of Michigan) i was somehow reminded of her. After some digging around,  i came to know that the Bentley archives has maintained some of the correspondence involving her. Hopefully, i will be able to share the details of the contents and discuss some of interesting parts on this blog. The archive managers were super-helpful in finding the material and some of the digitized photographs. Given below are two of the photographs which include Janaki Ammal.

She is rather easy to spot as she is the only Indian lady in both pictures, even without the writing below the second picture.

My hope is to be get hold of the 15 – 20 letters worth of correspondence between Dr. Janaki Ammal and the Blanchard family in Box 5 of the Matthaei Botanical Gardens records and some additional correspondence in Box 11 of the University Herbarium records. It will be especially interesting to read about her experiences during the second world war.

Tuesday, August 9, 2016

LTEE - E.coli and long term evolution - Lenski and his E coli

The E.coli long-term evolution experiment is an experiment started in 1988, which has been continued for more than 65,000 generations now. Changes in fitness, evolution of new phenotypes and the associated changes in the genome have been monitored for 12 initially identical populations that have now diverged. We recently discussed the analysis of a dataset spanning 50,000 generations that showed an abundance of non-synonymous over synonymous changes. Even after 50,000 generations "beneficial" (non-synonymous) changes outnumber the synonymous changes. 

Re-plotting the data provided in the paper, we see a clear pattern of non-synonymous/synonymous >1 for the non-mutator populations with a peak at around 20,000 generations, followed by a decline. The point mutator populations have the opposite pattern with more synonymous changes than non-synonymous changes. 

The plot below shows a clear difference in the rate of accumulation of mutations between the "mutator" and non-mutator strains. Both these patterns are noted in the paper and are used to make say "Our experimental results thus support a selectionist view of molecular evolution, complementing indirect evidence based on comparative genomics in bacteria, Drosophila and humans. Of course, the LTEE may differ from many natural populations in important respects including its low mutation rate, the absence of sex or horizontal gene transfer, and a stable environment. As we showed, high mutation rates tend to obscure the role of selection in molecular evolution."

Just to pretend that i found something new in the data, i plot the G-score (which is used to assess the degree of parallelism between clones) across the genome. In the below figure, it looks like we have a cluster of high G-scores between 3,248,576 and 3,481,685. This cluster has 7 of the 15 highest G-score genes reported in the Table-1. Although not consecutive genes, this pattern does seem a bit striking.

Saturday, July 9, 2016

Speciation and divergence at the level of the transcriptome, genetic editing and microRNA

Eviatar Nevo has studied speciation in the "Evolution Canyon" to understand the process of evolution at different scales. In a recent paper from his group, sympatric speciation as a model for the origin of species is investigated using Spalax galili as a model system. Their hypothesis is that the phenotypic differences between the Chalk and Basalt dwelling populations have their respective phenotypes due to changes in DNA mutations (SNP's) or differences in DNA/RNA editing.

".. comprehensively screened the basalt and chalk genomes and transcriptomes for DNA and RNA editing and identified both differential DNA and RNA editing." They identify a few candidate genes that differ in the level of editing. While this does not establish a causal link, it is extremely interesting to think of non-SNP traits as potential markers. They also focus on microRNA and differences in codon usage preferences.

This picture shows the face of a spalax species. The entire clade consists of numerous species and has been the focus of intense study over the years. Blind mole rats have been the focus of numerous studies that have looked at how eye and eye proteins function.

Sunday, June 19, 2016

The Pursuit of Happiness or Happyness

Most people want to be happy. Except for the rare idiosyncratic individual the world really is a place filled with people who are in the pursuit of happiness. Happiness is a weird and sometimes illusive thing. The movie about The Pursuit of Happiness does a good job of capturing the illusive nature of Happyness.

Would eating the Gumbo make you happy? It has a history as much as Thomas Jefferson does, or may be even more.
How about modelling the evolution of directed graphs along a phylogeny? Would it make any difference what those directed graphs represent? They could be anything really, a map of the brain (connectome), parasite transmission networks, protein protein interaction networks or simply a simulation.  
In the end we are still only chasing after it.

Who has found it?

The unlikeliest of people have found it.  No it has nothing to do with Pittsburg skyline and everything to do with the mind.

Monday, May 30, 2016

Dolphins, crows and apes - as clever as it gets

Dolphins, just like crows & apes are very smart. In some way they represent the pinnacle of the independent evolution of intelligence in species that dwell in the oceans, fly in the sky and walk on land respectively. What can the brains of these distinct yet similar taxa tell us about intelligence? Will they be able to provide crucial insight needed to understand intelligence, thought and the brain? How can they guide artificial intelligence research?

All these may seem far fetched questions for another day. However, we are not too far away. A study published in the year 2013, "Large-scale network organization in the avian forebrain: a connectivity matrix and theoretical analysis" was able to generate a preliminary map of a avian forebrain. Work in this field is progressing at an incredible pace. It might be worth noting that one of the co-authors Murray Shanahan is actually a Professor of Cognitive Robotics and has written the book "The Technological Singularity". So the fields are not so far apart after all. 

Few months ago, while writing up my PhD thesis "Speciation genomics: A perspective from vertebrate systems" i began to realize how intricately linked the world is. One needs to understand the evolutionary genetics of phenotypic traits to be able understand speciation and adaptation. This understanding of genetics will play an important role someday in the future to look at traits like "intelligence". We may infact be able to unravel the great mysteries of the brain and its evolution. 

Understanding the brain, its evolution and genetics definitely have their own merits. The next step into the world of artificial intelligence and culture actually seems exciting at this point. What kind of morality would different machine cultures create? Would speciation "co-evolve" with culture in artificial systems as much as it seems to in the natural world?

Sunday, May 1, 2016

Ensembl Perl API to get all intron lengths in Human genom

The below script will get all stable id's from Ensembl and prints out the intron lengths for each transcript of every gene. Along with intron length, the flanking exon id's are also printed. One can get the upstream and downstream intron length for each exon using the output.

The output would look like this:
Gene Id                     Transcript Id            Previous Exon          Next Exon               Intron length
ENSG00000084674 ENST00000233242 ENSE00000932268 ENSE00000932269 717 ENSG00000084674 ENST00000233242 ENSE00000932269 ENSE00000932270 2338 ENSG00000084674 ENST00000233242 ENSE00000932270 ENSE00000932271 112 ENSG00000084674 ENST00000233242 ENSE00000932271 ENSE00000719046 1100 ENSG00000084674 ENST00000233242 ENSE00000719046 ENSE00000718984 261 ENSG00000084674 ENST00000233242 ENSE00000718984 ENSE00000932272 863 ENSG00000084674 ENST00000233242 ENSE00000932272 ENSE00000932273 1663 ENSG00000084674 ENST00000233242 ENSE00000932273 ENSE00000718481 1240 ENSG00000084674 ENST00000233242 ENSE00000718481 ENSE00000542194 482

 use strict;  
 use warnings;  
 use Bio::EnsEMBL::Registry;  
 use Bio::SeqIO;  
 use Getopt::Long;  
 my $registry = 'Bio::EnsEMBL::Registry';  
 ## Load the databases into the registry  
  -host => '',  
  -user => 'anonymous'  
 ## Get the gene adaptor for human  
     my $gene_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Gene' );  
     # Fetch my gene of interest usning ensemble ID  
     my @gene_ids = @{$gene_adaptor->list_stable_ids()};  
 foreach my $geneid(@gene_ids){  
 #print "$geneid\n";  
 my $gene = $gene_adaptor->fetch_by_stable_id($geneid);  
  foreach my $transcript (@{ $gene->get_all_Transcripts }) {  
   foreach my $intron (@{ $transcript->get_all_Introns }) {  
   print $gene->stable_id,"\t",$transcript->stable_id,"\t",$intron->prev_Exon->stable_id,"\t",$intron->next_Exon->stable_id,"\t",$intron->length,"\n";  

Wednesday, April 27, 2016

Visualizing isoforms of a gene as a undirected graph using sna package

The sna (social network analysis) package in R provides an easy to use interface for handling network data structures. Apart from numerous statistics that can be calculated on the graph, it is possible to visualize the graph using the "gplot" command.

Here, we use a perl script to convert the information regarding exon positions along the gene and transcript structure into the nos format. We are then able to see that the gene glutamate-cysteine ligase, catalytic subunit (GCLC) is actually made up of 5 different components.

Perl script to write the gene in nos format: (GeneRanked_exons contains list of exons with their positional rank in the gene. test.exon.order contains the list of exons in each transcript ordered by the exon positional rank in the transcript)
 ##perl GeneRanked_exons.txt test.exon.order > ENSG00000001084_graph.txt  
 my %exons=();  
 open(FILE, $ARGV[0]);  
 while (my $row = <FILE>) {  
  chomp $row;  
  @values=split(' ',$row);  
 close FILE;  
 my %matrix;  
 my $maxexon=0;  
 open(FILE2, $ARGV[1]);  
 my $row = <FILE2>;  
 chomp $row;@values=split('\t',$row);$exoncount1=$exons{$values[0]};$trans1=$values[2];$trancount1=$values[3];  
 while (my $row = <FILE2>) {  
  chomp $row;@values=split('\t',$row);$exoncount2=$exons{$values[0]};$trans2=$values[2];$trancount2=$values[3];  
  #print "$exoncount1\t$trans1\t$exoncount2\t$trans2\n";  
  #print "$exoncount1\t$trans1\t$exoncount2\t$trans2\t$trancount1\t$trancount2\n";  
  if(($trans1 =~ m/^$trans2$/i)&&($trancount1==$ftrancount2)){ $matrix{$exoncount1}{$exoncount2}=1;}  
 print "1\n";  
 print "$maxexon $maxexon\n";  
 for ($i=0;$i<$maxexon;$i++){  
 print "0 0 ";  
 print "\n";  
 for ($k=1;$k<=$maxexon;$k++){  
      for ($i=1;$i<=$maxexon;$i++){  
      if(exists $matrix{$k}{$i}){$j=$matrix{$k}{$i};}  
      print "$j ";  
      print "\n";  
 #     #1  
 #     #4 4  
 #     #0 0 0 0 0 0 0 0  
 #     #0 1 0 0  
 #     #0 0 1 1  
 #     #0 1 0 0  
 #     #0 0 1 0  
The output from the printNetwork command can be read into R using the read.nos function of the sna package. The graph can then be visualized using below lines in R.

This produces a graph that looks like this: (each red dot is an exon with the number beside it being its positional rank).
While the actual gene on ensemble looks like this: