Thursday, February 26, 2015

Dog genome annotation quality has improved during the past 30 Ensembl releases

Quality of genome annotation changes with each new Ensembl release. How much does the quality actually change? 

We can quantify the amount of change in annotation quality by looking at specific parameters like number of genes, transcripts, exons etc. Here, the annotation of the dog genome from Ensembl release 48 to 78 is compared to understand the effect of change in genome assembly as well as role of annotation curation. 

First, we see the number of annotated genes(y-axis) in the Dog genome vs the Ensembl release(x-axis). It can be seen that the number of genes steadily  keeps increasing with each release. However, the sharp drop in number of genes in release 68 is surprising. The drop in number of genes from release 67 to 68 is less surprising as that release was accompanied by a change in genome assembly from BROADD2 to CanFam3.1
Various other features of the genome annotation such as Number of transcripts per Gene, Median coding length, Average Exons per gene and 3' UTR count change in a very different way (see below figure)


While such large changes in the genome annotation between releases might be scary, the engine of science chugs on. In the broader scheme of things, the fact that the Dog has 23,062 genes or 24,660 genes should hopefully not matter for the Dog breeders who seem to be driven by the discoveries made possible by the Dog genome and its annotation.

No comments: