Tuesday, December 6, 2022

Why do we need to find and quantify distinctive features of a lineage?

Lineage-specific genes, or LSG's have received a great deal of attention in genomic studies of evolution. However, the challenge involved in assembling and annotating genomes has meant that getting accurate numbers and definitive lists has proved difficult. Nonetheless, it has been shown that "Many, but not all, lineage-specific genes can be explained by homology detection failure." Several well-known examples of lineage-specific genes have been the focus of detailed study. For instance, the loss of several functionally related genes in the yeast baker's yeast Saccharomyces cerevisiae has been linked to the loss of corresponding phenotypes. Conclusively establishing the reasons for lineage-specific genes is also troublesome. LSG's can be inferred due to loss, rapid sequence divergence, duplication of genes, or other evolutionary events. Ruling out various possibilities and disentangling the cause and effect is not always possible.

Apart from LSG's, other lineage-specific changes have been noted in cis-regulatory elements, changes in interaction partners, etc. What is the contribution of these different lineage-specific changes? Which changes have had a large role in phenotypic evolution? Are these changes correlated, and if so, in what ways? These kinds of questions regarding the relevance and implications of lineage-specific changes (LSCs) can be answered when the changes can be quantified with confidence by ruling out various bioinformatics artifacts. Once we have answers to these questions, we may begin to understand the evolution of genetic changes from a different perspective. 

Lineage-specific accumulation of repeats in genomes has also been studied, although mostly in non-genic regions. For instance, in a comparison of plant genomes, Patil et al. identified large differences in the repeat content of closely related species. Do similar changes occur in protein repeats? How rampant are such changes genome-wide? These are intriguing questions, especially when it is known that some genes have rapid changes in protein repeat content. Among the various genes that are worthy of study (GWoS), immune genes occupy the prime place. Hence, it is no wonder that Teekas et al. do a comprehensive study of protein repeat evolution in immune genes. In their article titled "Lineage-specific protein repeat expansions and contractions reveal malleable regions of immune genes," they screen the annotated genomes of vertebrates for orthologs that contain protein repeats. Having identified orthologous repeats in orthologous genes, they quantify the expansion or contraction of repeats. Since these sorts of changes in repeat length seem to have a phylogenetic signal, they use PIC (Phylogenetically Independent Contrasts) to identify changes that are most distinctive in specific lineages. What if any of these changes have functional consequences will have to be seen. This approach developed by Teekas et al. opens the door for large scales identification of candidate proteins that may have "tuning knobs" of evolution. On the other hand, if these changes in repeat length have no consequence whatsoever, they reveal the regions of immune genes that are plastic to such changes.