Long after our simulation study on RNA-seq, itseems that it is finally time to see some real world use for it. While getting cited many times might suggest some utility for it, being able to actually help real world applications like curing cancer or malaria seems a better reward.
Downloading a dataset of reads(from a certain bird species) mapped onto the Zebra Finch genome from SRA and calculating the mean mapping quality per gene shows that genes with an Ensemble id greater than ~14,000 have a lower mean mapping quality compared to the rest of the genome.
This result could potentially be due to the increased mis-mapping seen in our simulation study (see Figure-4). Would discarding multi-mapped reads (zero mapping quality) get rid of this bias? Should this be a issue of major concern? Hopefully, followup studies will appreciate the importance of such bias and try to avoid it or correct for it.