Monday, May 17, 2021

On the virtues of identifying the correct publication units and the ills of Salami slicing

Publishing a paper is much more than just doing good research and writing it up. It needs a lot of thought into identifying the correct publication unit, crafting a captivating story, and delivering it with the right tone. A vivid memory that has stuck in my mind is a lecture on publication ethics that discussed the idea of salami slicing. The phrase "salami-slicing" of a paper refers to splitting up a manuscript into numerous small pieces to increase publication count. At the time it seemed to me like the evilest thing that a scientist could do and it reeked of greed and cunningness. 

Recent events have resulted in introspection on this quick judgment that I had jumped to. Alternative reasons for why "salami-slicing" could happen are listed here (not aimed at justifying):

  1. A story can become too long and convoluted without the proper amount of content.
  2. Reviewers might be inclined to comment stuff like "This manuscript is about multiple things,  and although the subjects are certainly appropriate for XYZ journal" etc. "At least 5 disparate projects are included in the paper...". Such comments can motivate or rather ensure splitting the manuscript into multiple parts. 
  3. The cost of doing research continues to increase in most biology-related domains. Pouring all of these resources into one mega monolith might not be liked by funding bodies or other relevant authorities. Focus on paper count rather than quality or thoroughness of the research is a worrying prospect. 
Having explained some background that doesn't justify "Salami-slicing", let me provide details of what Patil et. al., did. First, the manuscript titled "CoalQC - Quality control while inferring demographic histories from genomic data: Application to forest tree genomes"  dealing with various technical aspects of PSMC was posted on the Biorxiv repository in March 2020. Next, Patil et.al. managed to publish the first part of the study in the journal Gene, titled "The genome sequence of Mesua ferrea and comparative demographic histories of forest trees" in October 2020. However, the technical parts dealing with repeats, genome assembly, and parameter settings remained unscrutinized by the powerful gaze of the intellects of peer reviewers. After struggling through numerous journals that were willing to publish the manuscript without demanding article processing charges (APCs), the second part is now published in the journal Heredity, titled "Repetitive genomic regions and the inference of demographic history".

The date of acceptance (17th April 2021) for this second part is of great significance. It was the 130th birth anniversary (on 14th April) of an Indian anthropologist who wrote the book "WWTS?". Obviously, he is better known for his many other achievements. This book, published in the year 1946 is almost 300 pages long (including the appendices) long and was sold at a cost of Rs. 12/8. Many things have changed in the years since. We now have WGS data to tell us about human population history. However, the spirit of the initial book and its relevance continues to haunt India. If nothing else, the book delves into the past and challenges many ideas held dearly and venerated by a few. This possibility of being able to challenge and question dogma is what distinguishes scientific thought from non-scientific thought. The second part of the CoalQC manuscript is now published. In some ways, this manuscript challenges the existing demographic inference methodology. The fact that such critical evaluations of widely used methods are accepted and add to the discussion is of great value. This article by Patil et.al. is now available online as "Repetitive genomic regions and the inference of demographic history". Some additional material that we never published from the pre-print forms the basis for a blog post (Leaping from frogs to plants - in quest of repeats) at the Nature Ecology and Evolution community.

No comments: