Introduction

According to Oxford English Dictionary, a retraction is "the action of withdrawing a statement, accusation, etc., which is now admitted to be erroneous or unjustified… recantation; an instance of this; a statement of making such a withdrawal". In a comprehensive review, Fang et al. (2012) found that scientific misconduct played a more prominent role among the articles retracted from the biomedical literature. Scientific misconduct means “fabrication, falsification or plagiarism in proposing, performing or reviewing research or in reporting research results” according to the US Office of Research Integrity (https://ori.hhs.gov/definition-misconduct).

Publication ethics is a code of conduct and framework being developed for the publication process of journal publication system (Ali and Ali 2017). Violation of publication ethics is a worldwide concern and it includes various reasons such as plagiarism and replication, coerced authorship, fake affiliation, and fraud. Specifically, the key problems faced by the editors of biomedical journals (Parasuraman et al. 2015) are plagiarism, multiple submissions (submission of the same manuscript to more than one journal simultaneously) and duplicate submissions (submission of single manuscript to the same more than once). There are an increased number of breaches of publication ethics that occur and are published (Ali and Ali 2017). This rapid production of retracted papers is motivated primarily by misconduct (Fang et al. 2012; Parasuraman et al. 2015). There is a good or bad observation on the retractions. According to Fanelli (2013), the increasing number of retractions is a good sign that scientists / researchers and journal editors are becoming better at recognising and eliminating the fraudulent or erroneous papers.

Main reasons behind the scientific misconduct are “publish or perish” and “a gap in knowledge” (Mousavi and Abdollahi 2020). Among the retraction notices, plagiarism (including self-plagiarism) is the major reason (Elango et al. 2019; Chambers et al. 2019). More specifically, image duplication was prevalent in biomedical research publications: one in 25 images had evidence of image duplication (Bik et al. 2016). The development of internet facilities paved the way for both the scientific misconduct such as plagiarism or duplicate publication and detecting the same. A troubling trend is the rising number of retracted scientific papers (Fang and Casadevall 2011). Very recently, Rapani et al. (2020) found that India was the top country in the retracted publications in the dental literature with about 28% and India was one among the top five countries in the retractions (Bhatt 2020; Tang et al. 2020). These are the motivation behind this study.

With this background, the aim of the present study is to analyze the retracted articles in biomedical literature authored by Indian scientists. Particularly, this paper addresses the following research questions:

  • How much Indian biomedical literature is retracted?

  • What are the general characteristics of retracted Indian biomedical literature? Authorship, collaboration type, funding information

  • Which journals issued most retractions and their impact factor?

  • Who initiates the retraction?

  • What is the average time needed for retraction?

  • What are the reasons for the retraction and its retraction time?

  • Which misconduct (plagiarism or fake data) prevalent among the funded research?

Previous studies

Now-a-days, there is a growing interest in examining the retracted publications on a discipline/subject as well as on a country/territory.

Wasiak et al. (2018) identified that 84% of retracted publications in the field of radiation oncology were published after 2000. Bar-Ilan and Halevi (2018) found that more than 50% of the articles were retracted within 1 year. According to Bozzo et al. (2017), 61% of cancer publications were retracted due to academic misconduct: plagiarism, duplicate publication, and fraud. Rapani et al. (2020) found that almost 28% of retracted dental literature was contributed by Indian authors. Faggion Jr et al. (2018) found that nearly 57% of retracted dental literature was published between 2012 and 2016. Nogueira et al. (2017) found that redundant publication was most predominant reason for retraction in field of dentistry. Pantziarka and Meheus (2019) found that vast majority (86%) of authors have only one retracted publications in the caner literature. Bik, Casadevall & Fang (2016) found that 3.8% of published papers in the field of biomedical contain problematic figures. Rubbo (2017) found that majority of retractions (~ 65%) were issued by Editors in the field of engineering. Stricker and Günther (2019) found that 0.82 papers were retracted per 10,000 journal articles in the field of psychology due to scientific misconduct. According to Wang et al. (2019), longest duration has taken for retraction due to fraud/suspected fraud while shortest duration for authorship disputes. Most of the retractions were due to content related issues in the obstetrics literature (Bennet et al. (2010). Rai and Sabharwal (2017) found that 30% of retractions were due to plagiarism in the field of orthopaedics.

Aspura, Noorhidawati & Abrizah (2018) found that almost three fourth of retractions were issued by Quartiles 1 and 2 journals among the retracted publications of Malaysian authors. Chen et al. (2018) identified that 40% of retracted Chinese biomedical literature were due to plagiarism and error. According to Lei and Zhang (2018), most frequent reasons for retracted publications by Chinese researchers were plagiarism, fraud, and faked peer review. Moradi and Janavi (2018) studied the Iranian retracted papers indexed in Web of Science and found that plagiarism was the major reason for retraction. Glasnović et al. (2019) found that most of the retracted articles by Croatian authors were published in the field of biomedicine. Elango et al. (2019) analyzed the scientific retractions in Indian science based on the SCOPUS database and most of retractions were due to plagiarism including self-plagiarism. Scientific misconduct and duplication were the most common reasons for retraction among the Spanish biomedical literature (Dal-Ré 2020). Very recently, Palla et al. (2020) made an attempt to analyze the retracted papers in Health Sciences from China and India based on the data from www.retractiondatabase.org.

Recently, few studies have been undertaken on the individual journal and author: Bik et al. (2018) found that 6.1% of papers published in the journal Molecular and Cell Biology between 2009 and 2016 contains duplicated images. Wray and Andersen (2018) found that more than 50% of retractions were due to errors in the journal Science. Saikia and Thakuria (2019) examined the various aspects of retracted articles authored by Yoshitaka Fujii. McHugh and Yentis (2019) analyzed how long been taken for retraction of papers authored by Scott Reuben, Joachim Boldt and Yoshitaka Fujii.

Data and methods

Search strategy

PubMed was used to identify and collect the information about the retracted articles among the biomedical literature contributed by Indian scientists. It is an open source database which contains the bibliographic records of medicine and biomedical related literature and related works (Chen et al. 2018; Wang et al. 2019; Campos-Varela et al. 2020) similar to the present study had used the PubMed as a source. There is a separate publication type for retracted publication and retraction notice. The following search strategy has been used: (“retracted publication” [Publication Type] AND “INDIA” [Affiliation]). The database was accessed on 02.11.2020.

Data extraction

After downloading the retrieved records of retracted publications, every record was screened. All the corresponding retraction notices were consulted for collecting the information about retraction year, reason for retraction, sources of retraction, journals and publishers. Original articles were also consulted for collecting the information about publication year, number of authors, watermark and funding information. Researches supported only by external agencies considered and financial support in the kind of fellowship has also been taken for the analysis. However, financial support in the form of teaching assistantship has not been considered. Time needed to retract an article has been calculated from publication year to retraction year. Additionally, latest Journal Citation Reports (2020) has also been consulted to collect the information about impact factor and quartile of the journals published the retracted articles.

Data analysis

All the analyses have been done in the MS-Excel. The number of retracted publications per 10,000 publications is calculated by dividing the number of retracted publications with the total number of publications authored by Indian scientists and indexed in PubMed.

Coding the reasons for retraction

The classification of reasons for retractions is based on the retraction notices and such reasons have been classified into nine categories (Elango et al. 2019). The reasons and related phrases are provided in the Table 1. Among the reasons, plagiarism includes self-plagiarism and Error / mistake include both researchers’ error and administrative error. Fake data includes image / data manipulation. Generally speaking, images are considered as data in science (according to the US Office of Research Integrity). There is more than one reason in some retraction notices and only primary reason has been taken for further analysis. Example of such reason is “Due to highly unethical practices, which include serial self-plagiarism, data manipulation and falsification of results found across multiple papers. From this retraction notice, only the primary reason “self-plagiarism” has been collected.

Table 1 Reason for retraction and related phrases

Results and discussion

There are 508 retracted articles (as on 02.11.2020) authored by Indian authors and account for nearly 6.2% of retracted publications indexed in the PubMed database. However, the number of retracted articles is very low compared to the number publications contributed by Indian scientists in the database (~ 0.1%). But it account for more than 10 retracted articles per 10,000 Indian biomedical literature indexed in PubMed database which is worrying factor and it is almost four times higher than retractions in PubMed (Campos-Varela et al. 2020).

In the analysis, some interesting indings have been observed. (1) The journal Saudi Journal of AnaesthesiaFootnote 1 retracted ten articles in a single retraction notice and all the ten articles were submitted by the same corresponding author. (2) A corrigendumFootnote 2 has been published for a previously published articleFootnote 3 and both the original article and corrigendum were retracted due to plagiarism. (3) A correctionFootnote 4 has been published but it could not be convinced the editors and resulting the retraction of original articleFootnote 5. (4) Both an erratumFootnote 6 and editorialFootnote 7 have been retracted. It is to be noted that these were written and retracted by the editor-in-chief and it is the only editorial item retracted among the Indian biomedical literature. (5) Very recently, a COVID-19 related articleFootnote 8 has been retracted by Korean Journal of Anesthesiology due to plagiarism.

Characteristics of retracted articles

In this part, the general characteristics of the retracted articles have been discussed along with number of retractions per author, collaborating countries and top journals.

The major characteristics of the retracted articles have been described in Table 2: publication year of original articles, funding, number of authors and collaboration type, impact factor and watermark. Almost two third of the retracted articles were published after 2010 and only nine articles were published before 2000. It is clearly evidenced that biomedical literature published after 2010 might be problematic either fraud or error (Lei and Zhang 2018). More than one third (34%) of retracted publications were externally funded by various funding agencies and funding information could not be traced in 7% of retracted publications. Out of total 173, majority of the funded retracted articles (> 71%) were published after 2010. Such retracted scientific articles with financial support by public money are not only wasting the money but also wasting the human resources in the form of editor, publisher and reviewer. More than 95% of retracted articles were published with co-authors and 39% of them were written with 3 to 4 authors. Only 5% of retracted articles were written with single author, and it is almost 50% lower than a study based on SCOPUS (Elango et al. 2019) and retracted papers in life sciences (Palla et al. 2020). Number of authors in retracted biomedical articles contributed by Indian scientists ranged between 1 and 32. In terms of collaboration, 10% of retracted articles were written with international co-authors and more than 56% of retracted articles were co-authored within the institutions. Nearly one fourth of retracted articles had authors with two or more institutions at the country level. Retracted articles were published in 291 different journals; out of which, there is no impact factor for 95 journals and remaining 196 having impact factors between 0.426 and 74.699 with an average of 4.22. Almost one third of retracted articles were published in non-impact factor journals and this result is in contradiction with Campos-Varela et al. (2020). More than 50% of retraction notices were issued by journal quartiles of 1 and 2. It is clearly evidenced that journals having high impact factors show their interest in correcting the faulty scientific literature (Rubbo et al. 2017). Nearly 80% of retracted articles display the watermark of “retracted publication” and the information about watermark could not be found in 17.5% of retracted articles due to subscription based access or article withdrawn. It is suggested that retracted articles should be made freely available in order to avoid citations to retracted articles as well as awareness on retraction information of a particular article.

Table 2 Overview of retracted articles

There are 1741 unique authors among the 2315 authorships of 508 retracted articles. Table 3 provides the information on the number of retracted articles per author. It is observed that nearly 85% of authors had single retracted article which seems that there is no repeat offender. Similar result found in the cancer literature (Pantziarka and Meheus 2019) and contradictory to this, Samp et al. (2012) found that 40% of retracted studies were authored by two individuals in the drug literature.

Table 3 Number of retractions per author

More than 10% of retracted Indian biomedical literature had international author(s) and the list of international collaborating countries is provided in Table 4. Not surprisingly, the United States tops the list with 19, followed by Japan with 5, Iran and South Korea each with 4. These four countries accounted for almost 60% of the retracted international collaborated articles. The United States tops the list because it is the major collaborative partner country for Indian authors in many scientific fields (Elango and Ho 2017). Almost two third of retracted international collaborated articles were published with authors from the G7 countries which domination has been observed in many research areas (Ho 2014; Elango et al. 2013).

Table 4 Collaborating countries

Total (508) retracted articles were published in 291 different journals. Journals that had at least five retracted publications contributed by Indian scientists are listed in Table 5 and there are fifteen journals. More than 25% of retracted articles were published in these top fifteen journals. Highest number of retracted articles was published in Plos One with 26 followed by the Journal of Biological Chemistry with 24. These top two journals belong to second quartile. Five non-impact factor journals also figured among the list of top journals.

Table 5 Journals with at least five retracted articles

High impact journals such as New England Journal of Medicine (IF = 74.699) and JAMA-Journal of the American Medical Association (IF = 45.54) also had the retracted publications by Indian scientists. However, these journals retracted only one or two publications by Indian scientists compared to journals with highest retractions. It is clearly evidenced that there is a less chance of scientific misconduct or distortion among the articles published in high impact journals.

Characteristics of retraction notices

In this part, the characteristics of retraction notices have been discussed: retraction time, sources of retraction, and reasons for retraction. The total 508 retraction notices were issued between 1992 and 2020 (as on 02.11.2020). The earliest retraction notice was issued in the year 1992 for a publication in 1990.

Retraction time for the 508 articles ranged between 0 and 22: after 15 years, there was a huge gap, i.e. 21 years (Fig. 1). Nearly 80% of retraction notices were issued between 0 and 4 years and majority of such notices (28%) were issued in the next year of publication of original articles. This result is in agreement with Rubbo et al. (2017). Among the notices, a retraction notice was issued in 2016 to an article published in 1994 for the reason “duplicate publication”: it is the longest period of 22 years to retract a publication authored by Indian scientists.

Fig. 1
figure 1

Retraction time between publication and retraction

Issuer / initiator / requests for retraction notices was collected and provided in the Table 6. Almost 85% of retraction notices contain the information about the initiator for that retraction and remaining 15% didn’t contain. It is very low compared to a study (Vuong 2020) where it was 53%. By majority, more than 60% of retractions were involved by editor(s) and only 20% of retractions were initiated by author(s) which is very low compared to a study on retracted articles in biomedicine during 1997–2009 (Budd et al. 2011). Only a meager amount of retractions were issued by publishers. Majority of retraction notices were initiated by editors and this result is in agreement with Rubbo et al. (2017) and contradictory with Moylan & Kowalczuk (2016) where majority of retractions were initiated by authors.

Table 6 Sources of retraction

Based on the retraction notices, the reasons for retraction have been classified into nine categories (Elango et al. 2019) and shown in Table 7. Out of 508 total retraction notices, reason could not be found in 38 (7.5%): it is due to reason has not been given (example statement “This article has been retracted”) or article has been withdrawn (example statement “This article has been withdrawn at the request of the author(s) and/or editor”). Nearly two third of retraction notices were issued due to plagiarism (34.45%) and fake data (29.92%) including image manipulation. Similar results have been observed in the field of obstetrics and gynaecology (Chambers et al. 2019). These two categories of reasons are prevalent among the retracted Indian biomedical literature (Fig. 2) and these two categories forms the research misconduct (according to the US Office of Research Integrity). Earlier Fang et al. (2012) found that plagiarism and duplicate publication were the leading causes for retractions from India. Now, fake data replaced the duplicate publication. Only a meager amount of retraction notices were issued due to errors (4.72%) and it is unsurprisingly very low compared to retracted publications in the biomedical literature (Wang et al. 2019). Nearly 11% of retraction notices were issued due to duplicate publication and it is almost 50% lower than retractions in cancer research (Bozzo et al. 2017), and nursing and midwifery research (Al-Ghareeb et al. 2018). Nearly one third of retraction notices were due to plagiarism (includes self-plagiarism) and it is very high compared to retractions from genetics articles (Dal-Ré & Ayuso 2019), retractions in dentistry (Nogueira et al. 2017) and retractions from Korean medical journals (Huh et al. 2016). To curb the plagiarism among the academic and research community, the University Grants Commission (UGC), the higher education regulation agency in India has adopted its first regulation on academic plagiarism with four levels of punishments. Most of the retractions were due to misconduct. It is due to lack of awareness on medical research ethics (Kulkarni et al. 2015), lack of legal restrictions; some scientists may be more prone to misconduct because of their desire to publish more and in higher impact journals (Parvatam 2019). The increasing trend of research misconduct is mainly due to publish-or-perish situation in India: not only for promotion or incentives of individual faculty, a need for institutions towards rankings such as National Institutional Ranking Framework at national level as well as Times Higher Education World University Rankings, QS World University Rankings, and so on at global level. To create awareness on publication ethics and research misconduct, the University Grants Commission (a national level higher education regulation authority of India) introduces a two credit course “publication ethics and misconduct” which is mandatory for all PhD students.

Table 7 Reasons for retraction
Fig. 2
figure 2

Year vs. misconduct

The average time from publication to retraction was 2.86 years and the longest time taken for the publications with fake data while shortest for authorship disputes.

Table 8 provides the information about sources of retraction versus misconduct: plagiarism and fake data. Editor(s) initiated the retractions in most cases and authors involved in the lower number of cases. By majority, authors initiate the retractions in the fake data related issues.

Table 8 Sources vs. misconduct

It is clearly evidenced from the Fig. 3 that plagiarism is prevalent in the articles with lower number of authors while fabricating the data is prevalent in the articles with higher number of authors.

Fig. 3
figure 3

Number of author vs. misconduct

It is observed from Fig. 4 that fabricating the data is prevalent in the externally funded retracted articles whereas it is plagiarism for non-funded ones.

Fig. 4
figure 4

Funding Type vs. Misconduct

Below fifty percent of retraction notices due plagiarism were issued in the non-impact factor journals whereas it is 7% in the case of fake data. It is observed from Fig. 5 that data fabrication / falsification is prevalent in the first and second quartile journals whereas it is plagiarism in the fourth quartile and non-impact factor journals. There is no difference in the third quartile journals.

Fig. 5
figure 5

Journal Quartile vs. Misconduct

Conclusion

The present study on retracted publications in the biomedical literature authored by Indian scientists highlights some fruitful insights on growing interest: retraction. Even though the number of retracted articles is very low compared to the volume of biomedical literature published by the Indian scientists, the number of retracted articles per 10,000 published literatures is at high. Most of the retracted Indian articles were published after 2010. Notably, 10% of retracted articles were published in the top two journals: Plos One and the Journal of Biological Chemistry. Majority of retractions were due to scientific / research misconduct of plagiarism and fake data. Plagiarism is prevalent in the low quality journals whereas it is fake data in the high quality journals. Author productivity shows that there is no existence of repeat offenders among the Indian biomedical literature. Most of the retracted articles were written with co-authors and more than 50% of retracted articles were collaborated with in the institutions. Majority of retractions were initiated by editor(s). Alarmingly, more than 34% of retracted articles were funded by external funding agencies and majority of the funded research had the issue of data fabrication: at least publicly funded research should be free from any kind of misconduct. It is strongly suggested that funding agencies may consult the indexing databases or retraction databases such as www.retractiondatabase.org while considering the project proposals.

There are some limitations in this study. First, individual authors and institutions have not been discussed which might be useful for policy decision makers. Second, citation analysis of retracted articles has not been done which might also be useful for researchers in the growing concern of retractions.