Why Should You Trust Research Published in Psychological Science?
Why Should You Trust Research Published in Psychological Science?
Scientific institutions should work to earn trust from the scientific community and from the public. With prominent threats to trust in science around the world, it is especially important to make clear why scientific institutions are worthy of trust. Our goal at Psychological Science is to be an outlet that readers, authors, and reviewers value and trust. We are very lucky that authors submit excellent work to us, that reviewers agree to donate their time, and that readers share, build on, and help correct the work that we publish. In this editorial, I outline our recent efforts to continue to earn that trust.
In my noneditor roles, I often argue that research should not be evaluated on the basis of the name of the journal in which it is published. Yet in my role as Editor in Chief of Psychological Science, I do strive to make the journal’s name a valid heuristic for judging the articles within it. This is sometimes uncomfortable. At best, journal name can be a noisy proxy for the quality of the individual articles. Certainly, whenever we have the time and expertise to evaluate an article on its own merits, that is clearly preferable. However, journal name should be a reasonable heuristic for evaluators to rely on when they lack the time or expertise to evaluate individual articles. After all, one point of having different journals is that the journal in which an article is published tells you something about the qualities of the article.
One of my jobs as Editor in Chief of Psychological Science is to think about what the journal’s name should signal and how we can make that signal valid. There are two main qualities I hope publication in Psychological Science signals: that the research is interesting and important and that it is trustworthy.
We don’t want you (readers, authors, reviewers) to just take our word for that. Too often, journals’ reputations are unearned—based on flawed metrics such as impact factors—or simply the inertia of prestige. But journal prestige can and should be earned. There are many things journals can do to give the community concrete, verifiable indicators of their priorities. We ask our authors to be transparent and accountable, so we should be, too. The goal of this editorial is to make the case for your trust. We hope that by demonstrating our commitment to getting it right, and backing that up with concrete actions that you can see and judge, you will be more likely to submit your best work to our journal, give us your time and expertise as reviewers, and be excited to read the articles we publish.
What Can We Do to Earn Your Trust?
First, we can invest more in vetting the articles we publish to help authors present the most accurate, well-calibrated, and clear version of their research. We will make mistakes. Furthermore, we want to make some mistakes—at least the right kinds of mistakes. We don’t want to play it so safe that we never publish uncertain findings. But we want to minimize preventable errors, and we want to accurately portray the credibility of findings that are at high risk of being wrong. Findings that are provocative and stimulating, but far from definitive, should be framed as such. This minimizes the harm to public trust in science, and trust in our journal, if those findings turn out to be wrong. It is also the responsible thing for scientists to do (Campbell, 1988). 1
Second, we can invest in postpublication critique and correction so that when we inevitably publish some things that are wrong or miscalibrated we correct ourselves. And if our errors are excessive, we reconsider the policies and practices that led to those systemic problems. For both of these processes, we can give our community information about, and evidence of, our quality-control processes so that you can evaluate for yourselves whether we’ve earned your trust.
Vetting articles before we publish them
The first challenge that a manuscript faces when submitted to Psychological Science is that it must pass through not one, not two, but three editors before going out for external review. Most submissions (about 75%) are rejected without external review. For most of these desk rejections, at least two editors will have weighed in on the decision. At the desk-rejection stage, transparency is not typically a major factor. Instead, we are evaluating whether the research rigorously addresses an important question and whether the conclusions are supported by the evidence. For the manuscripts that make it past this stage, they will have passed the close scrutiny of three editors (me, a senior editor, and an associate editor). I’m certainly biased, but I think our team of senior and associate editors is incredibly sharp, and just passing through this filter is something to be proud of.
As we evaluate manuscripts, the senior editors and I each write up our comments and pass them on so that an associate editor sending a manuscript out for review already has the benefit of a few extra pairs of eyes. The external reviewers then provide crucial, specialized evaluations. In addition, while a manuscript is out for external review, the Statistics Transparency and Rigor (STAR) team conducts a “light transparency check” to ensure that it meets our transparency requirements or to flag it for consideration for an exemption (e.g., when sensitive data cannot be shared ethically).
Of the manuscripts sent out for external review, a revision is invited for about 40% (representing about 10% of all submissions). A manuscript that makes it to this stage has a good chance of ultimately being accepted—we accept about 80% of all manuscripts for which a revision is invited. Thus, when a manuscript reaches this stage, we suggest to authors that they take steps to make the path to acceptance smoother. Specifically, we remind authors that if their manuscript is conditionally accepted, it will undergo an “in-depth transparency check” in which the STAR editors check the transparency requirements more closely. For quantitative research, this includes checking the computational reproducibility of the results: Can our STAR editors reproduce the results in the manuscript using the authors’ own data and analysis scripts? Authors can save themselves a lot of time by having someone outside of the author team independently reproduce their results before it gets to this stage.
One of the advantages of being a fairly selective journal (we accept about 8% of all submissions) and having a dedicated team of handling editors, STAR editors, and reviewers is that we can invest a lot in the relatively few manuscripts that we select to publish. 2 After a year at the helm, I am convinced that this escalating approach to peer review (investing more and more scrutiny the closer a manuscript gets to acceptance) is the right approach and one that should be adopted more widely, especially at prestigious and very selective journals. Any journal that accepts a small fraction of submissions and that benefits from the prestige that comes with that selectivity should be expected to vet those submissions thoroughly and to give readers detailed information about what has been vetted and what hasn’t. To that end, with each article, we publish a “Research Transparency Statement” that tells readers crucial information about what is available.
Our commitment to our authors and readers is that we aim to make sure that every article we publish has been checked not only by external reviewers with domain expertise but also by our STAR team for transparency (i.e., data, code, and materials should be shared unless we judge that an exemption is warranted) and computational reproducibility of quantitative results. The STAR editors are an incredibly talented and dedicated group who apply their highly specialized (and highly marketable) skills for very little reward. We are lucky to have them, and many authors have remarked on how much they have learned from the STAR editors’ feedback.
Of course, transparency is only the beginning. What’s just as important is that the articles we publish make careful, calibrated claims that reflect the evidence. Indeed, the vast majority of editorial decisions come down to qualities of the manuscript other than transparency. Although the transparency checks are the focus of our STAR editors, the issues that occupy most of our senior and associate editors’ attention are issues related to research design, operationalization, interpretation, generalizability, theory, and inference. 3 Just as the STAR editors were selected for their expertise at evaluating transparency, integrity, robustness, and reproducibility, the senior and associate editors were selected because of their expertise in these fundamentals of scientific rigor.
Evaluating everything from importance to transparency to validity is a tall order—it is a lot to put on a single handling editor, even with the help of reviewers and STAR editors. For that reason, our team of editors has an online workspace in which we can ask each other for help and take advantage of the breadth of expertise across the team. In addition to ad hoc discussions of tricky issues, we also create a channel for each manuscript that reaches conditional acceptance. In that channel, the handling editor, the STAR editor, Tom Hardwicke (senior STAR editor), and I can discuss the manuscript and decision. I’ve now observed dozens of these discussions, and what I see is thoughtful, constructive consideration of the inevitable tradeoffs that are present in any research.
Our goal is not to publish only perfect research (our journal would be empty if we aimed for that). We aim to stress test the manuscripts as much as we reasonably can to catch as many errors as possible, evaluate whether the limitations are well justified or can be addressed, and ensure that the article we publish presents a clear, accurate, and well-calibrated report of the research.
This is a lot of work—for us, for our reviewers, and perhaps most of all, for our authors who make it to this stage. The authors whose work appears in the first few issues of 2025 have been especially patient and generous with us as we work out the kinks in our systems. Thanks to their feedback and the lessons we’ve learned from those early cases, we’re making major improvements. Because this process has been more arduous than we expected, the first few issues of 2025 are shorter than usual. This doesn’t reflect a lower acceptance rate but simply a longer lag to publication for early manuscripts handled by our team, mostly because of the checks we conduct after conditional acceptance. We are working to reduce that lag.
The concrete challenges we’re working on include (a) how to ensure that preregistrations are checked while also ensuring that studies that are not preregistered are scrutinized at least as much, if not more so, as those that are; (b) how to make the computational reproducibility checks more efficient, particularly when the research we publish uses a broad range of statistical methods and software; (c) how to evaluate exemptions to our transparency requirements in a way that is fair and consistent; and (d) how to encourage submissions of, and ensure high-quality peer review for, research beyond traditional quantitative, hypothesis-testing research (e.g., qualitative or mixed-methods research, descriptive research, explicitly exploratory research). You can expect to see our policies and practices around these issues evolve over the next few years.
Of course those are the challenges that are easy to articulate. The perennial challenges with peer review are the questions that require more critical thinking and subjective judgment: How valid are the authors’ conclusions? Does the research design test the question? Does the evidence support the claims? Are there alternative explanations or threats to the validity of the authors’ inferences? There is no checklist or badge for these qualities, but they are the most important qualities to evaluate in peer review.
Although our systematic and thorough transparency checks set us apart from many other journals, an equally important reason you should trust us is that we have a team of editors and peer-review process that focus intently on these more qualitative judgments. And, of course, the two go hand in hand—our transparency standards make it easier for our editors and reviewers to evaluate these qualities. If a reviewer can see the materials, data, and code, they are more likely to catch a confound or a design flaw.
But, again, you shouldn’t just take our word for it. We want readers to judge for themselves whether we’re vetting manuscripts carefully and appropriately during peer review. To that end, we now publish the peer-review history (reviews and decision letters but not reviewers’ names unless they choose to sign their reviews) with every article we publish. This way, readers can see what kinds of issues we focus on during peer review. We don’t expect readers to just blindly trust our peer-review process—we show our work. We are proud of the excellent work that our reviewers and editors put into peer review, and we think showing readers that work will only earn us, and the articles we publish, more trust. And it’ll increase the chances that any errors or biases in our peer-review process can be identified and fixed.
Vetting articles after we publish them
Even with excellent editors and reviewers, and with checks in place, we will make mistakes. We can’t catch every error, and some things that were considered best practice at the time of publication will turn out to be flawed.
Moreover, although we have principled reasons for choosing our policies and practices, there is shockingly little empirical evidence regarding the effectiveness of different peer-review policies and practices. There is a famous saying that if the system of peer review were itself subject to peer review, it would not pass (Wager, 1999). The scientific community’s belief in peer review is based more on faith than on evidence (Smith, 2006). The growing field of metaresearch is contributing empirical evidence to our understanding of peer review, and one day we may be able to make decisions about journal policies and practices that are informed by high-quality evidence (although of course values should also continue to play an important role). In the meantime, journals should be humble about whether their peer-review processes are achieving their goals and whether they might have unintended side effects.
One way to gauge whether we’re getting it right is to conduct audits of our published work. The Reproducibility Project: Psychology (Open Science Collaboration, 2015) was one example of an audit on a single dimension (replicability). The results of that audit suggested that Psychological Science had a lot of room for improvement on the dimension of replicability, and my predecessors took important steps to bolster our confidence in the replicability of published results (Bauer, 2022, 2024; Eich, 2014; Lindsay, 2015, 2017, 2019). It would be great to know how the journal would fare if such an audit were carried out on articles published this year.
More importantly, how would we fare if our articles were audited on a much broader range of dimensions? Replicability is just one metric by which journals can be assessed, and it’s arguably a very low bar (Vazire et al., 2022). What would happen if a rigorous, systematic audit was carried out evaluating the construct validity of the research we publish? Or the validity of causal inferences? Or the representation of a broad range of participant populations? In other words, we should want to know whether we are publishing the quality and breadth of research that we boast about, whether there are common errors or flaws across the articles we publish, and whether there is evidence of bias or blind spots in the topics, populations, methods, or approaches that we publish.
Ideally, we would invest in this metaresearch ourselves. Unfortunately, we don’t have the capacity to do everything that should be done. 4 Thus, we rely on generous, community-spirited metaresearchers who conduct such audits as a service to the field and to the journals they study. The Institute for Replication (i4replication.org) is one group organizing grassroots audits of behavioral-science journals, including Psychological Science. Their focus is mainly on computational reproducibility. We are lucky to be included in their project, and we’re looking forward to seeing the results of their investigations.
We welcome other audits of our published articles, including those focusing on dimensions beyond replicability and reproducibility. Researchers who conduct rigorous audits of articles published in Psychological Science are encouraged to contact me about submitting their manuscripts to our journal, either as registered reports or regular research articles. If exceptions to our submission guidelines (e.g., word limits) are necessary to accommodate your research, please don’t hesitate to get in touch. We don’t typically entertain requests to deviate from our submission guidelines, but publishing high-quality, independent audits of how we are doing as a journal is the best reason I can think of for breaking the rules.
In addition, we always welcome commentaries, including critiques of previously published Psychological Science articles. However, this path to critiquing and correcting the scientific record does not seem especially popular. In 2024, we received only five commentary submissions. 5 In my experience editing journals, this is not unusual. Although it’s a common trope that science is self-correcting, the effectiveness of existing self-correcting mechanisms, including the commentary article type, is dubious (Hardwicke et al., 2021; Vazire & Holcombe, 2022; Whamond et al., 2024). Nevertheless, those rare scientists who provide rigorous criticism and important corrections to the published literature are doing a tremendous service both to the field and to the journal. We would love to find ways to make those contributions more common, and more impactful, and we welcome your suggestions.
In the meantime, perhaps the best way to live up to the promise (and public expectation) that science is self-correcting is to change how we think of peer review. First, we should strengthen peer review by making it closer to what people expect it to be. With the technology and resources available, especially to prestigious, selective journals, there is no reason not to perform thorough checks and ensure that what we publish is accurate and well calibrated to the best of our ability. Second, we should be humble about what peer review cannot do. It is just the beginning of the vetting process, and vetting and correction should not stop just because research is published in a journal. We should be honest that research published in even the “best” journals is still subject to correction, and we should welcome continued scrutiny. Journals that invest in and facilitate both pre- and postpublication quality checks, error detection, and correction are the ones that deserve the most trust.
As a small step in that direction, we selected our team of STAR editors in part by reaching out to researchers who had conducted rigorous metaresearch, or postpublication critiques, examining the transparency, integrity, statistical validity, and reproducibility of our published articles. Some of our STAR editors’ work directly contributed to policy changes at the journal or to retractions of articles. These are exactly the people we should be grateful for and want on our team. A good maxim: When you find an incisive critic of your work, hire them.
Conclusion
Scientific journals will sometimes publish research that turns out to be wrong. That is not a reason to distrust journals or the peer-review process. But journals do have an obligation to take reasonable steps to vet what they are publishing and to make those efforts visible and verifiable to readers. Journals with the most prestige and influence have the most obligation to do so.
At Psychological Science, we want to make sure we earn, and keep, the trust and prestige that we enjoy in the scientific community. We also want to know when we got it wrong, fix our mistakes, and improve our processes. The steps I’ve outlined above are in the service of these goals. I hope we can continue to earn your trust, and our privileged role in the field, and that we’re giving you good reasons to choose Psychological Science when deciding where to submit, review, and read.
Simine Vazire
Editor in Chief