Here’s an interesting paper (may require login) from the Journal of the Royal Society of Medicine. From the abstract:
Design 607 peer reviewers at the BMJ were randomized to two intervention groups receiving different types of training (face-to-face training or a self-taught package) and a control group. Each reviewer was sent the same three test papers over the study period, each of which had nine major and five minor methodological errors inserted.
Results The number of major errors detected varied over the three papers.The interventions had small effects. At baseline (Paper 1) reviewers found an average of 2.58 of the nine major errors, with no notable difference between the groups.The mean number of errors reported was similar for the second and third papers, 2.71 and 3.0, respectively. Biased randomization was the error detected most frequently in all three papers, with over 60% of reviewers rejecting the papers identifying this error. Reviewers who did not reject the papers found fewer errors and the proportion finding biased randomization was less than 40% for each paper.
The thing is, i am having a relatively difficult time convincing myself that the comparison they made is the interesting one. When reviewing a paper, are we really ever looking for all of the errors in the piece or just enough to sufficiently determine whether to accept/reject the article? So, how interesting is the difference in the “number of errors found” among those who rejected the paper? To me, not very. This doesn’t undermine their conclusion:
Conclusions Editors should not assume that reviewers will detect most major errors, particularly those concerned with the context of study. Short training packages have only a slight impact on improving error detection.
My question is do you find the question interesting, or would you have sliced the data a different way?
(HT: Michelle Poulin)