How to combine p-values to avoid a sentence of life in prison

I find the use of statistics in the justice system a thrilling subject, specially so when you find out that some persons like Lucia de Berk have been handed life sentences based solely on flaw statistics coming from experts like Mr. Henk Elffers. So I’ll talk in this post about what he did wrong and how to avoid this kind of huge boo-boo in our statistical lives.

Lucia reads post, photo by Carole Edrich
Lucia reads post, photo by Carole Edrich (Photo credit: Wikipedia)

The use of statistics in the justice system has actually a long history, the amazing mathematician / engineer / physicist / philosopher of science Henri Poincaré already had to correct the misuse of statistics in the infamous Dreyfus trial.

But it was in the Lucia de Berk trial where combining p-values wrongly handed her a life sentence. I won’t go into the details of the trial, for that there are many other places like Mr. Richard D. Gill web page account of the trial and a video worth to have a look to. Instead I will focus on how to appropriately deal with a bunch of p-values to make sense of our data. Continue reading

15 to 42 percent of medical research are false positives (Yet Another Calculation)

A while ago I found a very interesting paper from Leah R. Jager and Jeffrey T. Leek  via a post in the Simply Statistics blog arguing that most published medical research is true with a rate of false positives among reported results of 14% ± 1%.  Their paper came as a response to an essay from John P. A. Ioannidis and several others authors claiming that most published research findings are false.

After dealing with some criticisms Mr. Leek made a good point in his post:

“I also hope that by introducing a new estimator of the science-wise fdr we inspire more methodological development and that philosophical criticisms won’t prevent people from looking at the data in new ways.”

And thus, following this advice, I didn’t let criticisms prevent me from looking at the data in a new way. So for this problem I have devised a probability distribution for p-values to then fit the data via MLE and infer from there the rate of false positives.

pvalues PDF CDFSo this is my take; 15.33% rate of false positive with a worse case scenario of 41.75% depending on how mischievous researchers are but, in any case, and contrary to what others authors claim, most medical research seems to be true.

Continue reading