Many analysts reveal irreplicability of results

Get your science fix here: research, quackery, activism and all the rest
Post Reply
User avatar
Bird on a Fire
Princess POW
Posts: 10137
Joined: Fri Oct 11, 2019 5:05 pm
Location: Portugal

Many analysts reveal irreplicability of results

Post by Bird on a Fire » Wed Jul 29, 2020 8:02 pm

In another worrying development for the integrity of published scientific results, a project that got 70 teams of neuroscientists to analyse the same fMRI dataset found that they got wildly inconsistent results.

https://medium.com/the-spike/seventy-te ... d96c23dbf4

The implications probably stretch beyond analysis of fMRI:
he NARPS paper ends with the warning that “although the present investigation was limited to the analysis of a single fMRI dataset, it seems highly likely that similar variability will be present for other fields of research in which the data are high-dimensional and the analysis workflows are complex and varied”.

The Rest of Science: “Do they mean us?”

Yes, they mean you. These crises should give any of us working on data from complex pipelines pause for serious thought. There is nothing unique to fMRI about the issues they raise.
FWIW I recently teamed up with a friend and contributed an analysis to a similar project for ecology & evolutionary biology. I'm interested (if a bit anxious!) to see what the results are. On the one hand, analytical pipelines are generally less complex, though this is rapidly changing with the advent of high-dimensional data from e.g. biologgers (not to mention the -omics cans of worms). But on the other, field data is absolutely rife with confounding effects, many of which are difficult to measure, and it often seems to me a bit of a personal choice whether or not certain things get included in the analysis.

I think this kind of thing has interesting implications for open data and data publication - if 70 can analysts get 70 different results from the same dataset, the importance of enabling independent verification of results is clear.
We have the right to a clean, healthy, sustainable environment.

secret squirrel
Snowbonk
Posts: 551
Joined: Wed Nov 13, 2019 12:42 pm

Re: Many analysts reveal irreplicability of results

Post by secret squirrel » Thu Jul 30, 2020 9:04 am

The fundamental problem is that the systems involved are tremendously complicated, and unlike something like high energy physics we don't have hugely expensive machines running large numbers of experiments to get 'reasonable' data. So a) the statistical tools we have are too wimpy to answer the kind of questions we want to ask with the data we have available, and b) many scientists don't use them properly anyway.

User avatar
Aitch
Snowbonk
Posts: 545
Joined: Tue Dec 03, 2019 9:53 am
Location: St Aines

Re: Many analysts reveal irreplicability of results

Post by Aitch » Thu Jul 30, 2020 9:09 am

Bird on a Fire wrote:
Wed Jul 29, 2020 8:02 pm
...

I think this kind of thing has interesting implications for open data and data publication - if 70 can analysts get 70 different results from the same dataset, the importance of enabling independent verification of results is clear.
I'm probably missing something here, but if 70 scientists can get 70 different results, what guarantee is there that independent verification is going to come up with something any more 'accurate' or 'right' or whatever you want to call it?
Some people call me strange.
I prefer unconventional.
But I'm willing to compromise and accept eccentric
.

User avatar
shpalman
Princess POW
Posts: 8241
Joined: Mon Nov 11, 2019 12:53 pm
Location: One step beyond
Contact:

Re: Many analysts reveal irreplicability of results

Post by shpalman » Thu Jul 30, 2020 9:11 am

Aitch wrote:
Thu Jul 30, 2020 9:09 am
Bird on a Fire wrote:
Wed Jul 29, 2020 8:02 pm
...

I think this kind of thing has interesting implications for open data and data publication - if 70 can analysts get 70 different results from the same dataset, the importance of enabling independent verification of results is clear.
I'm probably missing something here, but if 70 scientists can get 70 different results, what guarantee is there that independent verification is going to come up with something any more 'accurate' or 'right' or whatever you want to call it?
Independent verification should be able to point out that the dataset in this case is just noise so the result the first analyst got when they wrote the paper is meaningless.

Alternatively, what actually happens in the Nature paper is that one of the hypotheses stands out as being confirmed.
having that swing is a necessary but not sufficient condition for it meaning a thing
@shpalman@mastodon.me.uk

AMS
Snowbonk
Posts: 466
Joined: Mon Nov 11, 2019 11:14 pm

Re: Many analysts reveal irreplicability of results

Post by AMS » Thu Jul 30, 2020 11:49 am

Crikey.

Verification doesn't just mean "does a different team get the same result" though. It also encompasses making predictions from the data and designing (well-controlled) experiments to test those predictions. Not always easy, especially in neuroscience, but in my field, we use a lot of gene expression data ("x is upregulated in cancer" type stuff) and it's definitely a lot of work to follow up on the Big Data, which can be generated much faster than is useful sometimes!

kerrya1
Clardic Fug
Posts: 184
Joined: Tue Nov 12, 2019 11:13 am

Re: Many analysts reveal irreplicability of results

Post by kerrya1 » Wed Sep 02, 2020 2:58 pm

I think there are two ways of looking at this:

1) Verifying a published paper and the conclusions the authors reached - To achieve this you need a fully set of supporting documentation and metadata, not just the dataset but also detailed methodologies for data creation, processing, and analysis, copies of algorithms used or software written, specific details of the hardware and software used, exact lists of reagents include manfacturers details, etc, etc. Only when you put all the pieces of the puzzle together is there any chance of verifying the published output.

2) Replicating the results to prove that the conclusions made are "real" not an artifact of the analysis - this requires the complete dataset and some documentation, and metadata so you can understand what the original researchers did, but the expectation is that a replicating team would then apply their own methodologies to the data to see if they can achieve the same results.

Of course there is little support for or glamour in doing either verification or replication studies so unless their is clear evidence of mistakes or malpractice then the original results are often just accepted.

Post Reply