Wednesday, February 22, 2012

How to Design the Rice Experiment

"The first principle is that you must not fool yourself—and you are the easiest person to fool."
– Richard Feynman

The rice experiment, as popularized by businessman Masaru Emoto, is a good example of how not to design a scientific experiment. I will explain why at the end. First, I will explain how I would design a rice experiment. I am not a scientist, but I try to stay scientifically literate. If anyone has suggestions on how to improve this design, let me know.

To be clear, I am not planning on doing this experiment, as I will explain afterward.

My Hypothetical Rice Experiment:

1: Prepare good words and bad words on opaque adhesive labels. The labels need to be opaque enough so that they cannot be read through the backs of glass jars.

2: For a control, I would prepare labels with no words. I would also prepare labels with neutral words. I would also prepare good, bad, and neutral words in a foreign language that I don't understand. All the labels would be prepared under the same hygienic conditions and cut in identical shapes. This is the heart of the experiment. Let's pause for a moment and ask why this is so important...

A controlled experiment should answer the question, "Compared to what?" If, all other things being equal, some interesting variable changes, what will happen? That's the question. The tricky part is to make all other things be equal.

For a wonderful description of eliminating variables, listen to or read "Cargo Cult Science" by Richard Feynman. Now back to our hypothetical experiment...

3: Have all the words recorded separately in a ledger. This will help keep me honest.

4: Sterilize dozens of jars, seals, and lids. This will zero out the bacteria count. Let them dry.

5: Have someone other than me apply the labels to the jars. This is the start of a double-blind.

6: Have that person cover the labels with identical strips of lightly adhesive opaque paper. At the end of the experiments, these covers will be removed. This will double-blind the experiment.

7: Have a third person rearrange the jars before delivering them to me. This will randomize the experiment and complete the double-blind.

8: The ledger from step 2 records what words are used, though I don't know which ones are on which jars. The words should be categorized at the outset: good, bad, neutral, or blank. Words should also be categorized as English or foreign. (The foreign word might well be kept secret from me, just in case.) The word categories have to be established at the outset to prevent fudging afterward.

9: Set the labeled empty jars in a relatively non-hygienic place so they can attain similar levels of light contamination. Totally sterilized jars may preserve rice indefinitely.

10: Cook some rice and put the same amount in each jar. A few ounces on the bottom will do. All we want is to be able to look inside the jar without the labels and their covers blocking the view.

11: Set the jars in an array that I can check every day. The jars would be numbered so I can track the progress of each jar individually.

12: See which jars get moldy first. Keep watching as other succumb. I would set a deadline of maybe 60 days.

13: The reveal. After the 60 days, look at the jars and their corresponding labels. Compare with the ledger and mark each jar as good, bad, or neutral. If the results are:
- 12 good English words and 12 good foreign words = all pristine
- 12 neutral English words, 12 neutral foreign words, and 12 blank = all somewhat moldy
- 12 bad English words and 12 bad foreign words = all very moldy

This, or something close, would be an extremely significant result. But I expect the onset of mold will be random and will not track with any good word or bad word labels in any statistically significant way. Mold will slightly favor one category of words over another, just as a matter of statistical noise. The math for this is well worked out. The more jars I use for each category, the smaller this statistical noise becomes. If I do the experiment with 30 jars for each category, I would get very high resolution, low noise results.

14: Submit for peer review. I would explain the process described above. My test would satisfy the basics of what we want from a well designed experiment: it's double-blinded, randomized, controlled, and uses an OK sample size. Negative results would be expected and not terribly interesting. (Sometimes negative results are groundbreaking, like the Michelson-Morley experiment that set the stage for Einstein's Relativity.) In the rice experiment, a positive result would be extremely surprising. The way this is designed, a positive result would have a rock solid foundation. Just one more step would be needed:

15: See if anyone reproduces the results under similar experimental conditions. If no one can reproduce my results, there's a good chance I falsified my data or was just plain sloppy.

16: If the results are positive, conduct the experiment for the James Randi Educational Foundation (JREF). If it does show evidence of paranormal activity that can be verified under scientific controls, I will win $1,000,000. And that's a lot of money! I would like to have that prize! But it's been available for decades and no one has won it yet.

How to Backpedal:

Let's say I was invested (monetarily or emotionally) in the results coming out positive—but they came out negative. There are some tricks, fallacies of special pleading, I can play on myself. These might help me to dismiss my own results or fudge them in my favor:

Anomaly hunting: Maybe some seals were red and some were beige. Maybe the red ones were moldier to a slightly higher degree that statistical noise would predict. Maybe the vibrations inherent in color caused the differences in moldiness. Of course, that's not what we were testing, that's a patterns identified after the fact. If you want to test for color, put that in the ledger at the outset. Don't shoehorn a pattern after the fact.

For some real adventures in anomaly hunting, look at the number juggling people apply to the Egyptian pyramids. You can take a rich batch of numbers and combine them into all sorts of flukes that match physical constants or astronomical distances. James Randi shows how you can do the same anomaly hunting with the Washington Monument in his great book Flim-Flam.

Blame science or Western thinking: This is the common tack of accusing the skeptical mindset of spoiling the results. The experiment above is designed without appealing to any particular cultural heritage. The design is based on me preventing myself from introducing bias. If scientific thinking is such a party-pooper, how has it been so successful in shaping every little bit of technology we use?

Science, skepticism, critical thinking—these have produced plenty of reliable results, like cars and air travel. Telekinesis, for example, has not delivered comparable goods for human transport.

Those YouTube Videos and Why I Will Not Conduct My Own Experiment Design:

The rice experiment, as popularized online, has no controls, no blind (let alone double-blind), and operates on the smallest possible sample size. It is a race to see which rice gets moldy first. If the bad word rice gets moldy first (it's a 50-50 shot) a naïve person might claim confirmation. If the good word rice gets moldy first, a naïve person might think, "I must have done something wrong," or "I got so impatient waiting for mold, maybe my impatience threw it off." Such a person may be less likely to post their results on YouTube.

Now, I have no plans to conduct my hypothetical experiment. It's a lot of work putting together a well-controlled study. And I'm very confident the results would be uninteresting. You might say, "Put your money where your mouth is. Do the experiment!" In a sense I am putting my money where my mouth is. If I'm wrong, I am giving away, for free, a great way to win a cool million from the JREF. Have at it.