Way back in 2012, I discovered Netgalley and requested a few books. But somehow, I lost track of my acceptances and eventually forgot all about Netgalley. My ARC of The Half-Life of Facts lay neglected, forgotten, and unread. The years went by, and eventually, I rediscovered Netgalley, only to find that I had a black mark on my record: I’d never read or reviewed The Half-Life of Facts. Although I’ve been slowly repairing my reviewing ratio, I only discovered recently that I could actually go back and right the wrongs of 2012. And so I purchased The Half-Life of Facts, and I’m glad I did.
As can be inferred from the title, The Half-Life of Facts deals with the idea that facts--the concepts generally accepted by scientists of the day--are constantly in flux. Arbesman is essentially introducing scientometrics, that is, the scientific study of scientific studies. Concepts such as mathematical theorems are inherently true and thus unchanging. However, the “facts” of chemistry, physics, and biology are constantly being overturned. For example, there’s a whole collection of “Lazarus taxa,” that is species “known” to be extinct but which have since been rediscovered. As our technology improves, so does our ability to approximate the truth, and so does the rate by which we overturn old “facts.” According to Arbesman, the surviving population of facts undergoes exponential decay with respect to time. This decay can be observed in a variety of ways, such as examining paper citation rates over time or studies of the rate at which medical facts are disproved. I don’t accept all of the theories presented--to my mind, many of them rest upon correlation rather than causality--but his theories are interesting nonetheless.
Possibly because the main idea is so very simple, Arbesman quickly moves onto other related areas of scientometrics. For example, he also talks about innovation and “S-curve theory,” which hypothesizes that various innovations in a field can be plotted as a series of connected logistic curves which together can be smoothed to look exponential. He also talks about the importance of population density, which he claims can cause a superlinear growth of technology. In one of the most interesting sections of the book, he discusses how mutation-tracking techniques from evolutionary genetics can be used to determine the age and origins of papers and books. Some of the studies using these techniques seem a bit suspect to me; for instance, one researcher uses these techniques to assert that almost all written works from the early Middle Ages have survived; however, he uses the Venerable Bede to calculate the half-life of these documents, and it’s safe to assume that holy relics may have been treated a little differently than other documents. However, other uses of the technique are quite compelling; for example, according to a study that Arbesman cites (and presumably read) which used these techniques, researchers read only 20% of the works that they cite in their texts. This has led to amusing situations such as the fictional James Moriarty from the Sherlock Holmes stories being cited in real-life texts.
The most shocking section for me was on the issue of multiple independent discovery. While facts do propagate, they often don’t propagate fast or far enough, leading to independent studies of the same facts. In school, I learned about Barabasi and Albert’s famous paper on preferential attachment. Little did I know that preferential attachment had been discovered at least four times before. Similarly, the Erdos-Renyi random graph I learned about and used was actually first examined and discovered twenty years before. Arbesman puts it like this:
As Stigler’s Law of Eponymy stats: “No scientific law is named after its discoverer.” Naturally, Stephen Stigler attributes this law to Robert Merton.
Arbesman also discusses the other major reason for the propagation of false facts: publication bias. In a landmark paper, “Why Most Published Research Findings are False,” John P. A. Ioannidis discussed how the academic setup is a recipe for false facts. Researchers gain acclaim through discovery of interesting phenomena, which can lead to a pattern that xkcd eloquently describes:
To put it another way, if you flip five coins a thousand times, and each time you wear a different coloured shirt, you’re practically guaranteed for at least one run to come up all heads. If you go and publish on that one time, you can claim “statistical significance” between the shirt colour and the five heads-- as long as you don’t mention the other 999 trials. It isn’t even always intentional: researchers want results badly, and they don’t always consider experiment repetition in their calculations. Sadly, the atmospheric environment isn’t great at self-correcting. Researchers don’t gain fame and fortune from replicating other people’s experiments, whether they find the original paper to be true or false. They’re far more likely to use the first paper as “weak evidence” for their own tenuous findings.
If you’re looking for a light, casual introduction to the basics of scientometrics, then Half-Life of Facts is definitely worth a look. Just make sure to read it before the book's own half-life has run its course.