Psychology of Medicine: Mammogram Math

Sunday, December 13, 2009

Mammogram Math - NYTimes.com

In his inaugural address, Barack Obama promised to restore science to its "rightful place." This has partly occurred, as evidenced by this month's release of 13 new human embryonic stem-cell lines. The recent brouhaha over the guidelines put forth by the government task force on breast-cancer screening, however, illustrates how tricky it can be to deliver on this promise. One big reason is that people may not like or even understand what scientists say, especially when what they say is complex, counterintuitive or ambiguous.

As we now know, the panel of scientists advised that routine screening for asymptomatic women in their 40s was not warranted and that mammograms for women 50 or over should be given biennially rather than annually. The response was furious. Fortunately, both the panel's concerns and the public's reaction to its recommendations may be better understood by delving into the murky area between mathematics and psychology.

Much of our discomfort with the panel's findings stems from a basic intuition: since earlier and more frequent screening increases the likelihood of detecting a possibly fatal cancer, it is always desirable. But is this really so? Consider the technique mathematicians call a reductio ad absurdum, taking a statement to an extreme in order to refute it. Applying it to the contention that more screening is always better leads us to note that if screening catches the breast cancers of some asymptomatic women in their 40s, then it would also catch those of some asymptomatic women in their 30s. But why stop there? Why not monthly mammograms beginning at age 15?

The answer, of course, is that they would cause more harm than good. Alas, it's not easy to weigh the dangers of breast cancer against the cumulative effects of radiation from dozens of mammograms, the invasiveness of biopsies (some of them minor operations) and the aggressive and debilitating treatment of slow-growing tumors that would never prove fatal.

The exact weight the panel gave to these considerations is unclear, but one factor that was clearly relevant was the problem of frequent false positives when testing for a relatively rare condition. A little vignette with made-up numbers may shed some light. Assume there is a screening test for a certain cancer that is 95 percent accurate; that is, if someone has the cancer, the test will be positive 95 percent of the time. Let's also assume that if someone doesn't have the cancer, the test will be positive just 1 percent of the time. Assume further that 0.5 percent — one out of 200 people — actually have this type of cancer. Now imagine that you've taken the test and that your doctor somberly intones that you've tested positive. Does this mean you're likely to have the cancer? Surprisingly, the answer is no.

To see why, let's suppose 100,000 screenings for this cancer are conducted. Of these, how many are positive? On average, 500 of these 100,000 people (0.5 percent of 100,000) will have cancer, and so, since 95 percent of these 500 people will test positive, we will have, on average, 475 positive tests (.95 x 500). Of the 99,500 people without cancer, 1 percent will test positive for a total of 995 false-positive tests (.01 x 99,500 = 995). Thus of the total of 1,470 positive tests (995 + 475 = 1,470), most of them (995) will be false positives, and so the probability of having this cancer given that you tested positive for it is only 475/1,470, or about 32 percent! This is to be contrasted with the probability that you will test positive given that you have the cancer, which by assumption is 95 percent.

The arithmetic may be trivial, but the answer is decidedly counterintuitive and hence easy to reject or ignore. Most people don't naturally think probabilistically, nor do they respond appropriately to very large or very small numbers. For many, the only probability values they know are "50-50" and "one in a million." Whatever the probabilities associated with a medical test, the fact remains that there will commonly be a high percentage of false positives when screening for rare conditions. Moreover, these false positives will receive further treatments, a good percentage of which will have harmful consequences. This is especially likely with repeated testing over decades.

Another concern is measurement. Since we calculate the length of survival from the time of diagnosis, ever more sensitive screening starts the clock ticking sooner. As a result, survival times can appear to be longer even if the earlier diagnosis has no real effect on survival.

Cognitive biases also make it difficult to see the competing desiderata the panel was charged with balancing. One such bias is the availability heuristic, the tendency to estimate the frequency of a phenomenon by how easily it comes to mind. People can much more readily picture a friend dying of cancer than they can call up images of anonymous people suffering from the consequences of testing. Another bias is the anchoring effect, the tendency to be overly influenced by any initially proposed number. People quickly become anchored to such a number, whether it makes sense or not ("we use only 10 percent of our brains"), and they're reluctant to abandon it. If accustomed to an annual mammography, they're likely for that reason alone to resist biennial (or even semiannual) ones.

Whatever the role of these biases, the bottom line is that the new recommendations are evidence-based. This doesn't mean other right-thinking people would necessarily come to the same judgments. To oppose the recommendations, however, requires facts and argument, not invective.

John Allen Paulos, professor of mathematics at Temple University, is the author most recently of "Irreligion."

http://www.nytimes.com/2009/12/13/magazine/13Fob-wwln-t.html?ref=magazine