Is Cancer Mainly Bad Luck? (Web Exclusive)

NIEHS Scientists Are Not Convinced

A study published in the January 2, 2015, issue of Science that suggests that cancer is mainly bad luck spurred National Institute of Environmental Health Sciences (NIEHS) biostatisticians Clarice Weinberg and Dmitri Zaykin to take a closer look at how the data were interpreted (Science 347:78-81, 2015). Their analysis suggests that, as working with big data becomes the norm in biomedical research, it’s crucial to pay careful attention to analytical methods and how the results are interpreted.

The authors of the Science article claimed that “only a third of the variation in cancer risk among tissues is attributable to environmental factors or inherited predispositions [and that] the majority is due to ‘bad luck,’ that is, random mutations in normal, noncancerous stem cells.”

Clarice Weinberg

Dmitri Zaykin


NIEHS biostatisticians Clarice Weinberg and Dmitri Zaykin challenged how data were interpreted in a study, published in Science, that concluded that cancer was just bad luck.

But Weinberg and Zaykin found fault with the Science paper’s logic. In a commentary published in the Journal of the National Cancer Institute, they called its conclusion—that most cases of cancer are fundamentally unpreventable because they are the result of chance—“unwarranted.” (J Natl Cancer Inst 107:djv125, 2015)

The Science authors reported an association between the logarithms of two variables: average lifetime incidence rates of 31 site-specific types of cancer, and the number of stem-cell divisions for the associated tissues. According to their analysis, two-thirds of the variation in cancer risk could be explained by the number of stem-cell divisions.

“There is value in pointing out this relationship in that the errors due to replication are probably causatively important,” Weinberg said. But “they went too far when they implied that inherited genetics and environmental factors could only explain about one-third of cancer.”

Weinberg, who is a senior investigator and head of the NIEHS Biostatistics and Computational Biology Branch, and Zaykin, a senior investigator in that branch, raised three statistical concerns: (1) the Science authors overinterpreted their R2 value; (2) using aggregated lifetime risk values likely led to overstating the impact of random errors in stem-cell replication; and (3) the notion that causes can be assigned fractions of risk that add up to 1.0 is false. Statisticians use a value called the coefficient of determination—R2 for short—to represent the correlation between two different measured variables. The closer the value is to 1.0, the more closely correlated the two variables are.

The Science article reported an R2 value of 0.65 and suggested that, therefore, 65 percent of the variation in cancer risk among different tissues could be explained by the number of stem-cell divisions in those tissues.

Terminology is crucial, according to Weinberg. “‘Explaining’ is statistical jargon that has little to do with causation—it has to do with the relationship between X and Y,” he said. ”You can’t conclude from an R2 value how much of that causal relationship is due to the variable that X represents.”

Next, Weinberg and Zaykin expressed their concern about the use of aggregated lifetime risk data. “The R squared from an analysis of cancer types based on aggregated risk for each type obscures the contributions of individual risk factors to each cancer type,” they wrote..

Finally, they argued against partitioning causes into fractions that add up to 1.0. Referring to phenylketonuria, an inherited condition in which the body cannot properly metabolize phenylalanine, they wrote, “The fraction attributable to genetics is 1.0, while the fraction attributable to environment is also 1.0, because the outcome requires both a dysfunctional metabolic gene and an environmental exposure [dietary phenylalanine].”

“Weinberg and Zaykin make important points,” said NIEHS Scientific Director Darryl Zeldin. “The claim that two-thirds of the cancers studied are caused by bad luck could lead to an overemphasis on development of treatments at the expense of crucial research into prevention and the role that environmental and genetic factors play in the origins of cancer.”

NIEHS biostatistician Shyamal Peddada also weighed in, posting an online Science comment that the use of log-transformed data led to a faulty conclusion. “A correlation analysis of the raw data reveals a weak correlation,” he wrote. A similar point was made on PubMed Commons, which has received a dozen comments on the Science paper.

Shyamal Peddada

Shyamal Peddada weighed in the the Science article, too.

The public debate underscores the important contributions that biostatistics, data access, and open dialogue can make in ensuring that new scientific findings are properly interpreted so they can be confirmed and followed up by future studies.


C. Tomasetti and B. Vogelstein, “Cancer etiology. Variation in cancer risk among tissues can be explained by the number of stem cell divisions, Science 347, 78-81 (2015). [link:]

C.R. Weinberg and D. Zaykin, “Is bad luck the main cause of cancer?” J Natl Cancer Inst 107, djv125 (2015) [link:]