Covid-19 infections in vaccinated people: the use of statistics without context leads to false conclusions
Simpson’s paradox explains the risks of comparing data without taking all variables into account, which can spawn the kind of misleading headlines seen recently in the news
Over the past few days there have been several headlines claiming an increase in coronavirus deaths among people who have been fully vaccinated. Some media outlets have even gone so far as to state that “vaccinated people are six times more likely to die from Covid-19 variants.” While the increase in the number of vaccinated people contracting coronavirus is real, and to be expected taking into account available vaccines are not perfect, these reports are misleading. Many of these examples show that the blind use – that is to say use without context – of statistical results can lead to erroneous conclusions.
The majority of these errors stem from partial or biased analysis of the data offered by scientific studies, such as that published by Public Health England (PHE) on July 9. This investigation counted the number of cases in the UK due to the delta variant, which at the time of compilation accounted for 97% of new infections. The report observed that the number of fatalities among people who had received two doses of vaccine was on the rise: 118 of 257 deaths, or 46%. However, that does not imply that the vaccines are not working. In fact, these results are what would generally be expected with any treatment with a certain probability of error.
Let us imagine a perfect scenario in which everybody has been fully vaccinated. A PHE analysis estimates that the Pfizer vaccine prevents hospitalizations due to the delta variant in 96% of cases. In this case, 100% of hospitalizations – and of fatalities – would be among vaccinated people, but the risk of suffering serious consequences from infection would be 25 times lower than without the vaccine, which is to say it would be reduced by 96%.
As many studies concur, the probability of being infected is completely different depending on whether one has been vaccinated
On the other hand, according to the PHE, close to 36% – 92 out of 257 – of deaths correspond to people who have not been vaccinated. Calculating the percentage of fatalities among infected vaccinated and unvaccinated people, a figure of 1.09% was obtained for deaths among those who had been vaccinated, whereas “only” 0.13% of unvaccinated people died after being infected. These are the numbers that have been used in the most alarmist headlines, or in those with an anti-vaccination agenda, drawing conclusions such as “vaccinated people are six times more likely to die.” Again, these statements are incorrect and derive from an erroneous or malicious interpretation of the available information.
In the first instance, these reports are confusing the calculated probabilities. Those in the PHE study concern “dying when vaccinated and infected,” not “dying when vaccinated,” as the headline claims. This nuance is highly important given we are calculating the probability among very different groups without taking into account where the vaccine is at its most effective: at preventing infections. And, as many studies concur, the probability of being infected having received the vaccine or not is completely different, even against the delta variant. For example, the Pfizer vaccine has an average efficacy rate of 88% after administration of the second dose.
The probability of infection, vaccination coverage and other factors can be explored in greater depth, but even using only the aforementioned statistics, with closer scrutiny of the data we can conclude that the argument is still false. In many cases, like this one, it would be necessary to take a third variable carrying a strong causal link with the issue into account: age. Effectively, if we divide the population into those younger and older than 50, the percentage of fatalities among vaccinated people in the first group is 0.036% and in the second group 2.2%, while among unvaccinated people the numbers rise to 3% and 5.6%, respectively. As such, the number of deaths among the vaccinated is fewer in both groups.
Thusly, the conclusions the study draws are the opposite to what they would be without taking the division by age group into account. This result, apparently counterintuitive, is an example of what is known as Simpson’s paradox, or the Yule-Simpson effect, which is applied when the association between two variables (the rates of fatality and vaccination) changes completely when a third variable (age) is introduced.
The explanation for this phenomenon lies in the differences in incidence of the illness among groups. We know that the effects of Covid-19 are more serious in older people and, precisely because of that, the rate of vaccination in this age range is greater: n the UK this figure stands at over 80%. However, although the vaccine increases the likelihood of survival among this group, it remains lower than in other younger age groups. As the number of older people in much greater in the vaccinated and infected group (48.3%) than among the unvaccinated and infected (1.76%), the overall difference stated above can be observed but this does not imply, in any way whatsoever, an increase in mortality among the vaccinated.
Issues such as Simpson’s paradox appear with certain frequency in real-life problems and highlight the dangers of working with proportions, in particular among groups of very different sizes, or with sub-groups that contain different properties. It is a clear example of the importance of not drawing conclusions from statistical studies when not availed of all the available data. As such, despite what alarmist and poorly informed voices may say, beyond doubt as the PHE study and many others underline, in order to minimize the consequences of a possible delta variant infection it is advisable to get vaccinated.
José Luis Torrecilla is an assistant professor at the Autonomous University of Madrid (UAM).