STUART THEOBALD: Data can’t be taken at face value during these strange times

This column was first published in Business Day. 

In normal times we become, perhaps unwisely, comfortable with what statistics mean. We see a number — GDP growth, unemployment — and immediately think we know something about how the world is changing.

But Covid-19 has caused what statisticians call a regime change, because the underlying conditions that generate the data fundamentally changed during the lockdown. Simultaneously, the act of data gathering itself was disrupted.

Consider last week’s Stats SA unemployment numbers. The headline statistic from the Quarterly Labour Force Survey is that unemployment had fallen to 23.3% from 30.1%. In “normal” times, champagne would be flowing at this remarkable improvement, especially in a period of major economic contraction. But the figure is meaningless because the world changed fundamentally.

The more useful figure is that the number of employed decreased by 2.2-million to 14.1-million. This is a 14% fall in employment from the first quarter to the second and is shocking. But we can’t determine from that how many people are unable to find a job. One of the features of those defined as unemployed is that they are actively seeking work. During lockdown you couldn’t actively seek work, so that’s why it appears that the unemployment rate decreased.

The number of “not economically active” workers reported by Stats SA increased to 20.6-million from 15.4-million the quarter before. As a result, the number of officially unemployed fell from 7.1-milllion to 4.3-million. But that’s only because people were confined to their homes.

What if we add the 2.2-million lost jobs to the unemployed in the previous quarter when there wasn’t a lockdown (until the last week)? That would give us 9.3-million unemployed. We would need, though, to also adjust the size of the labour force to make sense of that number as an unemployment rate.

If we use size of the labour force in the first quarter, rather than the much-reduced labour force of the second quarter when people were stuck at home, then we end up with an unemployment rate of 39.7% compared with 30.1% in the first quarter. That is a more meaningful figure for the sake of comparison.

But what of the change in collection methodology? Usually, Stats SA does face-to-face data collection, but it had to switch to telephone-based collection because of the lockdown. It tried to phone the same households it usually talks to face-to-face but it didn’t have phone numbers for some, others were wrong or unanswered and others had moved.

All told, the national response rate to the survey fell from 87.7% of the sample to 57.1% (the sample has about 33,000 “dwelling units”). What difference does this make? The lower response rate means the statistics are less robust. There is greater margin for error. But some bias is likely because telephone numbers are probably not evenly distributed in the population — those without phone numbers are more likely to be unemployed, for example.

To compensate for this source of bias, Stats SA “rakes” the data, which means it adjusts the weights of different response groups to compensate for the different characteristics of respondents compared with the normal sample. To do that, it looked at the results of the first quarter and compared outcomes on various measures for respondents with phones compared with those without phones. This resulted in “bias-adjustment factors” it could apply to the results of the phone survey to extrapolate for the population as a whole.

This is the best one can hope for in the circumstances, but given the scale of disruption caused by the lockdown, we can’t be sure that the bias between respondents with phones and not in quarter one will have been consistent in quarter two. For example, respondents without phones in the first quarter may have been particularly vulnerable to unemployment from the lockdown so were more affected during lockdown than they were in the first quarter. We just can’t know. The “real” unemployment figure could be quite different.

GDP figures released earlier in the month were also bedevilled by technical details of this sort. Some media rushed to print with the headline that GDP had “fallen 51%” in the second quarter. Any sensible reading of that would be that the economy had halved in size in the second quarter compared with the first. That was not the case by a long shot. In fact, the far better figure is to compare the size of the economy to the matching quarter in the previous year, adjusting for inflation.

That change was 17.1%. The 51% is the “seasonally adjusted, annualised rate” that assumes the quarterly percentage change would be compounded in the next three quarters at the same rate. Normally that statistic works fine because trends do tend to hold. But now, obviously, the second quarter was unique and the loss of economic activity a one-off, not a trend for the year.

Our desire to make sense of how the world has changed means we are hungrier for data than usual. But it is now important to go back to first principles to understand what the data is telling us about this new world, compared with what it used to tell us about the old one. Otherwise, we will come away profoundly misinformed.

Theobald is chair of Intellidex.