The gender pay gap “is an idiotic statistical lie” writes Ben Shapiro, because “it fails to take into account job choice, time in the workforce, hours worked, or any other factor that would remove the vast bulk of the pay gap.” The aforementioned argument has gained wide-ranging popularity over the last few years among political commentators such as, though certainly not limited to, Christina Hoff Summers (the Factual Feminist), Dave Rubin (The Rubin Report) and Paul Joseph Watson (Infowars editor). It’s a notion that is routinely deployed in conversations concerning gender discrimination and the treatment of women in labour markets. However, this prevailing belief – that the gender pay gap is a mythical construct, as it negates the existence of significant gender discrimination once certain factors have been accounted for – is statistically fallacious and reflects a primitive understanding of economics and econometrics.
The argument here is not that sexist discrimination against women purely drives the gender pay gap, as some assert; it’s more nuanced than that, as those familiar with the literature would understand. Instead this post is a response to those who contend that gender differences in earnings are statistical myth because wage disparities between men and women disappear, or become substantially smaller, once relevant statistical controls are incorporated into multivariate wage regressions. The truth is that the inclusion of such statistical controls does not preclude the presence of a gender wage gap as evidence of labour market discrimination because incorporating such controls creates endogeneity issues that make coefficient estimates far less accurate and causal inference more difficult.
To understand why, it would be helpful to first explore the following simple wage model:
LnWit = Xitβ + ɛit
ɛit = vi + uit
where LnWit denotes the logarithmic wage for individual i at time t, and Xit is a vector of explanatory variables including, for example, gender, human capital investment, such as years of education, and occupational controls for individual i at time t, whilst β denotes the coefficient estimates corresponding to each right-hand side variable. The error term expression ɛit contains an unobserved heterogeneity component vi which captures individual specific skills (for example, ability). Now, the determination of β using the standard ordinary least squares (OLS) estimation procedure requires a stringent set of assumptions whereby, and including, for relevance of argument, the estimates of β remains unbiased and consistent so long as the explanatory variables are uncorrelated with the unobserved heterogeneity aspect of the error term.
The argument here relies fundamentally on the fact that there is an endogeneity bias caused by unobserved heterogeneity in the wage regression. That is to say, variables such as occupational choice and education investment decisions, for example, are to an extent determined by how they would affect one’s expected wage income. Suppose if gender labour market discrimination is significant, then it would not necessarily be captured (purely) through employers’ wage-setting practices; instead discrimination and cultural norms, i.e factors beyond a woman’s control, would also affect women’s human capital investment decisions and labour supply choices at every stage leading up to her eventual labour market outcome. Therefore, the aforementioned variables are to an extent caused by discrimination which means that said factors, such as occupational choice of employment, are endogenous to the econometric analysis. In other words, controlling for occupational choice and human capital decisions when investigating the gender wage gap will actually bias the estimates of β for the gender variable downwards towards zero which will inevitably downplay the significance and extent of discrimination, thus reducing the accuracy of multivariate wage regressions.
In closing, it is crucially difficult to evaluate the magnitude and significance of discrimination because of endogeneity bias which means that the existence of gender discrimination in labour markets cannot be disproved even once the above-mentioned statistical controls have been controlled for. Nonetheless, it is not necessary for gender wage gap studies to conclude that discrimination exists as a causal factor; indeed, the research literature already finds significant discrimination against women to be present at various stages and occupations in labour markets: for example, Reuben et al and Neumark et al. To conclude, the notion that the gender wage gap (necessarily) precludes a pay disparity between men and women resulting from gender discrimination once appropriate statistical controls have been implemented – as Ben Shapiro, Christina Hoff Summers and others suggest – is predicated on a fallacious understanding of basic econometrics and statistics, as briefly demonstrated above.