The Bivariate Probit Model in Strategy and Management Research: Applications and Potential
Ke Gong and Scott Johnson
Abstract
In the early days of the COVID-19 pandemic, an area could only report its first positive cases if the infection had spread into the area and if the infection was subsequently detected. A standard probit model does not correctly account for these two distinct latent processes but assumes there is a single underlying process for an observed outcome. A similar issue confounds research on other binary outcomes such as corporate wrongdoing, acquisitions, hiring, and new venture establishments. The bivariate probit model enables empirical analysis of two distinct latent binary processes that jointly produce a single observed binary outcome. One common challenge of applying the bivariate probit model is that it may not converge, especially with smaller sample sizes. We use Monte Carlo simulations to give guidance on the sample characteristics needed to accurately estimate a bivariate probit model. We then demonstrate the use of the bivariate probit to model infection and detection as two distinct processes behind county-level COVID-19 reports in the United States. Finally, we discuss several organizational outcomes that strategy scholars might analyze using the bivariate probit model in future research.
Keywords: Dichotomous outcomes; partial observability; bivariate probit; convergence difficulty; Monte Carlo simulation; COVID-19
Introduction
Strategic management researchers often study binary organizational outcomes such as corporate wrongdoing, acquisitions, hiring, and new venture establishments. In many cases, the observable binary outcome is actually jointly determined by two distinct latent processes (Poirier, 1980). For example, the process that makes a firm more likely to engage in fraud may be distinct from the process that makes the fraudulent firm more likely to get caught subsequently (and thus observed in the data). Ignoring these two distinct latent processes can generate biased results in research that lead to incorrect conclusions. For example, Wang (2013) demonstrates that in a bivariate probit analysis, research and development (R&D) investment increases the likelihood of committing fraud but decreases the likelihood of that fraud being detected, while these two opposite effects cancel each other out in a standard probit model, leading to inaccurate policy implications. The recent COVID-19 pandemic and resulting data availability in the United States allows for an analysis of a similar dynamic. The observed detection of COVID-19 cases in a US county is driven by the likelihood of infection in that county and the likelihood of detecting infection conditional on the presence of infection. At the county level, the drivers of infection may be different than the drivers of detection, just as the drivers of an organizational event may be different than the drivers of the detection of that event.
Despite its benefits in theoretically and methodologically distinguishing between the distinct latent processes, the use of a bivariate probit is often avoided for two reasons – difficulty in specifying the model and unstable convergence. The difficulty in specifying the model stems from the requirement for exclusion restrictions – at least one variable that predicts each latent variable but that is unrelated to the other latent process. For example, a bivariate probit model of fraud would require at least one variable that predicts the commission of fraud but that is unrelated to the detection of fraud and at least one variable that predicts the detection of fraud but that is unrelated to the commission of fraud. The identification of these variables requires strong theoretical arguments. Similar exclusion requirements are required for instrumental variable models. We hope that the difficulties in specifying the model will diminish as authors and reviewers become more acquainted with this kind of reasoning (Hill, Johnson, Greco, O'Boyle, & Walter, 2021).
Even if the bivariate probit model f...