Helmut Farbmacher - Research

Work in progress

"On the use of the Lasso for instrumental variables estimation with some invalid instruments"
(with Neil Davies, George Davey Smith and Frank Windmeijer)
Discussion Paper 16/674, University of Bristol [ pdf | software ]
revise and resubmit at Journal of the American Statistical Association.

    We investigate the behaviour of the Lasso for selecting invalid instruments in linear instrumental variables models for estimating causal effects of exposures on outcome, as proposed recently by Kang, Zhang, Cai and Small (2016, Journal of the American Statistical Association). Invalid instruments are such that they fail the exclusion restriction and enter the model as explanatory variables. We show that for this setup, the Lasso may not select all invalid instruments in large samples if they are relatively strong. Consistent selection also depends on the correlation structure of the instruments. We propose a median estimator that is consistent when less than 50% of the instruments are invalid, but its consistency does not depend on the strength of the instruments or their correlation structure. This estimator can therefore be used for adaptive Lasso estimation. The methods are applied to a Mendelian randomisation study to estimate the causal effect of BMI on diastolic blood pressure using data on individuals from the UK Biobank, with 96 single nucleotide polymorphisms as potential instruments for BMI.

"Increasing the credibility of the twin birth instrument" (with Raphael Guber and Johan Vikström)
Working Paper 2016:10, Institute for Evaluation of Labour Market and Education Policy (IFAU), Uppsala. [ pdf | code ]
Journal of Applied Econometrics, forthcoming.

    Twin births are an important instrumental variable for the endogenous fertility decision. However, in many economic settings, twins are not exogenous either as dizygotic twinning is known to be correlated with maternal characteristics and fertility treatments. Following the literature in medicine and epidemiology, we assume that monozygotic twins are a random event occurring from the spontaneous division of a single fertilized egg. We use this exogenous variation to construct a new instrumental variable, which corrects for the selection bias although monozygotic twinning is usually unobserved in survey or administrative datasets. We use longitudinal administrative data from Sweden and US census data and show that the usual twin instrument is not only related to observed but also to unobserved determinants of economic outcomes, while our new instrumental variable is not. We demonstrate the relevance of our new instrument in two labor market applications and find that the classical twin instrument underestimates the true negative effect of fertility on labor force participation and earnings. This finding is in line with the observation that high earners are more likely to delay childbearing and hence have a higher risk to get dizygotic twins.

"Semiparametric Count Data Modeling with an Application to Health Service Demand" (with Philipp Bach and Martin Spindler)
Working Paper No. 12/15, Health, Econometrics and Data Group (HEDG), University of York. [ pdf ]
Econometrics and Statistics, forthcoming.

    Heterogeneous effects are prevalent in many economic settings. As the functional form between outcomes and regressors is often unknown a-priori, we propose a semiparametric negative binomial count data model based on the local likelihood approach and generalized product kernels, and apply the estimator to model demand for health care. The local likelihood framework allows us to leave the functional form of the conditional mean unspecified while still exploting basic assumptions in the count data literature (e.g., non-negativity). The generalized product kernels allows us to simultaneously model discrete and continuous regressors, which reduces the curse of dimensionality and increases its applicability as many regressors in the demand for health care are discrete.

Published and accepted papers

"Increasing the credibility of the twin birth instrument" (with Raphael Guber and Johan Vikström), Journal of Applied Econometrics, forthcoming.

"Semiparametric Count Data Modeling with an Application to Health Service Demand" (with Philipp Bach and Martin Spindler), Econometrics and Statistics, forthcoming.

"Heterogeneous effects of a nonlinear price schedule for outpatient care" (with Peter Ihle, Ingrid Schubert, Joachim Winter and Amelie Wuppermann), Health Economics, forthcoming.

"Testing under a special form of heteroscedasticity" (with Heinrich Kögel), Applied Economics Letters, 2017, 24(4), 264-268. [ code ]

"The many weak instrument problem and Mendelian randomization" (with Stephen Burgess, Neil Davies, Stephanie von Hinke Kessler Scholder, George Davey Smith and Frank Windmeijer), Statistics in Medicine, 2015, 34(3), 454-468. [ software ]

"Extensions of hurdle models for overdispersed count data", Health Economics, 2013, 22(11), 1398-1404. [ software ]

"Per-period co-payments and the demand for health care: Evidence from survey and claims data" (with Joachim Winter), Health Economics, 2013, 22(9), 1111-1123.

"GMM with many weak moment conditions: Replication and application of Newey and Windmeijer (2009)", Journal of Applied Econometrics, 2012, 27(2), 343-346. [ software ]

"Estimation of hurdle models for overdispersed count data", Stata Journal, 2011, 11(1), 82-94. [ software ]