Helmut Farbmacher - Research
Work in progress
"On the use of the Lasso for instrumental variables estimation with some invalid instruments"
(with Neil Davies, George Davey Smith and Frank Windmeijer)
Discussion Paper 16/674, University of Bristol [ pdf | software ]
revise and resubmit at Journal of the American Statistical Association.
We investigate the behaviour of the Lasso for selecting invalid instruments in linear instrumental variables models for estimating causal effects
of exposures on outcome, as proposed recently by Kang, Zhang, Cai and Small (2016, Journal of the American Statistical Association). Invalid instruments are
such that they fail the exclusion restriction and enter the model as explanatory variables. We show that for this setup, the Lasso may not select all invalid
instruments in large samples if they are relatively strong. Consistent selection also depends on the correlation structure of the instruments. We propose
a median estimator that is consistent when less than 50% of the instruments are invalid, but its consistency does not depend on the strength of the instruments
or their correlation structure. This estimator can therefore be used for adaptive Lasso estimation. The methods are applied to a Mendelian randomisation study
to estimate the causal effect of BMI on diastolic blood pressure using data on individuals from the UK Biobank, with 96 single nucleotide polymorphisms
as potential instruments for BMI.
"Increasing the credibility of the twin birth instrument" (with Raphael Guber and Johan Vikström)
[ pdf | code ]
Journal of Applied Econometrics, 2018, 33(3), 457-472.
Twin births are an important instrumental variable for the endogenous fertility decision. However, in many economic settings, twins are not exogenous either
as dizygotic twinning is known to be correlated with maternal characteristics and fertility treatments. Following the literature in medicine and epidemiology,
we assume that monozygotic twins are a random event occurring from the spontaneous division of a single fertilized egg. We use this exogenous variation to
construct a new instrumental variable, which corrects for the selection bias although monozygotic twinning is usually unobserved in survey or administrative
datasets. We use longitudinal administrative data from Sweden and US census data and show that the usual twin instrument is not only related to observed but
also to unobserved determinants of economic outcomes, while our new instrumental variable is not. We demonstrate the relevance of our new instrument in two
labor market applications and find that the classical twin instrument underestimates the true negative effect of fertility on labor force participation and
earnings. This finding is in line with the observation that high earners are more likely to delay childbearing and hence have a higher risk to get dizygotic
"Finite sample properties of the Anderson and Rubin (1949) test" (with Maurice Bun and Rutger Poldermans)
[ pdf ]
revise and resubmit at Econometric Reviews.
Most studies nowadays use uncentered (as opposed to centered) moment conditions to form the weighting matrix for the GMM version of the Anderson and Rubin (AR) test statistic. Remarkably, both versions of the GMM-AR statistic do not cover the usual definition of the AR statistic under homoskedasticity (IV-AR). We propose a finite sample correction for the GMM-AR test statistic, which nests the usual IV-AR statistic and performs distinctly better in finite samples. Moreover, we derive an asymptotic distribution of the IV-AR under homoskedasticity but non-normal errors which has correct size even if the number of instruments is as large as the sample size.
"Semiparametric Count Data Modeling with an Application to Health Service Demand" (with Philipp Bach and Martin Spindler)
[ pdf ]
Econometrics and Statistics, forthcoming.
Heterogeneous effects are prevalent in many economic settings. As the functional form between outcomes and regressors is often unknown a-priori, we propose
a semiparametric negative binomial count data model based on the local likelihood approach and generalized product kernels, and apply the estimator to
model demand for health care. The local likelihood framework allows us to leave the functional form of the conditional mean unspecified while still
exploting basic assumptions in the count data literature (e.g., non-negativity). The generalized product kernels allows us to simultaneously model discrete
and continuous regressors, which reduces the curse of dimensionality and increases its applicability as many regressors in the demand for health care
Published and accepted papers
"Semiparametric Count Data Modeling with an Application to Health Service Demand" (with Philipp Bach and Martin Spindler), Econometrics and Statistics, forthcoming.
"Increasing the credibility of the twin birth instrument" (with Raphael Guber and Johan Vikström), Journal of Applied Econometrics, 2018, 33(3), 457-472.
"Heterogeneous effects of a nonlinear price schedule for outpatient care" (with Peter Ihle, Ingrid Schubert, Joachim Winter and Amelie Wuppermann), Health Economics, 2017, 26(10), 1234-1248.
"Testing under a special form of heteroscedasticity" (with Heinrich Kögel), Applied Economics Letters, 2017, 24(4), 264-268. [ code ]
"The many weak instrument problem and Mendelian randomization" (with Stephen Burgess, Neil Davies, Stephanie von Hinke Kessler Scholder, George Davey Smith and Frank Windmeijer), Statistics in Medicine, 2015, 34(3), 454-468. [ software ]
"Extensions of hurdle models for overdispersed count data", Health Economics, 2013, 22(11), 1398-1404. [ software ]
"Per-period co-payments and the demand for health care: Evidence from survey and claims data" (with Joachim Winter), Health Economics, 2013, 22(9), 1111-1123.
"GMM with many weak moment conditions: Replication and application of Newey and Windmeijer (2009)", Journal of Applied Econometrics, 2012, 27(2), 343-346. [ software ]
"Estimation of hurdle models for overdispersed count data", Stata Journal, 2011, 11(1), 82-94. [ software ]