Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hi Dr. Correa,

Thanks for writing. I've taken a quick look at your paper and its OSF page, and you all clearly know what you're doing and have some great work here. I am not your target audience (I am not an academic), but if I were peer reviewing this, I'd suggest one of two avenues.

1) As Angrist and Krueger put it, "In our view, good instruments often come from detailed knowledge of the economic mechanism and institutions determining the regressor of interest...progress comes from detailed institutional knowledge and the careful investigation and quantification of the forces at work in a particular setting." With this in mind, is it possible to contact the maintainer of Sci-Hub, email her your paper, and ask if she can think of any plausibly exogenous sources of variation in papers' availability on Sci-Hub? Some possibilities that come to mind for me:

* In my own experience, papers without DOIs are often harder to find on Sci-Hub. Are there closed-access journals that don't mint DOIs on which you could use propensity score matching to create something of an apples-to-apples comparison?

* Is there an identification strategy somewhere in journals with optional APCs for which only some of the papers are open access?

* Were there ever any sci-hub outages that lasted periods of many months, and, however long down the line, did the citations of paywalled articles decline relative to their open access peers?

2) If none of this is possible, I suggest removing all causal language from your article and sticking to a predictive model. Observational research is appropriate for exploring and identifying causal hypotheses rather than confirming them; no statistical techniques that attempt to control for unobserved population heterogeneity will persuade me otherwise (I am admittedly an extremist on this position, but my own views hew closely to those of Gerber, Green and Kaplan (2003): http://www.donaldgreen.com/wp-content/uploads/2015/09/Gerber...). The Lewbel (2012) paper you cite looks interesting but, to paraphrase Gerber et al., statistical techniques can't account for nonstatistical sources of uncertainty (the inherit unnkowability of whether you've specified the 'correct' model, in the absence of randomization or exogenous variation, means we can't say anything about the biasedness of your estimation procedure).

Observational research is great! Generating hypotheses is as important as confirming them. It's just that some research designs license causal inference, and some do not.

Best of luck.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: