The National Longitudinal Study of Adolescent Health, aka Add Health, has been in use for more than a decade ago. Thousands of researchers have used it. This is fantastic. There are great economies of scale in the data collection.
Sadly, we researchers have wasted years doing things that others have already done. Anyone beginning a new project must first clean their data. Add Health doesn’t require as much cleaning as some other, messier sources of data, thanks to people like Joyce Tabor, James Moody, Ken Frank, and many others. Still, I think research would be sped up quite a lot, and communication greatly enhanced, if people shared their code more widely. Therefore I’ve created my first github code repository which prepares the variables from the widely used in-school questionnaire portion of Add Health.
This will be of most use to people using R, but the data could be exported. The script also includes cross tabulations and fairly detailed comments which I hope will help people think about the data. Some time soon I’ll upload more code.
I recommend Jeremy Freese on reproducibility in sociological research here and here. Andy Abbott’s best objections don’t apply to a widely used data source like Add Health.
p.s. Do share links to other code repositories in the comments!