A big part of the open science paradigm (which I am trying to apply to my research) is making research as reproducible as possible – from data collection to publication and beyond. I’ve worked pretty hard over the last few years to become reasonably compotent at scripting and programming, largely in the R programming language. The application of such a skill is useful for a number of reasons:

  1. Project review – Allows others to critique my work, find bugs, etc.
  2. Reproducible analysis – Provides all the steps, assumptions, etc. to generate the main results of a paper.
  3. Code repurposing – Can act as an explict example of how to do a particular analysis.
  4. Personal evaluation – Acts as an archive of how my coding skills have improved over time.

To better hone my programming and data science skills, I have been working with the University of Wyoming’s Data Science Center, where myself and other students get hands-on experience using high performance computing, as well as other state of the art data analysis and modeling tools.

I’m still a bit shy about sharing my code, but as I develop some upcoming projects it is likely I’ll be sharing a number of projects on GitHub, as well as other long-term code & data repositories.

Code repositories

Below are some common code (and data) repositories used in the context of reproducible research.