Prior to this internship, my coding abilities were amateur at best. After this internship, I have not only become fluent in R, but also more confident in my software abilities in general. I have my mentors Davis Vaughan and Max Kuhn to thank for helping me build this agency, as well as the 2019 intern cohort and the #rstats community, but long story short, if you’re interested in learning how to contribute to open source, ship production-level code, or build software products, I would highly recommend applying. The training I had was exceptional.
The application is short, and the bulk of it lies in simply having a web presence. I sent my application on the last day of the submission window. Since I did not make it past the interviews last year, I was hesitant about applying, but the desire to intern was still there, so I applied again.
My website at the time was a jenky hand-coded HTML landing page with links to several Shiny apps I made for an introductory stats course. As there are many resources available to quickly get up and running with a beautiful, responsive, mobile-friendly website, please avoid making the same mistake I did. My favorite resource remains Alison Hill‘s summer of blogdown.
I heard back a week after applying and interviewed with Barret Schloerke and Davis Vaughan. The interviews took place on Zoom and lasted 30-45 minutes. They began with introductions and included 10-15 minutes of background questions, 15-20 minutes of technical questions, and a few minutes at the end for me to ask questions. I panicked a little with the technical portion as I hadn’t anticipated it, but I thought out loud and asked clarifying questions along the way. About a week later, I received an email with the great news.
Interning remotely is not for everyone, but I loved it: I could set my own schedule, so I traveled and saw family and was still productive. Working remotely, however, takes discipline and some getting used to: it’s not like you can turn around, tap your colleague on the shoulder, ask them a question, and get an immediate response.
In my first two weeks, my mentors answered every message I sent them within seconds, but past this initial period, a response might take minutes or hours. (The days where replies came hourly were rare; in fact, on those days, Davis and Max made it explicitly clear that they were either at a conference or in deep work, and would reply at a specific time.)
What I found most effective was asking easy questions in large team Slack channels (where I’d get an immediate response) and hard ones in private channels. Hard questions generally require someone reviewing my code base for a substantial amount of time. To be honest, knowing that I would have to wait to get feedback was great, because that gave me an excuse to either go out for a long coffee date or work on several side projects.
On the whole, my daily routine consisted of logging on around 7am, coding for 4 hours, hitting the gym, eating lunch, coding for another 3-4 hours, and if I had errands to run in the afternoon, I’d just make up those 1-2 hours after dinner. At the end of each work day, I’d log what I accomplished through an internal Achiever app. This not only helped me track my progress, but also let Davis and Max check in occasionally and provide any feedback or resources that could help. I was way more productive than I’ve ever been.
The big wins for this summer were (1) building a new R package ( butcher) that reduced the size of modeling objects saved to disk, and (2) getting this package onto CRAN. (If this tool might be of any interest, please check out this overview.) My key takeaway is that I wish I had learned package development earlier. It would have made so much of my research efforts (in grad school) more reproducible, easier to consume, and easier to extend. A lot of the scripts I wrote prior to this internship were single, long-form functions that I simply copy-and-pasted if there were any internal pieces I had to change. Had I developed my research scripts as a package, I would have to modularized my code in such a way that each subunit was designed to be reusable and testable, and thus much more robust.
I cannot stress the importance of unit testing enough. There is some debate around what the optimal level of testing is relative to each line of code, but the general consensus I received (at least internally) was 80%. And for anyone that’s never built a package before, or for anyone like me who still looks up everything up, check out Jenny Bryan’s whole game chapter.
It’s possible to get a skeleton package up and running, with version control, under an hour. Other important milestones, prior to submitting to CRAN, were:
- Setting up continuous integration
- Styling and documenting my code
- Creating an accompanying
pkgdown(complete with a hex sticker by Allison Horst)
Getting the package onto CRAN was incredibly straightforward: the package underwent 3 iterations before it was finally accepted, but the edits were legitimate and minimal.
My internship was unique (at least for me) because I wasn’t treated like an intern. My contributions mattered, and the relationships I developed with the interns and the employees at RStudio are irreplaceable. In fact, the interns and I still touch base every week or so, and whenever we might be going through something, or hitting any hurdles with our work, we’re there (at least virtually) for each other.