Gestalt Internship

Photo by Dinh Pham

Daniel Chen

Of all the projects listed for the 2019 internships, the one that stood out to me the most was the Grader Enhancements. Mainly because I knew how deceptively difficult the problem is.

It’s now named gradethis because grader made it to CRAN on my first week of the internship. The package tries to grade code in a learnr document, but the real magic of the package isn’t just reporting a correct or incorrect answer, it’s the ability to provide meaningful feedback to the learner. For example, if the solution to a question was sqrt(log(1)) and the student provided sqrt(log(2)), it would report the answer as “incorrect” but also return “I expected 1 where you wrote 2. Try it again; next time’s the charm!”.

If it wasn’t for Hadley’s lobstr talk a year earlier, I wouldn’t have even known about ASTs (abstract syntax trees) to approach the problem. I probably thought it would’ve involved writing the gnarliest regular expression pattern ever (I clearly don’t come from a computer science background).

Ever since I’ve been involved with the R (and python) community, I’ve always been surprised how much (free) material is out there for you to learn. Everyone in the community just fosters a welcoming learning environment, and the internship was yet another way to level-up my R skills. From a technical perspective, the most jarring part of the internship was transitioning from “a user of the R language” to a “developer”. I’ve taught workshops and written R packages, but nothing came close to the type of code I had to write for my internship.

Some of the major topics were:

  1. Using vapply instead of sapply: You might be okay in an analysis script with using sapply, but vapply guarantees the type of vector that is returned.
  2. Testing: This goes for package unit tests but also writing internal testing code to make sure the object you are working with is actually what you think it is. This was done with the testthat package for running unit test suites, and the checkmate package to write powerful assert statements to check objects.
  3. Non-standard evaluation: A good part of the internship was just reading and writing toy example code to get a sense of what NSE is in R. There’s a reading list of links on one of my earlier posts

To give a sense of how deep of a hole I needed to get myself out of just to understand what was going on, it took me about a month to fix the package and implement the first issue.

gradethis was refactored out of learnr, so the entry-point into the grading functions were highly coupled to what learnr passed into the grading function. If it weren’t for all the unit tests that were already there, refactoring the package and changing the API would’ve been impossible. What made unit tests essential to the work was that print statements and browser calls wouldn’t work as expected (if at all) because learnr is a very complex piece of software that needs knowledge of how knitr and shiny works. If I needed browser statements, I’d actually have replicate the error in a unit test and put a browser statement into the actual test call. This wasn’t something that Jenny Bryan talks about.

I eventually got a hang of things, and give a small internal lightning talk of it working. Along the way I polished up the API, and even found a bug in the == operator, which I reported to r-devel. You can also use the package as a standalone grader! One of my proudest accomplishments was putting together a function that standardizes function calls with all the default arguments. This way something like vapply(X = LETTERS[1:3], FUN = stringr::str_to_upper, character(1), USE.NAMES = TRUE) and vapply(X = LETTERS[1:3], FUN = stringr::str_to_upper, character(1)) are marked as “correct”.

What made the internship such a wonderful experience was how many people I saw and spoke with, even though everyone is remote. The interns rotated every week to host a virtual coffee hour, so we would meet one another. The common theme of our discussions were usually along the lines of “I have no idea how I got here”, “everyone is amazing”, “we don’t want to go”. Needless to say, we all loved working there.

I had the chance to meet a few interns in person while at conferences over the summer, but it was great meet the rest of the interns and RStudio employees at rstudio::conf:

But why “Gestalt Internship”? It wasn’t just an internship where I got to work on an R packages. I had an entire R and Carpentries community behind me to get me there. I didn’t just learn how to be a better R programmer, I worked on an education tool that would benefit future learners in R. I’ve left the internship with a better understanding of topics that I can pay forward to the rest of the community. After being on the Education team, I’ve found my way around my own dissertation topic and studying data science education in medicine, now. It’s more than just the “street cred” of being an intern at RStudio, I feel even more responsibility to take all that I’ve learned to help the community grow.

So if you ask, “what was it like interning at RStudio?”, it’s anything and everything you could possibly hope for.

Contents
online
February 26 – 27, 2020
This workshop is the first step in becoming a certified RStudio instructor, and takes place online for four hours on each of two days at a time suitable for participants in Asia and the Western Pacific. (Please note that the dates given are UTC: the class will start at 6:30 am in New Delhi and noon in Sydney.) If you wish to take part, please fill in this form.