Of all the projects listed for the 2019 internships, the one that stood out to me the most was the Grader Enhancements. Mainly because I knew how deceptively difficult the problem is.
It’s now named
gradethis
because grader
made it to CRAN on my first week of the internship.
The package tries to grade code in a
learnr
document,
but the real magic of the package isn’t just reporting a correct or incorrect answer,
it’s the ability to provide meaningful feedback to the learner.
For example,
if the solution to a question was
sqrt(log(1))
and the student provided
sqrt(log(2))
,
it would report the answer as “incorrect” but also return
“I expected 1 where you wrote 2. Try it again; next time’s the charm!”.
If it wasn’t for
Hadley’s lobstr
talk a year earlier,
I wouldn’t have even known about ASTs
(
abstract syntax trees)
to approach the problem.
I probably thought it would’ve involved writing the gnarliest regular expression pattern ever (I clearly don’t come from a computer science background).
Ever since I’ve been involved with the R (and python) community, I’ve always been surprised how much (free) material is out there for you to learn. Everyone in the community just fosters a welcoming learning environment, and the internship was yet another way to level-up my R skills. From a technical perspective, the most jarring part of the internship was transitioning from “a user of the R language” to a “developer”. I’ve taught workshops and written R packages, but nothing came close to the type of code I had to write for my internship.
Some of the major topics were:
- Using
vapply
instead ofsapply
: You might be okay in an analysis script with usingsapply
, butvapply
guarantees the type of vector that is returned. - Testing: This goes for package unit tests but also writing internal testing code to make sure the object you are working with is actually what you think it is.
This was done with the
testthat
package for running unit test suites, and thecheckmate
package to write powerful assert statements to check objects. - Non-standard evaluation: A good part of the internship was just reading and writing toy example code to get a sense of what NSE is in R. There’s a reading list of links on one of my earlier posts
To give a sense of how deep of a hole I needed to get myself out of just to understand what was going on, it took me about a month to fix the package and implement the first issue.
gradethis
was refactored out of learnr
,
so the entry-point into the grading functions were
highly coupled to what learnr
passed into the grading function.
If it weren’t for all the unit tests that were already there,
refactoring the package and changing the API would’ve been impossible.
What made unit tests essential to the work was that print
statements and browser
calls wouldn’t work as expected (if at all) because learnr
is a very complex
piece of software that needs knowledge of how knitr
and shiny
works.
If I needed browser
statements,
I’d actually have replicate the error in a unit test and put a browser
statement into the actual test call.
This wasn’t something that
Jenny Bryan
talks about.
I eventually got a hang of things,
and give a small internal lightning talk of it working.
Along the way I
polished up the API,
and even
found a bug in the ==
operator,
which I
reported to r-devel.
You can also use the package as a standalone grader!
One of my proudest accomplishments was
putting together a function that
standardizes function calls with all the default arguments.
This way something like
vapply(X = LETTERS[1:3], FUN = stringr::str_to_upper, character(1), USE.NAMES = TRUE)
and vapply(X = LETTERS[1:3], FUN = stringr::str_to_upper, character(1))
are marked as “correct”.
What made the internship such a wonderful experience was how many people I saw and spoke with, even though everyone is remote. The interns rotated every week to host a virtual coffee hour, so we would meet one another. The common theme of our discussions were usually along the lines of “I have no idea how I got here”, “everyone is amazing”, “we don’t want to go”. Needless to say, we all loved working there.
I had the chance to meet a few interns in person while at conferences over the summer, but it was great meet the rest of the interns and RStudio employees at rstudio::conf:
The @rstudio interns gather for a picture at #rstudioconf #rstats.
— Dⓐniel Chen @ rstudio::conf 🐍🏴 ☠️ (@chendaniely) January 31, 2020
AKA this is why parallel programming is hard. pic.twitter.com/QslxOjAPTV
But why “Gestalt Internship”? It wasn’t just an internship where I got to work on an R packages. I had an entire R and Carpentries community behind me to get me there. I didn’t just learn how to be a better R programmer, I worked on an education tool that would benefit future learners in R. I’ve left the internship with a better understanding of topics that I can pay forward to the rest of the community. After being on the Education team, I’ve found my way around my own dissertation topic and studying data science education in medicine, now. It’s more than just the “street cred” of being an intern at RStudio, I feel even more responsibility to take all that I’ve learned to help the community grow.
So if you ask, “what was it like interning at RStudio?", it’s anything and everything you could possibly hope for.