Bryan Shalloway | From summarizing projects to setting tags, uses of parsing R files

Transcript#

This transcript was generated automatically and may contain errors.

Today, I'm going to be talking about some of the ways that parsing out the functions and packages in your files can be helpful for organizing and viewing projects. And I'll also be introducing a new package I wrote called FunSpotter that provides some helpers for parsing, analyzing, and organizing functions in different project structures.

Okay, so I hope I'm not the only one here who maybe follows some of their favorite R developers a little too closely. So for example, before going to sleep, rather than scrolling through Instagram, I'll just scroll through the recent GitHub activity of my favorite R developers. And then when a new blog post or new book comes out that I'm super excited about, it's like a major event for me, and I just want to jump straight in. And I would love to say that I'm just this sponge of information, and immediately I'm just, you know, know exactly what I'm doing, but usually it ends up kind of looking a little bit more like this.

And you know, there's all those things coming at you, and you're trying to figure out, you know, how these things relate to each other. And I've come up with this kind of heuristic where I'll just start kind of like writing down the functions that I see as I'm going through the new book or blog or whatever it is. And I'm happy that I've seen other people in the R community that kind of share this same pattern as I do. Here's a post or a tweet from Alex Cookson, where he says, any other R stats people find D-Rob's Tidy Tuesday screencasts useful? I made a spreadsheet with timestamps for hundreds of specific tasks he does. And then another example from Jeff Rothschild, who made something similar for different developers in the Tidy model space, particularly Julie Silgi and other authors there.

And I think that creating a reference table like this for yourself can be really helpful in terms of kind of building a, you know, mental model for what's going on and printing these different functions that you can go back to and check on and give yourself a little bit of context of the materials that you're working through. One downside with doing this is it just takes a long time. And you also may miss some specific examples or packages as you're going through and creating these by hand. And I think that this is where something like Funspotter can be useful, either in supplementing these materials or in creating an initial reference table for yourself.

So I'm going to start by just showing what a typical Funspotter workflow looks like and what the output looks like. I'm going to give you a few seconds to kind of soak this in. And this is also another opportunity if you find you can't read the code to scoot closer up. There's plenty of seats in the front. So I'll just give you a few seconds to see if you can soak this in.

Okay, so what I'm doing here is I'm specifying the GitHub repo where Julia Silke's blog lives. And then I'm just pulling out all of the individual functions and packages that are used there. And then we have links to the relative paths within the repo as well as the URLs where these files actually exist. And then I think passing something like this into a nice little HTML lookup table can give you a nice little reference table for reviewing whatever material you're in the process of going through. So here I'm just looking up where are the cases where she's used written about random forest in the past. And this gives this nice little way of looking this up.

Another place where this is useful is just in those cases where you kind of want to sift through some set of information. You may not even remember the function or package that you saw her write about. You just remember, okay, there was something that was useful that I think she did. And you just kind of want to look through a reference sheet. And I think that's also where these types of tables can be really helpful and provide nice little supplements to whatever other method you use for searching information or helping you kind of in your learning process.

And then if you actually go and look at like what the tag section of my website looks like all my posts are organized based off of whatever packages are used within them which I think is this nice clean coherent way of actually you know setting things up.