Resources

Benjamin Arancibia - The Expanse - Navigating the R Package Universe

video
Oct 31, 2024
3:53

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

All right. Hi everyone. My name is Ben Arancibia and I'm Director of Data Science at GSK. And there I specialize in enabling the use of R by building tools and working with study teams to integrate R into their workflows. I have to say it, this is my opinion, it's not my company's.

All right. So, The Expanse. It's a cult classic sci-fi show and a series of books. And it's about how people communicate across vast areas of space to achieve goals. And I think a lot about this with my role in building tools. Communication, similar to in space, is really hard. And this might be true for your teams, it might be true for many teams. But I think it's especially true for my team, where we're trying to blend individuals who have a PhD in statistics and R package developers, who are very, very different and have very different skill sets.

Communication challenges across skill sets

So, when I was working on R package called Beast, which focuses on Bayesian dynamic borrowing with PhD level statisticians, we often used the same words with different meanings. For example, they asked for executable code. So, I thought, oh, a .exe file. No, what they meant was just code that you can run interactively. Another example in this statement is covariance. We need to take into account covariate adjustments for Bayesian dynamic borrowing. I have a background in stats, not a PhD level, so I thought I knew what they meant. And without knowing it at the time, I had absolutely no idea.

And we're talking about two different definitions and understanding of what covariate adjustments means. So, later on, we talked about these two different things, what we meant, and we had a really good culture of collaboration. So, when these things pop up, you often have people actually looking and asking things like this.

Establishing team values

So, how do you avoid this? Setting up your team values at the beginning to really have a focus on that culture of collaboration is crucial. So, setting up values. That sounds really easy, doesn't it? Yeah, it's not. Unfortunately, it's really, really difficult. So, it's hard. What do you do? What are your values? What's the right balance of too many or too few? What should they be? And these are like really tough questions that need to be discussed and decided upon so you can use them when you have those tough questions or you have those kind of obstacles pop up.

Here are the five values that we established at the beginning of the BEAST package. My favorite is radical transparency. Basically, it's a way to be constructive about how to answer tough questions. And it's how you make the best data products, in my opinion. How we use these values is we would often reference them in conversations. So, like in the spirit of radical transparency, I think we should do a different approach. Or, in the spirit of trust, you make the decision since it's your area of expertise. And that's really crucial, especially when you're trying to blend skill sets with PhD statisticians and R package developers.

Basically, it's a way to be constructive about how to answer tough questions. And it's how you make the best data products, in my opinion.

So, once you figure out the values as a team, you can go from asking things like, what? To this.

And that's it. That's what I wanted to talk about, values. So, if you'd like to know more, come talk to me. Shoot me a message on LinkedIn. Or check out BEAST, which is open source. If you're really into Bayesian dynamic borrowing, which I realize is very niche. But that's the package I reference. It's where we try to implement these values. And I'm really proud of this data product that we're able to let everyone use. So, thank you.