James Blair: Part 1 — Portfolios, practice, and staying curious
videoimage: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Welcome to the test set. Here we talk with some of the brightest thinkers and tinkerers in statistical analysis, scientific computing, and machine learning, digging into what makes them tick, plus the insights, experiments, and OMG moments that shape the field.
Alright, James, so glad to have you on the test set. Thanks for joining us. I'm here with Hadley Wickham and Wes McKinney, and we're so excited to talk to you, I think, all about your journey into data science. For a little bit of background, you're a Senior Product Manager at Posit on cloud integrations, is that right?
Cycling, 3D printing, and hobbies
My bike enthusiasm? Yeah, I don't know if it's a healthy obsession or an unhealthy one, but I got into cycling probably 10 years ago. I was commuting to work by bike, and then I went remote and had just bought a new bike and was like, now what do I do with this thing that I just bought? And so I started riding for fun and found that the more that I rode, the more that I enjoyed it. And it became this, you know, Hadley calls, Hadley talks a lot about the pit of success. This became a pit of like something, a lot of time, and a lot of money and a lot of carbohydrates. But I love doing it. It's something that I kind of like grounds me, I guess, right? There's a good community with cycling. I'm part of a team that races. So during the summer, I have races that I go to and, and it's just like, it's fun to have something that keeps me active that I enjoy doing besides just, I don't know, trying to be motivated to be active.
It's funny, I got so I have a road bike and then I have a mountain bike that I've had for a couple of years. And I feel like a lot of people that I at least interact with kind of grew up mountain biking, which is its own whole discipline, right? Like I'm, I consider myself a decent road cyclist, to some degree, mountain biking, I'm like, I'm totally a fish out of water even still. And so I have fallen in many pits on the mountain bike and have and have a few of the scars to prove it. But but I like it's it's fun to kind of mix and match. And I live, I live in a neat part of the country. I'm in Utah. So I have, I have like 50 miles of mountain bike trails within a quarter mile of my house that I can get on to. So it's, it's easy and convenient for me to get out and ride outside and test myself and try new things and hopefully not crash too terribly.
Ribbon? Yeah, I was like, so I was trying to figure out like how to frame this right I was like, I can put the bike on the wall. That's interesting. That's all printer filament. So I have also become like oddly obsessed with 3d printing, to some degree. And so that's all printer filament that used to be in these cupboards, and then they like overran it. And so I incidentally, 3d printed these little holders that like hold them on top of the cupboards in these nice little organized ways. Two of those doors are total like cycling stuff. So all my like cycling gear is in there. And then two of them is printer stuff. So just like other things related to printing and taking care of and building stuff from 3d. And a lot of stuff that I've like purchased in this in this kind of hobby, and have yet to figure out how to use which I'm a chronic accessorizer, which is like that does not combine well with cycling at all because cycling has like an exorbitant amount of things you can buy. So it's not like that's it's not it's not a good thing. But I tend to get really excited about something and then I'm like, I need all this stuff. And then I have all this stuff and I'm like, I should someday learn how to use all this stuff. So I'm slowly working my way through it.
I have. Yeah. So nothing of like, consequence, right? Like I'd like 3d printing, I think is fascinating, but I'm not going to like, trust any critical structural component to something I've printed. But I do have there's like a little, there's a little inset decal that goes on the axle. And whenever I take my wheel off to put my bike in the car, sometimes I over tighten the axle, and it pushes it out. And so I lost it a while ago. So I printed a new one. And, and because I thought it'd be fun. And because it's cycling and everything's extra, I printed it in carbon fiber filament, it doesn't make a difference. Like it's like a tiny little piece that weighs, I don't even know, like it's a fraction of a fraction of a, of anything. But I also, I call my name on my bikes, and I, I like superheroes. So my road bike is Black Panther. And, and so it's the Black Panther logo in this little decal that I have on the, on the front axle of my bike, which I thought was kind of a fun, a fun touch.
From robotics dreams to data science
All right. So I'm glad I love when we can get into the intersection of your, the hobby van diagram, you know, like explore that intersectional space. I think I saw you mentioned like early on you were really thinking about going into robotics and then you ended up aiming at journalism and then data science. I'm so curious, like maybe you could recap for folks that path into data science. Cause one thing that struck me was the move from robotics to journalism is so intriguing.
I'll try to be brief to some extent, but I, from a young age, I really thought robotics was really interesting. I had this like long standing dream of being this robotics engineer. I think as young as like six or something, like there's old videotapes of me being like, I want to go to MIT and study robotics. And that was kind of the dream that I had for a long time. And then there wasn't any sort of thing that like pushed me away from that necessarily. But in high school I ended up, somehow I ended up on the school paper. I don't, I don't remember how this happened, but I had a fantastic journalism teacher and I was, I did high school. The first part of high school was in Alabama. I had this fantastic teacher who just really embodied journalism, like the pursuit of truth and what it meant to report and to report on news and how to identify things that were newsworthy and ask good questions and all these things. And I just, I really loved it. And so that kind of pushed me into this direction of maybe this is something I want to pursue as a career. And I found that I really enjoyed meeting with people, interviewing people, thinking about how to tell a story that was compelling. And I still to this day, remember like headlines of some of the articles that I wrote in high school, right? Like they just, it was like, it was this deeply meaningful process of finding something interesting and then just exploring it to the very bottom and then telling that story that I found really fascinating.
So did that for a while. That led me to think like maybe journalism was a career choice for me. I did an internship with a local newspaper after I moved to Utah. So halfway through high school moved to Utah, did an internship with a local paper. That was really great, like strong advocate for internships solely because I think it helps you kind of figure out, do you really want to do this? And in my case, it helped me figure out like, I don't think I want to do this as much as it is, as much fun as it is to kind of like really investigate things. There's also a lot, particularly on the journalism side where it was like, you kind of have to work your way. You're not always, I think it's very few people that find themselves in a position where they're like really delivering kind of significant investigative reporting on a regular basis. Whereas like I was writing a bunch of reviews of local events that have happened, which was fine. And I like, it helped me understand the industry and kind of helped me understand things a little bit better. So pivoted away from that, I was really into, and this feeds into the 3D printing thing a little bit too.
I was really into like 3D graphics and design and that kind of world for a while too. And so when I went and started college, I did my undergraduate at BYU. They have a really fantastic animation program. And so I had decided going into it that I wanted to pursue animation. That felt like an interesting place. Pixar seemed like this really exciting place to potentially work. And this whole industry just seemed really intriguing to me. Because of BYU's program and the reputation they have, it's a very, very competitive program. So the first year that I was there was a bunch of kind of prereqs before you formally apply to the program. And all of that was nothing to do with computers. It was all like hand-drawn figure drawings, rudimentary two-dimensional animation, which was really fascinating. Not anything I'd really focused on before. And also taught me that I'm like, I'm a pretty terrible artist. Like I just was not good at it. And I think like anything, you can develop those skills, you can develop those. But I just like, I was with people that were so gifted. And I would look at their sketches and I'd look at what they were doing. And I was like, I am so clearly not on the same, I'm not even in the same like city that they're in, let alone like the same ballpark, right? And so that kind of made me have this little crisis of, okay, well now what? Like what am I going to do instead?
And I had, I decided that I really loved teaching and the process of kind of like teaching and helping people through this like process of discovery that is learning. And so I was trying to figure out if there was something there. I ended up deciding to pursue psychology. I pursued psychology. And the goal was to do like instructional design. I was like, it could be really cool to build curriculums for all kinds of things. One of the things that was encouraged amongst psychology students was to have a statistics undergrad to bolster your opportunities for grad school. So that's what I did. I was like, cool, sign me up for statistics, which I had never really put any thought or emphasis on before. And suddenly there's this like whole new field that I just loved. And I realized that the only thing about psychology that I found remotely, like really engaging for me was analysis. Like now that I've done the study or now that I've reached some results, I want to get in and look at what I've learned. Like what, what are the relationships? What did we learn? What did we not learn? Those kinds of things. And so I ended up flipping the two. I changed my, my major to be statistics, my minor to be psychology. And then that introduced me really early on in the statistics program to R and that as a, as a programming language. And for me, like I look back on my whole career, that was the one point that really is like an inflection point for me, where all of a sudden I fell into something that I just absolutely loved. I'd never programmed them before. I think I'd taken maybe one like C++ class on a whim and I hated it, absolutely hated it. And all of a sudden there was this language that like, to me, made a ton of sense. And particularly from a statistical standpoint, it gave me the ability to explore statistics in a very hands-on way that was fundamentally different from like the mathematical proofs and concepts that we would discuss in lectures, which were like are, are undeniably important. But for me, I just really struggled to engage with that kind of learning. And then all of a sudden with programming, I was like, oh my goodness, like I can, I can see what I'm learning about in real time. And I have this total environment where I can experiment with things and I can build my own functions and my own simulations and all this stuff.
So that was this huge turning point for me. And then I graduated and as I was approaching graduation, it was right at the height of, this is 2016, so it's kind of right at the height of data science as a discipline where people were talking a lot about it. Programs were being developed around it. There was a lot of enthusiasm for data science, which in my mind was just like statistics by a different name. And so I applied to a data science graduate program and was accepted and did that. And that's kind of put me on this path of data science, programming, R specifically as a starting point into all of that. And yeah, so very like winding journey, but one that I wouldn't trade because I've learned a lot through the whole process.
The Shiny app that changed everything
Yeah, it's so cool. I mean, it makes me think about, I did something similar with psychology and stats and that pipeline of like, I do remember like our stats classes at the time were all in SPSS. So like a very point and click. And what you said about R really resonates with me where you're really like hands-on and you're able to like explore and kind of like get in there. But also I feel like there's something magical too about going from that to like now being able to put up a website is such a kind of interesting path.
Yeah, I think one of the other key pieces here was I did, while I was still in undergrad, I did an internship with a really small bioinformatics startup. And they, I found them through like a career fair and I was like, I'm studying statistics. And the guy that I was talking to was like, we might be able to use that. Cool. Why don't you come see what you can do with us? So I showed up and I really didn't have, they didn't have anything they wanted me to do. They just thought I could be useful, which for some people I think would be kind of like the worst case scenario. Like, please tell me what to do. For me, I've always been like fairly autonomous. And so I showed up and was like, okay. And they were like, do stats for us. You know, like we need some stats. I was like, cool. And what ended up happening was I took this opportunity to like learn Shiny. So this was like, I spent every day of this internship for a couple of weeks, just like diving into the Shiny documentation. I'd heard about it before, but I'd never done anything. But I was like, you know what? I think I might be able to use this Shiny package and build something that could be of use to this, to this company. So I dove into the Shiny documentation and within like the first month or six weeks that I was there had put together this like Shiny dashboard that pulled all of their Google analytics from their platform and put it all in this like central place where they could monitor traffic and engagement with the tools that they were building. And everybody was blown away. And I was like, I felt so empowered, right? I was like, oh my, like, look at what R can do. Look what I can do. Like, this is so, this is so great. I can, I can I can make things that are useful using this kind of quirky little programming, statistical language.
I was like, oh my, like, look at what R can do. Look what I can do. Like, this is so, this is so great. I can, I can I can make things that are useful using this kind of quirky little programming, statistical language.
The value of a master's degree in data science
Yeah. Yeah. It's so cool. I, one thing I'm curious about is I know when I talk to people about data science and sort of data science education, I feel like people often ask like the value of a master's or some of these like advanced education. Like, do you have advice for people who are thinking about a master's or what to do after say like undergrad?
So I really enjoyed my, the program that I went through. It was, I was part of the first cohort through the program. So they were kind of figuring things out as they went. In fact, it was a master's in analytics when I started. And then I graduated with a master's in data science. At some point, they were like, you know what, if we call this data science, we could make it cooler. So that's what they, which was great. I think that's such a hard question, especially in today's world, right? The way that the world is evolving, we live in this world of like AI and what that means both today and in the future. For me, like, I, I still think there's a lot of value there. Part of that value came from the fact that I used this program as a chance to really try to like stretch myself to the rest of the data science ecosystem that I really hadn't had time to dive into previously. So a good concrete example of this is I had done everything in R, right? Like I knew that Python existed as a language that was kind of it. And so when I started this program, they told us upfront, they said, Hey, all of our content is structured so that you can kind of choose how you want to do things. You can use R, you can use Python. And I made a conscious decision to do all my schoolwork in Python, just as an effort to kind of like, I knew the concepts and I knew that if I really got stuck, I could like prove something out in R and then try to work myself backwards from that to figure out the equivalent on the Python side. And so for me, that exercise, doing that with peers, being in like being in a classroom setting was, was really helpful and useful.
It was also like my experience was a bit unique because the program I did was out of San Francisco, but instead of like moving to the Bay Area, I just lived in Utah and plane commuted to grad school for two years, which like honestly, financially made a ton of sense and, and was kind of a nice way to do things. Like I had a young family at the time. And so instead of trying to figure out how we could make it work in the Bay, we just lived where we had been living. And I took every other Saturday and flew to San Francisco for the day. So it was like, it was a busy couple of years and I racked up a ton of Alaska miles, but, but it was fun. Like I, I, I enjoyed it. And for, again, to answer the question for me, I find a ton of value in it. I think partially from the curriculum and the instruction, but also just the experience of like being with peers at that particular point in time, trying to figure out like what, what is data science to some extent. And I think maybe we have a better answer to that question today, although it still is somewhat ambiguous, I think, particularly as we consider the landscape of AI and everything else. But, but I, I benefited and continue to benefit from the experience. So for me, it was well worth, well worth the time and investment.
Yeah. And I tell them like, I don't know, but I will say, I think the, like the average masters like across disciplines is like generally like a pretty shitty quality. Like that's just such a good way for universities to make more money. So I think they've like good masters programs do exist, but you really have to hunt for them. And, you know, you should be able to, especially for like a data science masters, you should be able to ask them for their data on like what happens to their graduates. Like, I think you need to think of it as an investment. You're going to pay like X dollars to get this thing. Like, what are you going to get out of it? Like, how does it affect your, your job chances?
I've heard similar things. Like I know a lot of universities, their, their data science and statistics masters programs have become, you know, big sources of income and tuition for the, for the programs, because typically the master's students pay for their, pay for their masters and the PhD students are paid for out of, out of grants and research and research funds. And so essentially the PhD students cost money, the master's students make money. But my, my advice generally is it, it depends. I think if the, if the program has a good track record of helping students get jobs and end up in transitioning successfully in the professional world that's, that's an important data point. It can be helpful for people who are making some type of a career or education segue. Like maybe they started out in a field where they didn't have a lot of like statistical training or data science training, and they're looking to segue into being more of a data scientist. And, and maybe they're on their resume on paper with their education and their work experience. They don't have the right kind of resume to, to, to make themselves interesting to appealing to employers looking for, for data science candidates, then having, having that type of program on your resume can be a big, a big benefit. But of course, in 2025, the big question is whether that's all going to be going out the window with the emergence of AI.
And certainly I think data science is one of those areas where it is one of the areas of technical work that requires the most human judgment. That's the most nuanced and is perhaps the least well-suited to, to automation and humans being automated and replaced by AI agents. That doesn't mean that a lot of startups and companies aren't going to try, but I think compared to it with a lot of the areas where people are writing code and doing data analysis, I think data science is one of the areas where humans are the most essential to being in the loop. And so perhaps there will be, I speculate there will be a lot fewer entry-level data science positions than there, than there were in the past, just like there'll, there will be fewer entry-level software engineering roles and things across the board, but there's still, there's still a need. So I'm sure that, that AI will have a disruptive effect on the education industry. And so it will be interesting to see what, what things look like when the dust settles. And I don't know whether it's three years from now or five years from now or 10 years from now. Andrej Karpathy thinks we're 10 years away from artificial general intelligence. So I don't know what happens when we, when we reach that point, but it's definitely going to be interesting.
Yeah. Maybe to recap some of what people said, it sounds like James, you're saying like it was really useful to have two years or some, some stretch of time to really focus on like learning a data science and have like um, classmates and a cohort to work with and had these, it sounds like you're saying, um, there's a need to sort of be discerning about master's programs since there's a sort of strategic component to them in universities and things. I have to do, I have to say like combined with like what Wiz was saying, like choosing to invest like $50,000 or, you know, whatever a master's costs these days into data science, like right now feels like it does feel like a high risk kind of bit. And I'm kind of like, I don't know, like 70% joking when I say this, but I feel like if you spent like $50,000, like learning how to weld is probably much cheaper than that to like learn how to weld or like do something that like something like physical. I do think like those skills, I'm like mostly saying that like programmers and data scientists are going to be around, but it doesn't hurt to like diversify your options and make sure you can do something like in the real world too. Like James will be fine because he can like 3d print things.
I was just going to say, I think part of the benefit for me was the timing of it all, right? I get the world today is so different than it was 10 years ago. And, and part of it for me was just that it, it kind of, especially young in my career helped legitimize the aspirations that I was aiming for. And, and I look at, I look at kind of where my career has evolved since that point, since finishing up that master's program, like I, I finished, I graduated from that program, spent a little time in California, touring around and then started at RStudio right away. And so RStudio was the first job that I had after I concluded that program. And I think part of that was what kind of set me up to, to come into the role that I came into at RStudio and then to evolve from there. But again, like the landscape now versus then is like, it's foreign today. It really is. It's a, it's a totally different, it's a totally different world.
Curiosity, portfolios, and staying hands-on
I think there's like, I think what we've seen, or at least what I've seen is there's, there's formal education, which I do think plays a role. I think, I think there's reasons to be cautious and very intentional about the pursuit of that. But I also think that there's a lot of, there's, there's continual value in sort of your own hands-on experience. People talk a lot about like the value of a portfolio, which you could debate, like how good is it to have an impressive GitHub to show somebody? And I don't, I don't think that's really the point though. I think the point is, at least from my perspective, you learn a lot through your own practice and like your own implementation of things. And so even if nothing else, even if like you're not showing these things off, I think just putting your hands, like putting yourself to work on interesting projects, exploring data, learning how to use tools that exist today, you know, in today's world, learning how to use AI to Wes's point to kind of supplement, but not replace the work of a data scientist is extremely useful and can, and can easily be accomplished outside the constraints of a formal educational program. And I, and I think there's a ton of value in sort of that exercise of like, let's just practice things, right? Like find an interesting problem, find some data about it, look at that data, figure out what the tools in the market today let you do with that data. And then critically as the data scientist or, or aspirational data scientists, ask the hard questions about that data and then figuring out how to find answers that are reproducible and, and can be defensible, right? Like that, to me, that's the critical piece here is we live in a world of so much noise that part of our job as data literate individuals, whether you want to call us data scientists or statisticians, whoever, right? Like whatever label you put on it, at the end of the day, the job is what, what is true? Like, what is this data actually saying? And why are we convinced that that's the case? And how can we use X, Y, or Z tool, R, Python, this package, that, but whatever, how can we use that to defend our positioning, to defend whatever decision we've made based on what we've learned.
And to me, like the thing that's even more true today is like, we just have this human tendency to like find data that confirms our like prior beliefs. And like, regardless of whether that's looking at data or like working with AI, like that's something you've just got to be so aware of and kind of constantly like fighting against. So to me, like that's at the core of like good data science is like really kind of pushing against that tendency to like just stop at the answer that you want to see and really think like, does this actually make sense? Like, is there some other reason I could be seeing this, this, this pattern? Like, I think that's such a great contrast point to the way that AI operates today, right? Like AI just wants to be agreeable and wants to pursue the right answer. And so this sort of idea of critical thinking and being able and willing to take a step back and kind of flip things on their head and say, but what about, or what if, or wait a minute, what if it was, you know, what if we looked at it this way instead, or what if the question was this? Like that's, that to me is what makes us so critical in this world. And maybe us is not the right, that's what makes like curious individuals so critical in this world.
And that's why, like when I talk to people who are interested in the field or trying to kind of figure out what, what data science is about, and you hear this, I'm not the only person that holds this belief, but I think curiosity is such a fundamental requirement and you have to defend curi- like in today's AI world, I think you have to kind of like hold on to curiosity. It's something you have to intentionally just say, like, I'm going to remain curious because otherwise like it's really easy to open up chat GPT or whatever tool you want and just be like carried away by the vibes. But you can kind of lose yourself in that. And, and that's dangerous. And so I think there's this, you have to have this willingness to say like, I'm, I'm going to remain engaged. And part of that engagement is I'm going to remain fiercely curious about whatever I'm doing.
I think curiosity is such a fundamental requirement and you have to defend curi- like in today's AI world, I think you have to kind of like hold on to curiosity. And part of that engagement is I'm going to remain fiercely curious about whatever I'm doing.
And a big part of that for me is working on problems that are interesting to you, right? Like working on, working with data that you find personally engaging for X, Y, or Z reason. I spend a lot of time looking at cycling data. Like I like cycling. And so for me, that data is meaningful. Like, is it earth shattering? No. Am I solving critical problems? No. But for me, I know what questions to ask. I know how, like, I understand the meaning behind the numbers because I'm on a bike every day. And so I know what X, Y, or Z looks like and feels like. And so there's like this connection to what I'm doing that fundamentally helps me become more curious and like, ask deeper questions.
Yeah, that really resonates. I, I do feel like too, in a lot of situations where I've seen data scientists really work super well is where there are like numbers coming in, but they also have tons of experience, like with the domain at that point, or a lot of the kind of like flow of data through the company and how it taps into reality. So they can kind of like connect it back. Like, like to you being on a bike and then looking at bike data, I feel like a lot of data scientists have this sort of, they've like been in a room with the like stakeholders and people who need to make decisions. And so they can kind of like interpret the data well, or know which data to kind of like focus on more or less than.
Stay tuned for part two of our conversation with James Blair on the next episode of The Test Set. The Test Set is a production of Posit PBC, an open source and enterprise tooling data science software company. This episode was produced in collaboration with creative studio Agi. For more episodes, visit the test set.co or find us on your favorite podcast platform.
