Adam Austin @ The Hartford | Data Science Hangout
videoimage: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Well, thank you all so much for joining. Welcome to the Data Science Hangout. My name is Isabella Velazquez. I'm stepping in for Rachel Dempsey as she shreds down some mountains in Europe. So if you are joining for the first time, the Data Science Hangout is an open space for the data science community to connect and ask questions about leadership, learn about what's happening in different industries, and discover just generally what is happening in the field of data. So these Hangouts are recorded and they're posted to the Posit YouTube channel, so you could re-watch past Hangouts whenever you would like.
And together we're dedicated to creating a welcoming environment for everyone, regardless of your background or your experience. So really we'd love to hear from everybody. There are three ways you can jump in, provide your perspective, or ask questions. You can raise your hand on Zoom and I will call on you. You could also put a question in the chat and I will ask you to read it out loud. And if you would prefer that I read it out loud, you could put an asterisk next to it and I will read it. Third, we have a Slido link where you can ask questions anonymously and a great team of moderators checking those out and making sure that we feel them to our featured speaker. So if you'd like to connect with folks after the Hangout, there is also a Data Science Hangout LinkedIn group. And so with that, I'm so excited to pass it to our featured guest today, Adam Austin, Director of Data Science at the Hartford. Adam, I'd love to have you introduce yourself and share a bit about you and your role and also how you like to have fun in your free time.
Sure, yeah, thanks for having me, everybody. So nice to meet you all. My name is Adam Austin. I'm repping our studio today, so vintage t-shirt. Yeah, so I'm a Director of Data Science at the Hartford, which is an insurance company based out of Hartford, Connecticut. And their primary business is a lot of commercial lines insurance, so both middle and large and also small lines of business. But also, we do have a personal lines division as well. And that's where I sit. So I actually, I help build some of them, the pricing model that go into our rating plans in the personal lines home side of things. So today, I'm actually not here representing the Hartford, so I'm going to be giving some of my own thoughts and perspectives on data science and industry in general, but happy to chat about insurance sort of from a high level. So I do have some hot takes coming your way, so your mileage may vary on some of that. But yeah, I'll be kind of speaking from my own experience, mostly sort of like in working in large multifaceted businesses.
I'd be interested, you know, as we chat and hearing some of your alternative experiences, maybe you have some pushback on some of the things I'm saying, so chime in. Always happy to wrestle with new ideas. My other professional interests include things like data visualization, package development, language interoperability. I love the whole like R versus Python thing. So I know that's like stoking the flame wars, and I shouldn't do that, but it's fun to think through for me. And we did say happy Leap Day. It's the 24th anniversary of the release of R 1.0 today. So apparently it was released on Leap Day 24 years ago. So big day for me, so hence the RStudio theme. My personal life, what do I like to do outside of work? I have four kids. I have one on the way, and that keeps me very, very busy. But I do, if I have some free time, like to get out and ride my bike a little bit in the warmer seasons. In COVID, my new hobby became craft coffee, so I accumulated an unreasonable number of coffee brewers. So feel free to ask about that.
Career path and the importance of relationships
Thank you so much, Adam. Adam, I'd love to hear, like, can you share a bit about your career path? And in particular, what advice you would give someone who's looking to transition from different kinds of fields or industries? Yeah, for sure. That's a very broad question. Sort of the themes that I've been thinking a lot about in terms of data science, and I call this out sort of in my bio, is that data science is really all about relationship building.
So I was asked to give a guest lecture a couple years ago for students in an intro to data science course at UC San Diego. And so I felt a little bit, you know, kind of humbled by the invitation, and I thought, well, what would I say to the young kids coming into the field, right? Like, what are some of the things they might not learn about data science in their coursework, in the curriculum? And I thought, well, I really feel like the things that I had to learn on the job that were hardest to grasp were all about just what it means to have business relationships and sort of where data science actually fits in to the bigger picture.
And I guess I like to start with sort of like this historical view of what data science is to help me think through this, too. So I think it was the Harvard Business Review in about 2012 was when they came out and said data science is the sexiest job of the 21st century. You probably all heard that said before, right? If you've worked in data science, you're kind of thinking, why would you ever say that? Because I think the reality was a little bit different. So I think when data science sort of rose up, I think the term was coined in like 2008 or something. And as it was on the rise, we really didn't have ML Ops. We really didn't have much in the way of DevOps. I think data engineering itself was just as new of a field. And companies really kind of jumped on the DS hype train really hard after that article was published.
And organizations take a long time to mature analytically. And so I think some companies, I would say a lot of companies are there now, like our big tech companies. But a lot of companies are still going through this maturation. And especially in 2012, 2014, when I came on the scene, it was a time when companies were still figuring out that whole process. But when this article was published, that kind of opened the floodgates for all these newly minted master's students like myself in stats to come into these companies, these newly forming departments, trying to make sense of what data they had available and kind of what to do with that.
There were, I think, at the time when data science was sort of centralizing in a lot of these companies, you had a lot of embedded analysts that were doing what we would now call data science as part of their job. It just wasn't seen that way, or I guess it just wasn't called that with a label. And so as these specialized roles and departments were springing up, everybody was kind of trying to grapple with what data science meant. And I think that kind of like set the stage for a lot of interesting corporate dynamics where data science was trying to figure out what it was going to do in this business. And business people were trying to figure out who these new people were and what they were up to. And I think that led to a lot of sort of misunderstandings and a lot of confusion.
And I think it has taken me a while to come to understand, come to terms with this idea of data science as being a tool for business. It's sort of one business function, and it's not just sort of like, it's not sort of every business is now going to be immediately data-driven, right? So, you know, I think about what advice would I give? Now I'll finally get your question. What advice would I give somebody? It's really about, you know, understand what the business needs are. Don't build something before you're ready to understand where it's going to go and how it's going to get used. And I think that's sort of the crux of where I've made some of my mistakes.
I remember after I left my first job, I was working for a vehicle manufacturer, and I was there for a couple years. I think we did some good work, but as I left the job, I was sort of reflecting on, you know, some of the things that didn't work out. I think I was surprised by a lot of the projects that failed, just how many there were, I guess, things where we tried to implement something and it just fell apart. And I actually took notes on some of this stuff because it was kind of amusing to me. It was a series of model names that I had tried to build, and I had written notes like, this model worked, but the executives didn't understand it. Or I couldn't build the right business case around this, and the data was bad. Or the business stakeholder just stopped responding to my emails. Or the partner support fell through, and the project couldn't reach maturity. It was all just kind of notes like this, which was really interesting.
And then as I went back and I looked at the things that did work, it was all about really simple answers to questions that my business partners were asking. So, you know, they needed to explore, for example, they needed to explore what are all the possible reasons we're seeing this one failure in this component of a truck. And they had all these hypotheses, and all they really needed to see at the end of the day was just a map of where the failures were happening and when they were happening. And that kind of sparked them to say, okay, now we know what the cause is. You know, it's probably temperature related. And they pursued that, and they found out that, yeah, in fact, it was temperature. And so it just saved them a bunch of time in that way. And it was just a simple, you know, map. And then, you know, it was just processes like that where you realize it's not always that you have to build a complicated model to answer a question. It's just figure out the right solution, even if it's not the, I guess, best solution, if that makes sense.
It's just figure out the right solution, even if it's not the, I guess, best solution, if that makes sense.
AI in data science workflows
A question from Slido, forward thinking, how about AI? Do you find AI making its way into you or your team's workflows? And how so?
Yeah, this is a really interesting question that I was just talking with somebody about. Because this feels like a really huge paradigm shift. And the way I was thinking about this is my parents' generation sees people like me and younger, and there's this label they apply to us. It's the digital natives. I don't know if you've heard that, where you kind of grow up around technology, you grow up around computers, and you just have sort of a mental model of how those things work that is maybe different from somebody who started using them later in life. And so that's kind of like how my parents see this great divide between older and younger generations.
And I always thought, is there going to be a similar kind of paradigm shift or seismic change in the way we think about things between me and my kids' generation? And I think maybe this new wave of generative AI is probably it in some ways, because it's really a different paradigm, a different way of thinking about technology and the access to information we have around us. And I begin to wonder whether that's going to be this sort of like permanent change in the way that we think about how we relate to the world, which I don't want to get too philosophical about or make too big a statement, but it does feel this way a little bit.
Because before you have, you know, as you're thinking about how do we introduce technology into our processes, you know, I would say 20 years ago it was, you know, evaluate what can be automated, sort of take all of the old functions you used to do, build this fixed core of sort of like technology and automation, and then sort of restructure your workflows around that fixed core. I think with technologies like gen AI, it's not like you have a deterministic fixed set of processes that take place at the core of your business or your work, and you structure everything around that. It's almost like, it's more like a relationship in a way. I guess going back to that theme of relationships, it feels like AI itself is going to, you know, like the way we'll use gen AI is going to be more like an assistant or, you know, an extra person you're working with, someone you have to kind of figure out how do you delegate tasks, how do you interact with the outputs and interpret those and polish those up for what you need, because it's not really as deterministic a system as you would have seen before.
And so it's, yeah, it's going to be an interesting couple of years as we kind of figure out what to do with this thing, but I can say that, you know, my company is looking at, you know, how you integrate this into your workflows. I know a lot of companies are thinking about what do you do with this, how do you make it, how do you build it in a way that is going to augment people's capacity, and how do you validate that it's giving you the right kinds of outputs, because there's lots of concerns around fairness and bias. And you think about how everyone has called things like XGBoost models black boxes. Well, I think gen AI is like the ultimate black box, right? So I think you're also going to see a lot of scrutiny around, you know, how can you interpret what these things are doing, and how do you ensure that they're providing accurate, and fair, and safe outputs.
Building a portfolio and avoiding cookie-cutter projects
Abigail, would you like to jump in, or I'm happy to read it out.
Yeah, thanks, so, oh, I still have background, sorry about that. So I asked you at some point about advice for undergrad, and you mentioned, like, do a small GitHub repo, or, you know, profile, whatever. I think about this a lot, so what do you look for with, like, you know, undergrads, recent grads, you know, junior people, like, what are you looking for on those GitHub profiles?
Oh, so, like, in, like, a portfolio? Yeah. Oh, sure, yeah. Yeah, I mean, it's gonna depend on, I guess I assume that the context is sort of, like, if I'm a hiring manager, and I'm looking to bring somebody in, so somebody is fresh out of school, they don't have any kind of, like, business experience, right, so I have to evaluate in some way the things you've done. So I would say, like, you know, if you've had some internship experience, you can list that experience, you can kind of tell me sort of, like, what are the things that you accomplished, you know, what were some of the things you were able to produce or deliver on. But if you just didn't get that internship experience, or, you know, you don't have things that you can sort of demonstrate on your resume, for example, it's great to have that portfolio.
What I'd be looking for is I want to just go in and see that, okay, you have, you know, some sort of, like, idea of structure of how the project should progress, so just the way I think about it is breaking projects into sort of data wrangling, cleaning some of the EDA processes, and then some of the modeling, and then some of the reporting, and then kind of tying it all together into a reproducible environment where I can sort of build that environment, I can see the steps to run to sort of execute and build that same model, and produce those outputs. And that kind of tells me you thought through the whole, like, end-to-end process, and you kind of thought through how is somebody else going to be able to take this and use it. That's probably where I would, I would say, like, that's a good portfolio project for data science in my mind.
There's a mention in the chat from Jacob, just also, like, trying to avoid cookie-cutter projects if at all possible. That's a good point. Yeah, yeah, if it's, if it's, like, the Titanic data set, right, where it's, like, okay, I think, I think, like, there's so many, there's so many examples of that out there. I think a good thing to do that I've heard once is just find something that interests you, like, take a topic that you like. So, you know, once I just went out, I was trying to build a workshop for people, and I thought, what's, like, a really kind of interesting, maybe a little bit messy sort of set of data that I can find and bring into this workshop for folks to, to play with? And I went out, and I found, I think it was on the UCI machine learning repository, or, you know, source of all good data sets. It was, it was, like, a bike share data set where you had different columns, sort of, like, you know, time of day, which, which day of the year it was, you know, when this, this bike was sort of checked out, and when it was returned, and maybe, like, the location of where it was checked out and returned, you know, how long it was taken out. And just kind of things like that, where you could start to explore, like, you know, who's using this on which days, and are they work commuters? Are they pleasure commuters? So you can kind of begin to explore and say, you know, what are the attributes that determine when bikes will be used the most, and where are they going, and how can you think about turning that into, like, a business problem where it's, you know, optimizing the location of bikes and things like that. But yeah, just find a topic that, that, like, you really enjoy and are interested in, and that will kind of lead you to the right kinds of questions and analyses.
Moving from ticket-taking to value creation
My question was, what would you, what advice would you give a younger or newer data team on how to either avoid or stop being just a ticket-taking center, and become a value creation center that is doing projects that are meaningful and that have meaningful KPIs and have backers and stuff like that? That's a transition that I, I find a lot of teams have a hard time making. And as I interview with, with places, potential places, that's something I ask about interviews. Like, how do you intake stuff? Do you have an intake process? If you don't, I worry that you're, you're not protecting people's time, stuff like that. So, I'm wondering your thoughts.
Yeah, I saw this interesting LinkedIn post the other day where somebody, I think he worked at a big tech company, and he was saying, he's like, data scientists, stop being, or stop taking requirements. And, and it was all about sort of get yourself a seat at the table, make sure that you are defining strategic direction and, and pushing that forward. And I, I had like a mixed reaction to that because I think on the one hand, yeah, that's right. Like, I think, I think a data driven organization needs to be thinking about how to advance their data capabilities. And that's the only way that you reach this, like, analytics maturity.
And when I say analytics maturity, what I mean is sort of like, a lot of organizations go through this process of starting out where they're sort of using descriptive analytics, where it's a lot of sort of reporting and understanding what happened in the past and that sort of thing. And then they move into more predictive analytics, where it's sort of like model building, you know, maybe using their predictions before business processes somewhere. And then they move into the space of sort of like prescriptive analytics, where it's like, okay, here's, here's, you know, the actions we should be taking as a result of these, of these more complex analytics initiatives.
So I think it partly depends on kind of like where you are in that, in that analytics maturity curve. But I do think it's, it's okay in some respects to, sort of build within the requirements of a business partner's set, broader set of goals. And so I'll explain kind of what I mean by that. A business is going to have this sort of like overarching goal, and that's usually like selling stuff, right? And then to achieve that goal, you, you kind of break that down into these core sets of business strategies. And then within each strategy, which might be owned by different kinds of like business divisions, you have a number of decisions in order to, to execute on initiatives that support the strategy. And then, and then to make those decisions, you have to have evidence that you're going to do the right thing. And that evidence, you know, I think among the evidence you have, one piece of that is your data. And among all of your data evidence, one piece of that is things that your data science team can do, can show, can prove.
You know, you'll have like different data science functions, whether they're so like centralized, but broken into like different groups based on sort of who they're supporting in the business, or they're maybe more like embedded within a team, but you kind of have a function that you're supporting. And so I think it's important to, this is where it's, it's not quite requirement taking, but it's working with your business partners to understand what their strategies are that, that ultimately roll up to that overarching goal for that division, right? Because they need to achieve some kind of, like you said, the KPIs and the metrics, right? And so if, if they're the ones who are generally going to set the strategy, and then what you need to do is, in those cases, work with them and say, like, what are those decisions that you're making? What are the sort of metrics or, or quantities that, that will tell you how to make those decisions? And then where do I plug in from there?
So it is a little bit about understanding the requirements of the business from that standpoint. However, you can also push back, and this stuff, I think, is where data scientists can have a lot of value, but also get themselves in a lot of trouble. You can push back on those metrics or on those ideas and say, well, what if we measured this thing over here instead? You know, would that be valuable? And I think, like, the value add there is, is now you can strategically shift the way they're thinking about how to measure things, and the, you know, kinds of data that they collect, and what they can do with that information. You also want to be careful, because in my experience, it's easy to sort of get a little too critical of the way that business is done. Sometimes we have to sort of, like, live into the way things are done, and then introduce alternative ways of looking at it at the same time.
Thanks. I wanted to add a comment, because this has been a huge part of my journey. I've been a data scientist, and also data analyst for the past 10 years. I'm now a data analyst at GitHub for the past two years, focusing on sales. When we talked about, let's say, not being a ticket, not just being a ticket receiving center, the model that I try to think of, especially when I'm working with salespeople, it's a collaboration, but they're the lead, and that's sometimes the hard thing. When we think about the project that we're working on, that if it's something we're interested in, but it doesn't solve a business question for a stakeholder, the project doesn't matter. In the end, the project always has to be tailored to that business question from the stakeholder. It's something that's really helped me, just asking the basic questions of, what are they trying to do?
Yeah, really important point, and I think that's where a lot of early data science groups struggled, I know I did, is to let the other people lead, in a lot of ways, because we're really good at doing analysis, so we can tell when something is going wrong, and so it's hard to let people lead when you think there's a better way. But this reminds me of a really nice blog post that Ted Lederis wrote just a couple days ago, he calls it, yes, and is the foundation of collaboration. He takes this idea from improv comedy, it's this idea called, yes, and, and if you haven't heard of this, it's basically, you know, when you're doing improv comedy, you might have sort of a scenario, but you don't really have an idea of what's going to happen, right, you don't know the direction this thing's going to take, there's no script, obviously. And so there's this idea in improv called, yes, and, in which sort of, you know, somebody will say something, or introduce a new idea to this scenario, and then it's your partner's job to sort of take that idea and carry it forward, push the idea forward, keep running with it, and not to say no, because that kind of interrupts the flow of, of a comedy sketch.
So, so in this blog post, Ted says that in collaboration, you build on what your partner is doing, and that's like this, I think that one sentence really stuck out to me, like, yeah, I think that's kind of where I see data science and relationships really kind of like taking off. It's sort of, yeah, like you said, letting them lead, building on what they're doing by, by enhancing it, by adding new ways of measurement, maybe being more precise, and maybe finding new ways to, to identify, you know, sort of the, the things that lead to the ultimate business success. And the other piece of that is like data science, I think at the end of that, that movie, Soylent Green, where the guy is being carried away, and he's like, Soylent Green is people, I always think about that, because I'm like, data science is people. And I think like data science, you know, it's, it's all about the people, I think people are political, I think of politics as sort of like this combination of power and motivations. I don't think that data science influences power directly necessarily, but it can certainly engage with people's motivations, and so once you identify what their motivations are, and, you know, then you can work within the context of those motivations to be more successful.
I don't think that data science influences power directly necessarily, but it can certainly engage with people's motivations, and so once you identify what their motivations are, and, you know, then you can work within the context of those motivations to be more successful.
Domain expertise and becoming a trusted advisor
Oh, thank you, Isabella, and Adam, really appreciate your remarks. Quick question, by way of background, I am, I spent four decades in the industry as a practitioner, and one decade as a university professor, and now I'm kind of retired from both, but I still, you know, am involved, we'll just say that I take on projects from time to time. Two questions for you, Adam, if you'd be so kind. The first is, I really appreciate your remarks about, for example, looking at GitHub profiles as you're evaluating, especially early career candidates, and people that might be just starting off in the field. I thought your approach is very purposeful and insightful, and I hope that everyone adopts something similar. But would you comment on your perspectives, particularly in a regulated industry like insurance, how long do you feel it takes somebody with good data science skills, let's say that's the x dimension, to also acquire good domain skills and deep enough understanding in the domain, which I'll call the vertical axis, to the point where they can be a trusted advisor to our business partners?
Yeah, thanks, Rick. The timing question is tricky, it's another one of those things where I'm always going to say it depends, and it's not just because I'm trying to cover my butt. But I think your first question is, how long does it take to sort of get up to speed within your business, especially if you're coming in sort of early career, in order to be that trusted advisor, to be able to work with people. Yeah, it's going to depend a lot on the way that the business is going to onboard people. I've worked in big companies and small, and a small company, and I think the difference is, you know, in the small companies, the onboarding is a little bit less structured, and you're kind of left to your own devices to kind of get set up. But in big companies, oftentimes, and not always, you're going to have an onboarding program, or maybe even a training program that will kind of introduce you, at least high level, to the concepts you're going to need. So my first insurance job, there was a lot of new training, which was a really cool program, where we learned a lot about, you know, how we do programming here, how you might do version control, all the technical stuff, but then also, so like, what is insurance, you know, and like, how does it work, how does it break down?
The other thing you're going to have to do, I think, when your early career is, and I don't say this lightly, because I know it's not the most fun thing, but it's probably just attend a lot of meetings, and pay close attention to kind of what's happening, and understand sort of who the people are in those meetings, what they care about, how they're relating to one another. And then identifying, you know, when you're in these meetings, what numbers are they sort of monitoring, or checking, why are they checking those things, you know, and then as you talk to your mentors, or your peers, you can say, what are the kinds of data do we have that kind of relates to this. And that gives you kind of a good starting point for thinking about, okay, are we measuring this the right way, and then what other ways do we think about this. And then you go to these meetings, and you can understand sort of what people are after, right, like, it's like, it goes back to that question of like, I mean, you have a goal, to reach your goal, you're building strategies, and you're making decisions to support those strategies, so what decisions are they making, and then you just ask a lot of questions about sort of how those decisions are being made, why they're being made, and then, you know, what other evidence they might need to make better decisions.
I think maybe that's the key to sort of get into a place where they start trusting you, is you're saying, I can, you know, I can bring some value, but I want you to kind of tell me what it is you need, instead of just building something and saying, here it is, or would it be helpful if you knew x, y, and z, because someone's always going to say, yes, I want you to, you know, yes, I want that knowledge. People always want more knowledge, even if they have no idea what they're going to do with it, so if you can work to frame something in terms of, let's find the actionable pieces of information, you know, by asking them what they need to make better decisions, I think that gives you good footing to start with.
Data science versus software engineering and agile frameworks
Yeah, so maybe, yeah, let's recap the question. So, it's the issue between data science and computer science, or which other fields? Not so much the fields as the, as, kind of, the people, I guess. So, like, maybe if you've had any experience, like, collaborating in groups that have people from these different perspectives. Let me think about that, because you're talking about fields where people are much more objective about things relative to fields where they're more driven by other motivations, or?
I guess statistics is, I mean, my own education is in statistics, and I feel like we've all kind of gotten gobbled up by the data science machine in some ways. But I don't, yeah, I don't draw too much distinction between, and I'm sure I'll get black for this, saying this in a public forum. I don't draw too much distinction between stats and data science, even though I do understand that there are, that there is a distinction to be made if you wanted to.
But the, yeah, I feel like data science versus computer science, I think the difference you'll see there is the way we think about sort of what happens to our work, and what happens to our code, and what that means for how we develop a technical project. So I feel, you know, I think data science itself is really inherently like a research field, and that's where I think about especially like relationships in business with your business partners being super important, because a big part of data science is to try something, and you're going to fail a lot of the time. And that's okay. That's just part of how it works. But when you have this like good, strong business partner relationships, that's okay. Because it's inherently research, like, I feel like a relationship can help you sort of cushion your failures. You know, like, if you have a good relationship, you can establish credibility in advance, and then you can try something and just see if it works.
Computer science, oftentimes, I think about just like maybe developing like, say, a software application. It's less research. It's a little bit more sort of like requirements based. And I think sort of like those development cycles look very different. I think about the way that data science tries to emulate computer science in a lot of ways, or I should say software engineering, I guess, specifically. But by using things like an agile framework for their development, even though I don't think that data science necessarily fits well within that agile framework, because you don't know what you're going to find, right? If you're building a piece of software, you can sort of line up the requirements, you can understand which components are going to fit where in that picture, and then you kind of can break down how it's all going to come together. With data science, it's sort of like, well, our goal is this one thing, it's to make these predictions, say, but we're just going to open up our data set and just kind of see what's in there and see if it's even viable, see if we even have the data in the first place. Was the data collected for a specific business purpose that just isn't suited for what we're trying to do? There's a ton of unknowns relative to our friends who are, say, doing software engineering.
Yeah, that's helpful. Yeah, thank you. Yeah, I think we have a lot to learn from each other in terms of just thinking through, you know, I think like data scientists, because we're building something and it's not necessarily like going into a production system. It might, it might not. And that really changes the way that we write the code in the first place and sort of get to that end product. And that can be really chaotic a lot of times, and I think we have a lot to learn about, you know, how do we structure an analysis such that, you know, it's easy to follow, it's reproducible, you know, all those things.
Alan, you had a really great observation. Yeah. Would you mind like sharing a bit on your experience too?
So this is super partly only, only partly thought out, but I think there's a really like challenging to deal with. I'll try and make this a constructive comment rather than a, than the not. I think there's a really challenging to deal with set of differing expectations based on how people come into the work, whether they come into it from a really, really technical orientation or a really, really statistical or a research or business kind of orientation where they expect the work to be a particular sort of thing. And it, and it may be, but it often isn't. And that results in real dissonant experiences. But I also, I guess I have an information critique of what gets privileged in data science that tends to privilege the technical and the fancy and the shiny new, like kind of methodological things or tool-based things. Like there's always a new data stack. There's always a new tool approach. And I think that that sets up expectations that are often not met by, by what the work is, but it means that the, the field is, is populated and increasingly dominated by the, like the shiny technical rather than the really well thought out sort of research and, and business driven questions and people who are invested in those things.
And I just think that's a really challenging place to realistically set expectations because day-to-day so much of the work is, is not going to be, Hey, we replaced our tech stack with the new vector database and look how much faster it is. It's like, I don't know how to answer this question for somebody. And the data is a mess and they need me to help them. And those experiences and those expectations are really different.
Yeah, no, that's, I mean, that's an interesting point about sort of like, the outcomes are going to be like, the things you're driving at are going to be totally different. They can be measured in a totally different way. Like, like data science is sort of meant to have this direct measurable quantifiable business impact. If you're doing something sort of like in maybe in software or, or, or more computer science-y type stuff, where it's about sort of, like you said, like speeding something up or, you know, sort of changing a backend thing that makes, makes life better for everybody, but is really hard to measure. That can be, you think about your outcomes in a totally different way.
And I think when, when data scientists try to go and improve processes, the way they work and things like that behind the scenes, it's, it's kind of a hard sell sometimes to leadership who are used to thinking about the data science output as being products that go into business, you know, some kind of business decision cycle. And so if, you know, someone says, hey, I think we could do much better work if we spent six months cleaning up all these processes in the backend. And that's essentially like pausing, you know, maybe you want to pause some engagement with a part of the business because you need to kind of like clean your own house. That can be a tough sell for, for putting on your data science roadmap, for example.
Data science in the insurance industry
Hey, Adam. I'm just wondering, it's, it's, it's kind of interesting, you know, I'm in, I'm in the consumer packaged goods industry. So it's, it's for me, when I think of people working in insurance, I tend to just default to the actuarial side of things. Right. So I'm, I'm wondering if you can help me understand where data science plays a role in the insurance industry. So do you guys, I mean, are data scientists required to have a deeper understanding of things like stochastic processes, stochastic calculus, or, you know, things of that nature to do projects, or, or do you simply do a lot of modeling with ensemble models or traditional data science models that you would think of?
Yeah. Yeah. Interesting question. To directly answer the question, there's generally not specific requirements for the data science, sort of like educational background around certain kinds of calculus or math. Never hurts to have kind of like an actuarial, some actual experience there, but I wouldn't say the way data science is deployed within insurance, and I'm in PNC insurance, but it's, it's really in support of so many initiatives that it can be almost anything. So for example, I'm building pricing models that, that, you know, support the rate making. And those are generally defined around some of the state regulations, and, you know, we're kind of doing things a certain way because that's, you know, that's the way that we get these things filed into our states and then available as a product. But there's also in an insurance, there's data science, um, sort of like in support of, of underwriting, which is less of a regulated process. There's data science in support of sales. There's data science in support of, um, sort of claims and operations. And also think, you know, you can think of things like telematics data science is going to be, um, completely separate from things like, um, you know, like a pricing model, even though telematics outputs might factor into that equation. But if you're, you know, if you're studying things like where are vehicles traveling and, you know, what's the driver behavior component, that's like a totally different class of problems, I would say. So it really runs the gamut.
Yeah, generally not. Um, but the actuaries, you know, they are going to, you know, sort of scrutinize our, our outputs and make sure that, uh, all of the, all the outputs and the numbers that are coming out of our models do make sense sort of like in the overall picture of the business because they're going to be heavily involved in things like, um, reserving amounts and things, you know, like how much, how much money do we need on hand, you know, for any given, any given period of time. So our outputs are definitely an input to their analysis.
Hiring, entry-level expectations, and staying current
Uh, yeah, it's definitely very related to this discussion, but it's kind of the balance of, you know, when we're discussing, hey, industry experience or knowledge of the business problem is so helpful to being able to tackle real data science problems that businesses are having, but universities are now offering data science degrees, which you don't have to come to from any background other than pure math. And so they say, you know, they get these curated data sets to perform their modeling exercise for their graduate school, and they have no experience acquiring that data, cleaning it in any way. They're just like, yeah, I put my, my model in our little dummy cloud instance for my grad program, and I've deployed a model in production. And balancing the expectation of what truly an entry level role would look like for someone that you do want to be a data scientist for your organization. So really just that expectation, and then also curating your thoughts on curating your job listings to bring in the candidates that you think have the right data science skills for what you're looking for.
It's, yeah, it's, it's one of those things when I do interviews, because, yeah, I mean, everyone's coming from a data science program now, and I'm always like intensely interested, like, what are they, what are they doing in data science school? Because I never came up through that route, right? And I'm always like, oh, what language are they teaching? Um, so like, how are they thinking about, you know, how are you thinking about going, like, end to end through a project? Um, yeah, I think, I think the answer is kind of very, depending on the program.
Yeah, I mean, I think minimally, you'd want to see some understanding of how you tackle data problems. Um, and, you know, how you think about what does a specific kind of, um, kind of data problem do to the way you think about an analysis. Um, and so, you know, like I mentioned earlier, like, uh, if you, if you don't have, that's always, for me, it's always totally fine if you don't have specific, like, internship experience, because I know that's not always something that everyone can get. Um, but to, in that case, maybe have a portfolio where you, where you can demonstrate at least, like, some, some thinking along that process of, um, I get some data and I need, I need to at least, like, be looking for things. It's not that I always have to, like, be, be able to address every single kind of data problem, but I need to prove that I can at least, like, go in and think critically about that data. I think that's the most important part, right? It's like, it's like, how do you identify the critical thinking? Um, and that's where, like, I, I, I tend to, myself personally, I tend to kind of, like, shy away from, like, the, the, the leet code style of evaluation, because I don't think that that really gets at, like, how does somebody think critically about a problem? It's more like, walk me through, you know, help me think through how you're thinking about, you know, a certain situation and, and, and how you would dig into something to ask the questions. And so, I think for someone, like, you know, coming out of school, entry level, I think it's more about, can you at least, like, you know, ask the right questions so that you know what's in this, this package of data? Or, or can you think through, like, hey, you know, my business partner is interested in this question, can you think through, like, how we might get at an answer to that question?
Thank you. Yeah, I, I'm also interested, I'm always asking, um, I'll show my hand here if anybody ever, uh, interviews with me. I'm always asking people how they're, uh, how they're keeping up with changes, because I think one of the things is just, like, you go to school and, like, don't let that be the last time you think about, you know, data science approaches or technologies or tools or packages or whatever. Like, find a way also to just make sure that you're keeping up with what's happening. And for me, it's social media. I don't actually know the, I don't know any other answer to that question other than, like, you gotta, like, I don't know, be on Twitter or whatever. I've actually unmasked it down. But yeah, so it's sort of like, yeah, like, how are you thinking about, you know, keeping your skills current and, and sort of, like, investing in your own, your own development and skills?
Yeah, shame to admit that I wait for it to enter my consciousness instead of seeking it out. It's okay, I mean, we're all busy, so I totally get it. Sometimes I just doom scroll, that's how I get my data science updates. Not always a good thing either.
Thank you so much, Adam. Thank you, everybody, so much for all your engagements in the chat and in the Zoom. I really, really appreciate it. Adam, just in the closing thoughts, speaking of social, like, what is the best way of keeping in touch with you and, you know, learning what you're up to? Yeah, find me on LinkedIn. I think there's a link in my bio on the Posit Designs Hangout. And then I'm also on Mastodon, so I'm at ataustin.fostodon.org, I guess it is. I'll put that in the chat too. It was a true pleasure. Thank you so, so much. All right. Thank you very much. Great chatting with you all.
