The skydiver to data scientist pipeline | Kevin Dalton | Data Science Hangout

Transcript#

This transcript was generated automatically and may contain errors.

Hey there, welcome to the posit data science hangout. I'm Libby Heron, and this is a recording of our weekly community call that happens every Thursday at 12pm US Eastern time. If you are not joining us live, you miss out on the amazing chat that's going on. So find the link in the description where you can add our call to your calendar and come hang out with the most supportive, friendly, and funny data community you'll ever experience. Can't wait to see you there.

Hey everybody, happy Thursday and welcome to the hangout. I'm filling in for Libby today if you can't tell. So if we haven't had a chance to meet yet, I'm Rachel, I lead customer marketing at posit and I'm so excited to bring in our featured leader today, Kevin Dalton, senior data scientist at Great American Insurance Group. And Kevin, would you be able to introduce yourself and maybe share a little bit about the work that you do today as well as something you do for fun?

Sure, absolutely. So welcome everybody, excited to be here at the hangout, have been a big fan for a lot of years. My name is Kevin, I'm a data science, insurance data scientist. So what does that mean in practice? It means I do a lot of work with actuaries, insurance companies like to consider themselves as really old school data science. We've been doing this for a long time. My background was originally in economics. I was trained as an economist, financial economics. Then I took, I worked in the insurance industry for many years and I took off some time to become a professional skydiver. So I worked as a working skydiver for about 15 years and which is pretty fun, not as glamorous, as well paid as you might imagine. And now I've been back working in insurance data science focused areas for about six years. So that's kind of a brief history about me.

I work for Great American Insurance Group. We're a very large American insurer and I work in the predictive analytics business data, predictive analytics group. And we are what's generally referred to as like a corporate resource. So underwriting groups, we're very decentralized company can come to us for their data science and analytics needs. We have several different groups, one of which builds what would be considered sort of traditional insurance actuarial models that's for bread and butter. How do we price? How do we segment? How do we market? Those kinds of things. But we also have, I think some of the more exciting things now, which are computer vision and natural language processing agent models in the AI sense of that as well. So that's what I do day to day. I'm an individual contributor. So half of my time is really spent between theory and then half of it spent between sort of implementing that. I'm also big into MLOps now. So hopefully that wasn't too much or too fast.

Oh, that's great. I think that's the first professional skydiver we've had. Are there any other professional skydivers in the chat? I think some people maybe want to go skydiving with you.

It's a good time. Oh, what do I like to do for fun? I actually am a big kind of nerd about programming and stuff like that. So I've been writing a lot of code recently, but I have little kids and I like to go out and be outdoors with them. So I live in Boulder, which if you know anything about Boulder, you know, it's all, everyone's outdoors with their dogs all the time. So it's fun.

Love it. Well, Kevin, I had the privilege to get to meet you a few months ago at the Pawsit.com registration desk. And I just want to say thank you for your kind words about the hangout. And I was looking back at my notes on my phone, because I can't remember anything from comps. But I was like, get Kevin on as a featured leader! Lots of exclamation marks. So thank you for being here.

No problem. Yeah, sorry, big fan of RStudio and sort of the Pawsit now. I remember when they, I told Connor this, I was sitting there when they announced it. And I was like, this is so smart. I'm so glad they're doing this. I think you guys have done amazing work for the community. And I've been involved with the R community for a long time. And, you know, you guys are doing the work that needs to be done. I really appreciate it.

Little P and big P production

Oh, thank you. One of the other notes that I had from that conversation, though, was around some conversation we had at that desk around little P and big P production. And I think something I want to talk about today. Before we even do that, what does it actually mean to put models into production at an insurance company?

Yeah, that's a great, that's a great question. I appreciate it. For you'll see, and I've seen this a lot. I mean, you'll see from the very, very unsophisticated to the very sophisticated. So, you know, a model for insurance company can be just a very simple linear model, or it can be something very, very sophisticated telematics type model. And you'll see most of that's sort of heavily weighted towards the simple linear models. You know, we're trying to segment different risks based on different classifications. I always use the sort of blue cars are better than red cars kind of thing, right? Did we know about our young drivers, those kinds of things.

And we have this great analysis. Now, how do we get it on the road? How do we put that out there? For a long time, insurance models were very static. You know, they, and some of them still are, they have to be fine in the US, they have to go before a regulator that says this. So it was okay to hard code them or put them in a spreadsheet or put those sort of coefficients in there. So that could be being in production, right? You could get a data set every month and say, Hey, score this for me. And that would be that. Now it's moving more towards automated production, automated inference to sort of cloud centric, real time inference engines where an underwriter, for example, can input the data into an underwriting system. And part of that underwriting system takes that data and hits an inference engine and gives them a score back. So that's where we're moving. That's where I see the industry moving, but a lot of it is still, you know, sometimes my production system is a notebook. I give new data to, so, but that, you know, whatever, whatever works. So we're moving towards that, but those are never, notebooks are hard to, hard to productionalize as, as I'm sure you all know, right. And it's not something that's easy to go back and then say, why did we do this? Which is where we want to be now, but that's what it means for insurance companies.

Thank you. How do you actually explain that to people at the company that there's this need for two different types of production?

I, I've read, it's a great, great, great question. I haven't really had to explain it. I think the need has just been so obvious to everyone that like, why does it take so long to get something back? Why are we waiting? Why is it hard? It doesn't have to be, you know, but you have to, you have to be willing to do the engineering, put the systems in place, think about, the tools you're going to use, you know, Posit Connect, Posit Workbench, whatever, whatever it's going to be, or you're going to code it yourself. I think, that's more of my communication role of that is, is to how we do it rather than why. Everyone wants to go faster. Everyone wants to do, I guess what, I didn't come up with this, but I really like it, analytics at speed. So, everyone wants to go faster and you can't go faster unless you have this productionalized pipeline.

Everyone wants to go faster and you can't go faster unless you have this productionalized pipeline.

The ability to understand the problem itself is probably still remains understated.

Yeah. Sorry, go ahead. How would you advise people do that before they've gotten a chance to do a job in that industry?

I don't think you can. You know, I, there's no way to bootstrap your way into it. I think for people joining data science in the industry.

Yeah, I've been a professional recruiter for like more than 30 years. I would recommend everything. Send notes to people saying, hey, can I shadow you for a day and call off on that day? I saw something recently I just loved. Reading gets you the general stuff. Articles gets you the applied stuff. Read, study articles, go to meetings, meetups, form a group if you have to, but, um, don't take anything of like, you know, oh gee, you can't do it too. You can do anything. Okay. Um, put your energy, put your enthusiasm into it. And it's amazing what you can actually do. Um, and as a person that's hired more than 15,000 people, I can tell you, I don't necessarily hire for skills. I hire for passion, drive, enthusiasm. So if you say, hey Russ, I did this on Saturday when I was on my own time, guess what? You're screaming at me that you're interested and I want to transition into dah, dah, dah. Great. So yeah, there are lots of ways, uh, read, study, join groups, um, shadow people, um, all that kind of stuff so that when you walk in, you can do it. And then the other thing, I'll just make this really, really quick. Uh, the extreme majority of people, when they apply for a job, send a resume and their resumes are crappy, by the way, that's a whole other issue. But the people who get hired, I'm going to send a resume letter, recommendation stuff they posted on GitHub, you know, LinkedIn. So they're going to show, Hey, I care. I'm passionate. I know how to do this. I'm walking in. I might never have gotten paid to do this, but tada, I'm Babe Ruth and I'm going to hit a grand slam every time you give me the ball.

Cool. Thanks. Yeah. So, um, uh, so I agree with everything Russ just said about, uh, reading everything, joining groups, all that stuff. The one thing that I would add to that is, um, pick a dataset that's related and do a small project as small as it might be. The reason for that is that you begin to understand how the, how the data play into some of the questions that you might have to deal with, um, in the real world. And I say this as a person who, uh, kind of has the opposite experience of Russ. He is, he's hired 15,000 people. I had been hired, not quite 15,000 times, but I've, I've been through a lot of industries, uh, from aviation to clinical sciences to economics and, you know, what have you. So the, the way, the way that I've been able to successfully go from one industry to another, um, though I'm kind of, I think I'm done doing that. Um, the way I've done that successfully is, um, by just doing small projects before I actually start my job. And that also gives me a little bit of insight into, um, what I still need to learn before jumping in. So it kind of tells you what, where your weaknesses are as well.

Thank you. Kevin, glad to have you back. Are you good now? There you go. Okay. Sorry about that. Yeah. Um, so what'd I miss? No, those all sounded, those all sounded like great advice. Uh, I think I was talking about it. Like I don't, um, uh, I will just add, I think that's all great advice. Um, I haven't used myself of building sort of a portfolio. I see it, I see it out there. I don't think it's bad advice. Um, I think if you can, um, I guess the things that I'm trying to do for myself, and I just will speak for myself. I think I'm trying to, um, get involved in open source. So I think if you can be, if you can find a hook into the open source, uh, software analytics side of the community that you would like to develop domain expertise, I think you can learn a lot. You can learn a little about the data and certainly that gives you an opportunity to show your work as a data scientist, as developer. So, um, that's what, that's all I would add.

Evergreen skills and change management

I see, um, direction had asked a question in Slido that was as an experienced data scientist, which skills, uh, would you say are evergreen irrespective of the tools that you're using? Not really about like tech stack, but skills like soft skills, good data practices.

Yeah, that's a, that's an amazing question. And one that, uh, I think is very, very good. Um, like I said, cause I think that you can learn the tech, you can learn the statistics, you can learn what you need to know. You can even learn the domain expertise. I think what's going to make people last and allow you to last in the field, if that's your goal is the, is, and it's trite. And I know people say it over and over, but it's the people skills. It's the ability to work in a very diverse. And again, I mean, no disrespect. And I have to say no pejorative way with a quirky bunch of sometimes neurodiverse people who, you know, are developers and data scientists and data engineers and the ability to work with them and communicate with them, I think is something that will put you in good stead. And it's very soft skill. I myself have been taking a lot of courses and seeking a lot of mentorship and change management. Um, so people don't like change surprisingly enough, right? Like all change is bad change. And so that's their, like, how do you get them over that? How do you, how do you move up to that? That I think is so phenomenal question. Thank you very much for that question. That is always a big thing for us, change management. Tell us, what are you learning about change management?

You know, I apply it, I think in a corporate setting, I mean, you know, I think it's just a great skill to have in life in a corporate setting, right? Like we are, we work across different groups and sometimes it's, you know, herding cats. People don't like, I don't have to do what you say, et cetera, et cetera. Right. And so the ability to lead without being a, lead people without being a, an appointed leader or to influence people without being an influence leader. Right. I think is extremely important. I can't downplay it enough. And I think if you want to last in the industry, you're either going to have to be so good at writing code and doing data sciences that, that, that your people just can't ignore you, or you're going to, you're going to have to, to be able to do most of that in the way and then be great with people and great with the domain. But honestly, I haven't hired 15,000 people. I have hired some people, right. But I, I think we can get there on the tech stack. We can get there. I would rather, much rather hire someone that I know I could work with and it's going to work with the team well, and it's going to, I can count on them to have the understanding of soft people skills and

The skydiver to data scientist pipeline | Kevin Dalton | Data Science Hangout

Transcript#

Little P and big P production

Small language models and Bayesian modeling

R vs Python and using Positron

Tool stack at Great American Insurance Group

Model guardrails and drift monitoring

Synthetic data and public data sources

Fraud detection and AI's impact on insurance

Career advice and domain expertise

Evergreen skills and change management

Featured software#

positron