The actuary to data scientist pipeline | Bill Wilkins | Data Science Hangout
videoimage: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hey there, welcome to the Paws at Data Science Hangout. I'm Libby Herron, and this is a recording of our weekly community call that happens every Thursday at 12 p.m. U.S. Eastern Time. If you are not joining us live, you miss out on the amazing chat that's going on. So find the link in the description where you can add our call to your calendar and come hang out with the most supportive, friendly, and funny data community you'll ever experience.
All right, I am so happy to introduce our featured leader today. It is Bill Wilkins, SVP Advanced Analytics with Practical Applications at Safety National Casualty Corporation. Bill, welcome. Thank you for joining us. And I would love it if you could introduce yourself. Tell us a little bit about what you do and what you like to do outside of work for fun.
All right. Well, thank you. And thank you for having me on. It's a lot of fun. I do know a lot of safety folks do try to participate in the community. So, you know, I do believe that it is a good forum for folks to join into. My role at safety is a little nebulous. I started out, I've been with the company 17 years, but I've been in the insurance industry for 40. I started out as an actuary, but before then, you know, had some programming. I got trained as an underwriter, trained in regional operations. So I've had a very broad experience within the insurance industry.
But as my training is an actuary, you know, I've had some very lucky opportunities over the last 40 years to work with some very talented people. And, you know, actuaries are essentially the data scientists within the insurance community. They were the original ones. Now folks are adding more because actuaries have a very specific view of the world. And luckily I don't hold most of that. That's why I'm where I'm at today. I'm kind of, you know, I believe in the actuarial training, but I do believe in a broad brush that the data science community can have.
So my role now, I went from doing pricing, reserving, data science work to help do predictive analytics. And I'll give you an example of what we do at safety. But you know, now my job is really just to think out of the box, come up with data, come up with connections, come up with technology, whatever we can to figure out what the next question should be that we need to answer. And it's kind of a fun role, frankly. It's, you know, I'm a problem solver by nature.
So, you know, which also you'll see the shark in the background. That's about a 22 foot long tiger shark. Wasn't in a cage, it was with my daughter, we're both divers. She's a, I'm a master diver, she's a dive master, so she gets paid for taking people diving. But one of the things that we like to do is dive, we like, yeah, love the ocean, she's going to save the world, she's a marine biology major, and going to help bring back the coral so we all can live. But, you know, I've always been a calculated risk taker, whether it's trying to fly off the roof of my parents' house, or whether it's rappelling down a building for charity, jumping out of airplanes, diving. I like to push the envelope. I like to see if there's a problem that we can solve.
The needle in the haystack problem
And, you know, I said, I'd tell you about safety's problem. They have what I call the needle in the haystack problem. There's about three to 4 million workers comp claims a year. Safety Nationals Forte is handling the really large ones. So a million dollars or above. Well, it takes sometimes 15, 20, even 30 years before a claim will even get close to a million dollars. So what, you know, our team is to help us do better as an organization, is try to find those at 12 months or 18 months versus waiting 10 to 15 years. Because our belief is that the sooner we can get into the process, the better the outcome for everybody involved. You know, we want people to get back to work, we want them to get the best medical care.
So but, you know, out of the three or 4 million, there might be 1500 claims above a million dollars in that same year. So we're looking for a really small number of claims out of a really big population. And it's a great exercise. It turned a couple of my data scientists on their head at first, on trying to figure out how to accurately measure because of the high variability within each client. So and we've moved on to that they have that one solved. So now we're moving on to other things trying to help the business. And it ranges for a whole lot of things.
What actuaries do
I think that we'll probably have a few more questions about what some of these terms mean. But I would love it if we could get some context around what an actuary does. Because you started in an actuarial work, you did that for a really long time. What exactly it means to work in pricing?
Pricing is what you pay most people. You know, when you look at your automobile premiums, or your homeowners premiums, actuaries have typically been the people that come up with those numbers. There's a whole set of techniques, there's exams, it's, and there's multiple types of actuaries. There's what they call the property casualty or general insurance actuary who handles business, you know, automobile homeowners, type of workers compensation insurance, if you've ever been injured on the job. Then there's life actuaries who handle essentially life insurance. There's health actuaries that handle your health insurance products that most of your employers have.
They have a spectrum of things. I happen to be in, I've taken enough exams that I happen to be in all those societies, currently, for the United States. And there's international actuaries. So, you know, they have insurance around the world, and it all works differently. And our job is to know really, through the education process, how to really keep that going and functioning. The most important actuary out there is called a reserving actuary, or the appointed actuary. Their job is to make sure the insurance company stays around so they can pay all the bills. So it's if you want to look into being it, I don't advise it, really. I, it has been a great career for me. I love the education process on it. But the exam process is very difficult.
Yeah, I hear it see Kylie in the chat is saying, I've heard the tests are really hard. I've heard that too. People study for them for a really long time. I think that we probably have some actuaries or former actuaries in our kind of core group of data science hangout crew as well. And we have a question that has come up in Slido. Reminder, everybody, we're asking questions in Slido today.
And it has some asterisks next to it. So I'm going to ask it. So Mita asks, actuaries have existed for a long time. So how do you think that the evolving data science technology has been affecting the risk and pricing strategies within insurance? That's actually a very good question.
Unfortunately, the answer is not at all. The, it's really funny, the technology has been so powerful. The actuarial profession is still trying to think how it should adapt. I was actually on a conversation about this just two weeks ago, trying to think about, you know, where should we be going? Because the most important thing to an actuary is the data. You know, I've been doing it a long time and the data isn't a whole lot different than we had before. It's the computing power. You know, the math has always been there. We just haven't been able to utilize a lot of the math. And so what it's doing is it's enabling us to speed things up.
But insurance is highly regulated. And so pricing is under a very strict set of conditions that even with good data science, you can't get around. So particularly when you're, and I'm going to say you're going to hear a lot about it. You know, I've been hearing about it for the last several years, but it's going to get even more, is the unintentional bias in the data. You can have as many techniques as you want, but if the data is somehow collected wrong or didn't, you know, it's not a full sample of what the world really looks like. There's a lot of issues and it's really becoming a big issue, not just on an ethics side, but on a legal side.
Actuary to data scientist pipeline
I think that there's a follow-up question from somebody to actually in the chat that I saw, which was like, do you think the pathway to being an actuary can be through data science? So like risk analyst, years of experience becoming an actuary after that, do you generally see data science to actuary or do you see actuary to data scientists, which is what I see?
It's what I'm seeing is actuary into data scientists. Now, like I said, because it's, it's learning the techniques. You know, I myself have taken data science courses after I got my, all my actuarial designations because it just makes me a better actuary, learning all that material. The, the problem with the actuarial track is there's a lot of specialized knowledge. You really become a SME. That's, that's the distinction between a data scientist and an actuary. The actuaries have very specialized knowledge relative to the general data science course. There is going to be a melding that, that again was part of the discussion a couple of weeks ago on, you know, there are so many like attributes, where can we really cross them over?
I think that the, the people that I have worked with in the past have actually been incorporated in to the data science teams or just had their roles change. So they, they started as a pricing analyst or an actuary inside of an insurance company. And over time, the sort of just title of their role shifted and they're doing the same thing, but now they're called a data scientist.
Advanced analytics and pricing models
I have another question in Slido really quickly about around pricing. So Bonnie has some asterisks here and I'll ask this. What type of advanced analytics models is your team using behind the pricing model? Do insurers actually run experiments like AB testing around pricing?
This has to be a very careful answer. Okay. Because actuaries are constrained to a very specific set of things, as I said. So yes, there are models and there are testing. However, because we have to have a very specific outcome, that is along exposure to loss. We, we have to be very careful about what we're running and how we do it. There are companies out there that are doing some unique modeling relative to true behavior in terms of buying potential, but an actuary cannot do that. It has to be strictly to exposure to loss, but there are, and each group does different things because I'm going to say there's automobile is different depending on the type of company you're with. So if you're with a state farm or an all state, you have a very specific type of clientele. But if you're with a, say a first acceptance who does what's called nonstandard or the general, I don't know if you guys ever seen the general television commercials, they're typically a non, what they call a nonstandard carrier. You, you go to them when you're having trouble getting insurance from state farm or all state. And so they're different because the exposure is different and you have to align the techniques with the variability in there.
Yeah. And just from being in the insurance industry myself for a while and being an insurance agent, I can tell you that for bigger companies, they probably have different sections of their business where they insure different risk populations. So as a peek behind the scenes, you might be insured by the same insurance company as your friend down the street, but you might be insured under completely different risk pools and have a different policy behind the scenes, kind of from the same company.
What people misunderstand about actuaries
I have some questions coming up in Slido, Bill, I'd love to ask some more. I have a question from Sanjay and Sanjay, I do not see an asterisk. Would you like to unmute and ask your question yourself if you're available and give you the opportunity to do that?
Yeah. So hi, this is Sanjay. Hey Sanjay. I've heard a lot about actuaries. What do you think is the thing which people misunderstand about actuaries? Like what's the thing which we usually do not understand? Like data scientists, we're talking about outside data.
Well, what most people don't understand about actuaries is that they're mostly on the spectrum. If you look at the group that I started with, and Alina, the profile has shifted over time. So when I started back in the 80s, we really were a big group of nerds, like you'd see on the Big Bang Theory. But as time has gone on, and we've been able to expand education, make things more readily available, the pool of people within the actuarial group has changed. When I first started, we had, it wasn't just people with math backgrounds or statistics backgrounds. We took, because we were looking for people, we took music majors, we took behavioral scientists, and we've been able to grow that pool of talent so that it's, other than the love of taking really hard exams, you don't have to be as specialized as you used to be, to be an actuary. So it's a much broader group than people give it credit for. The old joke was, you can tell he's an outgoing actuary because he's looking at your shoes instead of his in the conversation. We're well past that. I think we're, the group is more akin to the broad swath of data scientists than it was to the whole insurance nerd thing 30 years ago.
Incorporating catastrophic events into models
Awesome. Thanks, Bill. Thanks, Sanjay, for that great question. I have an anonymous one that came up in Slido that I think is really interesting. It's, how do actuaries incorporate once-in-a-lifetime slash once-in-a-generation or catastrophic events like COVID or the global financial crisis into their models? Those, what we would call system shocks.
There's a type of actuary that I was at one point called an enterprise risk management actuary. I used to be the chief risk officer for Safety National. And oddly enough, we just think of things like that. Back in 2012, I wrote a paper on a pandemic, how it would hit, and I built a model to figure out how a pandemic would work and how much money we could be potentially losing. So we think about those once-in-a-lifetime events quite a lot. And I've built terrorism models, I've built riot models.
The nice thing is we're free to think about the stuff that could affect us, whether it's flood, fire. I think actually the biggest part about the insurance industry now is people have the wrong expectations of insurance. You know, the fires that happened in Los Angeles, the floods that happened in North Carolina, these horrible events, they were preventable to a certain extent. But there was a conflict between increasing the tax base and doing the right things to stop catastrophes. And so we need to shift the conversation on, because it's really easy to come up with bad scenarios. There are whole modelers who do just that. And frankly, there's no more once-in-a-lifetime events. Things are going to speed up because of the effects of humanity on the earth. So a lot of things are going to start speeding up and transitioning, both in physical risk and the liability that comes with it.
And frankly, there's no more once-in-a-lifetime events. Things are going to speed up because of the effects of humanity on the earth.
Building community across a global organization
Rachel, I actually really like your question about community. I'm very invested in community, too. Are you available to unmute and ask? Yeah, absolutely. I'm just always curious when I see huge companies and knowing how do people get to know what other people across the company are doing. I hadn't realized before meeting you, Bill, that Safety National is a subsidiary of Tokyo Marine Holdings. So I was just curious, do your data scientists and actuaries have the same community across subsidiaries, or how do you collaborate with each other?
That's actually a really good question. I think Tokyo Marine is in 40 or 50 countries around the world, and I work for other multinational carriers. It really does come down to the people, whether they're actuaries or whether they're data scientists. We have a community of roughly 200 data scientists that meets every couple of months, virtually. It started very organically. I knew this guy, he knew this person, she knew that person. We just said, hey, let's hang out and start talking and seeing if there's stuff we have in common. It started out with maybe 10 people, and hey, I'd like to join, and I'd like to join. All of a sudden, it just starts to mushroom. Because it takes some time, and you have to figure out your way to find the conversations that people want. But that type of organic thing, it's like your data community here. It just starts out, and it'll just start to grow once people find the right conversations and how they want to talk to each other.
We've just had our second in-person global meeting for Tokyo Marine. It really was focused on how can we help others. We got together, and it was really funny because there's no data science department within Tokyo Marine as a whole. It's just us. We get together, we have the conversations, and we're picking problems to help the company solve. We do have some executive support, but it's really getting to know people. That's how it's been my career. I worked for a company that was actually 16 different companies. We just started getting together and talking. You build friendships, and it takes time. I've known a lot of these people for a decade, so it just takes time. Once it's up there, it's such a beautiful thing. I know it'll exist when I leave. It'll keep going.
Career advice and passion projects
I really like the focus on just trying to help people or help people solve problems. That's a huge reason why I'm in communities. I see people who have problems who need help solving them, and I'm like, we could all do this better together if we got together.
Yeah, it's a little bit along the lines of the previous one. My question, I'll just read it, was I just saw the insurance company or insurance industry, and I was wondering, given the flexibility of your role, how much influence on the type of problem-slash-works do you have? Do you have to solely focus on just finding high-cost claimants, or do you have the ability to look at passion projects or try to answer different types of questions? How do you reduce the bias in that?
It's an interesting question. I get to work on things that make sense or will make sense. I'm not necessarily solving things for today, but it could be for tomorrow. I do have some very passionate things because I really believe in what Safety National does, which is wanting to help people. That's what we go for. When I used to run the data analytics group, we didn't do just the projects that helped our claims department identify the claims. We actually put together products for our clients. We handle very large employers, so Apple, Starbucks, State of Texas. My team was putting together products to help them lower their costs and help their employees.
As a really good example, finger sticks is the number one issue in a lot of hospital systems. They waste a lot of time and a lot of energy with people getting hurt because of that. My team was actually isolating that and working with our risk management group to help prevent that, saving people injury, time, potential exposure to biochemicals. I get to do stuff like that. I don't want to say it's random, but you never know when the next really good idea is going to come. I get to work on the next good idea and then turn it over to somebody else who's much smarter than I am to go try to make it work really well. That's the joy part about my job is doing that. That's what I think insurance should be doing.
The bias question is a little bit harder. There are the technical items, but what I find the hardest part about bias is the definition of bias. I spoke on this last year. There is a technical definition of bias. However, that's not how people use it today. Everything is biased. So is it good bias or bad bias? Because there's good bias, which means you're very predictive. Bad bias means you're being predictive for all the wrong reasons. Depending on what you're going for, you want to minimize what I call the bad bias. There are a couple of different standard techniques to do that, but the hardest part is really understanding the data and getting back to that data. Is it incomplete? Was it captured a specific way? That's a much longer conversation than this community.
Third-party data and big data in insurance
Actually, third party data is going to become more and more critical. And actually, what I'm really excited about is the ability to take words and numbers and put them together. With the explosion of generative AI, we're able to do things that we really wanted to do 30 years ago because we have a lot of text in insurance. We have a lot of numbers, but we also have a lot of text and a lot of good text once you clean it up some. And individual insurance companies can do a lot with their own, but there was a recent example of a product out there that is taking medical information and litigation information and putting it all together to really hone a strategy for individual companies. And it's going to get more and more powerful as the computers are able to process more and more. And so big data is going to play a role.
And that third party data, because insurance companies, unfortunately, while they might be large, have their own personality. Every company has their own personality, whether they have a duck going Aflac or they have Flo in her nice white uniform. They take on these personas and their data takes on that persona as well. Eventually, I would like to have something that crossed the industry because I do think we're missing out on critical insights for the populace as a whole. And people do jump insurance companies, but what I see missing is that we don't have something that's a population as a whole type of data structure. Now, there's a lot of insurance companies that wouldn't like that because it could mean some competition, but I think we need more of that to level the playing field.
Causal vs. prediction models in risk evaluation
I learned a long time ago, a number is just a number and it's going to be wrong. Every model is wrong. What you have to understand is what's driving the model. If you don't understand what's driving the model, it could be the best model in the world, but it still is going to be wrong. I actually think you have to have a wide array of both in the process. There's no such thing, unless you can only have one model, there's no such thing. What I like about using a platform like Posit, it allows us to do multiple things all around the prediction process because it shouldn't be a one or the other. It should be what makes the sense for the problem at hand. Again, you always have to know the assumptions. In a good example, actuaries have very specific techniques in observing. Each one will give you a different answer on the exact same data. If you understand how those models work and what the assumptions were that go into them, you can actually level them to see if they're roughly the same. That's the important part. It's understanding what you're doing and what the question should be.
Adapting to AI and regulatory uncertainty
How are you adapting your actuarial models slash approaches to address the evolving risks introduced by recent advancements in AI and the resulting complexity, regulatory uncertainty, as well as ambiguity in liability slash accountability structures?
No, no. Actually, it's a very interesting question because it's evolving. The one thing you don't want to do is overreact because as folks are finding out now, one group of politicians took a path prior to the election and now that path's going in a different direction. The uncertainties have always been there. That's where you want to use types of scenario models, scenario planning type of activities from a risk standpoint. Scenario modeling is not just good for climate change. It's good for all kinds of things. What you want to do is test the various things. Again, it comes back kind of tied back to the prior answer. You want to know what the assumptions are and what you're doing.
I've built a lot of models over the years and they are getting more sophisticated. The problem is that when you get more sophisticated, it doesn't take much to really make an issue within the model. I do believe in keeping models simple, as simple as possible. I would rather take 10 very simple models than one really good complex model. I think what you want to do is within all the models or a model you have, really outline the assumptions, stress the assumptions, vary those assumptions, and figure out where all those things are. The uncertainty is always going to be there. The question is, have you put a pathway for that potential uncertainty? It's okay to say we don't know that this is going to be the answer. When I wrote the pandemic paper, I got lucky. I got about 80% of how the pandemic would work right and about 20% wrong. That's pretty good, but it was just a guess. You have to be comfortable in that and knowing that I took this down a path. That's all you really need to do.
Ethics, regulations, and model accuracy
Well, if I didn't have to worry about ethics and regulations, we could get some very sophisticated, highly accurate models where no one would buy the insurance. Insurance is a pooling mechanism. That's how it started out. When people started to try to compete, that's where a lot of this stuff really started to come into, the segmentation. You can only segment so far because some people are just bad risks and the rest of us have to make up for them. Again, insurance is actually a community product. We put our money into a pool just in case something bad happens. To some people, something bad happens all the time. The question is, how specific do you want to get? I mean, do you want people to have insurance? If you want people to have insurance, you have to lose some of that specificity. Again, it comes down to what's your goal? I believe in the mechanism of insurance. Do I want to pay more than I have to? No, but I'm going to have to so everybody can get it. What's your passion for wanting to help other people?
Explainability vs. accuracy
Yeah. My comment in the chat was just, I find that models that impact actual people's health outcomes or safety outcomes or cost outcomes, speaking as a data scientist embedded in an actuarial department, need to be explainable before anything else. Those simple models that you know exactly how the model got there become so much more important than we threw it into a black box model. We were 97% accurate, and we can't tell you why. If you're going to deny someone coverage, you need to be able to justify that. I was curious as to how you find that paradigm and that balance within some of the work that you do as an actuary or as an insurance provider. What does it look like for you to strike that balance between explainability and accuracy?
Without transparency, there is no model. You cannot charge someone something. You cannot deny someone something unless there's full transparency. There is absolutely no black box at all. I'm going to say our executive management is so on board with that. If we cannot explain it to a third grader, we can't do it.
If we cannot explain it to a third grader, we can't do it.
Posit tools and operational use cases
I'd rather have my data science team do that because they're much smarter and better at it than I am. For us, I'm going to... I still call it RStudio. Sorry. They have gotten a lot of traction using those kinds of tools for what I call the operational side of the world. We have a premium audit department. For those who don't understand, for large accounts, they give us premium upfront, but depending on how many people they have or don't have employed, their premium can change either up or down. At times, you could talk hundreds of thousands of dollars swinging any direction. We are obliged by law to make sure that they either pay or we give back the amount that we should. For us though, because we have so many clients, one of my team's projects was to go out and to basically figure out who should they talk to first. Who has a significant variance from year to year and who doesn't. It's that type of use case in improving operational efficiency because it allows us to get to those clients who we either owe money to or they owe money to us. It helps us figure out where we should be as an organization. That operational use case is really well suited if you structured it right for closet type of tools.
AI and machine learning in the actuarial world
I'm old school. I still call it machine learning. There's no such thing as artificial intelligence. I think it's very useful. Actually, I just wrote an internal paper. I cannot provide it outside because it is for us on how it should be used with an actuarial world in certain respects. The tools are going to make... This is a double-edged sword. The tools are great. I loathe them to death because they make me more efficient because I can do certain things. What I worry about with the tools is the lack of prior knowledge. I've been doing this 40 years. I've been able to see things and understand why they work. I've actually made the comment to a colleague, I started with punch cards. Then I went to a dumb terminal. Then I went to the first personal computer. Then I went to network computers. As all that technology has done, what I still have that I don't see a lot of people today is understanding the underlying concepts. You can take a tool and throw a lot of models at it, but you really do need that underlying concept stuff first. I think AI is going to really speed up and make people better as long as they have the underlying knowledge they need with the tool. My biggest concern is that we're not teaching people enough about some of the harder things that they haven't done yet.
The future of insurance
Given your comments about how things are no longer, quote, unquote, once in a lifetime and will be speeding up, what do you see as the role of insurance, broadly speaking, going into the future? It seems like things are going to get harder and harder to pull in a way that works out in the math.
Actually, it is. I sit on what's called a General Research Council for the Society of Actuaries. We started talking about this because certain risks, I don't think, fit into the insurance mechanism anymore. We need, as a group of really smart people, to figure out how to get them taken care of because they still need to be taken care of. You don't want to leave anyone in a bad state. The question becomes, who are the right people in the room and what are the things that need to be done? Because you see it happening in Florida. You're going to see it really happening in California. The state is going to have to start taking a bigger burden, which means everyone in that state is going to be taking a bigger burden because the insurance mechanism wasn't built for mass casualty events. If you look at all the mechanics behind it, it really is supposed to be the low probability, higher severity type of thing. Instead of an entire community burning down, it's a house in that community. That's how it was originally built. It's transformed over a generation. We, as a group, need to do something. You're going to need to involve regulators. You're going to need to involve the insurance industry, but you're going to need to involve the communities too, to see how best, because every community has something different. You are going to have to sometimes reach outside of our geopolitical sphere to get a solution. It's a complicated thing. We are trying to think about it now, but some things are going to be uninsurable.
Awesome. Thank you, Bill. Thank you so much for all of your insights and sharing everything with us today. I will wrap it up there because we have a minute left. I want to say thank you to Bill and to everybody for showing up. I would really like to let you know who's coming up next week as well. We have Victoria Prince, Senior Manager of Statistics at Takeda Pharmaceuticals. I'm going to put in the chat here for you a link to her page on the Data Science Hangout website, so you can go check her out. Find her on LinkedIn. Thank you so much, everybody, for making this a wonderful discussion. We will see you next week. Everyone have a wonderful day.
