Four steps for managing data teams | Toby Hall | Data Science Hangout
videoimage: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hey there, welcome to the Paws at Data Science Hangout. I'm Libby Herron, and this is a recording of our weekly community call that happens every Thursday at 12 p.m. U.S. Eastern Time. If you are not joining us live, you miss out on the amazing chat that's going on. So find the link in the description where you can add our call to your calendar and come hang out with the most supportive, friendly, and funny data community you'll ever experience.
I am super excited to introduce our guest today. Our guest is Toby Hall, EVP and CIO at Delta Dental of Michigan. Toby, I would love it if you could introduce yourself, tell us a little bit about what you do and what you like to do for fun.
Sure. I'm excited to be here. My name is Toby Hall. As Libby said, I'm the Executive Vice President and CIO for Delta Dental of Michigan. We are actually the headquarters here for three different Deltas, Delta Dental of Michigan, Ohio, and Indiana. We're a little bit like the Blue Cross system where notionally there's a Delta Dental in every state. And we forever have run our entire business on a homegrown IT system. And I lead the team that owns and maintains that. I've got about 500 or so, 510. By training, I'm an actuary. That's where I first was exposed to R, spent the first 20 some years of my career doing actuarial work, many of those as chief actuary for Delta.
And what is something that you like to do for fun? Oh, yeah. So for fun, I have two kiddos. So a lot of time is spent chasing them, going to their different sporting events and school events, reading, and ironically staying up on R, still do a lot of R coding, just hobbyist on the side.
Delta Dental's data and teams
So Toby, tell me a little bit about your day to day, but also the data teams at Delta Dental of Michigan. What is the type of data that they work with and what is it that they solve with it or what decisions are made based on it?
Yeah, that's a great question. So my day to day, I would say there really isn't a day to day. It's always different. With that breadth of the application that we support and the number of team members, there's always something going on, something different. It's a lot of project management type tasks though. With insurance, we're very heavily regulated. So there's a lot of data security, things that take up a lot of my time. And then just in the last 18 months, it's obviously been AI is the big kind of buzzword right now. One of the teams that reports up to me is our data science team.
We stood up a data science team a few years ago. It's a team that economically, you have to be at a certain scale for it to make sense. Data scientists, to do it right, you need a balanced team, domain experts, some more, I would say like statistics, modeling experts, some technology experts. It's really hard to find that one unicorn person or small team that can do it. So you really, to enter that and do it well, you need a critical, massive team, four or five, six people. And those resources are not the cheapest resources. So you got to have a certain scale to your business for that to make economic sense.
And we crossed over that bridge a few years ago and we got approval to build out a data science team. The data they use is, it's insurance data. So it's pretty much what you would expect. It's data about our network providers, dentist data, where they're practicing, how many patients they're seeing, how long they've been in our network, that type of data. Then claims data is a big one. Claims effectively, the way we ought to be looking at that in a highly transactional world is claims are effectively a receipt for the care that got rendered. So the provider data tells us who performed the service, the claim is what was done, and then the eligibility tells us who got the service, who was in the chair getting the dental treatments.
And there's a lot of different use cases for that. At times we're looking at it as cross-sectional data, sometimes it's more longitudinal. We've had some great wins lately with quality of care type models and data work that we've built. We've built some great apps that our teams can use out in the field, powered in the background with a lot of heavy lifting and a lot of hard work, a lot of Python being done in the back on that.
We've just recently started bringing in third-party data where we can buy or rent data from third parties. And we're still trying to get our arms around the best uses of that. We're probably not as mature as we need to be, but we're getting much better.
Third-party data and fraud detection
When we think third-party data in our world, it's honestly more around how can we get a more full picture of provider behavior. And we're really looking for either end of the curve. We have an obligation, a moral obligation to be looking out for fraud, waste, and abuse. The other end of the curve though, what are those dentists that are providing just excellent care, excellent patterns of behavior, conservative dentistry, they're keeping their patients healthy for the long-term. Those are the dentists we want to look at. We want to reward those dentists.
We did get data around financial behaviors. And no surprise, you find a dentist that's got tax liens and has filed for bankruptcy and maybe has borrowed money a bunch in the last couple of months. And yeah, that's a dentist we want to take a look at. On the other side, you've got dentists who have gotten awards clinical. Well, we want to know that too. Those are dentists we want to look at on the other end of the bell curve.
R vs. Python and tool choices
The four main languages or tool sets that are used in that area, we do get a lot of data pulled and manipulated through SAS, which is SAS was here when I came here in 2003. I'm not sure if we were building it out today, we would be using SAS, frankly. But it serves a purpose. Then R, Python, and then just like core grassroots SQL kind of work. Those would be the four biggies. And I would argue we are at that point now where it's sort of modeler's choice, whatever you're most comfortable with.
I am seeing a huge shift though in the last five years. Youngsters getting out of college very much on the Python wagon. R is a little bit less common now that I'm out of the Actuarial area. If I put my toe back in the Actuarial area, I think I'd see a lot more R than I do today.
AI governance and the "guilty until proven innocent" approach
The departments of insurance, which regulate us in our states, are fairly skeptical of AI at the moment. Some states, I would say, are outright hostile towards it. Definitely asking questions about AI and your AI governance and how you're thinking about that. The way we've chosen to do it is a little bit of a guilty until proven innocent approach, where you're effectively not allowed to use AI until you've gone through an approval process that says, yes, you can.
The way we've chosen to do it is a little bit of a guilty until proven innocent approach, where you're effectively not allowed to use AI until you've gone through an approval process that says, yes, you can.
It's a two stage process. The first approval is we have a use case for AI. It goes to a team that thinks about it through an ethics bias governance lens. And they'll ask questions like, what data is being used? Is that data staying within a dedicated tenant? Or is it being used to train a public model? So open versus closed LLM. Does the data ever leave shore? What tool are you going to use? And really what that team is trying, that approval process is really trying to answer two questions is, are we OK and proud of what's happening with the data for this use case? And then should AI be used in this use case?
And sometimes we're finding it's really, it's somebody wanting to do something fun and cool, and there's actually a better way to do it that's not AI anyway. So stage one is what tool and what is happening with the data? Stage two, assuming it passes one, is more where the CTO and the architects get involved. Are we going to have to integrate? Are we going to have to punch a hole in a firewall to get to an external tool? So it's a little bit more around the technical, how would we do it? And is it worth our time to do it? So it's a two stage process. The assumption is you cannot use AI. And then further down the road, you can prove your case and get the approval to use it.
I also, I sit on this beast, the IAA, International Actuarial Association. I sit on an AI task force for them. In Europe, AI is being thought of more from the environmental or carbon footprint of AI. And they're saying, yeah, we shouldn't always use AI. It's not always something that has to happen. So it has been interesting to hear that, that there is this awareness that AI might work and might be a candidate solution, but it doesn't have to always be the solution.
AI, the fundamental pieces of AI have been around forever, since the fifties. We've just never had big enough data sets and enough compute power to do it. So it's like, this is a weird case where the technology has caught up with the science. It used to be the other way around. And so I think we're all struggling with that just a little bit.
Four steps for managing data teams
Managing, I think it's a four-step process, and we overthink it very, very frequently, but it is the same four steps just on rinse and repeat. Step number one is put the right people in the right role, and that's an ongoing battle. The minute you get your chessboard set up, the game changes or the opponent moves or whatever. You got to move people around.
And then as a leader, you have to give them the right vision of what you're trying to accomplish. Make sure it's really clear, and we're all aligned on what that is and why we're doing it. Give them the right culture to get it done, and then the biggest one, number four, is get out of their way. And if you've done the first three, you got the right people, you gave them the right vision, you gave them the right culture, you should be delighted to get out of their way and let them do their thing. They will knock it out of the park every time. If you're not willing to just get out of the way, you probably need to look in the mirror because you either didn't do steps one through three correctly, could be true, or your ego won't let you get out of the way, and you're in there micromanaging when you shouldn't be.
If you're not willing to just get out of the way, you probably need to look in the mirror because you either didn't do steps one through three correctly, could be true, or your ego won't let you get out of the way, and you're in there micromanaging when you shouldn't be.
Matching skills to projects and persuading leadership
The first one, how do you put the right people in the right role? I think, again, it comes back, you've got to know your team. You've got to know their interests. You've got to know their skills, strengths, weaknesses, and I'm a firm believer in the stretch assignment that you think they can accomplish and pull off. However, you've got to make sure you're not stretching them so much that it's demoralizing.
I'm a firm believer employees need to own their own career development. You, as their manager, you support them. You get roadblocks out of their way, but they have to own their own career. The second question, though, was management being ready for new techniques. You do have to do a little bit of a marketing thing. Being able to communicate the results and explain them is a huge, I would argue that's the lion's share of does it actually succeed and go to a production-ready or not.
I used to say it with actuaries. I'll say it with data scientists. I will take a mediocre data scientist with excellent communication skills over an excellent data scientist with mediocre communication skills any day of the week.
I will take a mediocre data scientist with excellent communication skills over an excellent data scientist with mediocre communication skills any day of the week.
Hiring for traits, not skills
What I've decided lately because of that is the right way to go here is we're big on interviewing for traits, not skills. Skills, as long as you're somewhat technically inclined, you can pick up the skills along the way. I mean, that's one of the beautiful things about Python and to maybe a lesser degree R, it is pretty easy to pick up if you're technically inclined. And things like RStudio make it easy to do. So I would say we're less cranked up about discrete skills, more traits. And by traits, I mean, are you curious? Do you ask good questions? Are you a lifelong learner? How are you going to work in a team? How do you attack a big problem?
We spend a lot of time in the interview and screening process trying to get to those traits rather than discrete skills. My other reason is we found when we were really cranked up about skills, like do you have the latest Java certifications, those types of things, by the time a skill is realized in the marketplace, it's adopted by a university curriculum, they pump out a couple of graduating classes, there's some certifications achieved, that's probably five or six years on a date already.
Building a data science team from scratch
When we first built the data science team, I knew who I wanted to lead it to get it up off the ground. We also knew during the interview process, if you saw the it and you know it when you see it when you're doing interviews, and it doesn't exactly align, that you're better off bringing the person in anyway and work with it. If maybe that person exemplified your number three skill that you were looking for instead of number one, but they're a really good person and you think they're going to be a good fit, go ahead and take the risk. Do it anyway.
Part of the luxury that we had in that is it wasn't an existing data science team. We were building it from scratch, trying to define what it was going to be. So if we had to pivot what the vision of that team was because of the team we recruited, that was fine. It wasn't like we were trying to fit into an existing structure. So that was a little bit of added flexibility. Now that the team is up and running, I think we're a little more conscious and a little more aware of where there's gaps or where there's a need to backfill.
Staying current and continuous learning
Yeah, this is a fantastic question. And I am blessed with a team that's very curious, self motivated. You know, we recruit very curious, intelligent, self motivated folks on the front end. It kind of takes care of itself, to be honest.
I would say, yeah, okay, I'll give you three. One I would say is when new challenges come your way, be very careful saying no to challenges. That creates an impression and a brand for yourself that may not be great. Number two would be when you build your network, which everyone will tell you to do, it's not a transaction. I would rather you build three or four deep lasting relationships than 12 or 15 transactional ones. And then the third one, which sounds very cliche, but it's shocking how many people don't do it, is the continuous learning. Just the fact that you're all on this hangout tells me that's not a concern with the folks here. And it doesn't mean you have to spend three hours a day doing anything. 10, 20 minutes a day done consistently over months and months will yield incredible results. It's like compound interest for the brain.
I do Tidy Tuesday slash Piety Tuesday every week. And nothing will force you to learn something new or refresh yourself every single week like working on something like that. Every single week the data is different. And every single week I forced myself to refresh on something different or push myself. It's not even about the data set. It's about what skill this week. I was like, all right, we're going to refresh on fuzzy matching. And within an hour refresh on something that I hadn't done since grad school. So highly recommend Tidy Tuesday y'all.

