Data-driven people analytics | Josh VanderLeest | Data Science Hangout

Transcript#

This transcript was generated automatically and may contain errors.

Welcome back to the Data Science Hangout, everyone. If we haven't met, my name is Libby. I'm a Community Manager with Posit. I help host the Data Science Hangout and foster our Hangout community. And if you're not familiar with Posit, Posit builds enterprise solutions and open source tools for people who do data science with R and Python. And we are also the company formerly called RStudio .

I am joined by the creator of the of the Hangout and my lovely co-host, Rachel. Hi, everybody. I'm Rachel, as Libby said. And I'm actually usually, well, I'm usually in Boston, but I'm in Minneapolis this week for our company-wide workweek. So Libby will be our main host today, but I'll be helping out behind the scenes here.

Yeah. The Hangout is our open space where we hear what's going on in the world of data science across all of our different industries. We chat about data science leadership. We connect with other people that are facing similar things to us. And we get together every Thursday, same time, same place, with very few exceptions.

If you are watching a recording on YouTube and you want to join us in the future, we would really love to have you here live because you get to ask questions that way. Just as a call out to anybody who is adding it to their calendar right now, make sure it adds for the right time. It's 12 p.m. Eastern time.

Alrighty. I wanted to thank everybody for making this a friendly and welcoming place. You have all made it that way, and we are committed to keeping it that way. If you have feedback about your experience today, whether it's good or bad, we want to hear from you, and there's going to be a survey that makes that really easy.

At the Hangout, we love hearing from you. The Hangout is a community discussion. This is not a presentation, a slide deck. This is a community discussion. We want to hear from you, no matter your level of experience, your title, your industry, what language you work in, or whether or not you even use a coding language. And we also really encourage you to connect with each other in the chat.

There are three ways to jump in and ask a question today. Because this is a community-led discussion, we will not have questions to ask our wonderful co-host guest here today unless we all jump in and ask them together. So you can raise your hand on Zoom. You can put questions in the Zoom chat. There's also going to be a Slido link where you can ask a question anonymously, and we would love to have you do that as well.

With that, I am so excited to be joined by our co-host today, Josh VanderLiest, a manager of data analytics and people analytics at Progressive Insurance. Josh, it's so great to have you. We would love to have you introduce yourself and tell us a little bit about what you do for fun.

I still truly think it's one of the best approaches to summarizing big amount of comments. If you haven't gotten to the LLM space yet, still maybe even better than LLMs, but very intimidating to start working with.

Yeah. Topic modeling. So we still do that sometimes, but to be honest, I find large language models are pretty good at coming up with those 80 topics and very good at the feature engineering. Hey, I have this comment, please come up with the right topics that are represented in it. So that was the before. And then the after is throw everything away and use LLMs for every step. But where it makes sense, there's obviously the challenge of, hey, every API call is costing Progressive money. How do we do this responsibly and efficiently? And most importantly, as accurate as before, if not better.

Resources for learning people analytics

My favorite resource that I send to pretty much everyone that talks about people analytics. It's called people analytics dash regression book.org. The author is Keith McNulty. And if you're a LinkedIn person, he's great on there. Keith McNulty. He's one of my favorite on LinkedIn. And what I appreciate about him is he's pretty agnostic about it approach. You know, you don't have to use R, you don't have to use Python. He's got examples everywhere.

The other thing is like, if you're really into people analytics, that's the place to start. But honestly, I usually am just sending folks the CyOp website too. So CyOp.org. I know it's not exactly people analytics, but it's people analytics adjacent and I went to school for it. So, you know, I'm biased. What I like about CyOp is they have meetups similar to the Posit ones coming up, where it's industry research and consulting. And I love seeing where those do or do not intersect.

Impactful workplace changes from people analytics

When I started years ago, I looked at how Progressive does people development and onboarding. There was work to do. And to me, the biggest piece that was missing is, okay, when we're in person, I can see you onboarding is fine. Cause you know who to talk to when we all went virtual, we really lost that part of onboarding where it's making connections, finding people like you.

We had to rethink the onboarding process at Progressive. It no longer worked to just like, here's your computer. Enjoy. Like I'll give you a project in a week. And so, you know, I'm in people analytics. I'm not on the ground, like the HR rep, making sure things are going well. I'm a couple steps removed, but what I saw was we don't have a good grasp of what's going well and what's going poorly.

A more recent of like people in analytics insight that made a difference was, and this is one of my proudest accomplishments of last year. We introduced our very first onboarding survey program where we would check in with you now several times through your first year and just ask what you ask, if you have what you need and how can we help you? And I know it sounds lame, like really another survey, but I promise you this one is great. And what we've seen come from it is our leader seeing, oh, this thing is consistently being mentioned as a pain point. And it makes them really easy to put interventions and say, let's change the way we do this.

Let's come up with a buddy system for people in their first year. Let's find them a cohort where they can meet with them every week. And so what I think is really useful about the way we've approached this is it can't be a one size fits all at Progressive because our jobs are so distinct for our function. Like I said, but like we have people going to body shops and people getting their cars and going places. We have some people taking calls, we have sales reps, we have product managers, and like the onboarding experience is so unique and different.

Attrition modeling and working with stakeholders

We have the same, same challenge. So we live at the, we, like my team, we're in corporate HR. So I think of ourselves as like enterprise wide and we have that same challenge of, well, I can create a really sophisticated, validated attrition model, but I'm not the one in CRM on the, on the floor, like doing the interventions. We have counterparts out in the business, right?

I run into this exact issue is, oh, these three groups came up with their own turnover model and it doesn't seem like anyone's really validating them. How do we ensure they're of high caliber? There's only so much I can do. So I think ourselves a little bit as give them the right tools because they may not have the technical expertise that we do. So for example, we've created a suite of here are like, here's the tool kit we suggest you use to do attrition modeling.

For now, it's a toolkit of here are the things you should be thinking about. Here's the way we recommend modeling attrition. Because I think that's one, a lot of people are not don't currently really understand the right way to do it. And if you don't, you might be doing a really bad job and not realize it. So for example, it's easy to use something like logistic regression control for something that actually has a big time component to it. And now you've just modeled time rather than actually worrying about the event. So things such as here's how you can use survival analysis. Here's how you do time dependent and time independent covariates. And suddenly you have a really predictive, really solid model.

None of these models are gonna be perfect. There's so much individual difference that you don't have data to back up. And we also don't want to be big brother-ish and like, hey, we saw you weren't sending enough emails or whatever. In our case, even simple things seem to be helping. So use a somewhat simple survival analysis model, predict your high risk folks, and then just have HR set up a check-in with them.

Progressive really focuses on like transparent leadership. Meaning I feel like I can actually trust the CEO and like talk to them. And so even like that skip level, we found using that as an intervention tool, using the model that I collaborated with, with claims, for example, has been effective at reducing some of our high risk folks. But the tricky part there is how do you keep it simple enough to actually validate? Is that actually helping or no?

And that's to me, like that final step of validating a model and validating intervention effectiveness, it's really hard to get by it. Like, why should I care? I used it. It's probably working. And I'm like, oh, come on, we don't know if it really is or not. So like getting by it on that last step of model validation is really hard outside of the data science community. It's just not something people can really resonate with.

And that's to me, like that final step of validating a model and validating intervention effectiveness, it's really hard to get by it. Like, why should I care? I used it. It's probably working. And I'm like, oh, come on, we don't know if it really is or not.