Resources

Alan Schussman - Getting Data Done with a Pragmatic Data Team

video
Oct 31, 2024
20:24

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Positive comp, this is cool. Okay, so Annapurna. Annapurna is the 10th tallest mountain in the world. It was, in 1950, the first 8,000 plus meter peak to be summited by an expedition of mountaineers led by Werner Herzog. It's this classic tale of, like, disaster in the mountains.

The thing I think a lot about when I think of Annapurna is that after a year of planning this expedition, the team got to Nepal. They spent a year getting permits, training. This, of course, was the culmination of years of training, right, to be elite, elite world-class mountaineers. They get on the ground in Nepal, and they're not sure how to get to the mountain.

And, in fact, this is super apt, I think, for thinking about data science. They don't even, it's not just that they don't know how to get to the base of Annapurna, they're not sure if they want to climb Annapurna. They've got their eyes on this other mountain, Dalgiri, that's also an 8,000 plus meter peak. So, they know that they want to climb one of these two 8,000 meter peaks, but they're not sure which one. But they brought all their gear, and they trained, and they're really ready. So, the first problem is, like, we've got to get to the mountain.

Are we really data scientists?

So, I think about this situation a lot when I have conversations with my team that go something like, they go something like, it's all around this theme of, like, are we really data scientists? So, questions like, when we get to be data scientists, will we be able to, and it's use a certain method, it's employ a certain kind of statistical technique, or can we really be doing data science if we're not, this is when we tend to get into specific tools, if we're not using, like, the newest tool, if we're not using a specific method, if we're not using a specific application stack, this gets into really specific, like, in the weeds of, like, the way that we want to practice our work.

Or questions like, am I really doing data science if I, like, I spent all this month solving infrastructure problems? I spent the whole week figuring out, like, database drivers. Like, every time I fire up a project that has to connect to Oracle, I kind of hold my breath, because I wonder, am I going to spend the week instead solving that database driver problem?

So, I talk about this kind of stuff with my team, and my instinct, I believe in a really expansive definition of data science that encompasses a lot of different kinds of activities. And so, I say, yeah, we're data scientists, because I want to be really encouraging and really affirming of the work that we're doing. So, in doing this coaching with my team, I say, yeah, we're data scientists, and then I throw a bunch of rakes out on the ground, and I, like, I step on all of them, because I immediately start to qualify it with things like, even when some of the work may not look like data science, no, that's not the way I want to reinforce this message. Or I say, yeah, we're data scientists, although I know you want to do more of machine learning, or AI stuff, or proc life reg. And that's my one SAS joke that I'm allowed to make, because it's where I learned survival analysis.

The problem with this conversation, the problem with this conversation, and the problem that I step into, even though I don't want to, is that I think we have this sense in this business that there's a kind of value-laden continuum along which data science is the peak to really push that entry point. And there's all these other kinds of activity that as data scientists, well, maybe we have to do to get there, but that doesn't really define the work necessarily. And I think that that becomes a really harmful cycle of feeling like I'm always not quite doing the valuable work that I want to be doing. Or the thing that has the most payoff, the most value for my organization, or the most value for me personally, from a skill building perspective, is not where I have to spend my time.

And I think that that becomes a really harmful cycle of feeling like I'm always not quite doing the valuable work that I want to be doing.

The reality of pragmatic data teams

So I spent a lot of time thinking about how do we make an environment where for teams like mine, we don't feel like we're constantly getting pushed around on this value continuum of what's the work that we're doing. So a little bit about my team, and I think this really describes a whole lot of teams doing this sort of work. Sometimes we get to be strategic. We get to be in conversations about what's our technical roadmap? What kinds of tools do we want to use in a year? How do we get there? How do we develop along that path? Really day to day, we do a ton of just ad hoc problem solving. We've got to get data to a consumer. Along the way, we've got to transform it in some challenging ways. And maybe we have to do that in a slightly different way every week.

This is the kind of situation that my team ends up in a lot because we span a whole bunch of different functions. So we don't have this continuous pipeline of where we're refining a method and refining a method and refining a method. We're solving different problems for different projects every month. And that can make it really hard to feel like we're pushing along a single continuum of skills.

I think this characterizes lots of data teams. A lot of data teams, maybe most data teams, are in the space where we can sort of ideate about that far end of the spectrum. That's the sort of rarefied air of data science as I think the industry promotes. And as I think even a lot of our leaders tend to promote when they ask us things like, why isn't our organization doing more machine learning? Well, you need to bring us some use cases that demand it, and then we can get into that work. But in the meantime, we've got all kinds of problems that we need to solve. And at the same time, we're too big. And we have too much of an enterprise technology stack to be able to respond in nimble ways to some of the developments that are really exciting the way that smaller teams can respond.

So we work in this space where the model for the job is often not the way that we can practice the job. And I think that creates this really, really profound sometimes sense of sort of fear of missing out, and the sense of kind of a contest culture over who gets to practice with the tools that are the really exciting tools in this space. And so I think a bunch about how do we make data science jobs more healthy for data scientists, for people working in these jobs?

Setting expectations honestly

And I have a few ways that over a few years, I think I've had some success. The first one, and this applies both to folks who are hiring, to team leaders, and also to folks who are, to data scientists, folks who are looking for jobs. Both sides, all of these folks involved need to set expectations really carefully and really honestly so that we're describing the work that we have in our teams in an accurate way.

So here's some bullet points from job descriptions for roles on my teams. And you'll see that there's a progression here. Develop and build data assets and models. Okay, in the service of like business insights for our stakeholders. This is the sort of like data science top line where people go, hey, that sounds good. Building data sets, gathering value from data, presenting it to stakeholders. This sounds cool. Become a subject matter expert for needed processes and data domains. Okay, this still sounds like it's in the realm of things that I want to practice, although maybe it's getting a little close into the like sort of application side of things or the business process side of things. And that's not necessarily what some folks are signing up for. And then there's the end where we have to solve problems in this space all the time. Build and support capability, work on integrations, work on pipelines, work on getting data from place to place so that we can partner with analysts and with others to do good work with it.

This kind of conversation is a sort of conversation that I've had with everyone who wants to come onto my team. And some people say, great, I like this array of things. I could really be successful here. Other people say, no, there's too much here that I don't want to do. The sooner we can have that conversation and the sooner both sort of parties can think about it realistically and kind of gut check with themselves honestly, like, so this is a job I want, the better. Because if you go down the road, having had a mismatch, you get into really dangerous territory.

The most challenging situation I've been in as a leader of a team is with a person sort of working peripherally to our team, but in a related space for whom this work of kind of aligning expectations of the job with the expectations that they had for what the job ought to be, like that work wasn't done successfully enough. And you get several months down the road in a project and you realize that you need this person that they're not ready to do or maybe not willing to do. And in the meantime, the work is suffering. It creates this really almost catastrophic kind of scenario for the rest of the team as well, observing like the breakdown of effectiveness that the team feels.

And so the peak of this conflict that I've experienced was this almost catastrophic effect on work that we'd been doing for years that can really just hit a wall when the folks who you need to help deliver work can't or won't. And that's a failure of being realistic and honest on both sides. Leaders have got to honestly present what is the job. And as job seekers, as data scientists, you've got to honestly sort of check in with yourself on, can I do the work as is being described here? If not, you've really got to think it's not going to work and you're going to go down the road and you're going to hit a wall.

Balancing business needs with team aspirations

So that's the first piece, set expectations really carefully. The second piece is as a leader, this is where you get to start to be a little bit more active. Once you've got people on your team, this is where you can start to think about meeting the needs of your business and balancing them with the aspirations of the folks on your team. And this is sort of the balance between the what and the how. The what of course is like, what does our business need us to do? Well, they need us to deliver this project that has to do with building some really complex data assets from a bunch of different sources, with modeling and deriving some new attributes in that data, with working up a method for sort of making that data available to consumers through an API or through a platform or whatever. That's what the business needs. That's why we have a job.

And I know that's supposed to be the thing that is like motivating is create shareholder value. But the thing that as a leader, you get to do when you've got a team who understands those expectations is we get to, I get to spend a lot of time talking with the folks on my team about, okay, how do we get to what it is that you want to make? This is where we get freedom. And in my organization, we've got a really explicit process, this annual cycle of goal setting where, you know, you put out the big rocks each year. This year, we've got to rebuild this great big complex thing. We've got to come up with a market model. We've got to do some supply chain stuff. You put all those things, you know, that's your portfolio of work. We pair each of those things with a question about, okay, how do you want to grow while you do the work this year?

And that seed of conversation lets us really dig into person by person on the team, what are the skills that you want to develop while we do the work that we know that we've got to deliver this year? As a team, we have strategic goals for doing more code-based development, for modernizing some infrastructure, for developing some improved methods of communicating with data. And so we can take those team-level goals and we can wrap those into individual goals for every person, for every one of the what's. So it's, I want to write more Python. I want to get more into Git for collaborative version control with my team, right? These are the kinds of things that as you get to practice them, I think you get to sort of instantiate in your work, the day-to-day sense of I'm doing data science the way that I see it practiced. Even if you're not like fitting models every day, right? You're doing valuable work and you're getting to do it in a way that is rewarding and that is skill building.

And the critical thing here is that as a team, we've aligned on the goal, the requirement that as we meet our business needs, we're also going to develop as people. And that's a real responsibility for me to be checking in with as we go throughout the year.

Guarding against common pitfalls

Okay. So we've set up these goals for people to develop themselves, to grow skills throughout the year, as we meet our business needs. But there's danger in believing that like that's it. And so as a leader, the next things that I have to do is make sure, sort of have to gut check, have to check against some instincts that I think start to surface as you say, okay, I'm going to lead a pragmatic data team. We're going to do what the business needs us to do so that we can fulfill these roles. There's an instinct that you find yourself in to say yes to everything. Got to push back against that.

Being pragmatic doesn't mean that you just do everything that the business needs. Ideally, you're growing in credibility so that you can push back on the things that don't really make sense, right? You need to become a stakeholder. You need to be at the table with the business to push back. So we don't say yes to everything. It also really importantly, it doesn't mean that the folks on your team are somehow sublimating their ambitions around data science into this channel that you've got. And if you get a sense that people are feeling that way, you've got to pivot. You've got to steer so that you can avoid that cycle because of the sense of like, well, this leads you to the kind of situation where people think they're putting in their time. We don't want people to feel like they're putting in their time. If we're doing this goal setting well, it shouldn't feel that way. But we've got to watch for the sense of like, you know, I'm throwing myself on the grenade for the greater good. Like that doesn't serve anybody.

There's also a risk when you're being pragmatic of the sense of like, we're going to build a thing and we're going to own it forever. There's a trade-off here that's worth considering. If we build a thing using the tools available to us or using the capabilities that we want to stretch into, maybe we own some infrastructure for a while, but we got to learn a bunch along the way. Sometimes that's a really good trade-off. Sometimes that really lets the team get into a new space, make something effective, and then iterate from there to like, what is the next sort of level of sophistication in that? But it's worth, again, asking that question. If we build this and it works, are they going to keep wanting it from us? And if so, are we going to be ready to keep providing it? That's one of those checkpoints that we have to check in with ourselves to make sure that we've got, at least that we've thought about that, that we've thought about that potential risk.

Creating visibility for the team

So we're doing good planning. We're meeting business needs and hopefully people feel like they're getting to develop their skills as we go. This creates, I think, another kind of virtuous cycle. And the one thing I haven't talked about yet is broader visibility. Here's the thing that I wish I'd done like two years ago, which is create a forum, a regular event where the folks on my team basically get to be on a stage.

What we found in the past seven or eight months since starting a monthly event where my team just, we just get together in front of an audience and talk about what we're doing, is it creates a community. It creates this platform where they get to talk about what they're excited about, what they're learning, the ways in which they've solved particular problems. And the response from our kind of community of stakeholders is so encouraging and so that it's really, really gratifying. And it's really, really fun as a leader to see the members of your team sort of getting up, showing off what they're doing and getting that reinforcement cycle from the folks they work with. And that culminates in the sense of, hey, we're developing, we're getting better, we're earning the opportunity to stake out more of our own path. We're getting more and more credible, more and more successful.

And that culminates in the sense of, hey, we're developing, we're getting better, we're earning the opportunity to stake out more of our own path. We're getting more and more credible, more and more successful.

Okay. So we've got our gear, we've got our tools. We've gone from this sense of how do we get to the mountain to at least feeling supported, I hope, in knowing that we're working towards a shared goals. And that along the way, we're developing the skills that really feel rewarding and gratifying in doing the work that we do. As a leader, if we're doing that well, then I think we're doing okay for the folks on our team. And as data scientists, if we can get into a team like that and feel successful and feel like we're growing, then I think we ought to keep doing that. So that's what I got. Thank you. Appreciate this session. Appreciate everybody. Here's how to find me.

Q&A

All right. Thank you. So we have a couple of questions all queued up. Do you have any tips on avoiding burnout from consistently putting out fires, completing ad hoc requests, in addition to managing your main projects?

Yeah, it's really hard. I think if folks have a sense that all they're doing is putting out fires, one thing to do is to try to share that work more. So try to introduce a way to be collaborative so that you don't just have a single point of contact who's the only person who can respond to those sorts of things. That sometimes requires some growth in a team. And so if you can make a value case to have help, then I think that that can help with that sort of situation. But it's tough. We don't want people to feel like their job is just taking tickets.

All right. Where do you draw the line between data scientists and data analysts? I think that's part of the dilemma. I think that there's a lot of overlap. In some places, it's who's your audience. In some places, the audience for the data scientists are the analysts. In other places, there's a whole lot of overlap. In our organization, analysts are the folks who sit with their businesses, whereas we sit in a central function. So that's one way to do that differentiation. The value payoff of what you do, though, I think there's a ton of overlap. And I think that's okay.