Resources

Data Science Hangout | Jacqueline Nolis, Saturn Cloud | Structuring Teams to Empower the Business

video
Oct 19, 2021
1:00:00

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Welcome, everyone to the Data Science Hangout. I'm your host, Rachel Dempsey. But for anyone who's just joining for the first time, this is an open space for current and aspiring data science leaders to connect and chat about some of the more human centric questions around data science leadership. But really, it's just an opportunity to focus on questions that are most important to you all as well. So you can jump in live or questions in the chat. And we also have a Slido link so you can ask questions anonymously, too. But just a quick note, this will be a recorded session. But I'm so excited to be joined today by my co host Jacqueline Nolis, head of data science at Saturn Cloud. And Jacqueline's led data science, machine learning and AI projects also at Microsoft, Adobe, Airbnb, and T-Mobile. And she's also co wrote the book on building a career in data science. So happy to have you here today. I would love to have you introduce yourself and maybe just share a little bit about some of the work that you're doing today.

Sure. So hi, everyone. I'm Jacqueline Nolis. I got an undergrad and master's in mathematics. Then I went, I want to go get a job using data to help businesses. And this was before data science was a term. So I did that for a while. Then when I said, Nah, I want to go to academia, I got a PhD said, No, I want to go back in the industry. So then I consulted for many years, including at a bunch of the companies Rachel mentioned. And then I recently started, maybe not that recently anymore. I'm now the head of data science at Saturn Cloud. And Saturn Cloud is a cloud platform for data scientists to run Jupyter notebooks and things like that and use Dask clusters for distributed computing. And I will say I've used R for about 10 years now. And it's been a really fun journey. And R is my favorite language to use these days, even if the company I currently am employed by is more of a Python place.

And Jacqueline, I think it's also helpful for people maybe to understand some of the projects that you work on at Saturn Cloud as well and like what your team looks like too.

Sure. Well, okay. So throughout my career, I have started to lead data science teams. Like when I was a consultant, I created a data science department on my company that had like eight data scientists and managers managing. Then like when I went to T-Mobile and helped consult there, I like helped them start their AI team. So I've done a lot of helping teams start, and then growing up teams. At Saturn Cloud, I'm actually because we are a product for data scientists, I'm less of a like do analytics and build models and more of like product development of like what do data scientists like want? And, you know, how can we build documentation and examples and like help our customers solve their problems, that kind of work. And so that's what the team I lead does now. But it's actually the thinking of it is fairly similar. It's not as different as you'd expect from purely I'm gonna arima a neural network kind of data science.

Awesome. So I get to ask the questions in the beginning when people are thinking about what they want. So I'm curious, what's something that you're really excited about in data science right now?

Um, that is a great question. I think that I have found it really fun to watch. And like the last three years, I would say, maybe a little more than but it's gotten really easy to use the most cutting edge models. Like it used to be that like, well, you could use a linear regression or a logistic regression, or maybe if you really put some work into it a random forest, but if you want to do something advanced, you need a blah, blah, blah. And now it's like, man, with PyTorch, TensorFlow, XGBoost, like GBL, like they're really the difference between the most powerful models and the powerful models that are powerful enough that you can use them on a day to day basis gotten really small. I think that's really cool, because we have to, there's less like, oh, you only use a blah, blah, blah, you haven't even tried to use a blah, blah, like there's less of that, because we can all just kind of use everything now. So maybe the thing of data science is it feels like there's less gatekeeping than there used to be. And I think that's great.

Centralized vs. decentralized data science teams

That's awesome. And I know I got to talk to you a little or I feel like I got to talk to you at the RStudio webinar where we had the building effective data science teams. Something that I thought was really interesting from that conversation was this whole idea of like distributed versus centralized data science teams. And I know you had a kind of set idea on that. And I think it'd be interesting for everyone to hear a bit more about that, too.

Yeah, so I will actually tell a story around this. So this is back when I was the head of a data science team at a consulting firm. And some like one of these Gartner reports, like Forrester kind of a company was asking us like, what is your stance on distributed versus centralized data science? Meaning? Do you think at a company, the data scientists should all live on individual teams, like you should have the data science marketing and the data science blah, blah, blah? Or do you should you have one central data science team that all the data science report to you? And there's more like a consulting firm that kind of goes out and helps. And at the time, when I was working on this, and we're a consulting company, we didn't want to harm and offend anyone, we did a very consulting company thing, we just said, you know, we think there's, there's reasons why you'd want to use all of them, you know, and we just kind of like absolutely punted on like, who's to say what is the right thing. And now in 2021, I feel actually pretty definitively, like in most situations, the decentralized method is better, which is to say, have your data science teams really close to the people who are there helping is better than having data scientists close to each other. Because the data scientists get close to each other, what do they do, they really like talking to each other about really complex models and building really complex things. When you have the data scientists close to the teams they're working on, then you actually can like solve the more practical problems.

And now in 2021, I feel actually pretty definitively, like in most situations, the decentralized method is better, which is to say, have your data science teams really close to the people who are there helping is better than having data scientists close to each other.

I think I have almost, well, like I've either been at like a small company where like, we just have one data science team because there's only 50 people. Or usually, well, actually, I've kind of been on both. I've been at companies where there's just one central data science team. And I've been on companies where each team has its own data science team. I do think there are times where all of them backfired. I worked at a certain, I consulted for a certain large tech company. And they had within one product, they had like 10 data science teams all basically doing the same thing all fighting with each other. And like, that's not a good use of decentralizing. But yeah, I think having your data scientists close to the people they're trying to help is still like the generally right call.

So specialization right to an extent is a positive thing. How do you feel about having the person that is awesome at forecasting sitting in just one of those business unit teams, versus the person that right has spent a lot of time doing cluster analysis and then figuring that out? When you have these individuals who are really, really good at things, and then they get stuck kind of in a corner, which is great for that team that has them, but not so great for everybody else in the company. How do you see that balance playing out?

I think that what you just said is the justification for having some sort of internal consultancy. Like if you really have a rockstar forecaster, put them on the team of people that goes around and just talks to the other people to like get their forecasting up. Like I think if you have 100 data scientists at a company, you could have 10 or something who are that style of like, I'm just going to go around and make all of our forecasting better. And then the other 90 are actually like, more in the field of doing it. I'd said, I think there's actually a lot of risk to in having people specialize. I've been in a lot of situations where like, people think have a desire to specialize too quickly. And like, oh, we're gonna have this person, like, there's so much risk and having this one person be the forecast person, and they're going to do forecasts and no one else knows how to do forecasts, but that one person, like, then you, that person gets bored of forecasting, if they quit, your company's really bad position. You're much better off, I feel like, avoiding those sorts of scenarios and making so everyone knows a little forecasting. I think I've seen occasionally problems where people don't specialize enough, but much more often problems where people specialize too much.

How do you also have the team like, I'm imagining when it's decentralized, and you said it's not as many data scientists, it's all talking together, how do they also help train each other on different topics, or, like, keep up to date on certain things if you're working on a separate department?

I don't know, like, I think you can get clever with like Slack channels and like monthly meetings where we all get together and just socialize, right? Like, that's more about like, creating human connections than it is about like having an internalized process. But like, if everyone's friends, that doesn't super matter. Because like, if somebody goes roggy, you just go ask your friend. So that's more about like creating human connections than like formal processes, I would say. But I don't know, I think that like, also, like at some level, like, I don't know, I usually learn more about data science from blog posts than I do from co workers. So like, it's less about having all your co workers talk necessarily some internal information maybe but like a lot of it's also just like making sure your data science teams have the time to go out and learn new things and stay informed.

Standardizing tools and languages

Are you pretty open across your team to like whatever tools people want to use? Or how do you standardize on certain tools?

Okay, so I will say I, I have watched a lot of teams crash into the ground, because there's one person who likes SPSS. And so they know SPSS, so they're going to write an SPSM rather than them learning the language everyone else uses. I have been that person, not with SPSS, but with like F sharp, like I have, I have committed that sin. And I have personally crashed projects to the ground because of this. I think there is a very real risk of letting people go in and do whatever they want, and then having chaos. I also think concurrently that if you never let people use the tools they like your product, your employees are sad. And also your product is less good because people like particular things for a reason. So I think anytime you put in a new technology, it is a group decision, it has to be some level of group like thinking. Like the last company I was with, like, we were primarily using Python, and we're like, going to use R, and like, do we have enough people to support R, blah, blah, blah, like, like, that is a conversation and not just like, well, I like R, so I'm going to use it. And I think this is especially hard if you're one of the very early people on the team, because every time you make, if you're the only data scientist and things start, every decision you make about, well, I like this best is also a decision for all the future people. So it really matters that you think about this stuff.

Jacqueline, with the idea of the decentralized, I'm a big proponent of that. One thing that I wish I could get to be more successful is letting people move around a company. I've been successful over my career, actually getting decentralized, but then often the groups, the managers over the groups that have the data scientists don't want them to ever move anywhere. And I'm not sure how to get more buy-in from the higher executives to help push that along. Do you have any tips there?

No, but I will. Not do I not have tips, but I'll answer a different question that's related to your question. I will first say I've never actually done what you've described. I've always stayed in a particular team until I've been like left, basically, for whatever reason. That said, I think what you're describing is something I have dealt with a lot, which is when I think there's a culture change that needs to happen at an organization, and I don't know how to get people to notice that. In your case, the culture changes, people lock in too much on, hey, I like this employee, so they're going to stay with me forever and not, hey, it's better to let people move and grow. That is a culture change. I have a lot of times been at companies where I'm like we are doing something that is disadvantageous to our whole company, but the people at the top don't understand it, and I need to help them do that.

Here's the bad news, which is I've been in lots of those situations. I don't know how many of them I've actually succeeded in getting anyone at a higher up than me to really change a culture. I'm just a, I don't know, maybe you can notice this about me. I'm a very outgoing and loud person, and so I will really bring up these sorts of issues. And occasionally I can get people above me to make changes, but like more often than not, they're like, well, I don't know, and it's just like there's a certain like power imbalance where like I can never make people with more power do things. And I've had to learn how to be very strategic about deciding when to really push on stuff, when to just let stuff go and not care about it, or when to say this is so catastrophically bad, it is no longer worth it for me to be here. So I don't know if I have necessarily tips on how you can make those, get those sorts of changes to happen, but what I would say is maybe the better question isn't how can you make them happen, it's when you're in an environment where you can't make the things you want to happen, how do you react to that?

Yeah, actually, sorry. I've been in a lot of these situations where I've had people above me really making bad calls, including some like diversity stuff where like, it's like, I just, yeah, it's very frustrating. Like I put a lot of heart and soul into it and then just have people above me be like, eh, no, and it's just like, yeah, it's very demoralizing.

I will say the last five years of my career have in large part been a form of me learning how to navigate these situations, because I think that really, as you get to like principal levels and up, it's just more and more, hey, the people at the top have some ideas that maybe you don't agree with and what are you going to do with that?

So, oh, not only they have ideas you don't agree with, they hire you to fix those problems. And they're like, we want you to fix these problems. Also, we won't let you fix these problems. What are you going to do? And it's just like, that's just a very, how to handle that has been a thing I've really been focusing on.

And is that something that you also are training people that work for you to do? Or like how to have those conversations?

A bit. A bit. I'm also trying to not be those people, right? Like, I think there's just a certain amount of like, how can I be a good leader and make sure that the people under me are given both clear direction of what they want them to do and a clear understanding of what they can and what I, as their leader, can and cannot do and why. And make sure those are in sync. And I really have been trying more on that. But I do think training people to be good, navigating on that's important as well.

Prototypes going to production

I work with a lot of people that ask for my strategic input, maybe scoping out a data science tool, whether it's a package or a shiny app or things like that. And I give them some high level recommendations, and they start prototyping it. And this prototype can go quite quickly from some of those just meant to show how something worked to then they just build and build and build on top of it until you get to a situation where the leadership wants to use this in production. But it is most definitely not ready for production because of some shortcuts they took and things like that. So I guess from your experience, what have you done to still encourage innovation in your teams or with your colleagues? But yet, when you sense that maybe something's about to get really popular really quickly, how to empower those developers to think about, well, it's probably going to be used in production.

So I will say that in a sense, this is actually very related to the last question. But just in your case, it's like at some level, people at your company have an incentive to put stuff out the door because it makes them look good. You also have an incentive not to have this blow up on its face or not to be tech that, you know, in the next year, where suddenly you're like, oh, this is a huge hurdle. And if we just put five, you know, extra minutes of effort in so you have these different incentives, and it's kind of your job to be the person to like, even if you don't have the most power, you kind of have to do this thing of like, well, how can I as a person who isn't have the most power, try and guide the team to doing the right, the right thing. And I think, because you can't just tell people, no, you can't roll this out yet. The best thing you can do is really it's like, this is like a soft power, like discussion sort of a thing of like, well, let's talk about what the risks are rolling it out here. And like, why we care about unit testing and like these sorts of things. And as much as possible, a lot of let's say it's a unit test, let's say you make a cool API, and you haven't put a unit test in it, like, cool, let's just put it in prod, something like unit test, you could go to a person and be like, you really need to put unit tests in it. And they'll be like, whatever, teach, like, I don't know what that means. That sounds like a lot of work. And the more you can both tell, like explain to them why, hey, you're going to testing up support these things, blah, blah. But then also, if you can actually like write some unit tests, set up the scaffolding, like a lot of times the reasons why people don't do things isn't because they don't want to make stuff good. It's because they don't necessarily know how. And sometimes even just setting up a little bit of the scaffolding can really go a long way. And I think that's kind of what I think about like a principal data scientist job. It is like entirely to do that. It's to help like get people started in the right direction.

Yeah, let me give a very concrete example of this. It is generally best if you have code in a shared repository that people commit and push and pull. It is generally worse if you have a data science team where everyone has everything on their own laptop with no structure and different formats. So I had one at one of my consulting things I'd gone just from just from being a team where everyone uses Git and we all have a clear system with folders and blah, blah, blah, to a team of 30 data scientists, 90% of them didn't use Git. Some of them, like I said, use SPSS, blah, blah, blah. And that is bad. That is bad for a lot of reasons. If someone were to leave, that company would be really bad. We can't even keep track of what code actually is running in prod because everyone has their own stuff. Now, when I come in to be like, I objectively know that the team I've just been working on before with Git and folders and whatever, CICD, that is objectively better. But to try and do the work of getting everyone to switch to using Git and a shared language, that is a massive amount of work. I could not do that. I don't even think the person running that team of 20 could have gotten everyone to do that because the problem was just so big at that point. Had they started three years earlier, it would have been fine. But at that point, 20 people, all these different systems, it just would be very, very difficult. So I do think there's a certain point of no return of like, we just have to accept that this is not going to be a system that runs well and hope that as they build future systems, it will.

Internal documentation and communication

So there's actually, there's an anonymous question that just came in on how do you structure internal documentation or communication channels so people can be empowered by the internal data science team?

So I have maybe two hot takes that I think are maybe not generally, like, standard, what people believe. One is I think that a lot of times data science teams write too much documentation. And two, I don't think they put it in the right place or they don't think enough about what the right place for it is. Which is to say, let's say I am a data science team and we have some stakeholders work with us. One thing you might say is like, well, the stakeholders, he's asking us questions. So what I'm going to do is I'm going to write like a doc on our wiki page, and it's going to be a huge doc with every question answered in it. And a data science team will spend weeks building this giant repository of all this information. And then what do you know? The first thing is no one looks at it because it has too much information. Even if they did look at it, everyone has new weird questions that show up. So they, those docs wouldn't really answer it. The docs are written to avoid having to take meetings and stuff with these people. But in fact, you still have to take just as much meetings. Now you've just spent weeks of time building something that you then also have to maintain.

So I think there's a natural data science feeling of like, well, if I just write everything down, people will stop talking to me anymore. I don't think that is generally true. I think the best kind of documentation data scientists can do is one for themselves. I think within your code, having read me is having your functions documented, having good testing, good variable names. Like that's what I do really think helps quite immensely. And making sure all this code lives in a shared place where it's not like we have a GitHub repo for one thing and a GitLab repo for another, blah, blah, blah. Like just having things organized is just as valuable as documented. And then the other thing I think is that, yeah, you really can't document your way out of talking to stakeholders. And the best thing you can do is just getting it. So stakeholders know that what is the most effective way to communicate with you? Is that if they have a request, you drop it in the JIRA board? Is it, hey, there's this, this is the project manager. They always take the meetings. Like just having that clear system. So you can't minimize the work, but by having it in a flow, you can at least like manage it as opposed to fire drills requests coming everywhere. I think that stuff is far, far more valuable than like, yeah, let's write lots of docs.

And then the other thing I think is that, yeah, you really can't document your way out of talking to stakeholders.

What kind of system do you use for requests?

So I do think that you have a person on your team who's the project manager person, and maybe that's actually a project manager. Maybe that's the principal data scientist, whatever, but someone whose job it is to triage requests. You then have, when anyone has a request they want to make of you, they either talk to that person or drop it in the JIRA board or whatever board. And then that person looks at it, tries to understand it, goes back to them, has a meeting about it to really clean it up. And then from there, the data science team tackles it. I think some sort of intake method like that works best. I think creating a standard form that they fill out, that kind of stuff doesn't really work because every question will never fit into a form because always have its weird edge cases. And I also think, yeah, I think just having a person who everyone knows the point of contact on this just makes it a lot easier. And like, I don't know, I find that job really fun. I love talking to people about what their problems are. So it's not necessarily like a bad job, but it is something where you should have one person who's really the point of contact on it.

Being the first data scientist

You mentioned going in, starting all of these teams from scratch. Was there a common denominator, this common thing that you would experience in companies where there's maybe not a data centric culture or there's not a big understanding of what data science is? Was there a common thing that, you know, hurdle or something you had to overcome to kind of get buy-in or, you know, just to even start the team off?

Yeah. So a couple of things. I mean, usually if they are hiring you, there's got to be like some big buy-in at the company in the sense of like, you wouldn't have been hired if at least someone somewhere didn't think data science would be useful. So I used to worry that like, am I going to get buy-in for this? That's kind of gone away. Usually it's like, there is buy-in, but there's a total lack of idea of how to actually make it work. Which is to say, and I've had this happen a few times where you come to a company, you start a team, but the question is like, is there actually data we can use? Is this data actually useful? Right? Like, so if they hire you with the expectation of like, you're going to build machine learning models on the revenue data to blah, blah, blah. And then it turns out like two months in that, like, there's not enough data here that you'd ever be able to build models on it or stuff like that. That stuff keeps me up at night, but I don't think there's any real way to solve it besides before getting hired, really try and like probe on if this seems like a good chance for data science and really just being flexible with once you do start.

I will also say to the previous or earlier point, when you are the first data scientist you do, you are the first data scientist, everything you do sets a precedent. If you pick R or you pick Python, now everyone's going to use R or Python. If you use the Git repo or blah, blah, blah. Like, so it really, more than any other time, you being careful about what you're doing is going to pay off in the future. And also you have to, at the same time, and here's the twist, you can't just be like, I'm going to build a beautiful cathedral of Git repos, this CI, CD, blah, blah, blah, because you'll spend the next six months of your job setting up all this tooling, no one else to use it yet. And after six months of not actually doing work and just building tooling, you will be let go. So it's like, it is like a strategic game of like, where do you actually invest, given you can only do so much investing while concurrently trying to produce revenue. I think being the first data scientist is extremely fun, but that is quite a stressful thing.

It sounds like maybe that has happened before where you get there and you don't have the data that you actually need to do what they want you to do. Like how did you actually handle that?

Well, you have to have some real frank conversations with your leadership about, Hey, you hired me to do things X, Y, and Z, but think X, Y, and Z is impossible. Let's talk about what is possible and how I can do it. And if you have a strong set of leaders, they'll be like, you are a data science expert. I trust you on that. Let's have that conversation and pivot. And if they are weak leaders, they will say, well, you just don't know enough about data science. We're going to keep doing the same thing because we know more than you. And then that is like a, I don't know, not great situation.

Denying requests and organizational hierarchy

Sure, absolutely. So I work at a hospital where we have a fairly new data science team. And the problem with hospitals is that they tend to be very hierarchical in that everybody's first idea is always would go as high up in the hierarchy as possible to ask what we should solve. Whereas I maybe have the understanding that the good ideas actually come from the people who work with the patients and do all of the stuff where whose work we're supposed to be helping. And I haven't figured out a good way to triage where people I work with are not necessarily comfortable saying to the high ups that this might not be the best use of our resources. And I also am not senior enough to maybe be able to do that myself. So if you have any ideas on how to, as you say, like, try to bring the reality into the conversation, that would be really cool.

So I will first say you said, well, I'm not necessarily senior enough. I think that is a student observation, which is I think a lot of times people get on these calls and do the way like you can be the lowest person in a company. And if you pitch your idea, just right, people are gonna love it. And like, that's not really like, like, positions are real. And sometimes being more senior gets stuff. So like, I think it is good that you are aware that that stuff exists, and you're navigating around that. That said, the most luck I have had in these things, in these sorts of situations is just trying to like pivot the pitch from not like, hey, more senior person, you're wrong. And here's why. Although that's maybe how I feel but much more like, hey, how can I show you, senior person, the things I am seeing that are making me think that the more junior way of doing things are, right? So like the senior person, like we need to build a whole new data platform or whatever. And you're like, no, we really just need to like, go survey some hospital beds, right? Like the more you can be like, well, let me explain why. Let me show you some data. Like the more you can kind of, it's not like, hey, I need to change your mind. It's like, I'm going to bring you into this new, even better idea.

Like that's kind of thinking helps, which is to say, most of what you're kind of asking about, and I think has repeatedly come up on this call is like, organizational dynamics and like the things that have helped me the most of my career get good at these are books on like negotiation and like getting to yes and difficult conversations, just like thinking, answering the question of how do I convince another person to do the thing that I think we should do? Like that's that is totally unrelated to data science and very much like business philosophy, like argument, meta conversation, you know, like, and I think just really thinking about it that way and trying to look at those resources help.

Hiring and upskilling

As a data science leader, what are the pros and cons of hiring new talent or trying to upskill existing talent?

So I will say, I don't super feel like it's, um, it's like a binary. I always feel like I, as a leader, I always want to be upskilling my employees, which is to say, at every turn, I should be trying every day, regardless of hiring other people or anything like that, I should always be trying to make sure my employees are working on things that are challenging them and helping them grow. And they learn new skills, both because that gets me more things that they can work on at the company. And it's good for them because they get new skills and have presumably more fun doing it. So like, I just, I think that the upskilling is just like a perpetual push forward of people. When it comes to like the pros and cons of hiring new talent of like, when is it valuable to your team, I really struggle with this, because I do think that when you hire more people than that, that actually is a really big burden on your team, both like, you have to give them work to do and make sure they are content and communicate with them. So I, I think I tend to be a little bit more conservative in my hiring, just because I find that terrifying. I'm more terrified that and then I am terrified of not having enough people, which is, I think, just my personal, like personality showing up. But like, there's just, it's a very real calculation of like, how much do you want to take on the risk of like, have not having enough work and your team getting too big and blah, blah, blah, versus the risk of missing opportunities, because they didn't have people around.

Data analyst vs. data scientist

Oh, okay. Great question. I would say a data analyst job is to take raw data and get it into a consumable format for an executive, right? So if you have a SQL table, it's an analyst job to turn it into an Excel file that has just the summary monthly numbers and put that in a dashboard and these sorts of things. This is a challenging job because like, well, what are you doing with sales data? What about returned objects? And what about this? And like, well, if we do that, then that actually changes last month's numbers. And that would be weird. Can we do that? Like, there's like a lot of work and thought that has to be into getting executives that data.

A data scientist, in my opinion, is a person who is more focused on, they get a data set in an abstract question of like, why are my customers churning? What is the best offer to give my customers? Things like that. And a data science job is to like build models, do regressions, forecast, whatever, to try and answer those things. So a data scientist is more about using models to answer questions and like do inference, whereas a data analyst doesn't really build models. They just take data and group, aggregate, summarize, and visualize in thoughtful ways that people within the company can use it. And I think companies have like a thousand times more data analysts work than data scientists work. And I think there are a lot of data scientists there who like, I would never do data analysts work. And I know because I used to be one and I was very snobby about this early in my career. And now I do a lot of it and it's fine. But like, there's like a mental hurdle you have to get over of like, things aren't less interesting or numerically challenging just because they don't have a regression in them.

Roles on a data science team

So as far as data science roles, I guess teams, you mentioned kind of a project manager earlier, data engineer, data scientist. What do you recommend as far as the types of roles in a data science team?

It's super depends on what your team is doing. If your team is like primarily like making reports, doing analysis, like doing an analytical, like trying to answer these questions with data. That is different than you're building actual models that are going into production, that sort of thing. I think, I mean, obviously you need data scientists. I think after you have like two or three data scientists, you probably want to have like a principal person or like this, a person who is like job it is to like coach and help out on the other data scientists. I also do think having someone whose job it is to do project management, which is could very well be that principal. That's valuable too. I think having a manager is good if you get big enough. Okay, so let me, I'm going to start over. Here's what you need. You need data scientists to do the work. You need someone whose job it is to help those people technically when they're not sure about stuff, which could be convincing them to do unit tests or whatever, and checking for that. You need someone who's a manager whose job it is to figure out that the work coming in makes sense, right? Like a manager's job is to like, hey, are we asking, are we working on the right stuff? Are we asking right tests? Are our employees growing and are they getting the right comp or whatever? And you need someone who's a project manager, which is like literally like, hey, person X from this department asks us to do this. Does that request make sense? Do we need to flesh it out? How does that fit into our like list of work? That's like five roles on that team. A lot of those can be bunched into the same person, but depending on what your company is like, it may make sense to do that bunching differently.

Transitioning from analyst to data scientist

If you're a data analyst today, what tips do you have on making that transition over to data science?

So I think it's a lot about learning, learning the regressions, learning the models, getting more comfortable because analysts are more likely to use SQL and Excel and those sorts of tools and R and Python. And the more you can get comfortable with R and Python, the better like there are boot camps, right? Like a data science bootcamp is a thing designed to get you those skills you need real quickly. But a lot of times when people ask this question, it's because they are an analyst and they aren't just in this stuff and they learn some R in their free time and now they're trying to figure out, but okay, but how do I actually get the better job? And then at that point, it's a lot of like crafting your resume and stuff to like actually like highlight that, hey, analyst work is very similar to data science work. So look, it's not really much of a stretch at all for you to switch positions. So at some level, it's about like marketing yourself as it is as much about getting the right skills and things like that, or using internal opportunities. If you can switch teams and that sort of thing to see if someone could switch you within your org to get more work. I will say, I have a co-wrote a whole book that has discussions about these sorts of things, and you can get it at bestbook.cool with the offer code buildbook40% for 40% off.

Explaining departures and data science consulting

But when you do determine that it's time to leave a company, how do you explain that departure when looking for the next job?

Okay, so we do have a whole chapter on leaving your company. What I do is, I mean, I guess it like super depends if it like, I think it's very one. Oftentimes, people don't even ask because they don't see it like it's often not too real. But if they do ask, you can be like, Hey, you know, I've been at the company for a few years, and I was looking for a new opportunity. And that's fine. You can I've been in situations where like, I actually left because of the stuff we talked about earlier, like I have incredibly toxic boss or whatever. And like what I do there is when I go to my interviews, I say, like, well, why did you leave? I'm like, well, you know, after being there for six months, they're just very clear that there wasn't necessarily a fit in terms of what they wanted data scientists to do. And that what I thought was the right stuff to do. It just like wasn't a quite right fit. And that's like a wink, wink, nudge, nudge for like, it was a bad talks environment, but like no one questions that. So like, there's a lot of ways you can say I don't like working there anymore. That's still like middle class sound like look, we're being professional and fun, aren't we? And I think that you just kind of have to like, practice saying the words a few times, then go into the meeting, and it's fine.

Pitching data science to executives

One is if you're talking to an executive leader, how would you explain the core value that a data science team brings to the organization?

So how would I purse if like, I get asked into the office by a CEO of big name company, like fortune 40 company, whatever. And they're like, Jacqueline, what is data science? Why is your company's data science? I would say, look, companies, you generally collect lots of data, what products people are buying, how they are using them, blah, blah, blah. You also have to make a ton of decisions, both like in the abstract sense, should we use that data to do stuff like, or in the abstract sense of like, should we launch a new product or not? You could totally make that decision in a vacuum. And that's fine. And companies have done that for a long time. But wouldn't it be a little bit better if you use all that data to try and do things like, Hey, let's not send the customers who will, who will purchase from us, regardless of if we give them a coupon, let's not just hand them a coupon and lose money. So I don't think data science ever gets you like your company went from a failing company to a huge success of data science. But I do think like you can like on the margin improve your product a reasonable amount, and like make things a bit better by having data science. So I don't think like data science is the right thing for every company to suddenly grow huge. But I really do think especially for bigger companies with lots of data and lots of opportunities to improve stuff, that data can be just a very nice way to kind of flesh things out.

Data science consulting

You mentioned that you worked in data science consulting, you know, at one point in your career. I was wondering if you have any sort of tips, tricks, best practices that are unique to finding success as a data science consultant, and in the consulting world, dealing with a lot of non-technical stakeholders as well, the sales process of selling your value.

Yeah. So I will say, I do talk about the book. I have like loosely toyed with the idea of like writing like a, like a five chapter ebook just on consulting data science, because so many people have your question of like, I find this stuff interesting. How would I, a data scientist get into consulting? So consulting has a pattern like, okay, so super high level, how do, how do companies hire people? They have employees, full time employees, they have, it's generally called like contractors or vendors who are like, hey, we just need you for a certain number of months to just work as like our team. And then they also have like consultants, like more consulting consultants, like we're bringing you in for a very specific task. So it is possible to, as a individual, be in that last group of like people know you as like, you're a good person for this kind of stuff. So they're just going to bring you in. By far, by far, the hardest part of all this is finding, getting companies to know and trust you. And like, like finding the work is a thousand times harder than doing the work. For me, I had worked at consulting companies for many years. So like I had built connections with like companies I'd consulted for as part of one point. And then I was able to talk to them as an individual. So like a lot of it is just how that is how I built the network. And more than anything, what I recommend if you're interested in this kind of stuff is like try and find someone who would hire you, and then do it in your free time, like do it in your evenings and weekends and that sort of stuff. Because if you do it that way, it's like a nice you're testing it out your and if it works, that's a nice side income, which can be really valuable. And I recommend charging like real, like charge more than just whatever your hourly hourly is like double or triple that. And then if you do that, and do that enough times, and you like it and stuff, then you can start thinking about making it your full time life.

But that's a hard thing. I also think working at a consulting firm is a very valuable way to learn these sorts of things in terms of like, well, how do you make that first contract? And what is the sales cycle like, there's just a lot to that, that is unrelated to data science and just a standard consulting thing. But like, it just takes time to like figure out the beats of it. And so that kind of stuff, which is to say that being a data science consultant is very little about being a good data scientist and extremely much about being a good salesperson thinking through contracts, thinking about like, what's going to happen if this company won't pay you for four months, like, it's like that kind of management and all that kind of stuff, which can be fun and eventually a very financially lucrative if you can pull it off, but it is also tricky, and I no longer do it.

Being a data science consultant is very little about being a good data scientist and extremely much about being a good salesperson thinking through contracts, thinking about like, what's going to happen if this company won't pay you for four months.

That's super helpful. Thank you, Jacqueline. And as we get to the top of the hour, too, I just want to make sure to ask, what's the best way for people to get in touch with you? If they have follow-up questions, is it Twitter or LinkedIn?

A great question. I don't really check LinkedIn. Please Twitter DM me at Sky Tetra, S-K-Y-E-T-E-T-R-A. That is the best way to get it. And like a lot of my talks and blog posts and stuff, I have all my personal websites. If you just go to janeowles.com, you can see a lot of my past material.

Awesome. Thank you so much, Jacqueline. It was really fun chatting with you. I really appreciate your time, and looking forward to hopefully having you on here at a future session, too.

Perfect. Yes, this is a lot of fun, so thank you for inviting me. It's mostly just getting to rant about stuff I like ranting about, so it's great. Awesome. Thank you, Jacqueline. Have a great rest of the day, everyone.