Data science and automation in publishing | Sophia Tee | Data Science Hangout
videoimage: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Welcome back to the Data Science Hangout. If we have not met before, I'm Libby. I am a Community Manager working with Posit to help foster and grow our beautiful, wonderful community here at the Hangout. And I'm also a Posit Academy Mentor. So I help professionals do more with data by learning how to use R and Python. And I am joined today by my lovely co-host Rachel. Would you like to introduce yourself?
Hi, everybody. I'm Rachel Dempsey. I lead Customer Marketing at Posit and started the Hangout I think over three and a half years ago now. So so excited to see all of you here today.
Yeah, I think it was in 2021. I was an early shower-upper to that. And I'm so thankful to Rachel for making this happen. And we are so happy to have you joining us today.
So the Hangout is our open space to hear what's going on in the world of data across all of our different industries, because many of us are from super diverse places and different roles. And we really just want to chat about data science leadership, connect with other people who are facing similar things that we are. And we get together almost every Thursday, same time, same place here on Zoom.
If you are watching this recording on YouTube, and you would like to join us live, please look in the description box below. There's going to be details for you to add it to your calendar. I want to thank everybody who has made this the friendly and welcoming place that it is today and continues to be. We are all super dedicated to keeping it that way.
So if you ever have feedback about your experience that you'd like to share anonymously, good or bad, or maybe even just suggestions for who you'd like to see, or what topics you'd like us to dive deeper on, Rachel just shared an anonymous feedback form. Please go fill that out. We would absolutely love to hear from you on that form and also on LinkedIn as well. And at the Hangout, we love hearing from you here in the chat today and live.
Doesn't matter your years of experience, what your title is, what industry you work in, what languages you use or don't use. And we really encourage you to connect with each other in the chat. So share a little bit about yourself, say hello, where you are, what role you're in, and what you like to do for fun outside of work. And I also really encourage you to share your LinkedIn, share a link to your website, something where people can find you because a lot of times we don't have our full information on Zoom and we meet wonderful people and then they disappear forever, right? If you are hiring, please share a role, share where it's located, a little bit about it, how people can contact you. And if you are looking for a role, please likewise let our community know what you're looking for.
There are three ways you can jump in today. You can raise your hand on Zoom. We will call you. You can ask your question live. You can put your question in the Zoom chat, put a little asterisk next to it, and Rachel or I will read it for you. Maybe you're someplace busy, maybe your mic doesn't work. That's totally fine. And then if you wanted to ask anonymously, there is a link to Slido where you can do that, and I think Rachel will put that in the chat for us.
All right. With all of that said, I am so excited to be joined by our co-host today, Sophia T., Senior Director of Data Science at Penguin Random House. Sophia, welcome. I would love for you to tell us a little bit about yourself and what you like to do for fun outside of your work.
Introducing Sophia
Hi, everyone. Thank you, Libby, for the introduction. So as she said, I'm a Senior Director at Penguin Random House. So my team primarily deals with marketing automation as well as a bunch of different evaluations that we do for the marketing and sales teams. So we have about 10 data scientists in the U.S. team and a bunch of different data scientists all over the world.
And yeah, I mean, the whole team, some of them do pricing work, some of them do forecasting. So depending on which division that they're working with primarily, they're using different models. Outside of work, so I'm a mom of a six-year-old and a nine-year-old, so I'm always busy at a lot of kids' birthday parties, I guess. But I also really enjoy doing art. I recently took an intro to drawing class. I'm, you know, not great, and it's very much still a side hustle. I should probably keep my day job. But yeah, that's what I do in addition to yoga.
Career journey to publishing
Awesome. Thank you so much for sharing that. And I know that you have a pretty varied background. I do too. I always love hearing about people's really interesting career journeys. So I think that your background, you were in like finance and marketing and tech, and now you're in publishing. So what drew you to data science initially, and then how did you end up in publishing?
Yeah, great question. So as you saw on my resume, you know, I started out at a hedge fund, DE Shaw. So when I was graduating from undergrad, I actually went through kind of like the regular recruiter process, and DE Shaw was one of those companies that came on campus. I was in a very specific program at Northwestern called Mathematical Methods and Social Sciences, and I double majored in econ as well. And so those were like feeder programs to a lot of financial companies. So like the financial companies would come on campus and recruit all these people. And so I kind of felt that like that was the track that I should go on, even though I didn't have like a lot of passion for finance.
So when I ended up at the hedge fund, you know, I kind of felt the job was a little bit, you know, less than exciting. But you know, the pay was great. We had so many amazing perks. You know, back then in 2007, before the crash, every single hedge fund was just, you know, had too much money to spare. And so we went on these lavish trips like every year, and it was great. So I spent four years there. And then finally, I said to myself, what am I doing? Like, is this what I want to do for the rest of my life? And obviously, the answer was no. So I really wanted to do something that was more meaningful.
And so I tried to think of like, what are my skill sets? And you know, how can I marry them with what I truly want to do? And I felt at that time that I really wanted to help people. So I was thinking that since I was good at math, right, I wanted to be able to apply math or statistics to whatever new industry I ended up in. And I felt that doing a master's in statistics would get me there. Or at least, you know, leave many doors open. So the idea of data science didn't even cross my mind at that time. I don't think I think data science was kind of a nascent term back then. It was more than a decade ago. And so at Yale, you know, we use a lot of R, actually. It was like the first coding language I ever used, R and SAS.
And so the first job I got out of Yale was essentially a marketing mix agency. So we would provide our services to large CPG companies. So, you know, companies like L'Oreal, Avon, you know, and talk to the chief marketing officers and say, like, these are the marketing tactics that work for your company. And, you know, how we can optimize your spend for you. So after that, I jumped over to a digital agency where I did a mixture of marketing mix modeling, as well as digital attribution. And digital attribution now, I think it might be called a different name, but essentially what it is, is analyzing a consumer's touch points throughout their online journey and trying to figure out what are the touch points that actually led to the conversion online. And you have to understand that back in the day, it was the Wild Wild West where people could be tracked throughout their online journey. But now I don't think it's as easy. But so that's basically what digital attribution is. It's a giant logistic regression, like one on zero, whether they converged or not.
So after that, I got a big break and I got an in-house position at Samsung, where I was essentially the first female data scientist in their corporate strategy team. And that was an incredible experience for me. Because it was the first time I actually got to see our recommendations being discussed in the boardroom with, you know, the CEO and all the board and being used. So, you know, it could be as ad hoc as, like, the S7 is coming out tomorrow. Like, tell us what are your forecasts? You know, it's just it could be anything that the CEO or the CMO wanted at any time. So obviously, you know, it was a very fast paced job. But I think it was also a very rewarding and eye opening one. And then I spent about two and a half years there.
And before going to Verizon, where I was the principal data scientist. So in the supply chain, the ethos of supply chain is always how can we get the right thing to the right place at the right time. And so my job was essentially building these giant optimization models to ensure that every one of the 1600 stores around the United States had the right product at the right time. And so I did that for a couple years. And then I got promoted. So I essentially rose up to become a senior manager of the business intelligence team, where I manage about 15 people. And so that was a big step for me in terms of, you know, management responsibility.
And then after that, I made my way to publishing and you might be wondering, why publishing? So it's actually kind of a funny story. So my husband was a still is actually a lawyer for Penguin Random House. And he happened to see a presentation from our SVP of data science and, you know, he told me that I should definitely talk to him because the things that you're doing at Penguin Random House sound really, really interesting. And so I did. And they happen to have a position open at that time. And so it was sort of fortuitous. But I ended up here.
Data science use cases at Penguin Random House
I love that. That's amazing. Thank you so much for that context. And yeah, I'm also interested in the use cases in publishing. And there's so many different book lovers in the hangout. And we always share favorite books in the chat. So I was wondering if you could share about some of the problems that your team is working on now, or maybe an example of a data science project.
Yeah, definitely. Um, so one of the big things that we're working on is automating the entire process of campaign creation. Because, you know, Penguin Random House, and I'm sure a lot of other companies as well spend tens of millions of dollars in advertising every year. So the process of actually creating the campaign, it can be very manual. And so what we've done is automate that process from the start to finish. And so what used to be a two week long process is now a 30 minute process.
And so what used to be a two week long process is now a 30 minute process.
So that is one of the projects that I'm primarily on. And not only is just the resourcing cost savings, and time savings, but the actual performance of those campaigns have been equal or actually better. So that's so we're increasing the revenue and decreasing costs at the same time.
Yeah, yeah. So, um, I think I'm not at liberty to talk about the more, yeah, the more AI portions of it. But essentially, it's a mix, you can think of as a mix of automation techniques, combining several different packages.
Automation and job displacement
Yeah, so, first of all, thanks for popping on the call with us. So one of the questions I always get, because I'm a huge advocate for automating the manual, like, I, it's just made my existence to do something manually that could be automated. One of the questions I always get is, well, why should we automate this? Will it cost somebody their job? And I always say that the time can be better spent elsewhere. But I wanted to now that I have somebody else with a use case on a significant time savings, how do you see your team reallocating those time savings?
So we are not reallocating the time savings that are created. Because of this automation, I think that a lot of times, the time savings are with the marketing team. So they actually like that we're doing this, because they feel that some of these things that they're doing, the very extremely manual things can be very repetitive, and it's not where their core skill set is. And they really want to do the more creative things, rather than the things that are super manual and repetitive. But I definitely feel you when you say that people are a little bit, you know, worried about automations, because they've been doing the same thing for the last decade or so, right? And you come in and you, you automate that away. So but I think that rephrasing it a little bit might help. So sometimes it's about finding other opportunities for the company, where they can add value. And so that's, I think that's where the company's moving towards. It's like, where, where can humans add the most value, rather than doing the things that can be automated by a machine?
Forecasting and machine learning in publishing
So my question has to do with remainders. So publishing has lots of metrics that a lot of other industries don't see, and maybe, you know, wouldn't even think of. But I actually remember working on remainders quite a lot and trying to forecast remainders for different titles that were coming up. And so kind of tying that to a broader question is, how has the addition or the introduction maybe of, you know, maybe machine learning methodology and new data science tools, how has that improved some of these forecasts that you all do, you know, with remainders, with marketing, with other things?
Yeah, so when you say remainders, I'm assuming like the stock that is left over is what you're talking about, right? So yeah, I think that was a huge problem. And so about two years ago, the supply chain team actually met up with us, and they had asked us to help them improve their forecast accuracy. And so, you know, we have built a model that has drastically improved that. So we're using, so I think, I'm not sure about when you were there, but I think the amount of data that we're collecting now is also probably more and just like in a better format as well. So we have information about, you know, like metadata about the author, things that would help our forecast include like, you know, estimates from the sales teams and things like that. So I think maybe the different variables that are going in the model is not just more accurate, but also just more of it. So yeah, I think with machine learning, we've been able to actually help them improve their forecast accuracy quite a bit.
Yeah, I mean, my experience was 20 years ago, so not nearly the amount of data that you all have now. So I guess a follow-up then, because publishing is a bit of a, there are a lot of folks that are very traditional in the way they do things, and there tends to be some institutional inertia was my experience. And so I'm wondering how, what kind of success have you had in, you know, helping stakeholders understand newer methods that help you forecast better or anything like that? And how do you go about it?
Yeah, no, that's a great question. So I think there's certain divisions are more likely to want to, you know, adopt some of our machine learning model recommendations. Others are a little bit slower, like you said, because, you know, publishing has been around for a hundred years and, you know, they only had data scientists like recently. So, you know, but having said that, I think that the company generally has like a very, very much a growth mindset. I think the CEO, you know, very much encourages us to experiment with new techniques and stay abreast of new technologies, because we definitely don't want to be left behind, you know, in terms of all these new techniques.
So one, I think the whole company has this like culture of like innovation and experimentation. And then two, I think that if you talk to people at their level, and you understand what is their business, I think it makes it very much easier to make them agree with you. So try to try to understand first what their KPIs are. And then when you're talking about your model, you can talk about how your model meets their KPIs, because they are just as interested as you are in driving conversions, right? So at the end of the day, if you can show with testing that the model is actually generating more conversions, then there's no reason why they would not want to implement your model for their own performance agreements, right? So I think that just understanding their business and trying to understand it from their point of view is really important.
Experimentation and new tools
Yeah, I mean, I think that, you know, it's, it's always good to really do your research, especially with new techniques and tools, people are, you know, going to be a lot more skeptical. And if you're advocating for a tool to be used in a widespread way within your company, like, you have to expect that the spotlight is going to be shined on it. And every single aspect of it, from the security to legal to, you know, mark, like, everyone's going to have their own opinions. So just make sure that you really dotted your eyes, you know, so, um, yeah.
Um, so I'm, I guess. Yeah, no, that's a great question. I think that it's sort of part of our strategy to be staying abreast of the current technology. And so, um, there's always an ask to, you know, look at what's happening in the industry and report back, you know, what, what exactly can be used or not used or, you know, how are other people doing things that we may not be doing? Um, and so that, that I think is, is the best way because like, it becomes official, it's part of your learning agreement. So, um, yeah, I think, I think that's how we, we make sure that we're not, um, spending so much time in a day-to-day and forget to experiment with, with new technology.
Campaign automation deep dive
Uh, are you, sorry, which model were you talking about? The one where it went from two weeks to 30 minutes? Yeah, two weeks to three days. Yes. Or 30 minutes. Yeah, exactly. Okay. So there is a model involved in that process, but a majority of the time that was reduced was not because of a model. It was because of the automation. So it was as much of a software engineering project as it was a data science one, because essentially we automated some processes, for example, like, creating the image that goes on the Facebook ad, right? So that used to be pieced together manually, or we had to get, you know, a company to actually do it for us. But now it's all automated. So that's really where the process efficiency came in.
I was gonna ask just one follow up on the piece of automating content. Because I think something that we had talked about before as well, which was really cool from automating it is that you mentioned sometimes a book that is selling 10 books a week might not warrant the creative time, or creative team time. But now that you have this new process, you've actually like unlocked some additional capabilities for some of these books that aren't as popular, maybe?
Yeah, so that so that was for a different project that we automated, essentially, for Amazon's product detail pages, you know, you have these advertisements that you see at the bottom when you scroll down the page. And we automated that process to make it easier for the marketers to basically release out these advertisements at scale. So, you know, the way that Amazon algorithm works is sort of a black box to us. But we do know that if we have those ads at the bottom, it does make it surface out our products more frequently. So essentially, there's no cost in doing these advertisements. So there's no really no cost to us to get there, except for the time it takes to create the advertisement. So that's where automations can really help.
Team structure and hiring
Awesome, thank you. And that was a great, great follow up. Sophia, I am wondering, you mentioned that your team, at least in the US was about 10 people. Do you feel like that is a big team to you a small team? Do you feel like there are advantages to keeping things like small or in a core group? And are you all dispersed all over the US?
Yes, it is. I think it's a very small team. I mean, I think that it's all relative, right? To like other places. Yeah. How many people? Yeah, compared to Verizon, you know, it's a tiny team. Verizon probably has 10 data scientists just for the supply chain team. So, you know, do the math, and it's like, probably hundreds of thousands in the entire data science and AI organization. But I've also heard, you know, other companies, CVS, like all these healthcare industries have a ton of data scientists, right? So I think that this would be comparatively small. But I like, I almost think it's, in a way, a feature rather than a bug, because as a director, it does free up some of more of my time, instead of managing people to actually be innovating, like what Rachel was saying, there's a lot more time and space for me to be looking at new technologies. So I don't mind.
I see Alan asked a great question in the chat just now. Alan, do you want to jump in now?
Yeah, I think it's good timing. I'm just back after corralling my dog, who's losing her mind downstairs. I'm really curious, and I know we're like zeroing in on this campaign automation thing continually. I'm really curious if you could talk more about the roles that different people on the team perform in a project like that, because my sense, my guess is that there's like software development kinds of things there, or there's application kind of development, in addition to the data roles around the campaign information itself. So I'm curious if those folks are all on your team, or if there's coordination, and what that coordination is like, and sort of in general, like who wears what kind of hats, or sits in what kinds of roles for something that has a broad kind of, for a project that's really broad.
Yeah, so for that project specifically, the people that sit on my team, they are in charge of the model obviously, but also software engineering, and the whole productionizing the model as a whole. The two people that do report to me, they're both machine learning engineers, but they are basically full-stack engineers. They know everything from creating the model, to putting it into production. So it's a little different from, you know, a big company like Verizon, where there's the IT department to actually productionize your model for you, because we're such a small team, we do it ourselves. And the actual coordination of campaigns, and things like that, all lies within the marketing team. So there's a whole lot of collaboration between our team, and their team on a daily basis, I would say, because we are rolling out these campaigns company-wide, and so it would definitely not be possible without them. And you know, they have been awesome in getting support from all the different imprints and divisions, and it's a very large company, so it's a lot of coordination.
Great, thanks. It's great that you're able to, it's got to be really empowering that you're able to work as autonomously as you can, and right up through deployment and stuff, without lots of process bottlenecks, or you know, handoffs to other teams. Yeah, exactly. Thank you.
Yeah, thank you, Alan. Always has like the greatest questions, and I kind of have a follow-up, and it looks like Sonia just asked a similar question. So my question is, I feel like Sophia, you were recently hiring for a position, I don't know if it's still open, but I know that you were looking. So I was curious, along the lines of Alan's question, like what do you look for in a team member, a new team member, when you're looking to build your team?
Yeah, we are looking for someone. It wouldn't be reporting directly to me, but we'll be working very closely together, because of this project that we're just talking about. And we're looking for this role specifically, we're looking for somebody with recommendation model experience, preferably, but if not recommendation, any kind of ranking model would do as well. But yeah, if anybody's interested, definitely reach out. I'm on LinkedIn, so it's pretty easy to find me, I think. So generally, of course, one technical acumen is very important, because you have to go through several rounds of technical interviews with some people on the team. And then the last part, myself and another senior director on the team would also do a behavioral interview. Essentially, we just look, you know, for a cultural fit, someone hopefully that loves books, because it will be a little bit awkward if you're in publishing and you hate books. But that and also just, you know, somebody who's flexible, because, you know, we're a small team. And so sometimes, you know, you might be put on several different projects that might not be your own. And so somebody who's just very willing to learn and experiment with other tools as well.
Books and recommendations
Thank you. Well, on the topic of loving books, I see Joseph, you asked a question about books in the chat. Do you want to jump in?
Yeah, sure. I heard Penguin, and I was just been waiting for all the book talk. Yeah, I'd love to hear if there, you know, what you're reading now, or if there's something that you read recently that you had a recommendation on, and then separately, since this is the Data Science Hangout, if you or anyone else had a good, you know, fictional book, you know, that features data or data science, and you know, to recommend a good version of it.
I'll answer your first question, but I'll let the audience answer the second one. So the first question, I'm right now reading Between the World and Me by Ta-Nehisi Coates. I'm really enjoying it so far. You know, growing up in Singapore, and then coming to Northwestern US, like, I feel like I live in a little bubble. And so it's really opened my eyes to a whole new world. And I think it's just really important to this country's discussion about race and equality. And I've been quite moved by it.
And so the second question about the data science books, does anyone in the audience have a recommendation? This is such a great, great question. And I'm hoping to see some people's, you know, suggestions or to chime in. Because I'm at a loss. I read a lot of fantasy books, and like sci-fi, and I don't know if I have any data science characters I can think of.
Yeah, I find Foundation very fascinating. And Dr. Harry Seldon, and Psychohistory. It's all stats, you know, machine learning, analysis, that kind of thing, you know, pretty in the future. It's fun.
I love it. I love to see all these books shared in the chat here. And Libby, we should probably reshare your collection of all the books shared in the Hangout too. Oh my gosh, it's been so long. I need to update it. Okay, yeah, I'll go grab it.
Mauro, I see you had a question. Yes, yeah, I'm in love. I totally love Audible books, because I can do things and enjoy and learn. So my question, I often I run out of good books to read, because I run out of ideas, really. So I would like to ask you for a recommendation of a process. How did you find, you know, a book that is worth your time?
Well, there is a Today's Top Books section at Penguin Random House.com that our team put together. So it's a little bit of a shameless plug here. But Today's Top Books basically uses, you know, a bunch of different metrics. So we look at Wikipedia page views, Google page views, you know, Goodreads, things like that. And we compel it into a giant aggregator. And that sort of, it's supposed to come up with the trendiest, buzziest books of the moment. So a lot of people with book clubs, for example, would go there and find their books. But if you sign up for our newsletter on Penguin Random House.com, you will also get personalized book selections based on your preferences and based on your browsing history. And that, again, comes from the data science team's recommendation model.
Using Posit tools
If it's okay, I'm going to ask maybe a little bit of a selfish question for me, because I've been starting to work on a few more of our customer spotlights. And I was wondering, Sophia, if you could share a little bit about the ways that you're using Posit.
Yes. So as I mentioned, you know, I've been an R user since my graduate school days. So when I came on to Penguin Random House, one of the things that I did at first was build a giant marketing mix model using R. And so that model really is the foundation for all of the recommendations that we do to the marketing, sales, publicity teams, which are the tactics that actually work. And so eventually, you know, we productionize the model. And so it's sort of been run on a pretty regular cadence. And it's still being actively used for various sales teams requests. So, you know, I think R is definitely really important, not just for discovery, but also for actually models in production.
Data science skills for publishing
Somebody asked, what data science skills do you believe are most important for the publishing industry? And what should potential recruits be focusing on?
Yeah. So actually, it's not too different from data science skills generally. So, you know, obvious ones are coding skills. You know, not just, I guess, coding, but also like being organized in your code so that when you work in a big team, your code can be shared and understood by a lot of different people. And so if you're out or people need to make changes, you know, it's not a very significant endeavor. And also, I would say, you know, the ability to, you know, obviously, to productionize a model, which is not something that most data scientists have. But because, as I mentioned before, we do do the full, you know, the full stack of actually creating the model from creating it to productionizing it, that is important for us.
And then for publishing, I mean, listen, like at the end of the day, it doesn't matter how great your model is, if you can't explain it, it's really hard to sell it. So I think being able to speak in plain English about what you did and what the outcome is, and what you expect people to do about it is really important. And so it's, you know, a simple model is sometimes just as good. But it's really about the implications of the model that people are very interested in. And oftentimes, you know, they will ask you a lot of questions, and specifics about the raw data, some of which you may not, it may not even cross your mind as a data scientist to know, but it's, it's helpful to have like good business context. So you know what questions to ask before you even dive into modeling, because when you're presenting it to somebody much higher level, they're going to expect that you know what you were doing. So it's not just about the data modeling, definitely, it's a lot of context gathering.
It doesn't matter how great your model is, if you can't explain it, it's really hard to sell it.
Transitioning from individual contributor to manager
And there's another great Slido question, which I'm curious about as well. And that is how have you found transition from individual contributor to manager?
Yeah. Um, so I, I was always worried about that, because I thought, you know, as an individual contributor, that it's, it's so different to be a manager and be, you know, on the podium and telling people what to do. So initially, it was a little bit tricky. You know, especially not so much when I was at the digital agency, but especially at Verizon, when I went from, you know, managing two people to managing 15. And then a lot of the people on the 15 person team, you know, with decades older than me, it doesn't help that I have sort of a baby face. And then I'm short. So, so it was like, they, they had been there for, you know, 30 years. And here I was, like a fresh phase newbie, like managing them. So, you know, that was a bit of a transition. But I do think that confidence really helps. I had an amazing mentor when I was at Verizon. And one of the things he told me was that, like, you know, just be confident, because it really, it really, you know, it's a little silly, but what they say about fake it till you make it, it really does work. Like, if you just, you know, believe in yourself, and you exude that confidence, people would naturally gravitate towards you and be more willing to listen to you. But if you say everything in a very meek voice, like, oh, please, like, you know, it's, you just kind of like you're, you're sort of just, you know, throwing yourself under the bus.
So I think that a lot of times, one is, is really good also to have a mentor, somebody who can see things from the outside, and tell you what you are doing wrong, potentially, or, you know, how you can improve what you're already doing, right. But, yeah, at the end of the day, I think that a lot of times I was thinking, what would this person do if he was in my shoes, because you have to remember, everybody starts as an individual contributor, right? Like nobody, I mean, unless your father's the CEO, nobody starts off as a manager, right? So everybody starts from the same place as you. And so they have to have to overcome the same things. And so if you were chosen to become a manager, or, you know, anything up from there, there was a reason, right? So somebody thought that you had the capability of doing that. So why don't you believe it in yourself? So that's sort of the process that went through my mind. And I don't know, I just, I don't even recall why I was so nervous at that time. Because now it's just so natural to me. I'm just, you know, it just feels like it's second nature.
Finding a mentor
I love that. Thank you so much, Sophia. I see. Well, actually, let me ask you a follow up real quick, because you did mention the importance of finding a mentor as well. And I wanted to talk about mentorship for a little bit. How would you recommend we go about finding a mentor?
Yeah, that's another great question. I, you know, I've been thinking about all I had thought about, and a lot of it, I hate to say it, but a lot of it is luck. But if you were to do a selection process, you know, obviously, it's somebody you feel that is a little bit, you know, more pivotal in an organization who can point you to the right resources and, you know, obviously, preferably, you know, a couple of levels above you so that, you know, it's somebody who's been through the rungs and knows what it takes to get up there. But actually not, I mean, it's not necessary, right? Like, it can be just somebody that you admire, and it could be a peer that you feel that you can really learn from and who sees you and wants to help you. So it's really about selecting somebody who you think's got your back at the end of the day, right? So the reason why I say initially, like, it's better if it's a skip-level person, because that person actually has the authority to bring you up with them, right? But even if it's not that person, it's just somebody who's got your back, somebody who can see your good qualities and so is able to bring it out in you. That's who I would select.
Handling forecast deviations
So I remember you talking about having situations where you're asked to do some kind of forecast or prediction and say, okay, we have a new product. And so when you did that, sometimes there's a high stakes, okay? And then when you roll out your model, your prediction, and then somehow the result deviates from when eventually we get to that point and say there's a deviation. How do decision makers typically respond? How critical have they been in your experience? Like, hey, you said this was going to happen and it didn't go the way you said. And this might be based off some other factor that wasn't considered something that blindsided everyone, or it might be because someone did something based off your prediction and then it didn't happen. But have you had situations and then how do you respond to those?
Yeah, that's a good question. I don't actually manage the team that does the forecasting. So I don't know what sort of like the day-to-day... No, I'm speaking about like previous, like your role, I think Samsung mentioned, like, you know, your experience. Right. Yeah. It's a little tricky. Nobody really has called out our predictions in that way. I think that the only thing I can remember at Samsung was when we reported out on a marketing mix model and the results weren't what the CMO thought it would be. And so she was not as happy about the analysis done. And so she was very critical of it. Right. So, you know, I mean, nothing happened. Like, you know, at the end of the day, like they can scream and yell as much as they want, but the data is the data and data scientists can only tell you what the data says. Right. But I did learn from that experience, though, you know, I'm a lot more careful now when it comes to reporting out values to, you know, anybody, an executive team, obviously. And, you know, you just always have to remember who you're speaking to and what that means for data KPIs, because if you're going to be delivering news like that, you definitely want to make sure to show that this is based on a model. So, you know, it's not that you're giving them exact numbers. It's not a point estimate. It's, you know, it's something that has a confidence interval. So it's very helpful to show what it what you know, what the interval might look like. And so that they won't come back and say, hey, you didn't you didn't make exactly 100% this number, but then you can if you give them a confidence interval, you know, you're bound to like lie somewhere at that confidence interval. I think that's a lot more helpful than just giving them a point estimate.
Fostering curiosity and innovation
Thanks. Thank you, Sophia. And somehow we have reached nine minutes at the top of the hour. I don't know how this hour goes by so quickly. But I wanted to ask a little bit of a career advice question from a management perspective. And that is, what steps do you take as a leader to foster a culture of curiosity and experimentation and innovation within your team? Or what would you like to do more of?
Yeah, definitely. So within my team, I think that we, I think I might have mentioned this before, but we actually have official things that we need to report out on that requires us to do experimentation. So if, you know, a new tool comes up, my manager might say, hey, can we do some research into this and compare it with all the other tools that we have looked at before and see if this is a viable product. So the official route, but also, I really encourage the people that work for me to go to conferences, go and, you know, see and meet with other people and find out how they are attacking certain problems. But obviously, you know, with the time constraints, it's not always possible. So that's definitely something I would hope to do more in 2025.
Other data science applications at Penguin
Thank you. So I love getting to learn that the today's top books was part of the work that your team is doing. And I see somebody else asked a question on Slido that was, what other aspects of publishing is data science involved in at Penguin? Are there a few others you could share that we haven't talked about yet?
Yeah, I'm trying to think. You know, a lot of it is strategy. So, for example, like, we could help with tagging authors with the occupations. So, for example, like, if an author was, you know, also a singer versus a politician, like, which of those occupations actually sell, which memoirs would sell more because of their secondary occupation. So things like that, where, you know, we do sort of like, we help automate previously very manual processes. So it really, you know, it could be a lot of different things. But that, even though it's such a small thing, does impact, like, strategy as a whole.
Keeping up with new technologies
So you mentioned working with new technologies. So how do you keep up with the latest technologies and trends? And to sort of add on to that, like, what are you excited about? What new stuff are you seeing that is getting you going?
So I do, I mean, I do attend a lot of different, a variety of different conferences, and I get invited to just a lot of different talks. And so I think that is helpful. Definitely, you know, just reading up in the news about various things that are happening. And then, you know, lately with ChatGPT, it's like, it's incredibly easy to just summarize what's going on nowadays. It's, you know, it's like a cheetah's guide to everything, right? So I've used that a bunch. And I'm just really excited to see, like, where this is all going. As in, is everybody going to be on Google? Is everyone going to be on ChatGPT? Like, what, where is the future leading towards? That's what I'm most excited about. Because obviously, that's going to impact retail as a whole, right? And is OpenAI going to try to monetize this somehow by making us pay to, you know, surface higher in the algorithm? Like, that's all, like, very, very fascinating to me, where search is going.
Measuring ROI on data science projects
Thank you so much, Sophia. I have maybe one last question here. And I haven't really fully formed a question in my head. But when we were talking around, like the two weeks to three minutes, or 30 minutes, I feel like that number just really sticks out for people. And a lot of people are trying to come up with their own ROI metrics. And it's so hard to do that. And I was just wondering, how did you go about that? Or what is the process that you use internally?
Yeah, that's always tricky. Because I would say a lot of our projects do not have very clear KPIs like this. But when it comes to process automations, I think it's so much easier to get those concrete numbers. But so even for, you know, like the marketing mix analysis that I was talking about, that's really hard to measure. Because, you know, so if the marketer says, oh, I put this recommendation into action, like on this date, even if you were to measure it, it's still not even if you were to measure, you know, what is the ROI pre and post recommendation being implemented, it may not necessarily mean that that is due to your recommendation, because there's so many other things that could have been happening, right? So, yeah, I mean, for the marketing mix, I think it's a little bit trickier. But for process automations, I think it just naturally is much easier to calculate the actual time savings. So, but yeah, I mean, I do, I do make an effort to sort of every year or so I do a wrap up, and I try to put everything into numbers on a spreadsheet, so that I can see from my point of view, like, what is the highest priority I should be giving and who I you know, how to allocate my resources that way. But yeah, it's not always very obvious.
Yeah, it's just it's something that's been top of mind for me lately to think about when even when someone comes on as a new customer, how do we help like, identify what's that first project they're going to work on? And what's the process now? And how long did it take? And then after? What do they get to? And what's the impact?
Right. Right. Yeah, I mean, we've been doing a bunch of I don't know if this is helpful, but we've been doing a bunch of AB tests for one of the automations that we've done to see like, you know, compare it to another sample size of titles. Like, and obviously, you know, you balance it by format and division, all that stuff. Like, how did it do once we had that advertising in place? So things like that, like, I guess would be a bit more controlled.
Closing advice
Thank you so much. So I know we have one minute left here. We've already talked about so much great advice. But is there any career advice or anything left that you'd want to share with us? Maybe some advice you'd give yourself 10 years ago?
I'm trying to think of myself 10 years ago, and I'm having trouble pulling up an image. Um, I would say, just don't be afraid of trying something new. Um, just because I feel like a lot of times people have called me like, like too brazen, or, you know, like, for example, like for some jobs, you know, they put on the description, must have 10 years of experience or more. And, you know, back then I had like five years experience. Like, I never cared about that. Don't let people tell you what you can or cannot do. Like, you know, you, you try it. And the worst thing that can happen is that they don't accept you. But I'm, I'm all I always think that you should be brazen a little bit when it comes
