MLOps for a billion pizzas a year | Zack Fragoso @ Domino's Pizza | Data Science Hangout
videoimage: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hi everybody, welcome back to the Data Science Hangout. If we haven't had a chance to meet yet, I'm Rachel, I lead Customer Marketing at Posit. I'm so excited to have you joining us here today. The Hangout is our open space to hear about what's going on in the world of data across all different industries, chat about data science leadership, and connect with others who are facing similar things as you.
We get together here every Thursday at the same time, same place, so if you're watching us in the future and want to join us live, there'll be details to add it to your calendar below on YouTube, but just make sure that it adds it for 12 Eastern Time so that you can join us live. I know some people had some issues around like daylight savings time or with the time zones, so it's every Thursday from 12 to 1 Eastern Time.
I always love to ask, is this anybody's first Data Science Hangout? Say hi in the chat because we'd love to welcome you in and say hello there. We are all dedicated to keeping this a friendly and welcoming space for everyone and love to hear from you no matter your gears of experience, languages you work in, titles, or industry. It's also totally okay if you just want to listen in and you can be a part of the Zoom chat party that you'll see happening here, but there's also three ways you can jump in and ask questions or provide your own perspective too.
So first, you can raise your hand on Zoom and I'll call on you. Second, you can put questions in the Zoom chat here, and if you're in a place with a loud background or something and you want me to read it, just put a little asterisk or star, as I call it, next to it. And then third, we have the Slido link, which Curtis will share or maybe already shared, where you can ask questions anonymously too.
Before we jump in here, I have two other notes I wanted to share with everybody while we're here. So the lowest price for the POSIT conference expires tomorrow, so I just want to make sure everybody knows that. If you are thinking of joining us for the conference in August in Seattle, I encourage you to go take advantage of that before that goes up by tomorrow.
And then next, I am on a personal mission this year to help connect people who are using POSIT across their own companies. It really pains me when I find out somebody didn't know they had POSIT Connect somewhere in their company and I could have helped connect them. And so our LinkedIn group might make it easier for you to find people on your own from your companies, but I would also love to help introduce you. And so I just made like a short quick Google form if you are ever curious and want me to help connect you with other people from your company.
But with all that, thank you so much for joining us today, and I am so happy to be joined by my co-host, Zach Fragoso, head of ML Ops at Domino's Pizza. And maybe we kick off the party in the chat with sharing what's your favorite kind of pizza. But Zach, I'd love to have you introduce yourself, share a little bit about your role and something you like to do outside of work, too.
Hey, everyone. Thank you for having me today, Rachel. Thanks for the invite. I've been really looking forward to this. I've participated in a few of these, and it's really cool to be on the other side of it presenting. So thanks for the opportunity.
My name is Zach Fragoso. I'm coming to you here from sunny Ann Arbor, Michigan today. And I'm kind of here to talk about the Domino's data journey that we've been on, and also as I was writing some notes for my talk, I thought, well, my journey is kind of weaved in there, too. So I'd be happy to talk about my career trajectory and opportunities, roadblocks I've faced along the way as well.
But just to get started with a quick intro, I'm the head of ML Ops at Domino's. So in that capacity, my charter is to make sure that we have the right technology and process available to our data science team to be able to deploy our models into our production environments and integrate them with whatever downstream platform we're integrating with, either our e-commerce platform or in-store technology platform.
My team also covers global store siting. So using geospatial data and machine learning models to identify the best places to build new Domino's stores and supply chain optimization as well. So looking across our network of stores and supply chain centers and finding ways to optimize that network to save us costs, but to also help the environment, too, which I'm really excited about that impact as well.
I play guitar and I just saw Dune II last night. I finished the book over the weekend and frantically, I wanted to finish it before it got out of theaters. And I literally went to the last IMAX showing of Dune II last night at 9 p.m. So I was up until like midnight, the last IMAX before Godzilla takes over the IMAX theater near me.
The Domino's data science team
I've been lucky enough to be a part of a really exciting growth journey at Domino's. So I joined the Domino's data science team in 2018. And around that time, I think we had 10 to 15 data scientists on the team. As a team stands now, we have around 80 data scientists. So pretty rapid growth over the last few years, really focused on the business impact we've been able to deliver on.
But we have 80 team members and they we have some in Seattle, we have team members in Ann Arbor, we have team members in Amsterdam, and we have team members in Singapore. So we're like truly a global data science team, which is exciting. We have some really cool partnerships, one of them being with the Posit team, which we're really excited about, but also NVIDIA and Microsoft and Docker. We have teams specializing in pricing, personalization, retail technology, store siting, forecasting, supply chain optimization, and MLOps. So cover almost every area of the business.
I think if I were to give one piece of advice, you know, for someone kind of going through this journey or about to embark on this journey, it's really everything starts with the data. I mean, it's so cliche now, but it's truly garbage in, garbage out. So focusing on having really high quality data that's not only exists, but is really in an accessible format for data scientists to be able to leverage to kind of extract that business value. I mean, any team that's starting, that's got to be the starting point.
I mean, it's so cliche now, but it's truly garbage in, garbage out.
MLOps challenges at scale
I'm going to touch on two challenges. The first one is scalability, right? I think when you talk about ML and AI and MLOps, there's a lot that can be done on a data scientist laptop and, you know, great accuracy and recall can be achieved on models on an individual's laptop, but there's a lot of challenges that arise when trying to scale that, you know, to a large system.
So Domino's, maybe, probably not a shocker to anybody, but we sell a lot of pizza, you know, we sell close to a billion pizzas every year across the globe. So everything that we do has to be looked through the lens of scale. You know, there's that volume scale, but also like our business is highly concentrated, right? So if you look, you know, at the pizza business, 80% of our orders are between 5 and 7, 5 and 8 p.m. And then that's on a daily level, but even if you look across the year, days like Halloween, Superbowl, New Year's, you know, represent a disproportional amount of our business. So we have to scale to be able to serve, you know, something like 6,000 orders per minute during Superbowl. So if we're doing machine learning that is kind of at the order level, our models not only need to be accurate, but performant to that scale.
The other one is process, right? Data scientists are experimenters at heart, right? They want to take the time to experiment with the data, experiment with different modeling types, and, you know, fitting that into a process that balances the flexibility and creativity with the control and versioning and reliability, finding the right balances is tough from a process standpoint. And that's a journey that we're still going on, to be honest.
Maybe one example there is like looking at different model types and the balance between accuracy and performance. So a good example that I always go to is like, like random forests are really great at achieving accuracy, but, you know, traditionally like slow on the inferencing side. So we might make the decision to do like a deep neural net even, that might be a little bit overkill to achieve the accuracy that we need, but it's more performant from a scalability perspective. So understanding those trade-offs and having our data scientists kind of keeping that in mind as they're doing their model development experimentation is, I think, important.
Delivery routing and store network optimization
I think those are actually the areas of opportunity that we really look for because they're oftentimes a win-win. Like, when we can get the customer a hotter, fresher pizza, it's a better experience for the customer. But we can also potentially reduce the miles that we're driving on the road, improve our labor utilization rates. So like, those are the kinds of efficiencies that we're kind of really after is those win-win from a customer and cost perspective.
So, yeah, we look at that from several different angles. One of them is just the store siting angle. It's where do we put our stores relative to our customers to optimize our network? Domino's, we have around 6,800 stores in the United States, and that covers around 80-ish percent of the U.S. population lives in a Domino's delivery area. So we look at our stores, not as single units, but as a network of stores and how do we optimize that network to better serve our customers? So that's like at the macro level, but we're also looking at it from even a delivery routing perspective. So what orders should be grouped together on routes to better service the customers at the most micro level?
Data contracts and feature engineering
And that is often step number one when we're embarking on a new data science or AI ML kind of project is data contract negotiation. So it kind of starts with that feature engineering, data exploration, initial model exploration phase and our data scientists kind of look at, hey, what are the most parsimonious set of features that we need to achieve the model accuracy that we're looking for?
And going through that experimentation, having other data scientists review that, making sure that we're not asking for too much. Then immediately after that, and phase number two is, all right, taking that to our business partners. And that includes our enterprise data warehouse team that includes like our e-commerce team or in-store platform team and saying, hey, this is the model we want to build and this is the data that we're going to need to kind of service that model at inference. And then it becomes a negotiation. And we literally call it like data contract negotiation between the partners. So and there's gives and takes, right? Because every data, especially when you're talking about real-time inferencing, right? Passing data, there's a real cost to it. So taking those things into consideration, balancing that with what's needed to achieve a certain level of accuracy is a big part of the initial planning for any project we have.
Supply chain forecasting and GPU-accelerated routing
I think about kind of two different flavors of model that we implement to optimize our supply chain network. I think one is forecasting. That's that's kind of key to all of this. When you're talking about optimizing supply chain, it's not because we want to you know, we can't fix anything that already happened. So we want to optimize into the future. So that kind of necessitates those time series forecasting models. And we kind of have a whole laundry basket of forecasting models that we implement at the, you know, from the hourly demand signature level to the store level, to the system level, we look at forecasting.
And then really the where the the next step of that is really then combining that with more combinatorial optimization algorithms. So not even like traditional AI or ML, but things like mixed integer programs, right, or vehicle routing solvers, right. And one of the cool things that we're doing in that area is, again, to address like the earlier topic of scalability, traditional like MIP or vehicle routing solvers, they're meant to solve logistics problems like overnight, because like orders come in, and then the next day, you need trucks, and they take the whole day to deliver those routes, and then they come back to this depot.
Well, for us, like, we need to route delivery drivers, they're taking, you know, 1000 routes a minute when you look across our entire system. So we need to solve those vehicle routing problems really fast. So we've partnered with Nvidia to leverage their coop solver, which is that is really cool with the first GPU enabled vehicle routing solver, and it's able to solve those problems really fast and get to almost the very, very close to the ideal solution.
Communicating data quality to leadership
Man, I see both sides of this coin, to be honest with you. And which I think is probably is at the root of what makes this a challenging problem. You have what I talked about earlier, garbage in, garbage out. Like the accuracy of your model, which I'll use as like a pseudonym for value. Assuming more accuracy is more value. The data going in that you have available, whether it's messy or even the scope of the data, like what you have is clean, but you're missing key pieces of information. That impacts your accuracy. And it's going to therefore limit the amount of value in some cases that you can deliver on.
On the other side of that coin is leadership has a point that says like, hey, we're not looking for the maximum amount of accuracy that we can see from your model. We believe that there's value in a lower accuracy model. And I think the answer to this question kind of lies in establishing a baseline and really understanding what you're trying to what you're trying to beat. In most cases, it's like a human decision. Or a very basic, like more like algebra type problem.
Like, for example, our delivery times. One simple way to look at our delivery times. If we didn't have if we didn't know how many drivers were available, how many people were working in our store, we didn't know information about the order or all the orders ahead of it on the make line. If we didn't have all of that, we could still the baseline model is just a look at the stores last 20 minutes. What was their average delivery time and just use that. So knowing that that's the baseline kind of helps you have that target. And then you can say, hey, we either do or don't have the data that we need to perform better than whatever that baseline is. I think that's maybe something that is more tangible to leadership.
Capturing the value chain and ROI
I think one of the skills that I had to develop as I transitioned from individual contributor data scientist to senior data scientist to people leader in the data science space is being really, really good at capturing the value chain. Because in data science, you might not be doing the thing that then immediately impacts the business value. Like you might be doing something two rungs down. And I'll just like bring up the delivery time estimate as an example.
Like, why does giving a customer a better delivery time estimate? Why is that valuable to Domino's? Well, if they get a better estimate, like we're more likely to meet their expectations. If we meet their expectations, they're more likely to order more pizza. If they order more pizza, then the company makes more money. That's like a really stupid, simple example. But like that can get really complex depending on what your business is. So being really good at illustrating and capturing that value chain, I think is an important skill to develop. And it's kind of an art and a science.
I think one of the skills that I had to develop as I transitioned from individual contributor data scientist to senior data scientist to people leader in the data science space is being really, really good at capturing the value chain.
The other question was on mix of internal and external data. I feel very fortunate at Domino's that we have a really, you know, what I think of as a really well curated data science mart, where we have access to a lot of information about a lot of information about our stores, our orders, our customers. And it's really in not perfect, but it's in a very good place in terms of like accessibility to our data science team. So the amount that we have to look at third party data is not huge. So it's definitely primarily we're looking at Domino's own data. But there are certainly areas where you look for third party data. You know, on the store siting team, I purchased geospatial demographic data. We also purchased like credit card data, weather data, event based data. So I would say predominantly Domino's, but certainly we bring in third party in certain cases.
Points for Pie and favorite use cases
No, that's really cool, honestly. So certainly our research science team is involved in, you know, consumer research around new products and yeah, anything that we, if you notice like Domino's doesn't take like a limited time offer approach where we're like, you know, putting out things and testing them like that. We really put a lot of effort into our product research and like really ensuring if we're going to release a new product that it's going to be a creative to the business and it's going to be well received by customers. So we're not like, we're not the brand to be putting out new products all the time, but when we do, they usually do really well because we do a lot of research ahead of time.
Now, this is like a really cool, fun, like LLM case study, I think is like, hey, if we looked at generative AI, like what kind of crazy pizzas or products could we generate from AI? That sounds like a fun one that kind of sparked my thinking on that.
It was the Points for Pie program that launched in 2019. And if you guys didn't see it or don't remember what it was, was we would give you Domino's loyalty points for any pizza that you ordered. Didn't have to be Domino's pizza. And how you redeemed those points was you would go into our app and you would scan your pizza. So we built a pizza, not pizza model to detect whether that pizza was present and then you would get the points for loyalty points for that pizza. So that's one of my favorites. Got really good PR. Not everything that we do as data scientists gets to end up in a marketing window that gets seen by everybody. So that was a really cool opportunity.
Model monitoring and drift detection
This is a big area of opportunity for our team. I think we do do kind of model accuracy assessment and we get daily. The whoever data science team kind of owns that production model is kind of responsible for monitoring that model accuracy and flagging any drift that they see in our ability to make predictions.
And we do have some cool use cases of detecting anomalies. So for example, like with our delivery time estimation model, this just happened recently. There's a new manager in one of the stores and he switched everyone's position code to driver. So I don't know if it was easy for him or whatever to do that, but we started giving really, really fast delivery time estimates for that store and they're like wildly inaccurate. The customers were getting pizzas in like an hour or something and we would say that they were going to be there in five minutes. So we shut down the model for that store.
But where I want to take us from the leadership perspective on this group is really creating that single pane of glass where we can have like, I almost think of like the command center Star Trek kind of thing that is like, hey, this is my control panel for all my machine learning models that are in production and kind of being able to assess just the general health of the service. Like, is it up and running? Is the throughput what we expected? But also like, is the data coming in what we expected? And is the accuracy what we expected? So we can like land there and say, you know, hey, Model X, Y, Z, is it green health? Model Y is yellow. We need to look into that. And then Model Z is red. We need to shut that down right now. So that's my dream, my vision. Making that a reality is on the road map.
Languages, tools, and Posit Connect
So I want to say I am not a programming language Nazi. So I think we try to be open and flexible to the languages that team members want to use. That being said, we've kind of landed on Python as our production software language. And really, the decision was made early on to kind of standardize on one language for our production level work. Just from an efficiency standpoint, getting the right, creating that reviewer, having reviewers reviewing one language, not multiple languages, and not having to write two sets of documentation for everything that we're doing. It was a really practical decision to kind of land on one language for the production stuff. But in terms of like R&D experimentation, data wrangling, we are customers of Posit Connect. So even like some web app type development, we do an R.
Yeah, one of the really cool use cases around Shiny apps or like our Posit Connect platform is really in the area of like model transparency and like demystifying the black box of models a little bit. So one of our use cases is, you know, when we're working on a new model, our business partners have a lot of questions around, well, how would the model react if X? Or how would the model react if Y? Like how would the model react if we get a school lunch order and there's a hundred orders placed at one time? How would the model react if, you know, we had a snowstorm? Like what kind of labor would it recommend?
So what we've done is for some of our forecasting models, creating a Posit Connect application that just has that model on the backend and user inputs for the inputs for the model. So the user can go and experiment and do these what-if scenarios and see what the output would be. And that's really helped us achieve some comfortability with our partners. And I think it's a cool use case of that platform.
Advice for PhD students entering industry
There's two things I wish I could tell myself in my first year, you know, out of my PhD and into the industry. And one is, and I think it's related to another question we had today, but it's related to kind of this model accuracy, right? When you're in academia, like your objective is to find the best fitting model and to really look in order to move the field forward. That's where your focus needs to lie. Really, it is building that best model possible.
But when you get into industry, the best model isn't necessarily the one that is going to be the most valuable, right? Maybe there's a simpler model that is more parsimonious. It's quicker to get out into production and start generating value. And there is a case to go with that simpler, quicker to deploy model than taking the time to do the full experimentation and really achieve the most optimal accuracy that's possible. So I had a few like scenarios where I was just taking too long to deliver because I was in this academic mindset of I needed to find the best solution. So, and not the most valuable solution.
And then the other area is, you know, at least in my academic experience, like intellectual debate was really kind of something that you become second nature, right? You know, I really cherish the times in my PhD education where I got to, you know, challenge my classmates thinking and push back. And, you know, what came out of that is usually mutually beneficial to everyone involved. You can't do that in industry. Not everybody is really thinking that way. And really not everyone is expecting to join a meeting and be part of an academic debate. So, you know, that is one thing I learned early on is like, hey, why is everybody like, why does no one want to talk to me? And you can't challenge everyone's ideas all the time. So that's another lesson I wish I knew.
Engaging people on the emotional plane
So I think what's really important is to engage people on the emotional plane first. People have an emotional attachment to their work. It's what they spend. Most people, I would say, see the people that they work with more than they see their significant other sometimes. And the work that you do is such a huge part of your identity. And the projects that you work on at work are become a part of your identity. So when you're challenging someone's ideas, sometimes you're challenging their identity and the value that they view themselves. So really understanding that and meeting people on the emotional plane before you dive into the logical plane, I think, is a good strategy that I've tried to utilize. I'm not always perfect at it, but that's my pro tip.
So when you're challenging someone's ideas, sometimes you're challenging their identity and the value that they view themselves. So really understanding that and meeting people on the emotional plane before you dive into the logical plane, I think, is a good strategy that I've tried to utilize.
Career advice and leading people
One of the best pieces of advice that I ever got from my mentor was you can't have everything all at once, right? You have yourself, your work, and your family. And when you're looking at your career and you're planning your career development, you need to look at it through the lens of your entire life and what it means in those other aspects of your life. Not just what it means for you at work.
Maybe there's a time when, let's say, you have young kids and you need to pull back at work a little bit and put more of your energy into the family bucket. And it might not be like that forever. You might then decide after a certain amount of time, you're going to put more of yourself into your work bucket to advance your career. And looking at your career development, through the lens of your entire life, and how it impacts not just you, but the people around you, I think is the compass that I try to use when I'm thinking about my career.
I think the biggest challenge that I see, I know I've experienced this, and I see it in others as well, is if you're coming out of a technical background for your education, you don't get much exposure on how to lead people. And so I think that's a big challenge. And when you go from a technical role to a people leader role in the technical space, that's often a gap that I see is like, how do you motivate people? How do you have difficult conversations? How do you align people's skill set with their interests? Or is that even important? Those are the things that I often see that are missing. The technical stuff is usually not the challenge is what I've found.
IoT and in-store data
So we do have data points at that level, right? There are items in the store that employees kind of engage with that then end up as data points in our database. For example, when they clear the make line, like they made a pizza, before they put it in the oven, they have to click a button that says I'm done making this pizza. And that's like a really important data point for us because it helps us optimize points in the store.
So I think to expand on this is like, of course we want more of that. Never ask a data scientist that they want more data because the answer is always yes. So I think we would love more data from IoT type devices, but we face two challenges there. One is cost, right? It's not like IoT in a factory where you might have like 200 different steps that you can kind of throw a device on and get a good pulse. We need to do that across 6,800 stores that are geographically dispersed literally everywhere in the country. So like getting those IoT devices in the store is a logistical challenge and a cost challenge, right? Because those devices are usually not cheap. Secondly, like our stores are franchise owned. So we would not only need to overcome those logistics and costs, but we need to convince franchisees that there's benefit in them doing that.
Well, thank you so much, Zach, for joining us today. I've loved this discussion and thank you all for the great questions as well.
