MLOps for a billion pizzas a year | Zack Fragoso @ Domino's Pizza | Data Science Hangout

I think one of the skills that I had to develop as I transitioned from individual contributor data scientist to senior data scientist to people leader in the data science space is being really, really good at capturing the value chain.

The other question was on mix of internal and external data. I feel very fortunate at Domino's that we have a really, you know, what I think of as a really well curated data science mart, where we have access to a lot of information about a lot of information about our stores, our orders, our customers. And it's really in not perfect, but it's in a very good place in terms of like accessibility to our data science team. So the amount that we have to look at third party data is not huge. So it's definitely primarily we're looking at Domino's own data. But there are certainly areas where you look for third party data. You know, on the store siting team, I purchased geospatial demographic data. We also purchased like credit card data, weather data, event based data. So I would say predominantly Domino's, but certainly we bring in third party in certain cases.

Points for Pie and favorite use cases

No, that's really cool, honestly. So certainly our research science team is involved in, you know, consumer research around new products and yeah, anything that we, if you notice like Domino's doesn't take like a limited time offer approach where we're like, you know, putting out things and testing them like that. We really put a lot of effort into our product research and like really ensuring if we're going to release a new product that it's going to be a creative to the business and it's going to be well received by customers. So we're not like, we're not the brand to be putting out new products all the time, but when we do, they usually do really well because we do a lot of research ahead of time.

Now, this is like a really cool, fun, like LLM case study, I think is like, hey, if we looked at generative AI, like what kind of crazy pizzas or products could we generate from AI? That sounds like a fun one that kind of sparked my thinking on that.

It was the Points for Pie program that launched in 2019. And if you guys didn't see it or don't remember what it was, was we would give you Domino's loyalty points for any pizza that you ordered. Didn't have to be Domino's pizza. And how you redeemed those points was you would go into our app and you would scan your pizza. So we built a pizza, not pizza model to detect whether that pizza was present and then you would get the points for loyalty points for that pizza. So that's one of my favorites. Got really good PR. Not everything that we do as data scientists gets to end up in a marketing window that gets seen by everybody. So that was a really cool opportunity.

Model monitoring and drift detection

This is a big area of opportunity for our team. I think we do do kind of model accuracy assessment and we get daily. The whoever data science team kind of owns that production model is kind of responsible for monitoring that model accuracy and flagging any drift that they see in our ability to make predictions.

And we do have some cool use cases of detecting anomalies. So for example, like with our delivery time estimation model, this just happened recently. There's a new manager in one of the stores and he switched everyone's position code to driver. So I don't know if it was easy for him or whatever to do that, but we started giving really, really fast delivery time estimates for that store and they're like wildly inaccurate. The customers were getting pizzas in like an hour or something and we would say that they were going to be there in five minutes. So we shut down the model for that store.

But where I want to take us from the leadership perspective on this group is really creating that single pane of glass where we can have like, I almost think of like the command center Star Trek kind of thing that is like, hey, this is my control panel for all my machine learning models that are in production and kind of being able to assess just the general health of the service. Like, is it up and running? Is the throughput what we expected? But also like, is the data coming in what we expected? And is the accuracy what we expected? So we can like land there and say, you know, hey, Model X, Y, Z, is it green health? Model Y is yellow. We need to look into that. And then Model Z is red. We need to shut that down right now. So that's my dream, my vision. Making that a reality is on the road map.

Languages, tools, and Posit Connect

So I want to say I am not a programming language Nazi. So I think we try to be open and flexible to the languages that team members want to use. That being said, we've kind of landed on Python as our production software language. And really, the decision was made early on to kind of standardize on one language for our production level work. Just from an efficiency standpoint, getting the right, creating that reviewer, having reviewers reviewing one language, not multiple languages, and not having to write two sets of documentation for everything that we're doing. It was a really practical decision to kind of land on one language for the production stuff. But in terms of like R&D experimentation, data wrangling, we are customers of Posit Connect. So even like some web app type development, we do an R.

Yeah, one of the really cool use cases around Shiny apps or like our Posit Connect platform is really in the area of like model transparency and like demystifying the black box of models a little bit. So one of our use cases is, you know, when we're working on a new model, our business partners have a lot of questions around, well, how would the model react if X? Or how would the model react if Y? Like how would the model react if we get a school lunch order and there's a hundred orders placed at one time? How would the model react if, you know, we had a snowstorm? Like what kind of labor would it recommend?

So what we've done is for some of our forecasting models, creating a Posit Connect application that just has that model on the backend and user inputs for the inputs for the model. So the user can go and experiment and do these what-if scenarios and see what the output would be. And that's really helped us achieve some comfortability with our partners. And I think it's a cool use case of that platform.

Advice for PhD students entering industry

There's two things I wish I could tell myself in my first year, you know, out of my PhD and into the industry. And one is, and I think it's related to another question we had today, but it's related to kind of this model accuracy, right? When you're in academia, like your objective is to find the best fitting model and to really look in order to move the field forward. That's where your focus needs to lie. Really, it is building that best model possible.

But when you get into industry, the best model isn't necessarily the one that is going to be the most valuable, right? Maybe there's a simpler model that is more parsimonious. It's quicker to get out into production and start generating value. And there is a case to go with that simpler, quicker to deploy model than taking the time to do the full experimentation and really achieve the most optimal accuracy that's possible. So I had a few like scenarios where I was just taking too long to deliver because I was in this academic mindset of I needed to find the best solution. So, and not the most valuable solution.

And then the other area is, you know, at least in my academic experience, like intellectual debate was really kind of something that you become second nature, right? You know, I really cherish the times in my PhD education where I got to, you know, challenge my classmates thinking and push back. And, you know, what came out of that is usually mutually beneficial to everyone involved. You can't do that in industry. Not everybody is really thinking that way. And really not everyone is expecting to join a meeting and be part of an academic debate. So, you know, that is one thing I learned early on is like, hey, why is everybody like, why does no one want to talk to me? And you can't challenge everyone's ideas all the time. So that's another lesson I wish I knew.

Engaging people on the emotional plane

So I think what's really important is to engage people on the emotional plane first. People have an emotional attachment to their work. It's what they spend. Most people, I would say, see the people that they work with more than they see their significant other sometimes. And the work that you do is such a huge part of your identity. And the projects that you work on at work are become a part of your identity. So when you're challenging someone's ideas, sometimes you're challenging their identity and the value that they view themselves. So really understanding that and meeting people on the emotional plane before you dive into the logical plane, I think, is a good strategy that I've tried to utilize. I'm not always perfect at it, but that's my pro tip.

So when you're challenging someone's ideas, sometimes you're challenging their identity and the value that they view themselves. So really understanding that and meeting people on the emotional plane before you dive into the logical plane, I think, is a good strategy that I've tried to utilize.