Ralph Asher & Laura Darby Rose | R in Supply Chain Management

Optimization modeling helps us figure out these relationships and find the optimal solution that takes into account the relationships between the costs.

If you know enough to write a script, you probably know enough to do a project that will really impact your company.

Okay, so how the project began. So imagine it's spring 2020. Actually, you don't have to imagine, right? It's probably all still fresh in our minds. And you're staying home, being a good citizen, and you're a little bored. So what did I do? Created a little Shiny app, a simple Shiny app for time series forecasting. And shout out to the R-Ladies organization. I actually got the inspiration for the app from R-Ladies St. Louis meetup. So, you know, I'm working remotely and I want to impress to my manager that, hey, I'm being productive. So in one of our calls, I showed him the app and he was impressed with it. And he thought, well, maybe we can replace the SAS that we were currently using for stat forecasting with this app or something like it.

So the SAS that we were using costs about $200,000 per year, which for my company is not small change. We were definitely in a cost savings mode. It was also generating negative number demand forecasts. So keep in mind that that's what do you even do with that, right? We're not going to sell a negative number. The IT support for this company could not figure out what was going on. So it was costing us a lot of time and effort to basically modify these forecasts.

So after my manager's initial approval, I collected some demand data. We chose about 30 SKUs that we thought were representative of the business. And I did a rolling forecast accuracy analysis. We wanted to have an idea before we proceeded further, what kind of accuracy we could expect if we switch to the R forecast package. And specifically, the ETS function, which is the exponential smoothing, commonly used in univariate supply chain forecasting, also the ARIMA function.

So the ETS function was actually comparable to what was used or what we had enabled in the SAS. Various exponential smoothing models were selected by the demand forecaster. So I ran this rolling forecast accuracy analysis. And to give you an idea of how perhaps not efficient I was at the time, and I'm sure I'll be saying the same thing about myself in a couple years today, I had three for loops that I used to do this. And anybody who's programmed in R for a while knows that's generally not the best strategy. But it did work. And we were satisfied that, hey, just doing like setting the ETS algorithm to best, you know, best fit, minimizing the AICC, that was comparable to what we were getting in terms of forecast accuracy from our SAS.

Also, the benefit was, hey, we didn't need manual intervention from the demand analyst or manager. So we made the decision to move forward with this project. And then we had to get the buy-in from the IT group at Mallinckrodt. It wasn't difficult to get buy-in for a cost savings project. They were also frustrated with the lack of support that we were getting from the SAS company and the time required that we were all spending on doing useless tasks.

Setting up the enterprise R environment

So up until then, I had written some other scripts that I used for various data wrangling tasks, the kind of stuff you do in supply chain analysis, right, or demand planning, which is my specific area of expertise. And we didn't want to just have me running scripts on RStudio Desktop, as useful as the free version of RStudio is. We wanted more enterprise-capable options. So we looked into RStudio Server Pro, and I believe the most recent version of that is called Workbench. Then we also contracted with Lander Analytics, and they helped with server setup and maintenance and then application installation. So it is a Linux server, for those of you who might be more Windows people. And at Mallinckrodt, we're more of a Windows shop, is what I've been told by our IT group, so definitely needed a little bit more guidance in setting that up. And we had both a production and dev server that were supported with them.

Okay, so we're moving forward in this project. We made a decision to go forward. The IT analyst who was responsible for maintaining the data used for SAS forecasting needed to collaborate with me to replace the data management. So not the forecasting, but those of you who might have used the SAS at your company for forecasting or other things, note that the SAS can sometimes do functions that are not just the main function that you may have purchased it for. So we needed to do in-house solutions.

So we had tables for both cleanse history and non-cleanse history. So cleanse being removing outliers, sometimes you might decide to not include all history for consideration in the stat forecast model, because there might have been a structural change, something like that. So we wanted to move away from Excel spreadsheets. There was definitely a time and a place for spreadsheets, but we just didn't feel that was in the best interest of the business. So we used the DBI package, which probably many of you are familiar with, and then the ODBC package. And just using some basic SQL, we got those set up, and it was very easy to connect to the AS400 system, pull in those tables that I would need to estimate models and calculate forecasts.

We also changed our outlier cleansing process, as I alluded to in the second point, and that is because we had previously cleansed the history directly in the SAS application. So we just moved to a pretty simple process. And we had an Excel upload. If you needed to remove an outlier, you'd upload an Excel file to our AS400 system. So probably could have been – well, we're changing the process actually now, and I'll get more to that in the future. But, you know, it worked in the initial stages, it worked, and sometimes you have to go with that.

And then we also needed a table to store stat forecasts generated from the R script. So as I said before, no Excel files involved. We were going to write that to a table. And then we also needed another file or table, so to speak, for storing historical forecasts, and that is to measure forecast accuracy.

Okay, so we were able to use RStudio Server Pro for other uses besides stat forecasting. As I alluded to earlier in my presentation, I had written other R scripts to do a lot of data analysis and wrangling tasks, the kind of things that you don't want to spend a lot of time on if you can. Measuring weight of MAPE. Weight of MAPE is what we use to measure our forecast accuracy. Checking the commercial forecast upload versus the download. I am responsible for loading the commercial partners forecast to the ERP system, and depending on how things go, we might have kind of phantom forecasts hanging out in the ERP system.

Also, last year with the – well, I said last year. I guess it's two years ago almost now. We had a lot of orders for acetaminophen with the start of the pandemic, and our product director needed some help in understanding how to allocate orders versus forecasts. So it was very simple to set these jobs up to run on the server using Cron R, and that's – as you would expect by the name, it's an R package which interfaces with Cron.

So the scripts take a lot of time, all these jobs set up. We could compile a lot of data reports, no intervention, or maybe a little bit of just give it a check from demand managers. And then the real test is when I was out on short-term disability this past summer. The jobs ran well. I saved my coworker a lot of time having to cover my responsibility as well as his.

So my coworker left the company in November, so I've since been taking over his former duties. And I've been writing various R scripts to kind of hopefully improve some of the processes, and it really has made it possible. Scheduling these jobs on the RStudio Server Pro interface has really made it possible for me to do my job as well as his previous duties.

Project outcomes and future improvements

Okay, to summarize the project, you're probably wondering how long did all this take. It took about 11 months. And I will say this, you could probably do this kind of project where you would totally replace the software you've been using in less time. We happened to have a lot of time based on when the contract with the SAS was going to be up, and so we had several months in early 2021 where we were running both concurrently.

There are definitely improvements to be made to the forecasting processes, and I can tell you that as well as anybody at the company. That's what I deal with day in, day out. But the initial goal of replacing the SAS with a solution at least as effective was completed. So we consider that the initial stage to be a success. We measure on the lag three forecast accuracy, which basically takes into consideration essentially it's sort of like lag four. But it's been about the same, depending on the month, a little bit worse, a few percentage points. But we did meet our goal for 2021. So my boss is happy. I'm happy.

And the cost savings is definitely worth it for a slight trade off in accuracy. And we are still using pretty simple models like, you know, ARIMA, exponential smoothing models. So we hope that as we introduce new models, we will improve accuracy over time.

And then I alluded to this earlier in the presentation. So I started out with the forecast package in summer 2020. And then I found out, I guess it was probably old news, but it was new news to me at that time about the tidyverts suite of packages, which is kind of the replacement for the forecast package. And so I switched to that. I switched over the code that I had written to that and took a little bit to kind of make sure I understood everything. But it's a great, great suite of packages that includes Fable, Feast, Sybil, Sybil data, I think. So I have found those to be really useful for automatic forecasting. If you have a lot of skews and you don't have time to go in and tweak parameters, figure out the best ARIMA model, figure out the best exponential smoothing model, for instance, like univariate forecasting. The models they produce, the forecast for supply chain are stable forecasts. They don't require a lot of intervention. You're not going to get anything unbelievable or crazy, I guess.

Okay, so I mentioned the Shiny app earlier. So the impetus of the project started with a simple time series forecasting app. We made the decision to forecast the script scheduled as a monthly job instead of an app. And part of the reason for that was it seemed simpler at the time, time savings. Also, it ended up working out, we weren't really ready to go with RStudio Connect at the time, which we would have needed to host the app for an enterprise, in an enterprise capable way.

So after that, I developed another Shiny app, kind of based on the first one, which is more, uses a number of different packages, the tidyverts packages, of course, DT , I think Ralph mentioned that in his app, to edit history, you can change parameters, basically interfaces with our AS400 system via SQL. So you can, through the app, you can interactively edit our data in our company's systems. And this allows me to, especially for those high volume, really important SKUs, to go in and pick out specifications I think are optimal.

So future state, it's always good to be thinking about continuous improvement, right? So I'm working on another Shiny app to visualize our generics data, that's our finished dosage data, with a pivot table, probably using the pivot tabler package, not sure yet, I might look into a few other packages. And then we either look at monthly or weekly doses or bottles, depending on what the user would want. So I currently use this very slow tool, using a legacy system I will not name. And we use this for creating visualizations for the demand review slide deck, where we look at our customer demand, compare that to our, look at our forecast, see what's changed, look at our financials, that kind of a thing.

And then another project we started last month is looking at the commercial team forecast. So they currently don't use much statistical forecasting. So they are interested, though, in looking at their end purchaser demand and how we can roll that up under distributor customers. Mallinckrodt is a manufacturer, so we sell to some big time distributor customers, you may have heard of in the news. And there's certain customers that are pricing contracts. And then those distributor customers sell to, you know, more of retail pharmacies kind of a thing. So we're hoping to get a better picture of our demand to kind of tie that out and build a better forecast for them in that way.

And then in terms of model improvement, those of you who are familiar with the tidyverts packages probably know that you can do hierarchical forecasting as well. We did explore that a little bit in the initial stages of the project when we were trying to understand what we wanted, how much we wanted to experiment with different kinds of models. My boss thought it wasn't a good idea at the time. But now that we've completed the initial stage, we'll probably look into that a little bit more. And then also machine learning methods for time series. We would need to do that on weekly data. We don't have enough monthly data really to make it worth it. But model time is a good package. So that is something that probably this year we are looking into exploring.

So to summarize, I just, a lot of this on this slide is common sense, but I think it's probably worth reiterating. You're interested in doing like a project where you replace a, you know, a SAS with an R solution, start sooner versus later. Something that we learned sometimes the hard way is that the SAS will perform other functions besides forecasting. So you need to come up with solutions for data storage and management. And those solutions might not be R specific, they need to be compatible with, you know, whatever you're doing in R for forecasting. Also, project management 101, I guess, make sure you've an estimate of the time involved on your part, as well as many collaborators, because you don't want to come up to the end of the project and figure out that they don't have enough time to give you to complete the project.

And then I wish somebody had told me this ahead of time, learning some project management skills is helpful. Probably a lot of us on this call have good technical skills. When I was trying to map out this process, which is my subsequent point, I had really no idea what I was doing. And it got done, but it could have been a lot more efficient.

Okay, and then I feel like I might be a little bit of a broken record here, but you don't want to realize you haven't accounted for something you previously relied on the SAS to do when you've almost completed the project. And as you probably expect, we learned the hard way. My coworker used the SAS for something I did not exactly know what he was doing. And I think there was a miscommunication between perhaps an IT analyst and him, not throwing blame, but we realized, oh, we never accounted for this. Now we're practically finished with this project. We need to make sure we have something that will allow him to do what he did with the SAS. So you don't want to have our experience there. And then I'll finish. Hopefully this is not too pessimistic of a note, but with a paraphrasing of Murphy's Law, if something can go wrong, it will. So plan for all the contingencies. And since we're all in supply chain planning, that shouldn't be too hard.

Q&A with Laura Darby Rose

Yeah, I can help with reading this. But thank you so much, Laura. That was an awesome presentation. Really appreciate it. But I did see one question that just came through quickly on the chat was, if you could please repeat the name of the package that you used, which allows the user to edit data interactively.

Oh, sure. Sure. DT. But I know, and Ralph mentioned this in his presentation, R hands-on tables, another good one. And there are other ones out there. I was familiar with DT, and I already gone down that path. So I was just like, I'm sticking with it. If I was building it from scratch, maybe I'd do R hands-on table. But yeah, there are several good packages out there.

Cool. Thank you. Another question from Slido was, what type of data are you forecasting on? Invoice sales, demand, POS?

Demand data. So we have two divisions. Yeah, invoice sales. Yeah, it's by request date. So it's not shipment. It's supposed to be representation of what our true demand is. So not taking into account any supply issues. So we sell business to business. So we supply a lot of the pharmaceutical companies you've probably heard of in the news.

Thank you. But Laura, one question was, if you had unlimited budget, would you prefer one modeling in R and then passing results to Tableau or two modeling and displaying in R and Shiny?

So R and Shiny is a clear winner for me. So I have used a commonly – I can't name names, but a commonly used data visualization tool, which is kind of a click and drag and everything. And I just don't find it. I like to know what's going on behind the scenes. And when I'm writing that Shiny code, I know how things fit together. But when I'm doing – if I was using more in Tableau or another tool, I wouldn't – and I know there's beautiful visualizations that can be done in Tableau. But I wouldn't have the same sense of control. So that's my personal preference. And I'm not saying it's the only – it's the right one.

Thanks, Laura. And I'm going to try and also pull people into the discussion and allow people to unmute yourself. I know this is dangerous with 300 people. But Eduardo, I saw you had a great question that you put into the Slido as well. And I was wondering if you'd want to read that one out loud and add some context there. And I can start by reading the question too. But it was, Laura, did you have to deal with uploading back the forecast to ERP? If that was the case, were there any guidelines you had to follow with IT?

So you're talking in the stat forecast, I assume? I mean, I know I mentioned doing a check of an upload. So I'm assuming you're going with a stat forecast here because that's what we – the models estimated and calculated are. So I actually worked – the nice thing about designing a system on your own – again, maybe I'm a control freak, but I hope not – is that I had a chance to work with the IT analyst. And he and I have a very good working relationship to say, okay, this is the kind of table we want to build. I need this field. I need this field. I need this field. So we kind of had a custom solution built. There – I think we had to have, like, permissions set so that anybody could – you could write with that account to the table. That's a pretty simple fix. So, no, they were basically like, what do you need from IT? And I was like, here's what I want. Here's what I want. And they said, okay.

Great. Thanks, Laura. One other question that came through on Slido was, how do you source snapshots of your forecast for later accuracy analysis?

Okay, yeah. I kind of did allude to that, I guess. So we are – every – this is something that IT team handles. Well, I guess I could theoretically do this in R. But at the end of the month, before the table that receives the forecast is cleared, that forecast, so all 67,000 now lines of it or whatever, we forecast three years out. That's just – nothing magic about three years, just how we do it. It's copied to another table that has – I think we store 13 months of historical forecast. And then when I measure statistical MAPE, the script pulls in from that historical table and it filters. I use some of the DBI interfaces. And there's – in the parameter argument, you can say, I want to pull in based on this criteria. So you pull in whatever might be your lag three forecast or lag one. So that's how we measure it. But yeah, we do track our stat forecast accuracy pretty carefully.

Open discussion and working group

Great. Thank you so much, Laura. And I see there are a ton of questions that have come in here too. So a few of them don't say if it's for Ralph or Laura either or both. So I

Ralph Asher & Laura Darby Rose | R in Supply Chain Management | RStudio

Transcript#

Introduction to supply chain network design

Optimization modeling for supply chain design

R packages and the Shiny app demo

Understanding constraints and trade-offs

Q&A with Ralph Asher

Forecasting demand with R — Laura Darby Rose

Setting up the enterprise R environment

Project outcomes and future improvements

Q&A with Laura Darby Rose

Open discussion and working group

Featured software#

rstudio

Shiny