Resources

RStudio Connect | Cut down on the grunt work. Deliver insights more effectively with RStudio

video
Dec 9, 2021
1:02:42

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hi, everybody. Thank you so much for joining us for today's RStudio Data Science Live event. My name is Tom Mock. I'm going to be your host for today. I'm our customer enablement lead at RStudio. I'm going to be going through a lot of different content specifically related to RStudio Connect.

We're also running kind of a fun experiment in streaming to many different locations. So a lot of the time when we've been talking to our customers or people interested in learning about open source data science, it might be coming from YouTube or LinkedIn or Twitter. So we're trying to meet everyone where they want to be in streaming to all those different platforms. So we're hoping everything goes really well today. But if you do have any hiccups, please bear with us.

For today, this will be recorded so you can always watch the recording afterwards on YouTube. And the content will be available in terms of the slides as well as the code I show here on the screen. So the slides are going to be at colorado.rstudio.com rsc.automate. And we can actually send that out through the chat. And then we'll also have the code examples I'm using as available as well.

So again, my name is Tom Mock. We are trying out this new data science live is kind of a webinar style event, but going to many different locations. Before I get started, I also want to give a quick shout out to my colleague Alex Gold, who's one of those solutions engineering managers here at RStudio. He put together this great bike share example that we're going to be walking through in depth along with a few other pieces of code.

And then also my friend Katie Masiello, who's a customer success manager at RStudio. And she put together a great point blank demo content that we're going to be showing as well today as part of kind of the data validation, ETL scripts and other content.

Overview of RStudio Connect

So as far as the core idea of what we're going to be talking about today, we're going to be looking at RStudio Connect, which is kind of our premier platform for building and sharing, or I guess in this case, sharing data science products and building kind of insights among your organization. The core idea is that RStudio Connect makes it easy to share your data products that you build in both R and in Python. For your data scientists, they get to stay and use their preferred open source data science tooling. So again, R and Python, but they're also able to get these things off of their laptop and share these insights to impact decision making across the organization.

So they might be creating things like APIs with Plumber or sharing datasets as a pin or versioning models with pins or deploying shiny applications for interactive web applications. Or lastly, what we'll be talking a lot about today, sharing R Markdown reports or automating and scheduling scripts or R scripts through R Markdown documents.

Now for our Python colleagues, in terms of the Python data scientists, they also have a rich workflow that they can kind of put things on to connect and use. Specifically, we'll be talking a little bit about Jupyter today, which is a great notebook that actually supports Python and R inside it, just like R Markdown can support R and Python. And there's other data products you can publish to connect like Streamlet, Dash, Bokeh for interactive web apps, or Flask and FastAPI for publishing APIs or RESTful APIs in Python.

Now, as far as getting these things out to your stakeholders, there's what we call kind of the easy button. You can just kind of go into a data product, click publish, and then send it out to your organization. Your decision makers, whether they're business stakeholders, executives, or other colleagues can then access the things you're creating via a web portal with authentication and with kind of automatic scaling within your server component.

Your decision makers, whether they're business stakeholders, executives, or other colleagues can then access the things you're creating via a web portal with authentication and with kind of automatic scaling within your server component.

Notebooks and automation

So the idea here is you can automate the execution of arbitrary, basically any R and Python code you want with notebooks. So you can have R Markdown for native R code or Jupyter for native Python code in the RStudio Connect ecosystem.

Now, when I talk about automation in a little bit or scheduling, what I'm really saying here is that Connect gives you this ability to schedule the re-execution of that code. So not only can you create the initial document, publish that, or share it wherever you want, but you can also have it re-execute on a schedule. So I have lots of scripts that I run either daily or weekly that clean up some data, pull it together, and then save the data out, basically ETL jobs.

What Connect gives me the ability to do is to specify things like a specific time zone. I'm in, you know, Texas, so I'm in GMT minus six, and I want things to execute at a specific time. So let's say 1037 a.m. on December 9th, which is about 30 minutes from now, give or take a few minutes. And I want this to run every single day, and I want it to republish the output, basically override itself when it's published.

I also have the ability to do things like send an email from Connect, and this would actually send out an email to myself or to colleagues. Now, when I say arbitrary R code or arbitrary Python code up here, that really means essentially any R and Python code you can write, you can embed into these documents. So I could have things like HTML widgets or leaflet plots or interactive kind of basic JavaScript inside my R Markdown document.

Now, you might ask, why notebooks as opposed to maybe a .R or a .py file? Well, notebooks are very powerful in that you can achieve the primary goal, which might be actually executing your Python or R code and very useful side effects. So you could also have things like writing up to sync or to separate output. So you might be generating a RDS file or CSV file or a pickle file in Python or graphics or tables or whatever else like that is your primary goal. But your useful side effects are that you're also creating a nice document or a nice report that comes along with it.

And that's what the power of a notebook gives you. So you get this rich self-contained output and report. Basically, the notebook is the data product in addition to whatever it's actually doing. It's actually self-documenting. It's showing you when it occurred. It's telling you all these things about like the code you executed and the context it was run in.

So importantly, R Markdown and Jupyter often and almost always show the code that's used to execute along with all the embedded outputs in line. So for reproducibility standpoint, you kind of know what's going on. And what was the code that was run when this was scheduled? Now, you can hide some of that code execution if you're reporting to business users.

And then lastly, R Markdown and things scheduled on RStudio Connect can also generate additional side effects. Things like sending an email. So not only can it execute the notebook, execute the R code or the Python code, it's also sending out this email or kind of many emails to all your colleagues, updating them only when something happens. So kind of meeting where they are in their inbox rather than asking them to go visit, say, a dashboard or something else.

Why RStudio Connect

But why RStudio Connect? Obviously, like part of the beauty of this is that with open source data science, you can build your own tooling. RStudio Connect is a great solution in that you can use multiple languages there. So R and Python native, as long with embedding things like SQL or Spark through Sparkly R or PySpark. You have environment isolation in terms of you're limiting the packages and libraries to specific versions. So capturing that environment. And when you publish it, that environment is kind of trapped and isolated with it.

You have your enterprise-grade authentication, so you can make IT happy and have all the things you're sharing be secure. From a data science perspective, you also have things like push-button deployment for easily deploying it out, as well as things that are more robust like Git or continuous integration or continuous deployment from, say, Azure DevOps or something.

Live demo: Connect platform

So I've hopped into Connect. You might have actually seen for that brief moment that I'm showing my slides here, and that I have this ability to schedule and distribute data science products. This is actually a data science product running on RStudio Connect, so my slides are on Connect. But more typically, rather than slides, I might actually create something like a basic report that's being scheduled.

So here I've got an older script that I've been running for a while, just looking at the weather forecast in Boston. So RStudio's headquarters are in Boston, so let's take a look at that. So this is a very basic report in terms of it's just got the title, it's got a plot confirmation here, and it's saying a little bit about a data set and a very basic plot.

The actual benefit is that this is taking that data and writing it out to what we call a PIN, or an external data set that's being updated. So this could be running on RStudio Connect, and you could actually be saving data to Connect, or you could be saving data out to Amazon S3, or to Azure Data Lake, or to SharePoint, or wherever you really want to save it.

You can see here on the right inside Connect that I already have the document scheduled, so it's going to be running in my time zone in Chicago. I started it just before Thanksgiving in preparation for today's kind of presentation, and I have it run every morning at 6 a.m. Just in case I ever had to go to the Boston office, I want to know before I fly out, what's the weather like? And it's going to run every single day and publish an output after it's generated.

Now, this is overriding itself, so you can see that for today, it was run as of December 9th, but with Connect, I can also go up here and look at the history, so I can see every run that occurred and look at older timeframes. So I have these snapshots of what the report looked like, even though the data is being updated to a separate environment.

Another benefit of Connect in terms of more from the admin perspective is if I go in here to the admin panel and look at scheduled content, I can see all the different things that have been scheduled for my entire team. So you can see there's a lot of stuff that we've scheduled here for our demo server, and some of these are very common. So some of them are occurring down to the minute, and some of them are occurring maybe once a day or a few times a day.

What to automate

So there's a spectrum of needs here in terms of things you might publish or automate. So you might have data updates like I just showed, like you have this ETL script where it's taking data from an API, saving it, and then appending it to a database or to some flat file where it's storing it for later. You might have a report that's running. So you just want to report on that data that's being updated.

Another common use case we see is automated model training, updates, or batch scoring. So here you're either training an update, training a model, updating an existing model, or sending that model out to actually do some scoring.

And then lastly, something I'll talk about in a little bit is automated email delivery with warnings or updates or just nice to knows. So not only do you have the actual report running and connect handling the environments and the creation of those reports, but sometimes you want optional reports. Basically, you want to know only when something fails or only when something goes above or below a certain threshold. So by combining emails along with your reporting or your automation of data science, you can basically get your warning only when it's needed. So you have a high signal to noise ratio and you're not just getting updates all the time, but only when something occurs.

Pins for data sharing

So this is actually, you might see a .R file. So here I'm loading the pins library, which allows me to take datasets and push them out to a remote location. And I'm loading dplyr just so I have the pipe, but maybe I want to do some data manipulation as well. So number one, I'm actually going to form a connection to what we call a board, where a board is a collection of pins or a collection of datasets that you're storing. There's native connectors to things like AWS S3 or Azure AD, Azure Data Lake Service, as well as Google Cloud, Kaggle, and a few different other locations.

So I say, this board that I just connect to, take a dataset. In this case, we'll just take MT cars, a little toy dataset, and write it out to pins, write it out on this remote location. So if I execute all this code, it's going to connect to Colorado, which is our demo server. It's going to take that dataset and writing it out to MT cars.

Now, you might say, like, nothing exciting happened. I just saved the dataset out somewhere. But the benefit being, is that it's now hosted on RStudio Connect. So if I go to content for our RStudio Connect demo server, I can actually show you that dataset we just created. So now I have this dataset. And of course, I can download the dataset if I want to by clicking on it. It gives me a nice little preview. But more importantly, I have this code, meaning I can read it from anywhere. So I can read this from a scheduled document. I can read it from a different computer or from a different server. I can give this to someone else and they can use it.

So the pins package makes it easy to publish things like data, models, and other objects and make it easy to share across projects and with your colleagues. So what I showed you was using Connect as a board and writing a data frame out to it. So we first formed this connection. I'm using board RStudio Connect, but it could very easily, you know, be a different cloud provider or somewhere else besides RStudio Connect.

And then I take that board and I write it out somewhere else. So here I write out our tidy sales data, save it as sales summary, and it's going to be a type of RDS. So I can basically specify what is the file type. Maybe I want a CSV so it's operational between Python and R, or I want to upload a Parquet file so it's more efficiently stored.

And then the real benefit here is then in a downstream automated report, I can actually connect back to that board and read it in. So you might have heard of people having friction when they're scheduling something or trying to automate a task, and they're not able to access the data. So having PINS available to access very specific files, update them in place, or overwrite them is a very powerful workflow.

The other benefit is that while PINS is really used a lot as a way to share data across a team, you can upload any, basically any arbitrary R object. So that means you could save model objects. You could train a model, save it as an RDS, and save that as a PIN. You can then version the PIN, and you can access it downstream and look at the very different uses of that model, whether it's loading it into an API or loading it into a Shiny application and then using it there.

Data sources: databases and web APIs

We're going to talk a little bit more about Arvest and the HTTR package for accessing web data. Arvest specifically allows us to access like websites or embedded data in websites and kind of download that or scrape it. And HTTR actually allows us to query APIs. So you might have like a RESTful API that you're sending some credentials to and accessing data from. We also have dbplyr and sparklyr for accessing, writing to, or reading from SQL databases or Spark clusters or connections.

Rather than working with a bunch of flat files, you might be working with a SQL database, and you can write either native SQL code in R notebooks or in Python notebooks, or you can use something like dbplyr, which is a wrapper around dplyr, to actually translate your R code into SQL on the backend. So this is very powerful for using the exact same syntax for in-memory operations that your dplyr you know and love, but also writing out in SQL databases with the same dplyr code.

So first, just like we showed kind of with pins, is you need to form a connection. Basically, you need to give it your credentials, who am I, and who am I going to connect to? So we'll have in this example, we're connecting to a MySQL database, and then I'm making a connection to a very specific table within that database, and with this API package and with this table command, I can then do my dplyr queries and it'll actually execute them in the database.

ETL automation

So you're going to extract data from a directory database, from your customer data, marketing data, business data, machine data, API, whatever. You're just grabbing different data from these heterogeneous data sources. Then while the data is kind of streaming in, you're transforming it into a standardized useful format so it's ready for your downstream analysis or your downstream applications and systems. Then you're going to load the data or the L in ETL, load the data into the long-term store, most typically a database, warehouse, or data lake as a flat file like Parquet, for example.

So actually writing out to say an alternative, you might actually do something like extract load transform or ELT. Here we're doing the extracting the raw data itself and then just loading it into a storage area, basically saving it as it is. We're not transforming it ahead of time. We're just saving the raw files or loading them in as they are. So downstream applications in this case or other processes are then going to transform or otherwise process as raw or semi-structured as opposed to being kind of ready for analysis by your data analyst.

Having the ability to write out to files like Parquet is really useful here or to JSON, for example, as you're going to be getting potentially data from APIs that are stored as JSON or you want to have the ability to read files. R and Python both have rich support for doing this within the cloud environments.

Data validation with Point Blank

So as you're pulling data in, obviously you can have SQL code that the data is kind of as you expected, and this is what point blank and R is for, is not only bringing the data in and doing your extract and transformation, but validating the data as you're loading it.

So if you're scheduling data set updates or automated data pipelines, you're bringing in data from all over the place. SQL databases, potentially flat files or individual files like CSVs or web APIs, these get read into R or Python notebooks that are being executed on Connect and you're feeding those in in isolation of them. Once the report is executed, then you have a filled out report. Basically, you can fill it with all this information about what was the data you brought in? What was the date? Was it successful? What did the data look like?

So point blank, again, lets me do data validation on either in memory or in database data. So as part of your ETL process, you might pull in some data. As we're seeing here, this is like bird strike data from a different API that Katie Masiello put together.

So here for the API, so it's actually going and finding an API, querying it, saying here's the timeframe it's being queried at, throwing in our API key, which is being added here, and then getting some type of response. So looking at the response, exploring how long it took to run, and then doing a data validation here. So because it's done in our markdown, you have this ability to have basically green and red indicators for was it successful or did it fail?

So you've used Point Blank to create this agent response. It's going to look over all the different parts of the table, interrogate and say, did it pass all the different validations? And are validations passed? If they failed, it would give us a warning about where they failed.

Now as far as does the data make sense, we might have more robust things saying like was the data between a range or was it greater than or equal to? So Point Blank also gives you the ability to query specific columns. So here I've got, we're querying to say, should these values be between 0 and 500? We should never have negative values here, and 500 is the theoretical limit that we've ever seen.

The bike share end-to-end example

So here, we're actually doing something similar. We're bringing data in from an API, scheduling it to be cleaned, updated, and saved out to both a database and to a PIN, and then loading that into an API downstream that's running in production. And all of this is being executed inside of RStudio Connect.

So this is where we're going to spend most of our time remaining, is kind of diving into a little bit of this example and how you can kind of schedule these out and what they look like running on RStudio Connect.

So again, bringing data in, data import, cleaning, and training some models, saving it out to a database, and then saving it out as an API and a Shiny application that are being fed the downstream data. Because it's on Connect, I have this kind of landing page I've created or that we've created with Connect widgets, basically give me the ability to write about this project.

You can imagine that with a data science project of this scale, you have a bunch of different scheduled jobs. Not only do you have the ETL tasks, things we've covered a little bit before, but you also have modeling. So maybe you're scoring the data and doing batch scoring and writing out the values to a database or training the model itself, or you want to send out a specific email if a criteria is met or not met.

So here, again, you have a bit more bare bones report. It's basically just saying, hey, you know, here's the analysis. It ran about a week ago and was executed on a schedule. It's using the ODBC package to form a connection to a database. We form a connection with the database and then clean up the data and write it back out.

So we're grabbing data from both the database and a pin and an API, and then we're writing out the data with dbplyr. So our query that we're actually running is a group by, summarize, interjoin, and then writing out the raw SQL code and showing that query. So I don't know about you, but writing out this SQL code by hand as opposed to writing in dplyr, I'd much rather write it in dplyr and get the rendered SQL code through dbplyr here.

Automated reporting for business users

So for this automated reporting, you can, again, schedule our Markdown in Jupyter, which can be used as a report. And these reports have many different purposes. They might be intended just for the data scientist or their team, meaning that they're going to show all the code. The metadata there is so that the data scientists and the data analysts working with the data know what's going on.

Alternatively, there might be reports for business users or executives, people who just want to consume the final output or they want to get some type of like insight or they want to get a report delivered to them as opposed to understanding what the code is doing. They're more interested in what the actual output is.

So for business users, these reports are going out to potentially like non-technical colleagues who are decision makers or business analysts or executives. And here, you're more worried about packaging up the data or your story, your visualization, your tables, but you're not as worried about showing the code, although some people may want to see the code.

So how do you make this report that fits your company style or one that's attractive and useful for them? We showed those reports earlier. They were pretty bare bones and not that many people would be excited about receiving just a black and white report with a lot of code being printed out. So we can do, and the examples I'll show here in one second, are using different themes or using different customization of the appearance.

So our Markdown theme, say, from the Bootstrap lib or BS lib package. This allows you to customize the entire appearance of our Markdown as well as Shiny applications or other kind of downstream effects.

So I'm taking our Markdown document and just changing the HTML document output and applying Bootstrap Lib theme to it. So if I knit this document, this RMarkdown doc, it's going to look a little bit different. Still fairly bare boned, but you can see that it's got this kind of gray background, the fonts have been changed, but the graphics haven't been changed in terms of we still have this white background on the graphic.

So we're missing a component here of translating the RMarkdown theming into the graphics themselves. So there's one more step we can do here. We've themed essentially the RMarkdown document, but we need to translate that theming into, you know, the plot, into Plotly or the table or ggplot.

So there's this chunk here with thematic or the thematic package, which allows me to pass all of this theming that I've done to the RMarkdown automatically into ggplot and Plotly graphics. So I've made this chunk eval to true. I can knit this, and we'll take a look at it again now.

So on this report, we can see that the graphic completely blends into the document in terms of there's not a white background that's making it contrast. We've actually applied the theming of the overall document into the graphic. It's applied this alternative color scale like a blue-green-yellow that's a bit more visually appealing than the default graphics or default colors that ggplot loses. And this works for both interactive graphics and Plotly and ggplots that we see here.

So while these might seem like small changes that you're doing, they're actually very powerful for connecting with your business users and helping them actually want to see the reports or kind of working with them and the theming that you're working on.

So while these might seem like small changes that you're doing, they're actually very powerful for connecting with your business users and helping them actually want to see the reports or kind of working with them and the theming that you're working on.

You may not even want to worry about doing a lot of customization. You could use something like R Markdown Distill. So this is like scientific and technical writing native to the web. It's got amazing defaults, and it looks really, really nice. So this gives you the ability to very quickly take the exact same code you're running, but make it look better.

And now we have a nice-looking report with very good defaults. So this instantaneously, even though it's black and white, to me, it's much more visually appealing. I've got kind of this floating table of contents where I can navigate back and forth between the different components. So with essentially no work on my part, just by changing one chunk of the YAML header, I can make a nicer-looking report, or if I wanted to go very deep, I could go full customization with Bootstrap Lib and make these reports look even better.

Sending emails with Blastula

Now the last portion we have in the last few minutes that we have for today is talking about emails and sending out emails from RStudio Connect with the Blastula package. So Blastula makes it easy to produce and send HTML emails from R. So this is very powerful in being able to combine things like ggplots or tables or data and embed them into emails that you're sending to your colleagues.

So emails can obviously be sent through SMTP, so you can do something like send them via Gmail, but more commonly we see a lot of customers use RStudio Connect to send these emails through the built-in mail server. So this uses your actually existing company email and uses that type of authentication as opposed to being limited to Gmail or something external to your company.

It's really easy to kind of render these emails. Basically, you take Render Connect email with a specific body of the document and then attach that email to the body of the document in terms of render it and attach it. So it's not only going to show it inline in terms of your actual email will be shown when they open their inbox, but you can attach different things to it, whether that's a copy of the HTML or a PDF report or an Excel file, whatever you want to attach to that.

So part of what Connect is doing is letting you bring in things like your authentication groups so you can send to your entire executive team or the entire sales team or the entire data science team and send it out to all of them.

So here we have an R Markdown report. This R Markdown report is not special in any sense. We can knit it and take a look at the repository. The main thing we're looking at here is this is what the actual R Markdown document looks like. So you could schedule this to be executed and it would generate this, which is intended for the data scientist.

Here at the bottom, and I'll zoom in a little bit, it has some information saying we can actually attach the email. That was the render Connect email and attach the email that we were doing. It kind of jumped the gun there, but it actually sent it out and this is what the email would look like inside your actual email client. So this would actually be the body of the email. You not only have text and headers, but you can embed ggplot graphics, you can print data, or even embed specific tables.

So that was the preview. Now in terms of if I were inside Connect, like let's actually go to an email one. So I'm going to go back into this and talk a little bit about the batch sending of emails because I know that was a question that came up.

Because Connect has a connection to an email server that's been installed alongside Connect, I could email the report and say, just send me a copy of this report. I can basically say do it now. But more often what I would do is like schedule the document and have it send an email out to people. So every time it's run, it sends the email out.

As far as who it's being sent to, that is up to you. I'm the only person looking at this document right now, but maybe I want to add my colleague Kelly or I want to add the entire solutions team at RStudio to receive this. So now all these people being brought in from our authentication protocol will be receiving the email if I want them to.

So I can go back to schedule. I can say send email after update. And I can send it to all the viewers and all the collaborators. Basically send it to everybody who is interested and have all this batch email being sent out.

Conditional email sending

Now that might be useful in terms of sending these emails, but what if that email gets too noisy running every single day? Maybe I don't want to see it every day. And that's also where Connect can help you or Blastula can help you. So this idea of conditional execution is basically, yes, every time a doc is rendered, it can be useful to send out an email. But sometimes you want that higher signal to noise ratio basically saying only send an email if a criteria is met or not. So maybe data quality is below expectation, or your model is predicting a value outside a specific range, or your data set is too small. It should be thousands of rows, and it's 20 rows. Basically whatever logical criteria you want to define, you can build that logic into your R Markdown doc, schedule it, and then send out emails conditionally based on that.

It should be thousands of rows, and it's 20 rows. Basically whatever logical criteria you want to define, you can build that logic into your R Markdown doc, schedule it, and then send out emails conditionally based on that.

So just a few examples of code. You know, here's an example basically saying if predictions are outside of a range or greater than a predicted value, then send me this email basically saying, oh no, the model is drifting or the model values are too high. Generate this other report and email it to me. Otherwise, if values are below that, it just doesn't send the email. It still executes the code. It still renders your report, but it doesn't warn anyone because nothing is there to be warned about.

Python and Jupyter on Connect

I've shown a lot of R Markdown, but you can do very similar things with, if I get over to it, with Jupyter. So I do, don't want to leave those people out in the cold in terms of you can also do similar things natively in Python. So here we have some stock examples where we're using Matplotlib and Pandas to go over to these documents, pull them in, and generate, you know, some graphics and save some data out.

So this native Python code in Jupyter Notebook can also be scheduled, and it will run every weekday and then publish the output. We have the same ability to look at older versions and do your access controls and scheduling of those documents.

A second one might be, again, doing model training. So rather than just relying on Pandas, you could also bring in XGBoost or sklearn, scikit-learn, to actually train a model in Python natively and do training and testing split, training the model and saving it out. So this can also be done and run on our Studio Connect as a Jupyter Notebook with the same scheduling that you're worried about.

Wrap-up and resources

So we're right here at the end. I've linked to all the different content here. Again, the slides are available, and I'll put those into chat. So that is the link to colorado.rstudio.com and the slides, which have all the sublinks and link out to the code. There's information about Blastul emails and in custom emails, pins, making beautiful reports with Distil, Thematic, and BS Lib. You can go a bit deeper into the bike share example and that kind of piecemeal, deep example of an end-to-end data science solution.

And then here on the right, we have Connect scheduling and email and updating, and then two other webinars that might be interesting to you. These already occurred, but there's a Beyond Dashboard Fatigue webinar with additional content. Again, open source code where you can adapt it and kind of use it in your own organization. And then this Rethinking Reporting with Automation is another great webinar from an outsider to our studio where they actually talked about really focusing on that kind of business stakeholder.

With that, thank you so much for hanging around. That was a full hour, so always good to hang out for that time. Thank you for kind of being part of this experiment of doing some more live content webinars. And hopefully you're able to join us from very different locations. Maybe join from YouTube or LinkedIn or Twitter or somewhere else.

So this has all the source code. It'll link to the YouTube video, which is the primary long-term storage, the slides that I demoed today. And then you can see all this demo code that you can adapt, modify, or play around with and look at it a while. So you can download from here. You could clone it and edit it released under a CC BY 2.0 license.

Oh, thanks so much, y'all. It looks like Ahmad gave me some help. So again, because I'm hosting this on Connect, I'm going to flip this over to anyone can access it, no login required. So go ahead and take a look at it. Thank you, Ahmad, for the shout out there. So it should be public and accessible for everyone now. Have a wonderful day. Thanks for joining us on another data science livestream.

So I'll say goodbye from here. Thanks again. And if you have other suggestions, feel free to reach out to me on Twitter at Thomas underscore mock or on LinkedIn. I'd love to hear from y'all in the community. What is the next thing you want to see in terms of like what type of content would be useful for you or would be helpful for your learning and your education? So thanks again. Have a great day. Stay safe and happy holidays.