December 2022 Webinar: The R Workflow – Dr Ryan Johnson from Posit
videoimage: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hello everyone and welcome to our December NHSR Webinar, The R Workflow. My name is Lynne Howard and today I'm pleased to be joined by Ryan Johnson, who is a Customer Success Representative at Posit, formerly known as RStudio. Today's webinar is being recorded and will be available later on the NHSR Community website and YouTube page. If you have any questions, please put them in the chat and Ryan will either answer them as we're going along, or at the end of the session, depending on the amount of detail that is needed to answer them.
If you think of any questions after the webinar, please do feel free to contact us, either using our Twitter page or via our very popular Slack channel, both of which again will be shared in the chat or you can find the links on the community web page and community members will be more than happy to help you. At the end of the webinar, we're going to use Mentimeter to gather feedback and that link again will be shared in the chat. Please do let us know how you found the session and we do appreciate your feedback, so thank you. So with more ado, over to you Ryan.
Great. Thank you, Linda. Great intro. I apologize everyone for starting a little bit late. Hopefully we'll be able to play some catch up here and still sneak it in within the hour. But again, if we have questions as we go through, just feel free to pop those into the team's chat. So today's session is going to be on the R workflow and I know we're going to be collecting some survey data afterwards. I'd be really interested to hear what everyone thinks about this presentation because it's brand new. I actually had a previous version of it, but I didn't really like it that much, so I redid it for this presentation, so certainly would appreciate any feedback. It's going to be a bit of a whirlwind, so we're going to talk about a lot of different things, but the whole goal is just to expose you to a bunch of different topics, tools that you can use to improve your workflows in R.
Overview of the typical R workflow
All right, so we're going to dive right into it. I don't want to waste any time with any introductory stuff. We're just going to dive right into the typical workflow, and so this presentation will be focusing in on R, but I would wager a guess that most workflows are going to follow this trajectory, where you typically start with some dataset that can be a raw dataset, something like that, and then you're going to be analyzing it. Now, that analysis could be cleaning it up, it could be performing some machine learning, what have you, and then ultimately taking all those insights from the analysis and reporting it, and that report can look very different, whether it's a web application or static report.
So, focusing in just on the data to start, so with this workflow, there's a lot of questions about the data. Not all data is the same. So, what exactly is the data? What format? Is it a CSE file, a text file? Is it going to be some massive, you know, compressed file? Is it structured, or is it unstructured data? Next question is kind of where is the data? Is this going to be a public dataset, so something that really anyone in the world can access, or is it going to be private just to you or your group or your team? Are you working with local data, so something that's on your physical computer or server, or do you have to access this data in a remote location on some other server in the cloud? Is it in a database? Is it, you know, again, structured data and a nice clean dataset where you can just make calls to that database, or is it like a data lake, or is this kind of everything just kind of thrown into this data lake? Maybe it's in an API, so you have to make specific calls to an API to get the exact data that you want.
How big is the data? Is this going to be a small dataset that can be easily consumed by your analyses, or is this going to be some massive dataset where you're going to have to think about some clever ways to deal with this huge data, and does the data change? You know, there's a lot of different datasets out there. Some are static. You know, today, it'll be the same dataset tomorrow, a year from now, 10 years from now, or this could be very fluid, so think about, for example, COVID data, and this changes basically every single day, so you have to think about stuff like that.
So, the next in the workflow, you have your analyses, and this really can vary depending on what your goals are, so what type of analysis you're running. Are you doing that typical extract, transform, load, or ETL workflow where you're pulling in some raw data, doing something to it, and then loading it back to, like, a database or somewhere else? Are you doing some cleaning, modeling, simulation data, visualizations, lots of different analyses? But a really important thing to think about is the compute power. Is this something that you can run on your individual laptop, you know, maybe just pumping out some quick plots, or is this going to be some massive, you know, simulations on multiple nodes, huge parallel processing jobs? These are things you have to think about as well.
And then, finally, we have reporting, probably the most fun part about the workflow, and this is ultimately how you're going to be informing the folks that, you know, want to consume the analyses you just performed. So, when you report something, you know, how is that report going to be delivered? Is it going to be hosted on a server, such as something like Posit Connect, which we'll talk about? Is this going to be something you email folks? You know, a lot of team members just love to live in their inbox, and sometimes you want to deliver those insights via an email. Or maybe you make your findings or your models accessible via an API, which we'll also talk a little bit about today. Similar to the dataset, you know, is the report, does it also need to be updated? Is this going to be a single report, where, again, it's going to have insights that it's kind of a one and done, that insights won't change over time, or do you need to constantly refresh this report? And then, finally, what type of report are you going to be delivering? You know, static reports, things like R Markdown, Quarto, Jupyter Notebook, these are all what we consider static reports, but then we also have web apps. And the web app that we're going to focus in on today is going to be built using the Shiny framework, which I would wager a lot of folks on the lines are at least familiar with Shiny, but if not, that's okay, we're going to talk all about Shiny.
Introducing the demo Shiny application
So, to keep things nice and simple, this is the application we're going to be working with today, and hopefully, this application looks familiar, because this is the built-in Shiny application that comes with RStudio, all right? And so, what we're going to do here in a second is we're going to create this Shiny application and just walk through all the components to make sure everyone is familiar with this.
So, if you want to follow along, if you have an instance of RStudio open, whether that be in Workbench, or on your desktop, or in Project Cloud, for example, feel free to follow along with this portion, or you can just sit back and relax, take it all in. So, I'm going to be accessing RStudio from within Posit Workbench, which is one of our professional tools, so it's a server-based implementation of the RStudio IDE. So, I'm going to click on Workbench.
All right, so we have this welcome screen here, and I'm going to go ahead and open up a brand new session, all right? So, I click on new session, I have the option of all these various IDE, so VS Code, Jupyter Notebook, JupyterLab, or RStudio. We're going to stick with that, and we'll just keep everything else the same. So, I'm going to go ahead and start this session, and give it a few seconds to kick on, and it should plop us right into that RStudio session.
And while we go through this workflow, I'm also going to be touching upon some other best practices for when you're working on projects, all right, within RStudio, or really kind of any environment. And one of those best practices, so here we are within the RStudio IDE, we have a console on the left-hand side, our environments pane in the top right, and our file browser down here at the bottom. So, a good practice is any time you're going to be working with a new project, or maybe you already have a pre-existing project, is always to leverage something known as RStudio projects. Now, if you look in the top right corner here, you can see project none, and we want to change that. We're going to make this a new project, all right. Now, why would you want to create a project? It's just a great way to keep everything isolated within kind of a centralized environment, all right. So, I don't need the scripts, the plots, reports, the packages, all that stuff is all going to be contained within a single project. So, it keeps one project very isolated from another project, so you don't get any cross-talk and cause any conflicts.
So, let's go ahead and create a new project here within RStudio. We get a few options here. We can start with a brand new directory, which is what we're going to do. You can also, if you have some scripts and have some data already in a directory somewhere on your file system, you can create a project within that directory, or you can pull in a new project from version control, something like GitHub, for example. I'm going to stick with a brand new directory. We're really going to start fresh here. Select new project. I'm just going to call this nhsrworkflow. You do get the option to create a Git repository. So, if you're going to be using version control, which we would highly recommend, you'd want to click this box. But for this demo, we're going to leave it unchecked. And then you have the option to use RN, which is a really great tool for keeping track of all those packages within this project. And again, it's highly recommended you do that as well, but we're going to leave it unchecked just for simplicity's sake for this demo. And we'll hit create project.
All right. So, now we're in a fresh RStudio project. And I know that because if you look in the top right corner here, you see nhsrworkflow. And you can also see my current path, which is listed right here in the top left. You can see I'm within that working directory. So, this is my new home directory for this project. And this is where we're going to put all of our scripts and analyses for this R Workflow. All right. So, we've created this fresh RStudio project. Let's go ahead and create that Shiny application. So, in the top left corner, you'll see this little green plus. This is a great way to just get started with various scripts, APIs, applications, documents, which we'll talk about here. You can also see Shiny web app. So, I'm going to go ahead and click on this. And I'm just going to call this test app nhs. I'm going to hit create.
All right. So, if you're not familiar with Shiny, this is a Shiny application over here on the left hand side. First and foremost, it is R code. All right. So, that's something that we believe very heavily at Posit is that all data science should be code based. Shiny is no exception here.
So, when you have a Shiny application open within RStudio, you do get this additional button right here to run this application locally within RStudio. So, I can do that.
And here we have the rendered Shiny application. Let me make my screen a little bit bigger. There we go. All right. So, here is my rendered Shiny application. And I can slide this far to the left. I can slide it to the right. And you can see it changes the number of bins in our histogram. All right. So, a very simple Shiny application, but it does demonstrate the power of Shiny, the ability to interact with your data and kind of get these live results.
Walking through the Shiny app code
So, because we're going to be leveraging this Shiny application for this demo, I want to make sure that we have a firm understanding of all the code that's going on behind the scenes. So, I'm just going to quickly run through it. We're going to start right here at line 10. Everything above line 10 are just comments to the author. So, these aren't actually executed. You can see line 10, we're going to load the Shiny package. After we do that, we define the user interface. All right. Now, for this demo, for this session, we're not going to be really focusing in on the user interface. But I just want to quickly go over it so you know what's going on. All right. So, we're going to leverage a fluid page, a sidebar layout format. So, the first thing we have right here is our title panel. So, you can see that's reflected right here.
And then we only have one input, and that's going to be the slider input. We're giving it the input ID of bins. Then everything else is just going to be unique to the slider input, like the name, so number of bins, the min, max, and the default value, which you can see is set to 30. And then we have our main panel, which you can see is showing our histogram. All right. So, that's going to be shown right here as this plot. Again, for the most part, we're not going to be focusing on the user interface. But now you know a little bit more about it. But we are going to spend some time talking about the server. Server is where all of the code that runs behind the scenes of your Shiny application lives. So, we have a few things here. We're creating a plot. All right. We're saving it as this plot, which again gets sent to the user interface up here.
And then within this render plot, we have some data. We're going to use a built-in data set called the faithful. Sorry to interrupt you, Ryan. Your screen seems to be cut off on the left-hand side. Okay. Let me see if I can move it a little bit. Does that help at all, or is it still?
No, that's not moved. Yep. We can see everything. Okay. Okay. Thank you. All right. So, we'll just have to give it a little squish, but that's okay. All right. Yeah. If anything like that happens again, definitely feel free to interrupt me. All right. So, just running through again all the code in our server function. So, we have our data set. This is the faithful Geyser data set. We'll talk about this in detail. We're going to save it to a variable called X. We have another line right here on line 41. This is really going to be our analysis, which we'll talk more about in detail. And then we have the actual code to draw the histogram using this function. And that's pretty much it. Down here at the very bottom is just a call to that Shiny app function to just run the Shiny application.
When to engineer your workflow
So, now that we have a little bit more of an understanding of what's going on behind the scenes of the Shiny application, a hypothetical question for everyone here on the line. No need to answer it. Just something to think about. Well, let's say your business or your company, whatever you're building, it absolutely 100% depended on this application running correctly, quickly. Would you be comfortable if your team, your company, depended or relied on this application that we just showed you? So, this is something to think about.
Now, I would make an argument to say, yes, sure. I think this application is pretty good. Why is that? Because it's consistent. The data doesn't change. I'll show you what that data looks like here in a second. It's very simple. It just has one input, one output, one data set. And it's pretty fast. So, you saw as I slid the bar to the left and right, it responded pretty quickly. These are all characteristics of a good production-ready Shiny application.
But not all apps are going to be this simple. So, when we talk about this specific application, what are we actually talking about? So, we have a report. And this report is going to be a Shiny application. And within that Shiny application. So, when we were going over that code, we had all the code for the analysis. And we had all the code that imports the data set. So, everything's basically contained within the Shiny application. And that's totally fine. You know, for this application, because it's fast, because it's simple, because it's consistent, you don't need to change this application. So, it's really important to know when and when you should, when you shouldn't overengineer applications or any type of reporting for that matter. So, a good rule of thumb is that, you know, don't overengineer workflows if you don't need to.
So, a good rule of thumb is that, you know, don't overengineer workflows if you don't need to.
But like I mentioned in the last slide, you know, not all apps are going to be this simple. So, it is important to know when to not necessarily overengineer, but to think about different ways to engineer your applications, your reports, so that you can scale accordingly.
The faithful Geyser dataset
So, let me just get everyone up to speed on what exactly is this data. So, looking again at our server function in our Shiny application, we're using something known as Faithful. All right? So, this is a built-in data set to R. So, when you download R onto your computer or server, the Faithful data set's already there for you. And it's just a good data set to play around with, try out some visualizations for you to test out. And specifically, we're going to be extracting the second column as a vector and saving it to the value of X. So, this is what, just kind of a snippet of what this data looks like. I'm just showing the first 14 columns, or rows, sorry. And we have two columns in this data set. You can see eruptions, which is the first column, and the second one is this waiting column. And these correspond, at least in the first column, this is the amount of time in minutes. Every time Old Faithful, which is a big geyser somewhere in the western United States, I think it's in Yosemite, every time it erupts, it takes that long in minutes. And then from time to the next eruption is shown in minutes over here in the second column.
So, as a reminder, we're going to be extracting the second column as our data for this application. And I'm showing you that data right here. So, this is the second column of the Faithful Geyser data set extracted as a vector. And you can see it's a little over 270 numeric values in length. And that's it. It's a pretty simple data set. You can see the numbers right here. It all fits nicely onto the screen.
But again, going back to that previous slide, what if this data changed every single day? Like maybe they just continued to add data every single time Old Faithful geyser erupted. What if the data wasn't built into R? Maybe it was stored somewhere else and you needed to import it into your workloads. What if others wanted to share this data with your other teammates? Sure, it's pretty easy when the data is built into R, but what about if it's not? And then what if this data was not this small? Maybe it was actually millions and millions of rows in length and hundreds of gigabytes in size. That definitely kind of changes how you're going to approach this data set.
Introducing Pins and Posit Connect
So that brings us to our first workflow. So rather than just simply having a data set built into the Shiny application, we're going to take that data set and we're going to save it as something as a pin. And this pin, we're going to pin it to something known as Posit Connect, which is one of our professional tools. This is kind of our publishing platform, which we'll talk more about here in a second. But let's talk a little bit about pins. Maybe some of you have heard about it on the call and maybe some of you haven't. And that's okay. I think pins is a really underutilized tool, which can help improve a lot of your workflows. So pins is an open source R package, just like Shiny, something that we've developed here at Posit. And what it allows you to do is publish or pin data, models, any other R object to a board, right? And that makes it really easy to share across projects and also with your colleagues. And so like I just mentioned, you can pin these objects to boards and these boards could be a variety of things. But for this workflow, we're going to leverage Posit Connect as our board. So just like you take a piece of paper and pin it to a cork board, you can pin your data and pin it to a Connect board. And it really makes your data so you can easily update it. You can version it. So it just makes your data a little bit more flexible.
So we're going to pin our data to Posit Connect, but we need an additional tool to basically house all the code in order to do this. And we're going to leverage something known as Quarto. So Quarto is something we're really excited about. It's a brand new tool that we announced at our conference back in July, I believe, June or July. And it's very similar to R Markdown. So if anyone on the call is familiar with R Markdown, you can basically consider it R Markdown 2.0. But it's really tailored to scientific and technical publishing. And what's unique about it, as opposed to R Markdown, is you can create these using whatever language you want. So you can use R, which is what we're going to do, but you can use Python, Julia, Observable, and you can use whatever IDE you want as well. So we're going to stick within the RStudio IDE, but you can also create Quarto documents using VS Code, Jupyter, or any other text editor.
And similar to Pins, you can also take Quarto documents and host them on Posit Connect and set them up for job scheduling, which is a really cool workflow and something we'll actually, I'll talk about here in a second.
Creating a Quarto document to pin data
All right, so what we're going to go ahead and do here is we're going to take our data set, that second column from the gadget data set, and we're going to pin it to Posit Connect. And we're going to do that using Quarto. All right, so I'm going to come back here to the RStudio IDE. So we have our Shiny application. I'm going to go ahead and close out of this, and I'm going to open up a Quarto document. So you can see in these starter scripts, we have Quarto document. I'll go ahead and select this. I'm just going to say NHS pin data, hit create.
Hit create. And here is our Quarto document. And you can see by default, it's leveraging our visual editor mode, which just makes working with these documents really nice and pretty. But you can also edit them and using source code, which looks much more like your typical R Markdown. But we'll stick with visual because I do think it's nice to play around with. It does come kind of pre-built with some code, some text in there. But I'm just going to go ahead and delete all of this so we can start fresh. All right, so let's go ahead and take that faithful Geyser dataset, and we're going to create a pin and pin it to RStudio Connect. So we're going to step through this bit by bit. The first thing we need to do is load our packages. All right, so I'm going to go ahead and insert an R code chunk here, and I'm going to load the library pins package. So because we're taking this faithful Geyser dataset and pinning it, we need to make sure we have pins. I'm going to go ahead and run this and just make sure I have pins in my environment. Looks like it loaded just fine, but if it didn't, you just have to install that.
All right, after that, we're going to go ahead and filter and save our data. And I'll go ahead and have another R code chunk here. And we're going to, just like in our Shiny application, we're going to save our data as an X variable. All right, and we're going to assign it to the faithful Geyser dataset, just the second column. All right, so I can run this and just make sure that looks good. You can see in my environment pane, I have X. I can run it down here. Yep, that all looks correct. So the goal now is we want to take this data and we want to pin it to RStudio Connect. All right, so that's going to be the next section here, pin to Posit Connect. So I'm going to go ahead and just copy a few things from that GitHub repository I shared in the chat. And the first thing here is our board. So I mentioned that pins, you need a place to actually pin your pins. In this example, we're going to be using Posit Connect. And so I have the URL to our demo server of Posit Connect right here. All right, we call it Colorado. I have no idea why, it's just what we call it. So this is the actual server of Posit Connect we're going to be using. And you do need to supply a Connect API key just so that Connect knows who's pinning this data set. So we're going to use the pins board RS Connect function to basically register this board. So I'll hit play here. And you can see connecting to RStudio Connect or Posit Connect, and that looks correct. All right, so we've now registered it, that's good. And now we're going to go ahead and write the pin. And it's a very intuitive function called pin write. All we have to do is supply the board. So we'll just leave that as board, the data set x, and then we can give a name as well. So I'm going to go ahead and call it faithful geyser data. And that's it. So I'll go ahead and run that code chunk. And you can see it's going to write as an RDS file writing to pins faithful geyser data. So that's it. Think of this as like saving it to like a Dropbox or an S3 bucket. We're just taking a data set and we're saving it to Posit Connect so that others can use it or you could potentially use it in other workflows.
Now I'm going to switch over to Posit Connect here. I'm just going to refresh here and show you what this pin looks like. All right, so here is that data set, that pin data set we just pinned. I can click on it. It's not going to show you much, but what it does show you, which is really helpful, is it gives you the code so you can import this data set into another script or another workflow. So loading the pins, registering the board, instead of pin write, we can use pin read. All right, so we're going to use this here in a second, but this is what a pin looks like once it's hosted on Connect.
Publishing to Posit Connect and job scheduling
Now actually, let me go back to this document really quick. So this is a Quarto document. Now this data set, it doesn't change. Every time I run this command, it's going to be the same data set. But again, think about what if your data changed every single day and you might want to rewrite this pin every single day so it's updated with that new data. So I'm going to go ahead and save this document. I'm going to call it test Quarto pin geyser. I'm going to hit save. Now the first thing I want to do here is I want to publish this Quarto document to Posit Connect. Just like we published a pin, we're going to publish it to Connect, but we're going to use a kind of a canonical publishing workflow. So we're going to click on this little blue button right here. And we want to publish this to RStudio or Posit Connect. Publish document with the source code. So if you want to set it up for job scheduling, you do need to make sure you include the source code. It's going to ask, you know, what Connect server. So we're going to use that Colorado Connect, which I mentioned before. We can leave the title the same. We're just going to publish the single Quarto document, which is that QMD ending. We'll hit publish. So if you've never published anything to Connect before, that's pretty much it. Once I hit publish, RStudio takes care of the rest. It's going to capture my environment. So what packages I'm using, what versions of those packages, what version of R am I using. It sends all that information to Posit Connect. Connect reads it, replicates my environment, and then publishes this Quarto document.
So let's give us a few more seconds to run.
And then once it's done, it should automatically pop open in Connect. And here we have that Quarto document now hosted on Posit Connect. And what I'll do first and foremost is I'm going to open this up to everyone here on the line. So I'm going to set the sharing settings to anyone, no login required, and hit save. I'm going to grab the URL here at the top, come back into the chat, and paste it here. So now everyone here on the line can see that Quarto document we just created.
And once we have it here, one of the important features I wanted to demonstrate is job scheduling. So you can see over here on the right hand side, we have the schedule tab. And let's say I want to update this pin every single morning. So I can schedule it, select my time zone, when I want it to start, and run daily. So it's pretty much all ready to go here. Run every weekday, Monday through Friday, or every other day, every day. And that all looks good. Hit save. And now this pin will automatically be updated every single morning at 841 AM.
All right, so just a powerful way to kind of improve your workloads, especially if you have data that needs to constantly be updated. You can set it up to run using Quarto. You can do this with R Markdown as well, and even Jupyter Notebooks.
Building a Plumber API for the analysis
Okay, so coming back to our slides, and apologies if I'm going a little quick here. I know we're only about 20 minutes left. But this is our starting point. So we had our Shiny application, we had all of our analyses, and the data within the Shiny application. And what have we done so far? Well, effectively, we've taken the data and moved it outside of the Shiny application. All right, so now the Shiny app, this data is within a pin, with the help of Quarto posted on Posit Connect.
So let's move on to our analyses. And I mentioned previously, there's really only one analysis in the Shiny application. And that is the calculation of this bins variable. So you can see we're using the seek function, it's going to take the min value of the x data, so that faithful gadget data set, it's the max value, so min and max, it generates a vector of length input bins. So whatever that slider bar is set to, it's going to be that number of bins.
So what does this actually look like? So we have our Shiny application right here, and we have the number of bins set to seven. All right, we can see this number right here is gonna be set to seven, and then we get this vector right here, it's actually seven plus one, I'll kind of explain why here. All right, so there's bins, and this example of seven computes to this numeric vector. And there's actually eight values here, because it corresponds to every single border of these bins. So starting over here on the left-hand side, this left bin, that's numeric value 43, and you have one, two, three, four, five, six, seven, all the way to the left-hand side, which is eight. And those all again correspond to this bin vector.
Not a very compute-heavy analysis, but just for the sake of conversation, now what if these analyses were more compute-intensive? So you had this massive simulation or machine learning model you're computing, that could take a long time to run, it could leverage a lot of CPU and memory. And what if you wanted to access the results using a different language? So maybe you've created a model, for example, but you want to use your model within Python or Julia or something like that. How could you do that? So we're going to do that using another tool called Plumber. Plumber is a way for you to create APIs using nothing but R code. So if you're like me, when you first started with your R coding journey, the concept of an API was so foreign to me and so scary to me, I didn't even want to touch it. But Plumber makes things really easy. We're going to go through an example here of creating an API. And really, ultimately, what you're doing is taking your normal R code that you've already written, and you're decorating it. And I'll show you what the decorations look like here in a second. But the one thing you do need to know for creating a Plumber API is you do need to know how to write an R function. We're going to go over that here now.
All right, so let me just go ahead, and we're going to create and publish a Plumber API. So I'm going to come back here to RStudio, and within RStudio Workbench, we're going to close out of this Quarto document. I'll clean my screen here using Ctrl-L. And let's go ahead and create a Plumber API. So starting with that same starter script dropdown menu, you can see we have Plumber API right here. So go ahead and click on this. I'll call this NHS. And the goal of this API is to compute the bins. So I'll just call that, and I'll put API here at the end. And hit Create.
All right. So this is an example Plumber API. So similar when you create a Quarto document or a Shiny application using this dropdown. It has some stuff pre-populated in here, but we're not going to worry too much about this stuff. So I'm going to go ahead and delete pretty much all of this except for the library Plumber. There's also this comments up here just for the author. We really don't need those either. So I'm going to go and delete those too. So we're really just starting with library Plumber. So the first thing I mentioned before is that in order to create an API using Plumber, you do need to write a function. So this function, the whole goal is to calculate the number of bins for your histogram. So I'm going to just create an example function. We're going to call it foo for right now. And we're going to use the function function to create this function. All right. That sounds a lot of functions. So I'm only going to take one argument. That's going to be the number of bins. And then once you've defined your arguments, we basically open up the body of our function. So we use these curly brackets and everything, all the kind of the code you compute is going to happen within these curly brackets. The first thing we want to do is I want to obtain that pinned data set. So I mentioned before we have that pin on Posit Connect. Let's go ahead and access that.
Now before we do that, we first have to connect to that board again. So I'm going to come up here and I'm going to first copy and paste some code. So we're going to library pins and make sure the pins packages updated. And this is basically the same code that we had in our last Quarto document. We just want to make sure we have the connect board registered. Now once we have that, we can read in pinned data. So we'll do that in the next step here, this data set, we're going to stick with x. And we're going to do pins. Pin read very similar to pin write this and read, we're going to do board. And then we do need to give it the name of our pin data set. So I come back here to right here, you can see this is the name of our pin data set. So it's my first name followed by the name we gave it. So go ahead and copy that and pin it here. All right, so that's going to pull in that pin data set from the pin on Posit Connect rather than pulling it from that built-in data set. Now once we have that, we want to calculate calculate bin breaks. All right, so this is going to be the code which we extracted from that Shiny application. So we're going to use that seek function, and we're gonna find the min value of x, the max value of x, and the length out is going to be the number of bins, so n bins. And I just want to make sure this is numeric. So it's one thing important with APIs is sometimes they're fed in as character values. So I just want to make sure this is converted to numeric.
All right, and then we also want to make sure we do plus one.
All right, and that should be pretty much it. So let's make sure this works as intended. So I'm going to go ahead and source this foo function. So if I come into my environments here, you can see the function foo is now in there. And I can try it out down here. So let's say foo, and I'm going to say seven. All right, I run that. We get a little message here, don't worry too much about that. But you can see here we have returned the numeric vector of our bin breaks. So that works out well. So this function is performing as we intended. I'm going to go ahead and delete the name
