Hands-on Session: GenAI to Enhance Your Statistical Programming - Phil Bowsher & Cole Arendt

Transcript#

This transcript was generated automatically and may contain errors.

Hey, this is Phil and Cole. We're excited to do a live session for you today. Hey Cole, how's it going? I think we are live. So let me go ahead and share my screen, and we're going to talk through a couple things today, set the context, and then jump in as fast as we can. We only have 30 minutes today, so we're going to go pretty fast, try to give everyone a chance to play with some of these new and exciting tools that are in the space. So let me go ahead and share my screen.

So back in 2023, so about a year ago, I went to Cole and said, hey, there's a conference coming up. Why don't we do a talk together and focus on all the exciting ways that GenAI is impacting the statistical programming and the pharmaceutical space. And so we got together and we wrote a paper. I'll put it in the chat box. It's called AI Exploration and Innovation for the Clinical Data Scientist. This was presented at the FUSE conference, I think back in February, March timeframe. And you can find the paper here, and I'll put this in the chat so everybody can get to it.

And what we did is we, in this paper, broke down all of the tools that people were using at that time for helping to aid programmers. And there's a lot of interest right now in how can we support programmers, especially coming from commercial software, and aid them in the transition over to open source, especially R, using Python and tools like that. And so that was the perspective of the paper. But we decided as we were writing it that there was two new and exciting areas that we jumped on. One was the idea of can you use local models in addition to the public LLMs? And then also, if you do that, or even if you build and have access to other public models, how do you have an interface into that? And so we have some sections in here around creating Shiny interfaces as well as Streamlit interfaces for doing that.

And so over the course of, you know, four or five months this year, Cole and I gave a couple different talks about these topics. And what they kept transitioning over into was the power of Shiny for interfacing into LLMs. And so, Cole, I know this was something that felt like we've seen a lot with groups that we work with, with people in the public. There was also Joe Chang's talk at the Posit Conference a couple weeks ago. Winston spoke about this. So why don't you take the attendees through this diagram that you wrote and explain this a bit?

Yeah, this is my crazy brainchild. But the thing that I thought was really interesting is that OpenAI and ChatGPT went like crazy viral to the point that people who know nothing about software and technology and statistics were talking about it. And I think the thing that they really nailed was this little triangle that I developed here. It wasn't just that they created a cool model. It's the bottom right corner. They did create a cool model, right? They've made a lot of good progress there. But the thing that they really nailed is they created an interface that people could build on top of, and that's the top of the triangle, right? So they created an OpenAPI interface that folks can build against when they're building their own software tools. And so that's consistent. It's structured. It's useful if I wanted to build some app that used this model.

And so that, I think, is part of what made them enormously successful is that, and I've done this before, is like you build a tool that uses their model through that OpenAPI interface. And if they improve the model, all I do is like click a button and my tool gets better. I don't have to do any rework. So that extensibility and that platform kind of structure was really big. But then the other thing that they really nailed, and this was part of the virality of it, was the user experience. This chat app where you could like talk to the model and be like, wow, this thing's really smart. That was, I think, really innovative and it was what kind of made the power of the model click for everyone.

what kind of made the power of the model click for everyone.

And so these three things all kind of work together really nicely. And this is kind of the worldview that we think about a lot when we think about data products, that you want to think about how your users are interacting with what you've built. A lot of times they don't care, you know, how you built the model or how you tuned it. They want to have an interface that makes sense and provides value to them. But then the other thing is to build these interfaces so that you don't have to rebuild your UX every time the model changes. And so that's kind of the API interface and more kind of structured interactions. So anyways, all three of these go into a good data science ecosystem, in our opinion.

what we want to show you today is without much sophistication, you can build an interface into these back ends that you have inside your company, or in your partnerships or relationships with these larger Gen AI companies.

Deploying to Posit Connect Cloud

So if you go to connect.posit.cloud, the login that you have to go through, and there is a little bit of a login dance. So if you haven't been before, you will have to go through that. And that, but you have to go through GitHub. So again, you're using your GitHub account. And the reason that we do that is because when you publish, and so this is what I was saying, we don't necessarily need to publish extension. We can publish directly from connect cloud.

And what you do is you go in here, you say, Hey, I want to deploy a Shiny application. And then you choose your repository. And so you can see, I can deploy all the different kind of entry point files inside of the repository. And so I want to deploy my chat application. And you can choose your version of Python, all that jazz. I'm also going to set my open AI URL here with my magic URL.

So one thing I want to point out is I just put the link to the publishing tool that we're going to use in the chat box. So if you'd like to try it out, the quickest way since we've got about eight minutes to play with this is to use your GitHub credentials. So you can log in with the GitHub credentials. That's right, Cole.

And then I think this is really thinking cool. So once you've got that page, all you have to do is just point it to Cole's repository, make sure it's pointed to the main branch and the app.py file, and then set the configurable variable so that that open AI underscore URL that we set in the terminal, we're going to set it here in this interface. And then, Cole, without much, go ahead and kick it off.

Yeah. Yes, we can publish. And now we're sharing this out with the world. And I really love how stinking fast this thing is. Anybody who has sat around and watching dplyr compile before will just absolutely love how fast this is because it's done already, which is insane. And so now this is a URL that anybody can use.

So cool. Why don't you try see if see how it does in creating a Shiny for Python app.

Yeah, so I will. I will confess, I played with this earlier today. And it struggled. It kept creating Streamlit apps and Dash apps. Yeah, see, it wants Dash. But if it makes you feel any better, I also found some other funny things. And, oh, wow, it got it. Good job. But check this out. It's not very confident, though. It was convinced it was Saturday, the last time I was talking to it.

So I think this connects back to the beginning, where at the very beginning of the workshop, we showed five different ways to create Shiny apps with gen AI. One of those, and probably the most popular one is Copilot. And I'll put a link in the chat box at how to set that up inside of RStudio. Originally, we were going to show that, but it's pretty straightforward. You just go into the global options, you can enable it, and then you have a gen AI programming assistant right inside of the RStudio environment. There's another tool called Chatter that lets you use OpenAI. But I think Cole and I, you know, what we mentioned about Positron earlier, there'll probably be some nice extensions for that.

Wrapping up and key takeaways

I really want to highlight this. There's a couple of ways I think you could, you know, take away from this. I think one of the easy ways to take away from this, I think there's probably people in the audience who are like, well, I mean, it's not like we did that much, right? Like, I mean, and you can't do that much in 30 minutes. But I think the key is what we're trying to show you is building blocks that you can then use and build something awesome with. And the idea is you bring the awesome to that equation, right? But these building blocks are really good tools.

And somebody made a comment about, you know, can this be used with such and such model? And the thing I want to highlight is, again, this is just using an interface, a structured interface that OpenAI made popular. Any model that uses that interface or any random API that somebody like me wrote that happens to use that interface, which is what this is, it's a random thing that I cooked up one night, will work, right? You just swap out that URL and point it somewhere else. It's like, hey, this is my other API that is a front end, it's a wrapper around Bedrock or whatever other model you want to use. And so the idea is to identify these building blocks and then, you know, piece them together to build something really cool that makes an impact on, you know, your company, the world, whatever, you know, just you bring the, you got to bring the ideas though, because these are just like building blocks that you can then put together.

you know, just you bring the, you got to bring the ideas though, because these are just like building blocks that you can then put together.

I think that where we'd point you next is that Joe and Winston gave some amazing talks at the Posit conference that we had a couple weeks ago. I don't know if those videos are made available yet, but I think the plan is to make them freely available on YouTube and definitely watch those. I've been interacting with pharmas a lot the last couple weeks and they have been starting to use those concepts and tools, using Shiny for doing data analysis and also using Shiny to help you build Shiny apps. And so there's some really great talks.

But hopefully today highlighted how you can get going creating Shiny for Python apps and then also how you can use that in the space that you have around Gen AI and LLMs, which seems to be a popular topic. So I think we have about two or three minutes for questions. And if anybody wants to post those in the chat, I can ask those out loud for Cole and myself, and then we will pass it to the first speaker for today.

Yeah, so somebody I think is asking about the cloud hosting. I know that the connect.posit.cloud is pretty much brand new, Cole. It's free, I think, right? There's a free tier, maybe some also. Do you know much about that yet? I don't. Yeah, I'm pretty sure it's free right now. Same thing with where we were developing earlier. This has a free tier as well. It's hosted. The thing is, most of these services, you're using somebody else's computer, basically. If anybody's ever seen that sticker, there is no cloud, it's just somebody else's computer. So a lot of times they'll let you use their computer up to a point. When you start trying to train a thousand models and that kind of stuff, they're like, hey, pony up, it's time to pay. So anyway, so there's definitely a free tier.

Connect.Cloud right now is free because it's alpha. But yeah, lots of good tools out there. Lots of things that you can download for free onto your desktop. You do want to be careful with hosted services because you don't want to rack up a big bill or anything. Because I think the thing here, Cole, is we wanted to show you what it's like to publish, but most people inside their company are going to have strict controls around publishing and how they do that. So you probably want to build these Shiny for Python apps in a managed space, and then you also have a publishing environment or platform that you use. But hopefully this gives you an idea of how you can ship things off onto the web, whether it's internal or external.

And somebody asked, Cole, how do I start the app after I created the .env? Yeah, yeah. So when you want to run the app, you have the .env file. There's this helpful little run button, and that is made possible via this Shiny extension. So if you don't have that extension installed, you want to install that, and it'll recognize that, hey, this is a Shiny app, and you can run it. But if you notice what happens here, it's just a Python command, right? Python dash m shiny run, and then there are a bunch of arguments. And so you're using Python to run the Shiny app, but the little run button makes it a lot easier.

I've got two quick comments here. So someone asked about how secure is the private data? Because a lot of times, the idea with LLMs is working with your own PDFs, your own documents. And that's why Cole and I originally explored the local models, but some organizations now have relationships with the big tech companies where they're doing these types of things. So typically, inside your company is where you're going to have a lot of this software in managed VPCs, virtual private clouds. So the security is just going to be inherited based on the IT team and how they set things up. And usually, you just take Posit software there or wherever you're managing things in-house.

So the next talk is starting. I think Cole and I can hang out while everybody transitions over. The talk that Cole and I gave at FUSE that sparked a lot of this and was originally the idea for the conference, the speaker is going to go now from Roche to talk about what Roche is doing internally to create chatbots to help people with programming. So I'd encourage you to leave this session and jump over to the next one, where we will be kicking things off for the conference today. And thank you so much for coming. Hopefully, this gave you some cool ideas on how you can use Shiny. And we will see you in October for R and Pharma coming up. So thanks a lot. Thanks, y'all. All right. See you later.

Hands-on Session: GenAI to Enhance Your Statistical Programming - Phil Bowsher & Cole Arendt

Transcript#

Shiny for Python and LLM interfaces

Getting started with GitHub Codespaces

Exploring Shiny for Python examples

Deploying to Posit Connect Cloud

Wrapping up and key takeaways

Featured software#

Shiny