Resources

Workflow Demo Live Q&A - November 27th!

video
Nov 28, 2024
24:06

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hey everybody, thank you for jumping over here to the Q&A room, we'll give people another minute here to join us.

Alright let me bring Nick over here on stage. Hey Nick, can you hear me okay? There we go. Perfect, well let's give people maybe 20 more seconds to jump over. And Isabella and Hannah, thanks for helping out backstage, if you wouldn't mind going to let people know in the demo room where we are for Q&A as well.

Awesome, well we can get started here and people will join us as they do, but thank you all so much for joining us for today's Workflow Demo and happy early Thanksgiving to everybody this week. Thank you so much to Nick for such a great session. As a reminder, we do host these Workflow Demos the last Wednesday of every month and they are all recorded, so I'll share the link in the chat if you want to add the recurring event to your calendar.

Just a note, next month won't be on the last Wednesday of the month because of Christmas, so it will be the week before, but there are now over 20 different workflows that we've featured, from building model annotation tools, to pins workflows, to beautiful business reports, and so you can go and access all those recordings in the YouTube playlist as well. And once I am done with my spiel here, I'll put those links in the chat. But while I know many people joining today are current customers, if you're new to Posit Team and Posit Connect, which we saw today, and you'd like to try it out for free or want to chat more with our team, I'll also share a link where you can book a call to chat with us, but feel free to DM me as well.

I know a lot of times in big organizations, you might not know that your company already has our products in one department.

But let me jump into some of the questions here, and I just want to let you know, you can use the YouTube chat here to ask questions, or if you prefer to ask a question anonymously, you can use this link here on the screen and you can go and ask that question over in Slido.

Why Orbital stands out

Okay, so let me jump into some of the questions, Nick. But I first have to ask you, because you said in the beginning, Orbital is the coolest package you've ever seen. Why is it the coolest package you've ever seen?

That's a great question. I think it because it really answers a, it really solves for a problem that I ran into a ton when I was a data scientist when I was running data science teams, which is previously it took absolutely forever. I mean, it's just a lot of extra work. Once you fit a model to deploy it somewhere, you've got to either go find a place, you've got to find a place to deploy it, you've got to create a deployment package, you've got to create a way that that model can get itself, the results of that model can be given to your stakeholders. There are certain tools that make that a little bit easier than others, things like Posit Connect. But they're not really, I mean, again, it's just, it's another step. Whereas Orbital, it's one line of code. I mean, you can basically have, you can set a model and have it basically in production in a couple minutes, which is super cool.

And I think it's really going to kind of level up teams ability to take the great models that they're building every day and have them out to their stakeholders the same day, which is super cool.

And I think it's really going to kind of level up teams ability to take the great models that they're building every day and have them out to their stakeholders the same day, which is super cool.

Handling new data and database compatibility

So let me jump over to a few questions that were asked earlier in the chat during the demo. And somebody said, in the distributed workflow using orbital, how does new data get entered, entered into the analysis? Is it in another database that uses the same variable names?

Great question. Yes, you can absolutely do something like that. You can take you can create a new table in your database with the same variable names run the same orbital generated SQL code against that absolutely possible. One of the cool advantages of what I showed you in that demo is deploying the model as a snowflake view is that snowflake views are any SQL views in any SQL flavor are computed basically when you call them.

So they're not going to be they're not tied to specific data. They can just be run on basically any data. So one of the really cool things about what your models a snowflake view is what you could do is append new data to that table that I showed you. So you could just be adding new data to that table consistently. And every time you call the view, it's going to run on all the data in that table. So gives you a couple different options to deploy it whichever one makes sense for you.

Okay, maybe a quick question here. Somebody asked, Can I do this in Python?

No, this is an R only functionality. Big reason behind that is simply because a lot of what a lot of the magic of orbital comes from the tidy models universe. Tidy models provides an incredibly awesome, very opinionated framework on how you construct models and the pre processing steps. And that really opinionated framework has got what let us build and let specifically I shouldn't say us let Emil build orbital and build it to do all the to do all the magic you just saw.

However, that is doing it in Python is something that we are we we'd actually, it's something kind of very interesting to us. If you have questions about that, feel free to reach out to us. You know, we're constantly looking for new ways to kind of push the open source frontier on both are in Python. And that's something we're, we're interested in doing.

Okay. Another question asked earlier was, We do not have snowflake. We're using redshift. Could this be applied to redshift as well. Any drawbacks and using redshift. Yes, you can absolutely do this in redshift as well as snowflake because orbital translates to a standard anti sequel. So any, any database, you could do this and redshift. You could do this in postgres the really, the only drawbacks are going to be what the drawbacks of using redshift versus snowflake for the rest of your data modeling work or the rest of your data warehousing work. But absolutely, you can use any database flavor you're already using.

Okay, so I think that reiterates this, this question as well. So you said you can use any database flavor that you are interested in. Yeah, no, that's a, that's a great question. The, the only thing that I would say, and I haven't had a chance to personally test it again. Snowflake. Sorry. Orbital is generating anti sequel. So, basically, that's going to cover most SQL flavors or most database backends. I would be cautious around using it with things that kind of go away from that kind of general sequel. I mean, I'm thinking things like Apache Presto, but I don't think that's going to cover most SQL flavors. Which provides SQL, but it's a little bit of a different SQL flavor. It may work, but I don't want to promise that it will work.

Champion vs challenger and speed gains

Okay, thank you. Lots of great questions coming in. I'm trying to just copy over as captions here. Another question on Slido was, can it support tagging models in a champion versus challenger approach so that a retrained model is only versioned deployed if it is better than an older model.

Great question. I really encourage you to check out the blog post on that one. We actually have a an exploration on how you would do kind of a champion versus challenger. Yeah, a champion versus challenger evaluation using positive connect using scheduled scheduled content in positive connect. But yes, absolutely. You could do you could do that. We go into a little bit more depth in the in the blog post didn't have time for it in this in this presentation.

One, I wanted to ask you is what are the speed gains I can expect from my models.

I will say it's going to depend specifically on the models, you're looking to deploy the data you're looking at and what you've been what you've been doing previously. I will say just anecdotally, the the speed gains we've seen from some of our customers who've already adopted this are phenomenal. I've been working with a customer who was previously fitting models that pre that before they were doing it. They're using a very large server to fit their models and that on that very large server. It was taking on the order of a couple hours to fit specific models. By moving to orbital and they were specific they were using snowflake as a database back end. They saw some of those models which previously had taken about three hours go down to running in about 24 seconds.

They saw some of those models which previously had taken about three hours go down to running in about 24 seconds.

So it's a really cool capability and a lot of that comes from the ability to just container it to just basically package up all of the feature engineering that needs to go all the pre processing steps and need to go to your models. And the actual model fitting itself and then doing it all within the database. So you're not incurring the latency, the network cost of moving data from a database to a server and then back again. It's all just happening in that database. So it runs really fast.

Demo code and supported models

Okay, let me see. Another question was, could you possibly share the demo dot QMD file that you went over in this presentation. Yeah, we are actually we're working on creating a kind of public facing GitHub demo asset, something that you can something that you can download the full QMD. I will say, if you want to grab the raw code. A lot of that raw code is in the blog post. So if you want to check out our blog post, which we've linked Please check that out. Feel free to grab any of the code from that you want. Great place to get started.

And it is does orbital encode game or GM models into SQL and I might need to know what that is. It does not right now. So actually, let me and we'll actually link this in the chat. So if you go to the Orbital if you go to the orbital page, the orbital documentation, you can set the there's actually there's a list on that. If you go up to articles on that documentation page, you can see a list of currently supported models. We are exploring adding more models to that list right now.

One other question from earlier was, when would I use this over deploying a model endpoint to connect as an API. Awesome question. So this really orbital really provides an excellent way to do batch prediction workloads. So batch prediction being you're running a you're running a large set of predictions all at once. And you're not, you don't need those predictions to return in real time, but you're running a large set of predictions all at once. And you're not, you don't need those predictions to return in real time. So you're not doing it. You're not making a decision immediately like while a customer is doing something. You can use. So if you need something for that kind of real time prediction real time inference, you might hear it might have heard it called online inference. If you need something for that, that's when the something like hosting a model as an endpoint on posit connect is a really great option, because that lets you do single predictions, really, really quickly.

Is there a reason to use tidy predict over orbital? orbital actually orbital builds on tidy predict. So orbital, the actual like converting the model itself and all the model coefficients into into SQL. This comes from the tidy predict package. What orbital does orbital extends the tidy predict package to also support doing to also support encoding pre processing steps and feature engineering steps into SQL as well. I can't think of many situations in which I would want to use tidy predict over orbital because feature engineering such a key part of model development.

To get on the orbital documentation you can actually find a list of what algorithms are supported. Right now it's limited to regression models. I know we're actively working on including classification models as well. And right now we support, again, I definitely encourage you to go check out the full list, but we support what I tend to think of as the most common regression models. So things like linear models. We support random forests. We support XGBoost. So a lot of the common regression models you find in use right now are supported by orbital right now. And that's something we're actively exploring, adding to that list.

Closing and introductions

So while I'm going to see if there's any questions that I miss, I want to ask a question to all of you. And feel free to use Slido to answer if you don't want to put into the YouTube chat. But as I mentioned, we host these workflow demos once a month, the last Wednesday of every month. And I'm just curious, what different workflows would you all like to see?

OK, thank you so much, Isabella and Hannah, again for helping me in the background here. Are there any questions that I may have missed?

Let me do one more check here into the chat. And Nick, I know you introduced yourself in the very beginning when you were giving the demo, but not when we all came into this Q&A room. And I'll introduce myself as well. But if we haven't had a chance to meet, I'm Rachel Dempsey. Sorry for not introducing myself in the beginning. I lead customer marketing here at Posit. And so I host a variety of different events where we're bringing customers in the community together to share workflows and use cases with each other.

So I did just want to call out, I also host a data science Hangout that we host every Thursday, except for tomorrow because of Thanksgiving. But each week, we're joined by a different featured leader from the community to answer questions from you all as well. So if you haven't been to that, I want to personally invite you to join us, and I'll share that in the chat. But Nick, do you want to maybe reintroduce yourself as well if anybody ended up in the chat and wasn't in the demo?

Yeah, I'm Nick Pelkin. I'm a senior solution architect here at Posit. So what I do is I help our customers find the best way to leverage the Posit tools, our partners like Snowflake to best achieve their data science goals.

OK, really quickly, I'm doing one last check for questions. And the other only one I see that hasn't been answered is, can we use Orbital in our studio without Snowflake? Yes, you can absolutely. Orbital is an open source package. You can absolutely grab the Orbital package, use it to convert your existing Tidy Models workflows to SQL. The real magic of Orbital comes, though, from having access to a database back end.

Thank you all so much for taking the time out of your day to join us. I really appreciate all these great questions as well. And thank you so much, Nick, for an awesome demo. Thanks for having me. Thanks, everyone, for all the great questions. And have a very happy Thanksgiving. Thank you all so much. Have a great Thanksgiving. We'll see you all for the next workflow demo.