Andrew Bates (replacing Mike Stackhouse) - Making an App a System

video

Oct 31, 2024

11:50

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

I'm not Mike Stackhouse. He's the one who wrote this. Unfortunately, he wasn't able to make it. So if I do a terrible job, just send all your complaints to him.

And that's really, we're going to talk about how to sort of build a system around Shiny that lets us build a lot of applications that are kind of follow a similar template.

Pharma context and constraints

So not everybody is probably in pharma. So a little background is in pharma, there's this thing called GXP or good clinical practice. There's a lot more restraints on things that you can do. Like data access is a really big thing because you don't want anybody accessing the data.

And it can be hard to do new things and add new systems. So some of the implications of that is you've kind of just got to work with what you have. One of the things being like your data is often on a file system versus a database. So that's kind of where we're coming at with this. So it might look a little bit unusual if you're not in pharma, but the reason why we're focused on file systems is that.

So alternative title would be more like how to make a reusable app for within the constraints of pharma.

The copy-paste problem

So the main idea here is let's say you had a biometrics lead. They have, we have an app for called for study A. They see it, they like it and say, hey, I want that for study B as well.

So one thing you could do is like copy the app for study A and turn it into study B and then change the things that are a little bit different.

But you can run into issues when you find a bug in the copy study B. What are you going to do then? Well, you could fix the bug and then copy it back into study A. That could work.

But if you have study C, D, E, F, it's going to be a lot more challenging and it's going to be a lot easier to make mistakes and mix things up.

So really your user story might be, I also want to make an app for study A or for study C and I want to include a new chart as well. I want something a little bit different, not quite the same thing.

So again, we could copy and paste. But how do we get that across? So let's say it's a new chart that we're adding. How do we get that to all these different applications we have? We don't just have study A, B, and C. We have a dozen.

Copying, pasting to all these places is not really a good idea. So kind of the focus here today is how to generalize this idea of having an app that you want to be able to adapt and do just little tweaks. And how do you propagate those changes and to keep everything in sync?

The package-based solution

So the way we recommend doing this is your original app, turn it into a package that you can then use that package as a dependency and all your other like child apps. So the idea is kind of like a template. Your main app is like a template that's included in your package. Maybe you have modules that you export for your package. You use those throughout all the child apps.

So if you find a bug in one of your child apps, you don't have to go and update everywhere. You just go update your parent package and then you can propagate that out to all the children. All you have to do is update the package version.

You just go update your parent package and then you can propagate that out to all the children. All you have to do is update the package version.

And another thing, another aspect kind of related data is you want to be able to configure your app slightly differently. Again, they're not going to be exact copies. You want a little flexibility there to tweak them.

Data processing and YAML configuration

So in terms of the data, what we recommend doing is doing as much as you can of any of your data wrangling, data processing outside of the app. And Posit Connect has this really cool feature where you can run things on a schedule. So you can put it in like an R script or R Markdown script, for example, and then run that whenever your data updates and then your app, so that way you never have to actually change your app.

So, but then you have the problem of, if your data is a little bit different between each application, how does the application know how to use it? So the way you connect your data to your application and the way you generalize these applications that we recommend is using a YAML file that basically specifies all your configuration.

So for example, you might use as your subject variable, your subject ID, maybe another study uses a different variable. All you have to do is change that one place and then your app's going to work just like your child app is going to work.

So within the app, instead of calling, you know, instead of saying you subject ID, I want to use that column, you do more on the bottom here where you say, you just call the YAML key and then it returns a value for you. So you don't have to, it's more like a function argument than specifying the column name.

So each child app or child repository is going to have like three main components. You're going to have the app, your markdown or R script that's going to run your data processing and your YAML file that's going to configure the details of application.

And so by having it as a package, we can version the package so we can have different apps that depend on different versions. Maybe you have an app that you don't want to update, you can keep it on an old version. Maybe you, you know, another study team wants the most current version.

And it also helps for validation because you really need to just test the parent package, validate that. And then for any child app, you just, all you need to look at is the differences, where you don't need to run all your tests again.

The MAT package

So Mike and Maya created this package here called MAT that you can find on the Torsys GitHub. And it's an example of how to implement these ideas. So it's really, really basic and it's in early development right now, but it has the functionality to where you can have your parent app, like your template, generate your child app from that. There are some example YAML files in there so you could see what kind of metadata you'd want to use and get that mapping from data to app.

And a template for running your data processing.

So they're planning on expanding it as basically, as we work, as we use this idea and find problems or find better ways to do it, they're gonna implement it in this package.

Summary and benefits

So to recap, the benefits of this generalized strategy is you only have one place that you need to make updates. And you can use the existing system that we have of pack and versioning for help and validation.

And it also gives you flexibility. So it's not, your child apps are not exact copy, but you have some different tweaks that you can run, different features that you can use.

Q&A

That's all I've got. Thank you so much, Andrew. I have a question from someone in our virtual audience. Do you mind talking a little bit more about caching a Shiny app for faster startup times? Yeah, so it's caching the data. So let's say, I mean, if your data updates, maybe once a week, you'd run your job once a week to do whatever wrangling you need to do and then you save that. So that way you're not doing it within the app itself. So that's a great question.

And you mentioned that using a scheduled script in R Markdown would be a good way to do that? Yeah, and if you look at the packages as an example, that's how we're doing it because it's easy. Most likely, if you're doing a lot of Shiny app development, you probably have PositConnect. And it's just an easy way to use the existing tools.

What are your guidelines or general guidelines on how much of the app code goes in the package? It's pretty much as much as possible. Right, because anything you're going to want to generalize.

Do you rely on the Gollum framework to package the initial app? I mean, that's up to you. I think the MAD package has support for setting up for Gollum and for Rhino, maybe something else. It doesn't really take an opinion on that.

And I have one more question here. Can you say a few words about the various Shiny frameworks, Gollumverse versus Rhinoverse? Do you prefer one over the other? I like both. I mostly use Gollum because it's been around a little longer and that's what I've used. I like some of the ideas that Rhino does, like how you use this box, so there's a little bit more control over your dependencies. I just haven't used it a lot in practice.

Thank you so much. Thank you.

Featured software#