Mika Braginsky - DataPages for interactive data sharing using Quarto
videoimage: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hi everyone, I'm Mika. I'm a software developer at Stanford University and I'm going to be telling you about a set of tools and templates that me and my collaborators have been developing to help improve data sharing. Imagine you're a researcher and you've painstakingly created a high-value data set. For example, maybe you've taken body measurements from hundreds of penguins and some people know what's going. So now you want to share your data with the world.
And you might want to do that in a way that follows guidelines that are known as FAIR, making sure that your data is findable, accessible, interoperable, and reusable. So how do you share your data? The easiest and most common thing that people do is just put a static file with your data somewhere people can access, like a GitHub repo. And this is really quick and easy to do, but it doesn't provide much support for your audience. If they want to look at your data, they have to clone your repo or download the file, figure out what's going on in there, and do any visualization that they might want to do manually themselves.
And I bet many of us can relate to the frustrating experience of opening someone else's data spreadsheet and being like, what on earth is going on here? So on the other hand, you could put in a lot of time and effort into making a custom repository, like maybe a database with an API, and you can make a custom website or a Shiny app with fancy visualizations. And all the stuff would be super helpful for your users, but a ton of work for you.
And all the stuff would be super helpful for your users, but a ton of work for you.
So personally, I've done both of these things many times for different projects, and I wanted to help researchers and other people who manage data get the best of both worlds. So I'll be telling you about how to create a data page, which is basically a website for your data that's both easy to make and has a lot of rich features that make it easy to use.
Components of a data page
So what are the components of a data page? First off, you use the Redivis data platform to store your data set, which for free immediately means that your data is versioned. And then you use our custom data pages Quarto format to connect to this data to a Quarto website that you can configure and customize. At the center of this website are interactive visualizations created using Observable. So also you can add ggplot or whatever other visualizations you want. And all this results in a static website so it can be hosted on GitHub pages, but it has rich interactive functionality.
Getting started
I'm going to walk you through more concretely the steps you would follow to spin up a new data page. To get started, all you would need to do is use the data page GitHub template repo to create your own repo. And this gives you a template Quarto website that uses our custom data pages Quarto project type and format. And you then customize this website by adding information and formatting specific to your project and connecting it to the Redivis table with your data.
What you get
What do you get from doing all this? You get a website. The home page of your data page introduces your data set and features these automatically created interactive visualizations. So this means that users can get a feel for your data right away. And you can customize this through configuration, through like YAML variables, or by adding like any custom content you want, or by swapping out these templated visualizations. And a major area of focus for us is adding more configurability and more types of visualizations for you.
In addition to the home page, other pages include the data page where there's an embedded table browser inserted through a custom shortcode. So users who might want to actually look at your data in more detail can use this browser to preview your data and your metadata. Even more involved users can go to the analysis page to see an automatically generated code snippet that tells them how to access the data directly from R or from Python. And then finally on the about page you can edit all this YAML to add links, to add citation information, and then you can add whatever content you want. And of course since this whole thing is a Quarto website you can add other pages, you can customize the styling, and use all the amazing features that Quarto now has and will have in the future.
If you go to datapages.github.io you'll see a gallery of the different data pages we've made and find out how to get started with making your own. Thank you.
