Kirill Müller | dm: Analyze, build and deploy relational data models

Transcript#

This transcript was generated automatically and may contain errors.

Hi, I'm Kirill and I want to tell a few stories about the dm package and show recent developments.

In a nutshell, dm helps when you have multiple tables in R. Some call this a relational data model. So we had computers for sale in the 50s, we started to think about databases in the 60s, 1970, Edgar Kot published a relational model and this is way before I was born and still in use today. An example, the points at the right are energy consumers in Switzerland stored in rows in the table at the left and there is a lot of detail in all those related tables and this looks scary.

With our client, we wanted to build a Shiny app that allows filtering across all those related tables. This shows the Shiny app in action. All we had back then was a tool to visualize the relationships. We did not have tools that made the task less scary, so we went ahead, we implemented a routine that did all those joins and filters in like 800 lines of code. It worked, but it was tedious, it was difficult to maintain, it was error-prone and we thought we can do better than that and that's how dm was born.

With the dm package, the filtering logic was reduced to a variant of this. So from 800 down to 6 and the best thing is if the data model changes, the code stays the same. I call it a success. Others call it a success too. Amazing. It's a treasure, it's a lifesaver. Thanks, Gary. Thanks, Hadley. Thanks, Niels. Thanks to everybody who filed issues who contributed code to the dm package. You are awesome.

So from 800 down to 6 and the best thing is if the data model changes, the code stays the same.

Best thing is we are on CRAN in version 1. Our universe in GitHub as well. So if you want the latest and greatest, you may want to go to our universe. If you don't want to mess up your local installation, click this pause it cloud button. This gives you, well, should I say RStudio cloud? IDE in your browser with everything installed so you can just test. So there is no excuse to not try today.

Some people are wary about dependencies. We are not among those people. So in fact, dm wouldn't have been possible without all those great packages we depend upon, without iGraph, without everything. Thanks. This is us.

So, with all those bits and pieces in place, I hope very soon to be able to present a way to seamlessly move between all those formats, so all of this becomes one.

We are Synchro. The team, as I said, helped a lot preparing the Shiny app, making sure this all works. Thanks here. We're very excited to be a sponsor here. If you have detailed questions or would like to chat or would like to have one of those cheat sheets here, come by. Happy to talk or happy to take questions here. Thank you.

Kirill Müller | dm: Analyze, build and deploy relational data models | RStudio (2022)

Transcript#

Why multiple tables need dm

New Shiny app for building data models

Using dm in consulting work

Deconstruct and reconstruct for data analysis

ETL across multiple data models

Bridging normalized and nested data

Featured software#

rstudio

rstudio-conf