Resources

Tom Mock | Quarto for the Curious | RStudio (2022)

video
Oct 24, 2022
20:48

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Thanks for coming to my talk, really excited to kind of peel back the curtains a little bit on Quarto and show it off to the world for some of the first times. My name is Tom Mock and if you want to follow along with these slides you can do so at rstd.io slash quarto dash curious.

So really for the longest time and why I'm so excited about talking about it now is that for the longest time we weren't talking about Quarto, we were kind of working on it behind the scenes, getting things ready for that first stable release. In April of this year we actually did our first kind of beta release or some of the like initial soft release.

Allison Hill wrote up this great talk or great blog post called We Don't Talk About Quarto Until Now. Allison was one of the folks who used to work with the development team before she moved over to Voltron Data and it just really kind of opened the floodgates to people hearing about Quarto, started talking a little bit on Twitter or generally just generating some hubbub.

For the most part people were really excited, you know, they're just like, oh, I hear about Quarto, let me try it out, I really like it. So Kelly said, you know, she stayed up all night switching over her R Markdown materials over to Quarto. And her kind of takeaway was it's pretty rad, my dudes, I like it. So she's enjoying it.

And there's even some excitement in the Python community. So if you're familiar with Jeremy Howard or the NB dev team or Fast.ai, they're actually moving some of their work over into Quarto and using that along with Jupyter notebooks. In fact, Hamal is going to be talking about that tomorrow in his talk, talking about the next generation version of NB dev.

Now while there's a lot of excitement in both the R and the Python community, there was also people who had a few more questions. And I think that's where a lot of y'all are. Y'all are the Quarto curious and you've come to my talk to hear a little bit about what is Quarto, why should I care, what's different about it, what's going on with our Markdown and all of that.

First off, don't worry, our Markdown's not going away. We're still going to maintain it. If bug fixes happen, we'll definitely fix them. Ultimately, the question that people have is when can I switch to Quarto? When is it ready? Is it stable? What's going on with it?

What is Quarto?

So ultimately, we'll answer that question by saying, what is Quarto? So Quarto is the next generation version of our Markdown for everyone. And that everyone kind of has an asterisk to it in terms of not 6 billion people or whatever is around in the world, but for really any data science language or any future language. As JJ talked about in his keynote this morning, we want to build a 100-year company. And while that today might mean working in R and Python and Julia and JavaScript and other languages, in the future, we want kind of a language agnostic tool that can work with anything, anything that becomes popular in the future.

To make it a bit more clinical instead of just saying the next generation of our Markdown, Quarto if we define it is an open source scientific and technical publishing system built on Pandoc. So Pandoc is a command line tool that's also used by our Markdown. So again, there's going to be a lot of familiarity as we go through these slides, and we'll build up a mental model of how Quarto is a little bit different than our Markdown and some of the improvements.

Quarto itself is a command line interface, or what is called a CLI. So you can access Quarto via your terminal, whether in RStudio or any text editor or any terminal on your computer. In this example, I'm just calling Quarto dash dash help to see what can I do with Quarto once I've installed it on my machine.

It'll print out something like this, and I've abbreviated it so we can see the slides nice and big, but the commands here that we're looking at are I can use Quarto render to render documents from the plain text format out to, say, HTML or PDF or to a website. I can use Quarto preview, which will actually render the document and then maintain a background web server so as I make changes, it shows those in realtime on the actual rendered output. And because it's a publishing system, I can not only render it locally, but I can publish it to the web, places like GitHub pages or Netlify or something else we'll talk about in the future called Quarto pubs.

How Quarto differs from rmarkdown

To really kind of build up our mental model, though, and help differentiate a little bit between R Markdown and Quarto, we need to kind of talk about what R Markdown is and what is happening behind the scenes. When you use R Markdown in your daily work, you're probably just thinking, oh, yeah, I open up an R Markdown inside RStudio, type some code, type some text, and then I render it and I get something beautiful out.

Behind the scenes are things that are happening here in the middle is actually a little bit of kind of magic. It's not real magic. It's code. But it's things that are happening as intermediary steps. So when you render your R Markdown, the R code is evaluated by a package called Knitter. That will evaluate your R code, generate graphics, generate tables, embed HTML widgets and all the other magic that you're building. This generates an intermediary Markdown file or a .md file. That is then passed to Pandoc, which actually converts a plain text MD into something like a report, so a Word document, a PDF report or HTML page, or a presentation, just like the presentation I'm giving today, something in HTML, or a more complex project like a website or a book.

So this process is great and it's been around for almost ten years. You know, Knitter is an old package, it's robust, we've been using it a long time. Something to note, though, is that it has this hard dependency on R. So to use R Markdown, whether you're using R or other languages, you have to have R, you have to have Knitter, you have to have R Markdown. And some of the formatting or preprocessing that occurs to the document actually occurs with literal R code written in the R Markdown package.

For Quarto, we have a similar diagram, and for some of you, if you're not looking closely, this might look like the exact same workflow. But there's a few differences we'll note. For one, rather than using a .RMD, I'm now using a QMD, or a Quarto Markdown document. It's going to feel almost identical to R Markdown, which is great. And we're still using Knitter. So Knitter will evaluate our R code from our QMD, generate an intermediary Markdown file, but now we have another word added to Pandoc. We have Pandoc with Lua filters. So we've moved any preprocessing or postprocessing into Quarto, that command line interface, or to affect them with Pandoc, rather than having to call them from R.

What this does is still allows us to create the same reports, presentations, and projects that are available, but we don't have to have R in the system to use this. You could use Quarto on a fresh computer and not have R anywhere in it, and just render Markdown as opposed to code. So people could use it for scientific communication without actually having to execute code.

So people could use it for scientific communication without actually having to execute code.

But because, again, we're trying to become this company that is building for all of data science and building for scientific communication in general, we want to do more than just evaluate for R code. So Quarto also extends to other languages. So Quarto can now use Jupyter and Jupyter kernels as the engine rather than just R and Knitter. So you can imagine Jupyter kernels include things like Julia or Python. We also have JavaScript you can execute via Quarto. So you have this availability of not just being limited to R, but collaborating with other people in Python, or if you're a Python dev yourself, also using Python within your Quarto documents.

And again, the output that you're creating is not any different. You're just using a different engine because you're using a different programming language. And if I'm using Python, all I have to have is Python on my system. If I'm using Julia, all I have to have is Julia on my system. And I can switch between them if I wanted to without having to change the overall document.

For some other people who are working with, say, Julia or Python or familiar with Jupyter, they don't want to use a plain text format. They might not even use RStudio. They want to use Jupyter notebooks or Jupyter lab as their environment. And that's possible, too. So you can use a literal .ipython notebook or what's called a Jupyter notebook, and whether using Julia or Python, you can execute code inside of that and with essentially the same rendering process, create the exact same output.

So whether using this plain text QMD or a more complex kind of binary format like Jupyter, all of those can create the same presentations, the same reports, and the same projects that you want to create.

The QMD format up close

If we think of what QMD looks like, a little bit to what you're kind of working mental model of, say, in our markdown document, we have the YAML header or the metadata that says do this with my document. So format HTML, create an HTML static page. Here we have side-by-side choice of choosing an engine for Knitter in R or, say, choosing an engine of Jupyter or a Python 3 or Julia to execute that code. And again, you're executing it natively. You're not wrapping it in R code or wrapping it in Python code to execute the opposites. You're just evaluating it in its normal engine.

You can then write some code. So we have here a group I summarized with empty cars in dplyr and then in a Python library called Suba, both of which look relatively similar, but one's evaluating purely in R and one purely in Python. Then we have a bunch of text or markdown that we're using to create the actual prose or the text inside the document, and you can see this spans both columns. There's no difference in the markdown you're writing or the way that you use Quarto to kind of typeset your document. That is the same between any language you're using.

Key advantages of Quarto

So again, whether you're using R or Python or Jupyter or whatever else, you can write these documents and, again, get the exact same output whether you're changing your engine or not. Cool. So we got a plain text format. It looks kind of similar to our markdown. I'm still not sold. What's the difference? So let's talk about that.

Number one, Quarto is one install, what we call batteries included. So when you actually install the latest version of RStudio that was released on Monday, Quarto is just there. It's installed with RStudio. So as soon as you download the newest version of RStudio or upgrade it in the future, you have access to Quarto. You don't have to install anything else. You can just start using it. You would need an engine. If you wanted to use R, then you install R. If you want to use Julia, you install Julia. Or Python, you install Python. But for at least RStudio users, it's baked into using it.

The other half of that battery is included, though, is that there's a whole bunch of formats that have been created over the past, you know, five or ten years of R Markdown. So you have your basic formats like HTML documents or Word documents, PDF, different presentations like this that are made in reveal.js or even advanced layout like distil. That's several different R Markdown packages. In Quarto, it's just all there. That one install of Quarto brings along all the different formats. And it also includes the even more complex formats, so things like websites or blogs or books or interactive documents. All of that can also be done just with the one install of Quarto. So rather than having to manage a bunch of different R Markdown ecosystems that were developed at different times and might have different syntax, you have batteries included, one install with the shared syntax.

There's still a few things we want to add into Quarto to match full parity with R Markdown. Specifically Flex Dashboard is one of the ones that we're really passionate about adding quickly. And that's a way of creating dashboards with Quarto as well as the older version of Flex Dashboard in R Markdown.

Choosing your editor

The other part is that you want to stay in the comfort of your own workspace. If I asked you to, like, bake a cake and I told you to go do it in the corporate kitchen here, it might be really hard, versus going to your own home and baking it at your own kitchen table, probably a bit easier. Similarly, with Quarto, you get to choose and stay within the comfort of your own workspace. So you can obviously use it inside RStudio. It's preinstalled there. It's going to, again, feel almost identical to R Markdown.

So all of that work you've done to learn R Markdown, or if you're learning R Markdown or Quarto for the first time, learning Quarto, can be applied within the R Markdown or, sorry, within the RStudio ecosystem. You still have a button for render. You don't have to go and use the CLI, but it's still there for you, and it's actually embedded into different functions that are available through RStudio.

But again, when you say, oh, I want to collaborate with my colleague or my friend, and they're saying, oh, I work in Julia, and I don't really like using RStudio, like, that's not the editor I want to use, or I'm a Python dev and I really want to work in a Jupyter notebook. Quarto works just as well in a Jupyter notebook, so you can use that as your editor if you want to. Again, we've got kind of on the left this idea of that's the IPython notebook you're working in. You're writing pure, normal Jupyter notebook code. The only thing that's different is at the top, you now have a raw chunk with the YAML that's defining what's the format, what do you want to affect. And then on the right, you have a Quarto preview, so when you're typing and saving, it'll actually re-render and show you the output instantly as it builds it out.

And again, maybe you don't like using Jupyter, maybe that's not what you're wanting to look at. Maybe you use VS Code as your editor and you're really passionate about that because you also work in, say, JavaScript or other languages. We actually developed a Quarto extension for VS Code, so you get this rich interactive experience both in like a notebook format, so using, again, a plain QMD as opposed to a Jupyter notebook. And you can also do Quarto preview and get that real-time output as well. So whatever editor you want or whatever editor you have to collaborate with your colleagues on, they can stay in the comfort of their own workspace, you can stay in your own comfortable workspace and you can still collaborate across a document with a shared syntax.

So whatever editor you want or whatever editor you have to collaborate with your colleagues on, they can stay in the comfort of their own workspace, you can stay in your own comfortable workspace and you can still collaborate across a document with a shared syntax.

The other part of what building for RStudio and building for other editors like VS Code is we have things like rich auto-completion, which wasn't always available and is something that's added in Quarto. So I'm editing a document, I might remember, oh, yeah, I'm using format HTML, but what else can I do here? And if I click control space inside RStudio or VS Code, it will pull up all the options that are possible, almost like with R code inside the editor, you can hover over it and see what does this function actually do? That works in YAML and it also works in chunk options. Again, there's dozens of chunk options you can do, and while some of them are very niche, you're not going to use them all the time, those are very powerful for knowing exactly what you need to do and seeing, oh, I need to change figures, I just start typing figures and it will auto-complete to what's available. And again, whether you're working in Python or R or RStudio or VS Code, this is available for you.

Interactivity and formats

So we've got some auto-completion in Python and R, maybe we want to talk more about formats and we want to get excited about that. So this is actually an HTML widget, you'll notice that I'm pulling it over, this is actually where we are today, so we're at the national harbor, I can click on this and it says we're here together. So I can drag that over a little bit, it will load, the Internet's a bit bad in here so it's not loaded all the way, but importantly you can see I'm actually interacting with it with R code. So you can do Shiny, you can do HTML widgets or other things you've built and put those into Quarto, and in Python or your Python colleagues, they can also embed what are called Jupyter widgets. So similarly, you have the ability to create interactivity on the client side, both with Python and with R within a single Quarto document or in separate Quarto documents.

Another part of interactivity that Quarto adds, though, is beyond just using R and Python, it even embeds a version of JavaScript called observable JavaScript. This is an extension to vanilla JavaScript created by Mike Bostock who's also the author of D3, that might be one you're more familiar with in the JavaScript land, and now we have a reactive JavaScript library that might look to you like a little bit like Shiny, you know, I have the ability to create and interact with different tags, and it will actually change the graphic, or I can do things like filter the data to have specific build links for this penguin's dataset, and again, this is baked into Quarto so you don't have to install another thing, this interactivity is added for the document itself.

Unified syntax and existing documents

So we've got this kind of idea of interactivity, different editors, we can also kind of unify a lot of the syntax that was created across years of R Markdown. So again, this is next-gen R Markdown, it's not being uninstalled from the web and removed from the Internet, we're still maintaining it, we'll fix any bugs, but our new dev effort is going to go towards Quarto and to Knitter, so Knitter will still power R Markdown, still power Quarto, but we're focusing on Quarto as this next generation of kind of data science and technical science communication.

You might be asking yourself, okay, well, I've got a lot of old R Markdowns that I want to use. So for some of you, keep using them, we're still going to maintain R Markdown, you don't have to switch if you don't want to, you can keep using it. But Quarto can actually render R Markdown documents, so you can take your existing R Markdown documents and render them over to whatever format you want via Quarto. So whether you're staying in R Markdown or switching them over, you can still render them out.

And in fact, this unification of different formats, let's say we were trying to render to HTML and to PDF. For HTML, we use things like CSS and raw HTML, for PDF, we're going to be using LaTeX and other things like that. Normally you'd have to write a lot of manual code in there to convert between them. But for this document, I'm not writing any CSS or any LaTeX, I'm only using Quarto, and I can create a nicely typeset document like this. It's got a picture in the sidebar, it's got a table of contents and all sorts of other footnotes and things. That document in PDF looks almost identical. You're obviously constrained to what PDF can do, so you don't have hover text and you don't have the same type of interactivity as HTML, but without having to write anything else besides Quarto code or Quarto Markdown, I'm able to generate these two very well typeset documents from a single source, so single source publishing for this.

And again, Quarto is extended for other languages, so this typesetting and all the different things it brings are made with Lua filters rather than an R package, so things that people write in Python language and in R or in Julia are cross-language and cross-format compatible. So we're meeting these users in their native language, so when you are collaborating, you're not fighting with each other, but you're merging around a common syntax.

Those people might ask, well, what to do about my existing IPython notebooks? I don't want to get rid of them. The answer is essentially the same. I can Quarto render any IPython notebook and send it over to the format. The only difference is an IPython notebook might actually store computation inside of it, so you can either tell Quarto, hey, take this document and render it top to bottom or using the embedded execution, just format it out to something else or change the format to the final output. And in fact, Quarto has a helper tool to allow you to convert between a plain text QMD and an IPython notebook itself, so you can swap back and forth whether using IPython notebooks or Quarto without having to change the source code. You can actually convert back and forth between them.

Extensions and publishing

You can also extend Quarto, so we have things like short code, so if you wanted to add something into the open source Quarto, you could use a short code like this that will insert things like fun awesome icons, or you can add filters that will actually change the overall appearance of a document, and again, this is not limited to any language, so any extensions that you write in a Python world work in R and anything in R works in Python. You can also add entirely new formats, so maybe you want to create a format for your company. You can create your own cool company format for Reveal.js. So these extensions allow you to go further with Quarto as we move forward and add more things.

And then lastly, I mentioned this idea of Quarto pubs, so Quarto can actually publish the thing you just rendered to the Internet, so you can publish to GitHub or to Netlify or connect, but we're now announcing that Quarto pubs exist. It's a free resource that allows you to publish anything you create locally and put that out on the web at a URL you choose and manage all of your documents that you're publishing out. So this is very exciting. I would recommend signing up quickly to go grab your preferred user name, and you can use that.

The last part I'll say is, again, you can use Quarto in RStudio Connect and RStudio Workbench already, so if you do have those products available for you, we have some examples of using them, and you're welcome to go check them out, and with that, we're getting close to the end of the talk, so I want to give a shout-out to the entire Quarto dev team. Quarto is crafted with love and care by the same team that works on Quarto in R Markdown, and you can contribute yourself if you want to contribute, whether an extension or to the Quarto CLI itself.

I've got a summary of the talks that are available and other things coming up, and some takeaways with the batteries included, the shared syntax, the ability to choose your editor and your language, and the fact that R Markdown is not going anywhere, we're still maintaining it, but we're going to be building in a lot of our new features into Quarto. With that, thank you for your time. I'm happy to take questions after the fact down in the lounge, or you can reach out to me on Twitter. Thank you, and have a great and safe conference.