Joshua Cook - Quarto: A Multifaceted Publishing Powerhouse for Medical Researchers

Transcript#

This transcript was generated automatically and may contain errors.

Give me a hand in welcoming Joshua Cook, who will be talking about Quarto as a multifaceted publishing powerhouse.

Thank you so much for having me. I look up to a lot of the people in this room, so it's a real honor to present today. My name is Joshua Cook. I'm from the University of West Florida. Here's a little bit about me.

I recently graduated this past May with my master's in data science. Before that, I was in clinical research, and then before that, I was in biomed. Right now, I'm an adjunct professor at the university. I teach anatomy and physiology, and I work with a lot of Ph.D. researchers and a lot of physicians. I was a research quality analyst and a clinical research coordinator for a year before that. Actually, for three years before that.

Medical research. I've heard a lot about pharma. I want to insert my background a little bit here. I have never worked in pharma. I've never done clinical trials data. I'd like to. I'd like to learn it and get into it, but my experience has mostly been with phase one clinical trials or benchtop studies. That's mostly in academia. Sometimes you'll get it done in healthcare and industry.

When I talk about medical researchers, I'm talking about M.D.'s, D.O.'s, nurse practitioners, occasionally Ph.D.'s, or if you're responsible for preparing documents for somebody in this category.

The medical research dissemination process

The general process. You start with study planning. You initiate your study. There's some data collection, data analysis, and then dissemination. Pretty similar to clinical research, except that we don't have as strict guidelines in terms of our data structure, and then these are usually, like I said, in academia and stuff like that.

The publication needs are a little bit different. Our main goal is not a regulatory submission. I've only ever worked on one or two of those briefly before. My main goal in my roles has been statistical reports, and sometimes there'll be a simulation study before the report. Interim reports while a study's going on. Interim presentations, if a doctor or a Ph.D. is going to give a presentation at a conference and the study's still ongoing, and then when it's done, a final report, a final presentation for them to go present again.

They usually want the manuscript submitted as soon as possible. They want a blurb for their website. They want it on their blog, and then there's usually this new idea around patient-participant material. The patients themselves are starting to ask for their results back, so whenever you have people that are taking part in a clinical trial, they've been in it for six months, they want to know what their labs have been looking like. Well, it's very, very hard to extract that if you're entering data into an EMR or some dense medical system, and so being able to very quickly extract that information for that patient and then allow the doctor to have a presentation on hand to talk with them at their next visit's really helpful. Finally, sometimes there will be a regulatory submission.

Just in summary, this is a lot of work to keep up with for medical researchers, people who are not used to coding, people who don't do submissions, especially if changes are made. Many, many times a doctor will look at my statistical report and say, it's great, and then they're getting ready for their presentation, they're actually out there, they're not at the institution anymore, and they're texting me, I need you to update figure one as quickly as possible. Well, I do that for them, but then every single document that I made before that's out of date.

So that's really what my presentation is focused on, is how do we use Quarto to make it to where when I update one document for the physician or the researcher, everything else updates with it, we have a change log, we have something to say what it looked like before.

So that's really what my presentation is focused on, is how do we use Quarto to make it to where when I update one document for the physician or the researcher, everything else updates with it, we have a change log, we have something to say what it looked like before.

So here's the traditional method. You draft an initial document, everybody approves, all documents are generated, I've got a report, a presentation, et cetera. Then I get a random edit request, I update the report, and then I frantically copy-paste everything that I updated into all the subsequent versions. So yes, the doctor wanted figure one to be different, but now I've got to change it everywhere else. During that time, something gets missed, an error is made, or some more edits have been requested since the first request. Then the author, the medical researcher will go out and present, and they accidentally present something that's not true, or some outdated piece of information. They get frustrated with me, the stakeholders are frustrated, and then we continuously repeat this process. It looks like some of you have experienced this before, so that's good.

As long as I go do it in my data processing QMD, whenever I go to re-render these documents, that change will be applied everywhere because I used shortcodes.

I've been in the mix where things kind of get messed up, where they have a certain copy that I generated last week, and then I generated this new copy, and everything's just a mess. This saves that, especially if you're using GitHub, because then you can track your changes as everything is taking place.

And then furthermore, researchers don't know how to use Reveal JavaScript sometimes. They prefer PowerPoint, Keynote, that's fine. You can just change the format to PPTX, and then it will output those figures, those tables, the inline code that you referenced into a PowerPoint format. That way, the only thing they really have to do is copy that slide from your new output into their existing presentation. That makes it very quick and easy. They don't have to worry about resizing. There's no messy screenshotting of reports. It's what physicians have done with me in the past. It's also in the updated report, they just take a screenshot of what I generated and the whole thing's messed up. But yeah, everything will update for their new presentation.

So in this case example, this is really important because all of these documents are frequently used by medical researchers. And you could expand it further to Quarto websites, blogs, or other types of projects where maybe they have a website, where they have a research portfolio, maybe they're applying for grants. I've heard all of those things. You could add that in there as well. As you're updating your one QMD file, every subsequent Quarto output is going to update. If an error is identified or something else, simply altering those files will queue them to refresh.

And I'll show you a setting in a second that you have to put into the Quarto YAML to kind of freeze at that time point. And then anytime there is a change detected, it will re-render that document. If another template is needed, we've all been rejected from journals before. This is probably one of the biggest time-saving tasks that I've had is we'll submit to the Public Library of Science, and they'll reject us. And then we'll say, okay, submit to Nature. Well, then you have to go reformat the references, the figures, the file structure, everything needs to be changed. Instead, why not just enter in one line of code to change the template over to Nature, and then specify that in your YAML header? Everything else will be automatically rearranged. The only catch is some journals require you to submit figures and tables separately. In that case, you will want to disable the generation and then just save them natively to a file.

So even if the researcher prefers traditional PowerPoint or Word, you can still output to those. It just gets a little bit tricky with tracking. Still recommend to protect against redundancy when updating your analysis code though. And then, in other words, we are effectively moving beyond copy-paste. We're making it to where they no longer have to take the report and screenshot whatever change I made. It's automatically in all their documents. If they have access to the directory that I'm outputting these documents to, maybe if it's on a shared drive or something like that, I can just say, hey, every document that you need is now in the updated drive. Saves me a ton of work and it's really efficient for submissions.