Resources

Sean Nguyen - Beyond Dashboards: Dynamic Data Storytelling with Python, R, and Quarto Emails

video
Oct 31, 2024
18:42

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

So imagine you're tasked by your stakeholders to develop this dashboard to get all the information that they could possibly want, right? So you spend weeks gathering the data, making sure everything's there, and then you actually go to deploy it, and you think that you're going to have all this information at their fingertips. And then when you actually deploy it, they actually never use it, and it ends up collecting dots.

And so like, you know, I don't know about you, but I've experienced this multiple times. And part of it is just kind of the life cycle of developing a dashboard. And I think the underlying cause is due to dashboard fatigue. Because a lot of the times the stakeholders, they say they want all these different metrics, you go and compile them, but it's just like, it's just too much data for them to handle and not enough time to be able to process everything. And sometimes the key insights can get lost in the noise. Because there's so much information to look at. And then sometimes they'll have analysis paralysis, because they don't know exactly what they want. Because, you know, I'm running to the next meeting, I want to know this metric rather than go to the dashboard. I'm just going to email my data scientist to be able to give me that figure.

And just a little bit of background about me. I'm a data scientist at a venture capital firm based in Chicago, SVG Ventures. We invest at the seams of oceans and seafood, food and agriculture, and clean energy. And I've built many, many dashboards in my time. And I found that, you know, sometimes you can build these dashboards, but you don't necessarily get the engagement that you want. And I kind of want to share my journey of how I was able to gain additional usage from dashboards, right?

The problem with dashboards and the case for email delivery

And I think part of the reason is due to there's friction. And so friction, what I mean by that? I mean that, you know, logging into like a Tableau or Power BI portal can be somewhat cumbersome for an executive that's on their phone constantly. And I wanted to kind of meet them where they are. And that's typically, at least for me, in Slack or Gmail or Outlook, right?

And what you can do is you can kind of almost create, I want to introduce this idea of like using emails to be able to deliver insights to these executives. And then you can kind of have like dynamic subjects that can kind of alert them with different KPIs or whatever metrics that you want so that they can actually act on this insight.

And so the analogy that I kind of want to bring up is the idea of like the library, right? So the library has all the information that you'd want, but you kind of have to pull. You have to go in and find the information that you want. And I want to introduce the concept of like pushing the data, right? So it's almost like you have a librarian come to you and deliver that data. And then in this analogy, be using the emails to be able to dynamically deliver them to you whenever you want.

And I want to introduce the concept of like pushing the data, right? So it's almost like you have a librarian come to you and deliver that data.

And you're able to do this by using a portal. And then you're able to identify the specific parameters that you'd want to be able to deliver it to them. And then you can use this in both Python and R, and then you can leverage Posit Connect.

Case study: the CMO and personalized email delivery

So just to kind of make this more of an example, I'm going to use like at a case study, so to speak. And so imagine you have a chief marketing officer, Jane. She's got multiple dashboards, but then she doesn't have, she doesn't know where to search for the information that she wants because she has like seven different dashboards that she asked for, but she has limited bandwidth.

And so what we can do is we can identify what are the key data points that she wants at whatever this particular time. And so for this example, let's say for this quarter, she's really interested in some campaign why and she wants to know whenever there's new leads, right? So these are the key metric points that's going to kind of guide our principles for developing these emails. And then you'll just want to make sure you personalize the delivery. So what the two things I focus on are like when, so the whenever we should send these emails to them and then the so what, so like what do we want them to do it in response to said email, right?

And so how I do this is I use Quarto and then the pins package. And then we use Posit Connect just like we saw this morning's keynote. It's the platform that you can use to host all of your applications, shiny apps, and reports easily within your enterprise.

And so I use pins extensively at SVG because what I do is we have multiple data sources feed into our data warehouse and in ours, we use Google Clouds, we have BigQuery. But then within our Connect instance, I have a QMD file that runs every day and it's able to save model data as a pin. And then what I do is I use like I have data that's prepared for specific use cases for different departments. So in my case, like it would be marketing and finance. So I'd have marketing data that's ready to go whenever I need and then finance data and so on and so forth.

The four steps to generating Quarto emails

And so once you have that, there's essentially four main steps that you need to be able to generate these emails. So the first one is setting up your YAML. So this is an example of a QMD report. What you need to do is all you need to do in the YAML is just change the format to email. And then the second one is just running your code. So I'll just reference the pins. So this one, I'll reference the marketing pin right here. And then you just run whatever R or Python code that you would normally run. And then third is just establishing your conditional email logic. So this is the brains of like delivering an email or not delivering it. And the fourth is the actual content of the email itself.

So within the context of R, I use the YAML and then we use the parameters argument right here. So then you can add parameters to your document. So this is a way for us to modularize our documents so that you can send different variations for let's say different departments. So let's say I have a user table right here with all the individuals within my organization. They belong to different departments. So you can pipe in different parameters into the document and then reference them in the future. Right. So then right here, we'll have Jane and then she's part of the marketing. You can reference this by using the params argument right there. And what that'll do is then render your code so that it'll just say marketing, hello, Jane and filtering on her emails right there.

And so then the power of this being able to use this to be able to generate multiple documents for different departments, different users, different use cases. And so J.D. Ryan had a fantastic talk last year on using purr to be able to generate multiple different documents and by using the purr package and a map family of functions. So you can kind of see here you have the I can generate a finance report or a marketing HR, what have you. And so what I'm able to do is use this to generate different documents for different folks within the org.

And then within Python, it's relatively similar. We don't use the YAML up here. We just say the format is email, but then you just insert it into a Python code chunk and you have the tags parameters right there. And then you kind of pipe it similar to you do in R. And then you kind of reference it by just using the curly braces Python and it works just the same.

The second component of the emails is just running your code. So this is running your code just as you normally would. Right here I'm having like an alert for the params department. And then what this does is I can insert like a gplotly, like interactive graphs right here or react tables. So that has an interactivity. Whatever metrics that you normally have in your analysis, you kind of have it embedded right there.

And then the third component is the conditional email logic. So let's say I have neglected dashboards, right? So you generated them, you want the users to be able to actually check them out. So I have a shiny dashboard for my marketing team, and then I have a shiny for Python one for the finance team. So then you can have different QMD files that will then house your email to be able to send. And then all you need to do is define your conditional logic. So for the marketing team, what they want to know is like whenever we have new leads, just ping me in Q3. But then in the finance team, they want to know when we're going to be over budget, you know, two weeks beforehand. So then that's the logic that we're going to be using to be able to determine if we're going to send the email or not right there.

So then the fourth component, oops, sorry. So in terms of like making this more concrete, we can use this guiding principle of just sending the email whenever we have leads. So what I have is I have some leads data that I have from the pin, and then you just count the number of leads right there, and then you define a function. So right here, I'm just doing a send email with leads. And then you just have the logic being that if the number of leads is greater than zero and the data is in Q3, then return yes, if else, return no. So this is the actual function that we use to be able to determine if it's going to send the email or not.

And then the fourth component is just the emails, right? So we talked about the YAML, and then now we need to get into the specific format that we embed within a Quarto doc to send the email. And so what we use is we create a .email div with the triple colons right here, and you can just put plain text in here. You can insert code chunks, and then you can embed a subject. So just note that you just add another div, but you do .subject. It's important to make sure that this is nested within the email. And then you can kind of add dynamic things in here if you'd like as well.

The third one is adding the schedule logic. So you do a .email-scheduled. And so this is where you put your conditional logic where you're actually going to evaluate whether or not to send it. And so Quarto, because it supports multiple languages, the documentation says you just need to return something truly or falsely. So it's like yes, no, true, Python true, are true or false, right? So just make sure whatever you do, you abide by that, and Quarto will take care of the rest. And so this is for Quarto version 1.4 and above. So if you are on an older version, just make sure you at least have version 1.4.

And then so once you define this code right here, you say send emails with leads, and then you just evaluate it right here with the email-scheduled div. So because the number of leads is greater than zero, it's then going to send the email. And then this works similarly in Python. You just define your Python code chunk, have your function, and then you basically the same thing, and then you evaluate it, and it'll render based on the criteria right here.

Putting it all together

And then so this is an example of how you actually implement this in the email. So notice here, I'll have the email divs, and then you have hi, and then you can reference the parameters. And then you can say, here are the number of leads that we have, and you can just reference it and then insert different plots. You'll notice here that I didn't put interactive, like a ggplotly in here, because this is an email. It's like an HTML email. And then you can reference environmental variables right here, like the rs.posit.connect.report.url. So you don't have to hard code it right there. You can kind of reference it.

And then one tip that I have found is that whenever you're rendering like a GT, you can, for the email, you just make sure you use that as raw HTML. So this preserves the formatting, or else it'll just be kind of a plain table. And so we have this as a email div, and then you can insert the subject right here. The nice thing about this is that you can insert code chunks in here, like the R code chunk. And so how this will render is it'll say, campaign Y is either higher or lower based on the function and what it returns. And then you can actually add your conditional logic right there. So you'll see that, remember, everything's within the email div. You don't have to necessarily use the email scheduled. I prefer to use it, because then you can actually control when these are delivered. And then you can have the subject be dynamic versus just a static or hard-coded text.

Just to kind of put it all together, we can take this marketing QMD file, and you reference the pins. You have your report, your regular code that you normally have. And then that will yield this QMD report that's dynamic, that has all the hover tooltips. And then you can sort, search, arrange, and all that stuff. And then you insert your conditional email logic, and then you insert your email content. And what that does is then it renders a static version whenever the conditions are satisfied, if that makes sense. So then this is the dynamic version that they have it. This is the static. And then you'll notice that it's kind of pared down. It's not as full as this one, because it's just an email. But the nice thing is that you can embed URL links to, let's say, another dashboard that they should be checking out based on your parameters. So this is a mechanism by which you get people to actually go to the said dashboard or look at a specific report.

Delivery via Posit Connect and scheduling

In terms of delivery of these emails, what you can do is you go into your Connect console, and you go to the People tab, and then you're able to create groups. So in this case, I created a marketing group. You can define who's actually going to be in the group, and you can add or subtract people if you'd like, or you can create individual reports for people. And then you kind of provision access by saying, OK, I'm going to allow the marketing team to see this report, have this marketing URL right here, and then you can kind of schedule it. So this is where it renders. You can specify when it renders. You can have it render every hour, every day, every month, every quarter. And then what will control whether they get the email or not is based on the conditional logic that you establish right here. So this would only render if we have leads that are in Q3 or not.

So if you send this out, they're like, I'm getting too many emails. You can kind of tweak your logic right here. And so then the report can be up to date every single day because it's running. And then you change the evaluation of the parameters right there.

Considerations, tips, and continuous improvement

And then some considerations with quarterly emails. Because of the nature of emails, they tend to be static. So you kind of have to parse it down. You can't have super beautiful tables and large tables because it's just like it's going to be pretty small. And then you'll have limited interactivity.

In terms of tips, I think the main thing that I found is just focusing on the key metrics and then using the dynamic subject lines right here. So it's like you can make it so that the campaign is higher, lower, whatever. And that way it's not hard coded. And then just be selective in your delivery of criteria. That way you're not inundating people with multiple emails. And then always thinking about the so what.

So when I deploy this at the firm, it's not like it just worked overnight. It's kind of like this continuous process. So you're just making sure you get the emails to them and then actually check are they actually checking out the dashboard. So you can look at the connected admin console to kind of see are they using the dashboards or not. And you can adjust accordingly to see if they're getting too many emails or not enough. Or you can kind of tweak the KPIs that are being delivered to them. And then just kind of with this continuous process, then you'll get better feedback and better adoption.

And so just to kind of bring it back to that case study, we had the CMO, Jane, she had multiple demands. And now she's able to see this insights at a glance, doesn't necessarily have to log into the dashboards because she gets these alerts on her phone. And then she gets the key KPIs and ultimately has the ability to make faster decisions. And so next time you have any neglected dashboards, I invite you to kind of explore using Quarto emails. Just make sure you get your conditional logic right. And with that, I'd be happy to take any questions.

And so next time you have any neglected dashboards, I invite you to kind of explore using Quarto emails. Just make sure you get your conditional logic right.

Q&A

We have time for, I think, two questions. First one is a very good one. Where did it go? How do you assess that the emails are actually being used and helpful since you often can't see the traffic like you would with a site or a dashboard, like if they don't click through to that dashboard?

Yeah. So I think it's like a proxy, right? So you're essentially, you're hopefully getting them to go to, let's say, the report. You can actually check who's viewing the report, number of people, but yeah, it's true. It's like you can't necessarily do like the, did they actually open the email or not? The hope is that you can hopefully make it worthwhile so that they actually do something with it versus just pinging their inbox, you know, 50 times.

One more quick one. Is the code available for the example you showcased on GitHub anywhere? Also wow. In the comment. Also wow. I think we're all very impressed. I can make it available. I think, yeah, it was so annoying. I had to like make it like not like my actual code. It was annoying. But like, yeah, I'll post it on GitHub. So yeah. Thank you, Sean.