Data Science Hangout | Bryan Butler, Eastern Bank | Using the Best Tool for the Job
videoimage: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Welcome to the Data Science Hangout. I'm Rachel Dempsey, your host for today, and welcome back to all the familiar faces I've seen here the past few weeks. But for anybody who's just jumping on for the first time, there's no agenda for these sessions. It's really just focused on questions that are most important to you all. So you could jump in live or put questions in the chat. And we do also have a Slido link. So if you want to ask any anonymous questions, you can do that as well. And just a heads up that this session will be recorded. So we'll share the recording up to YouTube. But I'm joined by my co host for today, co host and friend from the Boston USAR group, Brian Butler, who's the VP of Business Insights and Analytics and has been on a few of these sessions so far. But Brian, I'd love to have you introduce yourself and maybe share a bit about your team and the work that you do.
Sure. Hey, thanks, Rachel. And thanks, everybody for joining up. So I have a very small team, two, myself and one other sort of junior data scientist guy. We support pretty much the whole organization. And we're in the early stages. So company has only been doing something like data science, and I won't call it completely data science for only about 18 months. So it's really, really getting started. And mostly focusing on supporting the marketing department. But I do a lot of work for our president and the board. And, you know, we're trying to just really get started with innovation. How do you how do you use data science in these traditional type roles? And one of the things I just want to talk about that is I often see on LinkedIn, how do I get a job in data science? How do I get started? And I always tell people, it's easier to bring data science skills to a regular job in marketing, ops, finance, than it is to get that first job listed as a data scientist. And, of course, everybody's always going for your famous tech companies and everything else. So those jobs are even, you know, more one in a million more lottery ticket types of stuff. So that's a little bit about what we do. So my career has been the most direct path you can get into data science, started in the military, worked in chemistry, then went into insurance as a quantum risk trader, ran my own business for a while as a restaurant restaurateur, was an independent contractor, and then got into data science. As you can see, it is the most direct route, as I always tell people.
Yeah, yeah, I was a I left the corporate world in 2010. After actually, we had lived in Bermuda for a couple years. That's why I was doing the risk trading, came back, did a year in the corporate world one more year. And I was like, I'm tired of this. I'm tired of the commute. Let's chase a passion. And our passion was food and wine. So that's what we did. We took a departure and did that for a couple years, actually. And I would never do it again. People come up to me say, Oh, that's so exciting. I really want to do it. I'm like, No, you don't. And, you know, even now, like I think about it now. And with COVID and everything, I'm not sure my business would have survived. It was hair raising and knuckle, white knuckling off without COVID going on back in 2010, 11, 12, 13. So I really feel for the people in that industry and what's going on out there and understand what they're really going through. And it's hard.
I think one of the things that's been key to all of this, when I look back at it is this passion for learning. And even now, like I am constantly, I'm literally constantly taking classes and learning. I think I'm enrolled in like four online classes now. And I actually am enrolled in a curriculum for computer vision, which is great because we start from ground zero. And it came from along the way, I became obsolete. And it's kind of scary. And it was after my restaurant business, I was like, Alright, let's get back into the corporate world. And I just finished some contracting work also. But there was a there was a big break. And I was applying for jobs. And people were like, Do you know are you know, SAS? And I was like, No, but I could do the job as a sense like I know it was jobs insurance industry. So it's like, I know the statistics, I know the actuarial, I know your specialty lines, I used to, you know, run the products and things like that. But people were just like, No, we need we need a modern skill set. And it took me about eight months to find that job. And, and it was scary. I'll be honest with you. I actually took the Coursera curriculum from Johns Hopkins, the R course, which is I think it's like 12 courses or something like that, and crammed through it in about three or four months. And that's what landed me a job to be able to go into an interview and talk a lot more about that. And it was with a startup. So it was it was a sort of the I call it my first step to reintroduction to the corporate world. And that was back in 2014.
It was a couple of things. So there was a lot of that experience because it was actually an insurance startup. But I was totally not hired to do that at all, which is kind of funny. That's not what they were hiring for. But I think taking the courses helped connect all the dots that I had spent the last 12 years doing. You know, I took PhD level courses in time series when I was in graduate school. So I had that time series background and used it a lot. I think it's one of the most underutilized or under focused skills out there like real time series, not just, oh, I use Facebook profit, or I use auto ML and it spits out some results. And understanding that, you know, the funny thing was, I was doing predictive analytics for a while. And it was actually data science, like I was using, you know, linear regression and doing data holdouts and logistic regression. So when I was talking to these guys in the interviews, they're like, yeah, you've done all this, you're doing all this stuff. And I was like, well, I use the R language. And one of the guys that was that was an interview, he was hugely into R, he's a good friend of mine still to this day. We still geek out every once in a while. But it definitely moved the needle and made it happen.
Using the best tool for the job
I learned a huge amount. And that's actually almost before the tidyverse days. So so I learned it a lot through like based R stuff. And so now I'm a hybrid. And I see I see discussions on LinkedIn. Are you tidyverse or not? Well, I'm like, you know what, it's like everything in life. Use the best that works. Don't box yourself in. You know, same thing with Python and R. I use both. In fact, I'm a huge fan of reticulate. I use it all the time now. Because R Markdown is the best way to present. So I'll mix and match R and Python to do a lot of my presentations using sort of a web based tab format in R Markdown. And it just it just gives you so much more freedom. And so I tell people don't box yourself in.
If you think about a carpenter, you know, they have like four different nail guns, one for a different thing. You have three different saws. It's the same thing, you know, languages or tools. You know, even knowing a little C++ or, you know, other languages, it all fits together, you will find that one day you're going to reach for that tool and use it when when you least expect it.
If you think about a carpenter, you know, they have like four different nail guns, one for a different thing. It's the same thing, you know, languages or tools. You know, even knowing a little C++ or, you know, other languages, it all fits together, you will find that one day you're going to reach for that tool and use it when when you least expect it.
NLP use case — solving somebody's problem
One of the things that gets like, I'm a big NLP person, I think a lot of people have found that out. And that is changing almost daily feels like, you know, just when I was getting, you know, good at TensorFlow 1.0, and then 2.0 comes out, and that changes everything. And then, you know, improvements in neural networks, and then all of a sudden, transformers come out, and that changes the whole game. But you got to be able to apply it. And I think that's one of the one of the biggest things like, there's some people Oh, wouldn't it be cool if if that's what your project starts out as it's not going to be successful, especially in a business environment. It's more about can you solve somebody's problem. And about six months ago, we were able to do something with NLP, which was take all these fancy models and tools, weave them together with our markdown and actually give my CX guy a visualization tool of comments. And so it ran, it took all the comments, ran them through a bunch of different processes. But at the end, he had like, this three dimensional interactive visualization that color coded the comments by theme. It used dimensionality reduction, get the first two, and then one of the scales was sentiment. And you could just spin this thing and flip around and find out, you know, you read the clusters and what people were saying. And it was really interesting, because it was so easy to really see what's going on now. And so that, you know, that was like a really first move for the for the bank for people to see something, you know, completely different that they've never seen before anything like that.
Implementing incremental change
But we're doing those things is a challenge too. Because you got to teach people about something that they've never seen before. And a lot of people aren't comfortable with that. And especially in, you know, sort of I'll call these older traditional industries, like, you know, I think about banking and insurance and some of the other ones. You know, no one wants a revolution. They can't handle it. And it's just too much change. But they can handle incremental steps, you know, you give it to them a little bit at a time. And maybe in two years, you'll find yourself down the road a lot further along. And I think, you know, that's the way you got to get a look at it. With industries like that. It's a it's a marathon. It's not a sprint, it's not going to happen in six months, or you know, whatever, whatever fast timeline you read about a Google or Facebook, someone's on you're not down, we're not down.
And sometimes it's about focusing on the basics first, like data. The company before I was at the bank, we were in healthcare, I was in the healthcare industry. And they formed a analytics core analytics team and everything else. We spent two years building a data warehouse. That's what that's what they focused on. And it was the right decision. We hired data engineers and you know, getting everything so you have one single source of truth is the biggest way to lose credibility with your executives is you come in and say something and show them some data. And they say, I saw something else elsewhere. And now you got a problem. And so so we focused on making sure that didn't happen. And then you can actually do the data science when when the data engineers are building you data models and everything else, then it's easy. Like, you don't have you don't want people writing custom SQL because everybody writes it differently. Like the, the importance of data engineering is not stated enough out there. Because that is the ultimate foundation.
The importance of data engineering is not stated enough out there. Because that is the ultimate foundation.
Yeah, so it's all about, like I said, you know, solving that business case. So somebody says, Hey, I got this serve in a classroom on that, because I did a year of market research also, and kind of revolutionize that in a way. They're like, Hey, I have some survey data, can you analyze it for me? And so from there, to me, there's two aspects of it. You're not going to do that much machine learning in it, which is good. But then it's also just bringing better visualizations. And like traditional and marketing, as you hear top two box. And I'm like, no, no, no, no, no, you use all of your data. So like, you know, people, for a question, you know, they score you one to five, one to 10, or whatever, you show that as a violin plot, or something like that, that shows the full distributions of everything. And people are Wow, no, I'd never seen that before. But at the same time, you got to be able to talk, you got to be able to explain it to what they're what they're looking at, too. So I think, and that goes back to you can't give them too much change at once, because you'll get hung up explaining everything, and you'll never get anywhere with the actual, you know, what did we get out of this survey kind of thing?
Building a data science team — timelines and delivery
I think you got to, you got to tackle the problems that are in front of them right away. And for instance, I think your timeline is a is a year, but you got to deliver pieces along the way. So one of the things that we've been working on, I've been there about a year. And early on, it was the customer lifetime value model. And, you know, we're, we're making our progress through it. And we're delivering the first chunks, first real chunk, which is, and it sounds like a basic part of it, but we're actually going to, we've shown everybody, like, these are the lifetimes of all your segments, you know, very much like a survival analysis. Because you can't do anything else with lifetime until you know how long people live, you know, sort of like how long do the customers hang around. But we've delivered little pieces, every few months along the way to let people know, we're still we're working at, you know, we're giving them the easy low hanging fruit while we work on this big strategic issue.
So you have to have two timelines, you're you have the strategic timeline, which is one to two years. And then you have your every few months where you deliver just even if it's a simple visualization, as long as it's different, and as long as it's useful.
But, you know, it all depends on, on the organization you're into. Some organizations can handle it faster. And, you know, others, you know, it, you know, just depends on, on what they're used to seeing, I think. So, because sometimes you get into just show me what I've always seen, don't deviate. And that's a, that's a different thing.
But the other thing, one thing about my career is, because it's a little older. I was often that first guy in doing data science. I was six, seven years ago, I was the first person in this marketing research firm to transform their analytics from manual Excel, small data to algorithms and understanding like, there are statistics involved. And so you can't just put an average out there and assume it represents everything. So you can't just put an average out there and assume it represents, you know, the, the data set.
And I think the first thing I did for them was like a linear regression. And it's, you know, I think to anybody that works at that data science background and saying, yeah, but that's basic, but not everybody uses it. Not, you know, people don't use that every day. But it is extremely powerful tool.
And I think I would say along with that is keep it simple. If you can, like, I know too many people solving a problem. First thing I want to do is run an XGBoost model, like throw all the data in the soup and let's run an XGBoost model, but you're never going to be able to explain that to somebody. And so if you can do a simple regression, you know, even if you think about a scatterplot as a visual with a regression line through it, you've done a regression for that. And people get that. And I think that's where you start.
My big thing that the marketing research firm was, was doing that. And also I brought text analytics in. And that's where I first got involved. We had, I mean, as a marketing research firm, surveys and open ends are your lifeblood. And when I was talking to people, they're like, well, we take a sample of about 100 or 200, and then we try to do it all by hand. And I was like, so you're throwing out 90% of your data. And they're just kind of, yeah. So, well, no, we can't do that. So, and the thing about that role was because I was the new person, you know, people didn't know what I did. You get a chance to define your role a little bit. And that's why I say bring the bring your data science to a traditional role, because you might be able to redefine it along the way and make it what you want it to be.
Communicating new ideas to stakeholders
You gotta start really simple and I'll use an example of doing that so early on my career literally almost 20 years ago in insurance. I was doing a commercial property rating plan form, and the data didn't work for, like, how much credit do I give somebody for deductible. I mean, think about that like if I take a $500 deductible I get x off and if I take 1000, and this was commercial property so you're talking about 5000 $10,000 million deductibles data didn't work at all. And so, I was like I gotta do some, you know, failure isn't an alternative here. And I did something from my finance days, called an option model. And at the end of the day, an insurance contract is a call option. And what a call option looks like there's something called a payoff function it looks just like a hockey stick. So, you have a straight line and then when you want when the law succeeds your deductible the insurance company starts paying. And it's literally like I spent six weeks describing a hockey stick figure for people. And explain just explain the concepts, and, you know, I'd go back and forth and, you know, I'd show it to him five different ways but I spent six weeks just selling that idea without even crunching a number. It was selling a concept.
And you just can't get frustrated, like you just got to be really open. And then at the end of that is, you got to read people to like, as you get a little buy in like you get the acknowledgement Hey I understand this. Then you give them a simple example that they can work, work on, and I did that like I took one of their accounts and, you know, with data that they had seen from somebody else that's the other thing is you got it, you got to use something that they're comfortable with. And so I like, we just had a broker in and the broker explained here's the standard deviation of your portfolio and I was like yeah you remember that and they were like yeah, so you keep familiar. It was very simple until I, you know, it took me four or five months to sell that idea from. I don't have any data to do this with so we're kind of screwed at this point. And now I'm going to now I'm going to get creative about outside the box that nobody's ever seen anything. And I'm gonna have to figure out how to sell it. And at the end of the day it's in there it was filed in all 50 states and this whole methodology that I sort of developed for and I've talked to, I've run into people in my past career down the road and they're like yeah I took that idea. Like, I have that rating plan or I have that my plan and it's moved to other people's companies so it's just kind of funny that how that can happen along the way.
Balancing leadership with hands-on work
I just had that conversation with my boss the other day. It is a tough balance like, so I've always been and even, even from the days of military like you can't just be a leader. And you can't, you can't just be the doer either as a leader or manager, you got to figure out how to wage it. At my level you got to figure out how to do both. I also do a lot of code reviews and my junior guy. And it's, hey, you know, because not efficient and we kind of walk through certain ways or have you ever thought about doing this way. On the other real exciting stuff. Like I try to do like a little piece somewhere, like, like the NLP thing so so that whole 3D visualizer thing. It was one of these I had a couple days in between all these other projects I'm like I'm just going to bang this thing out. And, you know, sometimes you end up working at night and all the other stuff but I was like, it was part of it for me too.
I hadn't worked with transformers before Google universes sentence encoder and all these other things and the visualization all of a sudden just came to me one night. And I was like, oh, we can do this in 3D with sentiment and because everybody in NLP talks about sentiment, like, okay, that's fine. What are you going to do with it. And for us, it was actually an axis in a plot and allowed you to all of a sudden, take thousands of comments, visualize them on this plot with scaled by sentiment, and it was like, ah. And then when we color coded and by themes and everything else you could really see where the clusters of the topics were.
But going back to how do you balance it. It's not easy. You got to make an effort, and at the same time is. You got to manage upward. You know, working with your manager saying, hey, you know, I feel like, you know, for the last X amount of months. I haven't been able to deliver some of these things that I know are important because we're focused on, you know, fighting fires. And so it's just having that conversation so they understand because you don't want to get to the end of the year and somebody says, what have you done or why haven't you done this? That's disastrous, I think, for a lot of people. So it's this constant checking, hey, this is what we're working on. You understand this is important.
And I actually keep a Jira board for my team. I know most people use it for for software development, everything I use it for. These are the projects that are hot. He actually taught me because people were just throwing like it began. It became at one point like the only thing that you can think about is what's doing the next hour. And I'm like, oh, my God, I can't handle this. This is just too much stress. And he showed me how to do the board. And so because then you can show someone I've got these 10 things. You've added 11. It's not going to get done. So which one which ones are we moving?
Leadership lessons from the military
Being a manager and a leader, you have to have those conversations like. Or your people will get killed and they will leave. And that's just one. I think that's one of the biggest things I took away from the military is how to manage and leave people that. When I was fresh out of the academy, a junior officer, and next thing you know, like, oh, you're in charge of people. I got 15, 16 years experience. And, you know, understanding that dynamic and how do you manage those people is critical. And one of the things as a manager and a leader is you have to be a buffer for your people. You got to be the person that's between them. And the rest of the organization. And I've seen so much bad management over my career where, where the manager is either an amplifier. And you turn over just goes through the roof. Or that's the worst case if you're an amplifier of all the noise out there for people or you're just a pass through, which means you're not even doing anything to deflect from from your folks. And then the expression that comes along with that is you can you can delegate your authority, but you can't delegate your responsibility.
So, if something goes wrong, a project goes wrong for one of the people that work for you, it's on you as the manager and the leader, you have to eat it, you have to own it, you don't pass the blame. You know, you can have a talk with that person and private down the road. But when you're in that conference room or whatever else. That's your responsibility, you take it. And so I think that's the part of leadership and management that often doesn't get talked about, you know, people like I organize teams and I do all this kind of stuff and and code and everything else yeah but there's a there's a there's a real human responsibility person portion of it that. It's just not taught I think that's that's the biggest part like you can't teach that in a traditional university you can't go to a seminar on it. I mean for me I got a drilled into me for nine years in the military so it's not like something you're wearing overnight.
One is, you get your first axiom is, take care of your people. That's the first thing. And so you got to constantly ask yourself. Am I taking care of my people. So we say managers do things right. Leaders do the right thing. So part of it is you're going to have to step in step out of your comfort zone as a leader is, and that's one way to practice. But be smart about it, you know, it's kind of like, I told people, you know, I think outside the box I live outside the box. But at the same time is, I make sure I don't get completely smoked, too, because you can't live out at the end of the limb the whole time because someone will cut it off on you and you will go crash it down.
How do you practice that you just gotta find the small opportunities, whether it's. You usually find it when something doesn't go right. Like, I think that's, that's what really defines people is when something just doesn't quite go right. How do you handle it. And the key is to not be not be instantly react. I'll be honest, when I was a lot younger, I wasn't I was, I was wasn't that great at that part like I would react right away. You don't always win management doesn't always you don't always look good management but I could sleep at night and I think that's the, that was the end of the day is I felt good, comfortable with my decisions, and my people like me, and they expressed it.
Those are hard ones. And you never do that because you can't get it back. Once you lose that integrity or once you go down that step, like fudging numbers and stuff, there's no coming back from it. And I mean, I've gotten into it with managers. I mean, when I was in the military, I actually told him that I was a lieutenant, which is pretty junior officer, told admirals where to go. Because I wasn't going to do or I told senior people, no, I am not going to do that. And I mean, in work environments, I just I've lost jobs because of that. Sometimes it's just you just got to walk away. And my career has had a lot of companies and a lot of jobs and everything else. And sometimes it comes down to stuff like that. And I am not going to be a part of that. I'm not going to hang around because it will destroy you and it won't really get you anywhere.
You have to find. A moment when it matters. So it has to be. Again, something has to go really wrong. And you've got to find that way to have that conversation. My first job as an officer on a ship, my chief engineer and I. Didn't get along. It was miserable. Everybody know it, too. And so it was either I'm done. I'm six months into my career for the next five years. That's all this. And so I took a gamble and knocked on the door and said, can I talk to you? And we had that. We had probably one of those difficult conversations. And this was so early in my career. But it turned everything around. But it was a risk I had to take. And I think you can do you can do that with. You can make that decision. Knock on the door and have that conversation.
And the other thing is, if it's not going well. And don't stay in there for the fight. Like, don't stay in there because you have to be right or whatever else. You know, you got to be able to know that, hey, this isn't working. But the best thing is coming off fresh off a bad experience or something like that is the only the only way you'll be able to get there on a common ground to be able to talk about it. And, you know, sometimes it is you take responsibility for it, too. And that's a good way to start. Start that conversation. Hey, I know this really didn't go well. This is what I felt. And, you know, try to get the try to get discussion and not conflict. But it's not easy. I'll be obvious. You know, it's never easy.
Combining R and Python in practice
I was really happy that you mentioned combining different tools and using them to the full effect. And as soon as a way I imagine like R and Python and base and tidyverse. I still don't understand this. This purism of like 100 percent base or 100 percent tidyverse and not just doing the thing that works. I get my people to always. However, they intuitively understand it. As long as they get something that is better than trying to get fraught in spending days trying to get them to work in one specific way. So I just published this book with O'Reilly, the R and Python together for the modern data scientists. And I was very happy to hear that you're doing R and Python. I was wondering if you can mention an example where you did that and where you really had a challenge. Because I think it's something that is hard to convince other people in the team with the managers to combine languages. And also something where you had a success in combining R and Python.
The easy one is I have not used R pi to go to the opposite go to the opposite direction. So I had this project at the last company. It was to do time series forecasting for like 10 different metrics for leadership team. And I loved R for time series, but it wasn't giving me the visualizations that I wanted. And maybe it's maybe it's the way I was looking at it. But for instance, like the whole backtest model. So you take your model and then you backtest it on all your data and see how it compares with with the existing data. I had done a lot of work with Python time series. And it seemed right way easier to do stuff like that. And I actually built a module that I was using to do that. And I actually built a module that I have on GitHub, a module that does all sorts of time series functions for you. Using stats, models and everything else. And so I took the Python for the time series part. And R for the visualization side of things. Because Matplotlib just doesn't cut it. And Seaborn is okay.
It was in this specific instance because I had to convince people that these models were accurate. How do you do that? Well, you show it to them. You show them, hey, here's the model. And it was really cool because my boss had to go explain it to the CFO. And so it was easy for him because they could see as the model moves through time, it was actually learning. And so the first couple of data points, there's a lot of variance. And then it sort of hovers in and eventually it begins to pick up all the little bumps and grooves in the time series. So that visualization piece of it. And then I used ggplot2 to do all the plots with ggplotlib, which is great. Because one line of ggplot2 converts a ggplot2 into an interactive plot. And that's what this thing was too. And then I wrapped it all up in R Markdown using the DT package for charts. So people wanted to export the excels and all this other stuff. And it really wrapped it all together pretty nicely. So we had this little app that I would build every month. And it would show, here's this month's forecast. Here's last month's and how it performed. And here's the month's forecast. We were doing three months out. And so they could see all three of the lines against existing data. And you could see it also had the back test of the model.
The only way this was going to get done was between the two of them. Otherwise, I would have had to build some crazy app and React or something like that to get it to work. But R Markdown, you just nick the code up. And with Plotly, the next thing you know, you've got an interactive application that goes to the web. I love just making a shiny runtime in a Markdown document and not even using the full thing. Just having indistinguishable snippets. It's so incredibly easy.
And what's interesting on an aside, a different part of it is I actually gave a talk at the Boston RUs Group on reticulate. And one of the things I showed is to do a simple facet plot, GG plot, you can do it in four lines. To do that in that plot lib, it was like a whole page of code. And I actually showed people, you have to write this much code versus this much to get the same plot. So visualization is key.
And I've actually used R in the past, actually in my first job back, like that startup job to do vector auto regression. Because we were forecasting our business using, and it took in two inputs. It took in how many sales people we had, and then using the time series of the other day, and it actually worked extremely well. It was basically think about sales leads and sales. Those two series are highly integrated. And then we had an extra variable called how many people did we have that day on the floor. And that worked beautifully. The VARS package in R, vector auto regression, gaming the plots, it took a little bit to unwind it. Because sometimes you get some weird data types in R that can be challenging, like lists.
And so if you do the natural language processing with R, if you do this frequency threshold count function, it's in the TM package. What you'll find is that it tells you, it gives you a paired data that's, you know, word and it's frequency. Like, so if you, like, if you do text and say, my phrase is customer service, show me all the terms that are correlated with customer service. But when it returns that data type, it's like a tuple that you can't unwind. And you have to do all this unlisting and everything else. I wouldn't blame R for that. I would blame the person that wrote the packages. They weren't thinking about using it.
But it's funny with the unpacking the frequency things, a guy came up to me and said, I haven't looked trying to do that for a year. I was like, well, here's the code. It follows all your problems because that function is so valuable. That frequency, you know, basically it shows the correlation of words in your documents and it makes a very powerful plot. Like I always do that. When people say custom experience or I do it either as a bigram or trigram, show me all the phrases that are highly correlated with it. And you can see some funny stuff. If you get a rich enough text, you will see some funny phrases.
Um, so one of the things that that got me down this route, or actually I was already down it and I call it a tale of two notebooks. Um, because I was doing something for our marketing guy several years ago. And I was doing it in Python. But you can't give somebody a Jupyter notebook. It just doesn't visualize. And I could have wasted my time copying all the things out and making a PowerPoint deck and everything like that. But I hate doing that because now it's now you're into this really manual. And if you've got to redo it. So what I figured out was that I could do my Python in R Markdown, use my tabs, hide all the code and only show a couple of key graphs and do the Markdown to do like the slides. And he'll get exactly what I want him to get because I actually gave him the first notebook. And I said, look here, here, here, here. And, of course, they don't. Um, and it was totally confused. So it's like, all right, let me change this. And so I began to copy and paste from Jupyter notebook into R Markdown.
And I do it a lot for presentation because the Markdown presentation format is the best out there. You know, there's nothing in Python that's as good. So I think about it as, um. Ours, ours and R Markdown is just the best way to present anything. Um, and I always use the HTML format with tabs. So it makes it looks like it looks like a Web page. And I remember doing it. So I was doing an NLP display when I was at Hanover Insurance. So I was text mining all their claims. And I was presenting to the executive team. And it was in my HTML format. So somebody says, like, how do I print this out? And I was like, you don't. Because part of it, when you do it, when I do it in that format. It looks like a Web application, and that's what I was doing. Like, this is the prototype to a Web app. And, you know, see, we have tabs and links, just like a Web page does. And here's the functionality. And here's the display you're going to get.
You know, as Rick was talking, like. What's interesting is my junior data science that I work with now. He, uh, he started using R. And then he switched everything to Python. And now he won't use R. And I was like, you can't be that rigid. You can't be that rigid. You know, there's a guy out there, Matt Dancho, you know who he is. Big R guy. Incredible teacher. And it's really like. If I get these quick things like somebody wants a couple of slides, a couple of visualizations with some data tomorrow, I'm doing that in R. And that's what I tell the guy that works for me. I said, I don't care. I am doing, this is going to be in R. So I can bang this out in about a couple of hours. And, you know, I've got all, I've got a huge amount of base code. It's faster. I can knit it. You know, it's just like, I just told him if I am doing anything on a quick hit scale, it's going to be an R.
Data storytelling and communicating visually
One is find the simplest visual for it first. Like one of my chemistry professors back in college used to always talk about it, call them football player analogies. Like the analogy is so easy, even a football player can understand it. And so it's got to be visual. So think about it as your first step is a simple visual. Don't try to put everything out there at once. And that's really that, you know, find the simplest visual you can give them to start the conversation. You don't have to sell it all in one shot.
I actually do that now as much as I can. Like I did the whole, the design for what a, an NLP application would look like processing our claims notes. And like it was a little, almost like flow charting diagrams and things in R Markdown. And people would easily follow it. Like here's an example of a bigram, you know, and stuff like that. You know, this is why we use, one of them was why do we use MongoDB for this rather than a regular one and things like that, where you can show very simple concepts that solve the problems and explain why you're, what problem you're solving. And I think that's the two things with that visual. Visual should explain what it is, but it also should explain why it solves your problem.
I think you can in every organization. I don't know if an organization is too broken. I mean, I've been able to practice in any organization. Some of them, I've only lasted one year though. So, you know, you got to take that. You can take that with a grain of salt. So, you know, I, there have been a lot of companies. There have been a handful of companies. I don't say a lot that I've only spent one year with them for whatever reason. Either they've made it clear that I don't fit. I don't have a career or there's no way I'm ever going to convince them that, you know, data science or something is the way to go. Even though, even though I'm making results for them, they're not buying, they're not recognizing the strength of it. And so sometimes, you know, your answer is to find somewhere else to go. And I think is, is data science,
