Data Science Hangout | Regis James, Regeneron | Achieving scalability & showing value of community
videoimage: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
So happy Thursday, everybody, welcome to the Data Science Hangout. Hope everybody's having a great week. I'm Rachel. If we haven't met yet, let us know if it's your first time joining, you want to say hello in the chat so we can all welcome you in together.
This is an open space to chat about data science leadership, questions you're facing, and getting to hear about what's going on in the world of data across different industries. And so every week, we feature a different data science leader as my co host to help us lead the discussion and answer questions from you all. Together we're all dedicated to making this a welcoming environment for everybody. And I love to hear from everyone, no matter your level of experience or area of work. It is totally okay just to listen in if you want.
There's always three ways that you can ask questions, though, and also provide your own perspective on certain topics. So you can always jump in by raising your hand on Zoom. You can put questions in the Zoom chat. And feel free to just put like a little star next to it if you want me to read it if you're in a coffee shop or something. And then we also have a Slido link where you can ask questions anonymously. And Hannah will share that in the chat here in just a second again.
We do share the recordings of each session. So they'll be up on the Posit YouTube, as well as the data science Hangout site. But with all that, thank you so much, Regis, for joining us as our co host today. Regis James is a senior manager of biopharmaceutical data science at Regeneron Pharmaceuticals. And Regis, I'd love to have you introduce yourself and maybe start off by sharing a little bit about your role and something you like to do outside of work, too.
Regis's background and journey to Regeneron
Sure. Thanks for having me. So I've been on the side of the company that I'm on right now for almost nine months. But before that, and I'll kind of clarify what I mean by that, I came to Regeneron right out of my Ph.D. in 2016, and I was on the R&D side and I did a lot of collaborations with biologists who were generating tons of data, but it was kind of challenging to make decisions based on that data.
So what I did a lot was distill things down to their fundamentals to simplify the process of going from stuck to unstuck. And in June of last year, I moved over to be a bit closer to the patients, which has always been my passion ever since before I went to undergrad for bioengineering. But I wanted to be closer to the clinical trial side. So that's where I am now. And I'm working to help optimize clinical trials using AI and ML.
But there are a lot of different parts of the side of the company that I've moved to, which is called global development, that are not just clinical trials. There are a whole bunch of other things. And we are in the process of figuring out how to establish some of these collaborations with them as well.
I also have, since the time that I was on the R&D side, have been maintaining a community of data scientists that are around the entire company. And since Regeneron is global, it's actually a global organization. And the reason why this organization formed was because when I came right out of my PhD, the ecosystem that I used was R. And RStudio now Posit's environment of the RStudio IDE and Shiny Server Pro, which has now been evolved into RStudio Connect. So I needed to get that set up so that I could do the same level of collaborations with non-computational people that I did during my grad school work, but at Regeneron.
So when I set that up, other people heard about it and they were like, oh my gosh, this is amazing. Can we please do this? So then I started letting people on. But I don't want to say with great power comes great responsibility, but with great open source access comes great responsibility to support an ongoing community of people. So that ended up happening.
But I don't want to say with great power comes great responsibility, but with great open source access comes great responsibility to support an ongoing community of people.
And what people were doing was building Shiny apps and a whole bunch of different other types of analytics that they were sharing with non-computational colleagues. And so I started hosting these showcases that would show what people had been building. And then they would kind of show it to a range of people around the company, which was kind of, it was feeding, it was a positive feedback loop because people started understanding, as my current boss calls it, the art of possible. So then they were like, wait, that's, does that mean we can do this? And can we do that? Can we do that? So then it kind of started growing.
So then I was only working in the R side and supporting that community. But then similarly, I can't remember which one happened first, but it was very close in time to when RStudio was like, there's other people in the world who use data science. I mean, the IDE, the workbench always supported a broad range of languages, but to make it more formally cross language, that happened around the same time that I moved the group from just R people to R and other programming languages that exist to enable data science. So that's the group that I convene every other Friday.
And yeah, we're having one tomorrow. It's internal, of course, but we're, so it's, we talk about the problems that data scientists face and how we address them. And then it kind of often, especially this year, it's been triggering spinoff meetings of other people who have common problems and they exist across the company. So it's been going pretty well.
Writing a data science novel
In my free time, I mean, I am a lover of sci-fi things. So I have for a couple of years been working on a data science novel series that takes all of the geeky, nerdy, awesome things like cryptocurrency and CRISPR and a whole bunch of different things and integrates them into what I think is going to be an interesting story. And I'm planning on releasing that at some point in the future. So that's what I do in my free time.
And I've been able to influence or I've been able to attract my wife to also contribute. And she's also a scientist. So sometimes we have these like brainstorming sessions about different plot points that could happen and making them realistic, but also having a bunch of action in there too. So that's the thing I do in my free time with my favorite person.
Facilitating community spinoffs
But you just said that a lot of people start having these spinoff meetings from the Friday sessions, which is amazing. And I was just wondering, how do you help facilitate that within the community?
So they haven't always been like the spinoffs have kind of started a little bit more recently. Sometimes there have been a thing here or there. I haven't, to be honest, and everything's a work in progress. The community is about helping each other figure out how to figure stuff out. So in the most recent spinoff, I wasn't exactly sure where my role was. They were like, thank you, Regis. We are together. We're meeting each other because of the organization that you created. And I was like, yes, cool. But I'm still trying to figure out exactly what it is. And I mean, I don't think I should be in charge of everything. It's good for things to happen organically, but be triggered by existing problems that people have.
The thing that happened recently without because, of course, I can't go into detail about what it was, but I can speak in general, but yet specific for this community data science terms. It was about natural language processing in different contexts to identify things that we need to act on. And there are three different parts of the company that have the same problem. And we would not have come together without that.
So people were asking questions. What I typically do, though, when I prep the speaker. So it's every other Friday. And before last year, I was setting a meeting. Actually, I was doing it even more lightly. I was just messaging them on Microsoft Teams to say, hey, are you ready to have XYZ so that we can kind of share it with the community?
But as things have gotten more complicated and people's schedules have gotten crazier now, I have my schedule set out for the whole year, basically until the end of twenty twenty three every other Friday, except for holidays. So now what I've done is I even have I have the page open right here on my other screen. I have a column of red, green and yellow icons indicating whether or not the schedule date is locked down or not. So whenever I have it locked down, when the person agrees, then I schedule the two weeks before. So not the Monday of the gathering, but the Monday before I meet with them to make sure we have everything together.
So that it's really easy for the people who are in attendance to know what the walkaway scenario, what the takeaway point is, so they can apply it to their own work as quickly as possible. So I do a lot of honing there and I think it's been paying off because in the meetings, people have been really engaged and have been asking questions during the meetings because I've been able to whittle it down. The speaker that I have tomorrow, I said, I apologize. I don't mean to ask you to ruin your story, but it may be helpful if you'd say the end part at the beginning. And he's like, no, no, no, please, please. I do want to ruin it.
So I think that would but but having them basically ruin their stories and saying the end at the beginning is what gets I think it helps people get engaged. So there were different departments that got engaged because of the way I was able to flip the narrative and say the ending from the beginning. And so because of that part, I think people were able to just jump in and identify things that were needed.
Building the community ingestion pipeline
So one of the things I do, and I hope this doesn't make anyone who's in the group who may be in attendance here, not do this, but so I've got what I, what I do is I maintain everyone is special in case anyone watches this video, you're all, everyone's special, but in order to achieve scale of business, to achieve scalability, I do have to standardize things a bit. So I have on, cause we use Confluence. So on the wiki that I've made, I have a lockdown page where I have like some standardized messages that I send out to people at different phases of things. And I do customize and I read what people say about themselves and I'm like, okay, this could be useful here. That could be useful there.
So one of the things that I do is I kind of have this whole ingestion pipeline that I maintain. So to identify new members of the community and then of those members who can help the community learn more about what the blockers are to the current responsibilities they have for their job. So, I mean, all companies have a directory where you can learn about who exists at the company. You can see people's job titles and stuff. So one of the things that I do is I can look into the directory and I can look for like data or data science or whatever. And then the people that come out in the results, like when I search for those terms, I look to see if they're already in my Microsoft Teams group. And because I'm thinking that they could be potential new people who could benefit from being in the community.
They might be able to do their job better and also help other people do their job better if they can connect to others who have similar needs or can offer similar things. So I look in the company directory and then, and I do it on a semi-regular basis. So I can really help to onboard people who have just joined the company or maybe moved into a data science type role. So then I reach out to these people and I use my standardized thing like, hey, I just wanted you to know this group exists. And I only send one email. I don't spam people. And I say like, hey, this community was built just for people like you. This is what we do. Here's a link to the wiki that has the links of all the recorded, because we have a growing library of all the conversations, just like the data science hangouts that you have here.
So then of the subset, because it's not a hundred percent, sometimes it's like 20% that respond and they say, yes, this is amazing. Thank you for letting me know. I just came to the company. I didn't even know if there was a community and I'm also doing it from my own perspective of when I joined, it was hard to see who else may be able to, I'd be able to speak the same language about and not the language where, when you talk to someone, their eyes glaze over, it's not for the non computational, it's for the computational people.
So when they respond and then they say, yes, they're interested, then I add them to the group, but I have another thing in my wiki for myself, my lockdown page, that is the intro, the welcome words. So I add them to the teams group, then I add them to the active directory group, the listserv, and then I also forward the invitation to the ongoing gatherings. So every other Friday and in the invitation text, I paste in, welcome to the group. Please introduce yourself in the introduction section. And I include a link, which I use a tiny URL for, so, because I'm using the full URL from Confluence and it was chopping things off. But now that I use a tiny URL, when they click it, it takes them directly into not just the team for the group, but the channel of introductions.
So since I switched over, I've gotten a lot higher percentage of people who've been added to the group to start introducing themselves and say what they work on. So then I could see what people are working on. And then I integrate that with my understanding of what the output is, what people have voted on that they need help with from the Shiny app. That's the poll for what people are interested in. And that's an open-ended thing too, by the way, it's not, that's fixed set of things you can in Shiny, you can configure it so that you can actually add things in. And that's got a MySQL backend. So whatever people pick from the dropdown menu, it stores everything in the backend.
And then since I'm using a global object, I'm all over the place, I apologize. But since I'm using a global object, whenever anyone updates something, it automatically updates the global object and that propagates it down. So it updates everybody's plot, regardless of what computer you're looking at. But I use that and then I integrate it with what people say in their introduction. And then I reach out to them and say, Hey, would you be interested in being one of the guests for an upcoming thing? And then I send them the link and they can see there's no pressure because I'm a year out. I've got someone every other week. So it's like, Hey, would this be something? And then I kind of just am gently having a conversation over time. And then sometimes people drop out and then I reach out to those people and say, Hey, would you be able to switch?
How community building comes naturally
I don't think I'm going to have a satisfying answer. I, okay. So it just comes naturally. It's not fair, but it comes naturally. My, one of my favorite places to be is an airport, not because of the lines, but because like for me, because, okay, let me back up what I try to do for my, and it does feed into all of this stuff because I'm trying to enable the things that I'm trying to make possible through my group.
So the purpose of the group that I've created is to help people build their own careers and also help the company help patients with data science more effectively. But you can learn from other people about how you can build your own career. If you see how they built their careers, sometimes you may be able to plug and play different things, mix and match. And also hearing how other people have done, the, the data science implementation in their own respective projects, you may be able to also mix and match as well. Kind of like this series.
But naturally the way I see the world though, is like a combination of the matrix and Tron in that in the matrix, you see, like all the, you know, the guys that were looking at the coding in the different, the ship. And they could see how everything was like connected and running and causality and all of that stuff. And in Tron, the way the light cycles go around. And then you kind of like see the story path.
I mean, I'm thinking in they're called world lines from my study and from the novel. It's an objects, four dimensional path in space time. But that's how I see everyone, though, every single person that I just walk around in the world, it's like, they have a Tron light, cycle path behind them. And so I want to understand about their light cycle path, their world line. And then, because I just think it's interesting. Everyone has a different path. I just think it's interesting to know but also it can help other people.
So then I try to think about how people, because I'm fusing that with the data driven decision-making that I tried to make it possible for other people to learn about. And in order for people to make decisions, they need the right information that can help increase their confidence beyond the threshold of stuff. So I am just, it's, my brain is kind of always running in the background. How do I reduce the activation energy so that people can make decisions to get engaged with the, the organization and also what frustrated me in the past? How did, why did I get frustrated when I was at a meeting and I was like, I don't, I don't care about this. This is not relevant to my life.
So we have over 300 people in the group, but we don't always have 300 people show up. We have like 50 to 60, sometimes a little bit more. And that's because part of the process that I do, because I'm always thinking about data-driven decision-making, but from a mushy side. Because actually my father, both of my parents are guidance counselors, but my father's more of the emotional lifestyle kind for students. And my mother's more like the hardcore organizational career stuff. So that's the example I saw my whole life. So because of that, I'm just naturally doing both things.
Um, so every single thing that I do is geared towards emotionally and logistically decreasing the burden that people face when they're trying to decide, should I do this or not? And so my brain, it just things pop into my head. I just absorb things from TV shows, books, whatever I am. I hear other people and I'm like, Oh, I could plug this in. I could plug that in. My brain is like a permanent graph database where, um, I just propagate everything at all times. My wife isn't the biggest fan of not having separation of concerns in my brain, but
Showing the value of community to leadership
So, I mean, since I was, since my memories begin, so, like, I think 6 years old or something, it's always been the case. And I'm glad that so many people are still on the call and paying attention. I'm always working on editing and, and, uh, but this is also our community. So it's easier to understand.
But since as long as I can remember, I've been familiar with people's eyes glazing over when I explain things in adults, people, like, 10 times my age when I was a little kid, all the way up until now. That is now I don't think it's going to go away. I mean, maybe if chat becomes even better, and then we've become obsolete.
I think the way to prove the, um, the contribution, the benefit of a community is I just, I use the word stuck and the other word unstuck because that's really the only thing the business, any business, any organization that's not even a business cares about. There's a bunch of stuff that the organization has to do. There are obstacles in the way they are stuck. How do they get unstuck? That's all data science is for really. Well, you can probably do it for exploration and a bunch of other stuff, but you can argue that maybe that's also for going from stuck to unstuck.
But the non computational people who have the power to bestow upon you, uh, promotions and more money or whatever you value more, more influence over the company to make other decisions. They really care about what is the value of the institution. If it's a business, business value going from stuck to unstuck. Um, to, I forgot who was asking the thing earlier about a graph database. It doesn't necessarily need to be instantiated in a graph database, but if you can show how the existence of that community propagates through the network that exists in the institution.
Um, to such an extent that something goes from stuck to unstuck and you can mark that you can measure it. Then you can say this only happened because of me. I mean, there are other types of organizations, which also should exist too, but sometimes they're based on like interest. But interest is a lot more difficult to prove the benefit of then for lack of a better word, pain focused groups.
Like if your community that's gathered around pulling thorns out of your colleagues arms, then you can be like, we pulled 15 thorns out or we taught another group how to pull a hundred thorns out because we were forced multipliers. That I think is the way to prove the impact of the community. I'm not 100% sure it's going to work because there's still going to be people with eyes getting glazed over. But if you say, what do you hate about your job, you person with power that I need to prove the use of this community of too. And then if you understand that, or maybe don't even ask them, but you may know because they're probably high enough profile people at your institution that you know what their values are likely to be.
If you can think about how the efforts that you're doing propagate to the reduction of pain points for those people and enable them to go from stuck to unstuck, then collect those things and do like what I did with the showcasing. You could show like, hey, people in this organization did this, this, this, this, and this. Then I think that could be a way to prove impact.
Like if your community that's gathered around pulling thorns out of your colleagues arms, then you can be like, we pulled 15 thorns out or we taught another group how to pull a hundred thorns out because we were forced multipliers. That I think is the way to prove the impact of the community.
Data-driven decision making and reducing ego
I would love to be able to give you a quick answer, but there's only two minutes, and I don't fully know, because there's, there's like five chunks of the company that I'm in. And I'm getting a stronger and stronger hunch that each chunk is basically a different company, because this side of it, I haven't been here nine months yet, but it's, it's like a completely different company from the other side.
But it actually may fit into the the three KPIs that I was talking about. I think one is like, be more comfortable with data driven decision making. That's a challenge. Currently, because there's a lot of, I don't know if you could tell from the slight tinge in my voice, the, the feeling of like, well, I know what I mean. So that's enough. And it would be great if I knew how to improve on how to communicate to people, such that they're emotionally okay with saying, oh, maybe I should look at the data instead of trusting my gut. I mean, people should trust their, trusting one's gut is important, but your gut is informed by like common sense.
And common sense is common to a subset of whatever your world line is that led you to that Tron point. So, but you have to factor in the entire 3D chess area of Tron, you know, like you have to factor in everybody's or as many perspectives as possible. And so data-driven decision-making enables the abstraction to a broader set of perspectives. And that's hard. If, if I could figure out how to do that, get over it. I see it's, I see it's an emotional hump for other people. And I wish I could improve on that.
I think ego is a good way of saying that maybe that's like, maybe that's inflammatory, but being more humble and realizing that there's a whole lot more going on that you don't know that you could know. Emotion is more ambiguous, but ego is more accurate.
I think ego is a good way of saying that maybe that's like, maybe that's inflammatory, but being more humble and realizing that there's a whole lot more going on that you don't know that you could know. Emotion is more ambiguous, but ego is more accurate.
Just want to say thank you everybody for all the great questions and thank you so much for just for joining us today and sharing all your experience as well. I know you, I think you gave me a LinkedIn as the best place to follow up with you is that best. Yeah. I can put it in the chat to the. Yeah, my LinkedIn is just got it right ahead of you.
Also, I have YouTube lets you get your own at at like the at time. So I was like, I wonder if data driven decision making is available and it is I mean it's a lot of letters, but like youtube.com slash at data driven decision making, I own, which is kind of cool. Awesome. Thank you so much. Bye, everybody.
