Resources

Data Science Hangout | Yu Cao, Exeter Finance | Impacting business with data science

video
Feb 8, 2023
1:02:41

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hi everybody, happy Thursday. Welcome to the Data Science Hangout. Hope everybody is having a great week. If it is your first time joining us here today, hello. Very nice to meet you. I'm Rachel. This is an open space for the whole data science community to chat about data science leadership, questions you're facing, getting to hear about what's going on in the world of data science at different companies across different industries. And so every week we feature a different data science leader as my co-host here to help lead our discussion and answer questions from you all. So together, we're all dedicated to making this a welcoming environment for everybody. And I love when we can hear from everyone no matter your level of experience or area of work.

It is totally okay to just listen in if you want or participate in the chat. There's always three ways you can ask questions or provide your own perspective on certain topics too. So one, you can always jump in by raising your hand on Zoom. You can put questions into the Zoom chat. You can always put a little star next to it if you want me to read it instead. And then third, we also have a Slido link where you can ask questions anonymously. And I'm sure Hannah is sharing that in the chat here right now. Just to note, we do share the recordings of each session. So you can always go find them up on the POSIT YouTube or also the data science Hangout site. And one last thing, we do have a LinkedIn group for the Hangout too.

But thank you so much, Yu, for joining us today. My co-host joining us is Yu Cao, VP of Data Science at Exeter Finance.

Hello, everybody.

Yay. And Yu, I'd love to kick things off with having you introduce yourself a little bit about your role and maybe because rock and roll was included in your bio, I'd like to know who your favorite band is too.

Okay, sure, sure. First, I want to thank Rachel and her team for organizing the hosting and also giving me the opportunity to host this session. As a data scientist, I always believe that it's super important for people to stay connected to the whole community. And that's why today I want to have an opportunity to talk to all of you guys here. And then let me do a little self-introduction myself. I'm Yu Cao. I'm from China.

Okay. So I'm from China. So you can share that. You can tell that English is not my native language. So if there is something I didn't speak clearly, please feel free to let me know. Okay, so right now, I'm the VP of data science at Exeter Finance, which is a subprime auto lender. I'm also an EMBA student at Cornell University. And so today, these are the two topics I really want to share with you guys about. We can talk about how we use data science in the finance industry, how we use data science to manage the portfolios, to manage the default risk. And also, I want to share with you guys why do I believe that to pursue an EMBA degree is super important for the core development, for the development of leadership for data science people.

But before we go into that topic, I can talk a little bit about myself. My background is biology. Actually, I have my undergraduate degree in biotechnology, and I have my master's in biochemistry in the United States. And after that, I decided, okay, I want to pivot to another industry, which is operations research and data science. So I got my second master's degree in operations research in 2010. And after that, I work as an analyst. I work in the analytical industry. I started as an analyst and transitioned to data scientist and manager of data science, senior manager of data science, and eventually VP of data science now.

Rock and roll roots

And probably a little bit fun for myself, because when I see Rachel's self-introduction, so Rachel said, okay, I play guitar. I said, okay, this is just me. I like guitar. I like rock and roll a lot. So I play guitar like an amateur, but I like it a lot. So I maybe started, I started in maybe 2000. I was a high school student in China. And then I happened to saw the famous video of Smell Like Teen Spirits by Nirvana on a Chinese television. And this song resonates with my personality and my character so well. I started from there. I maybe spent like four, five years listening to rock and roll crazily. I will say that. Maybe I spent like five, six hours every day just listening to all this type of music.

You know, if you guys maybe remember, Apple used to have a 160 gigabyte iPod. It can hold more than 40,000 songs. And this is just a product for people like me. So I kind of spent a lot of time on rock and roll music. I listened to the music. I wrote music reveals, album reveals for some online forums. I even published it in some music magazines. It's done for four, five years. And when I came to the United States, so after I graduated from my, with my master's degree, I got an entry level job as an analyst. And during that time, this for me is kind of like a, I will call it, I have two phases in my life. So during the daytime, I work as an analyst. I was very analytical, very rational, very nerdy. But probably after work, after 8 p.m., I got changed, put on my, all my punk rock gear.

And then, I don't know why there are other people, by the way, I'm from Dallas, Texas. I don't know why there are other people in the audience from Dallas, Texas too. In Dallas, there is a famous district called Deep Elm in the downtown area. There are a lot of music clubs, pubs, rock and roll shows, all the concerts hold up there. So I spent a lot of nights there. So after 8 p.m., I just go, went there. I joined the concert. I hung out with a lot of rock and roll bands there. I met a lot of my, a lot of bands I idolized, like Mudhoney, Meat Puppies, Sonic Youth, this kind, all these bands. So I just hung out with them and then probably until 1 a.m. and the next day, I just go back to work. And then such life, yeah, such life continued for like five years until like 2050. There is a milestone. I got married. So after that, things changed a lot.

So instead of go to having my regular nightlife, like every day or every other day, probably I can go to a concert show maybe every month, that's fine. I remember the first show I tried to brought my wife to the first show. It was a drum show by the Huan Zhongzhai at the American Art Life Center. And after that, she told me, okay, I can't believe that you listen to that kind of noisy music. So I kind of decided, okay, probably I need to stop going to a concert that often. And then so since then, I can only go to a concert like once a month. And then in 2017, another thing happened. So my wife was pregnant with our first son, William. So I remember, I can still remember, my last concert I worked was in May 2017. That's the last one. And William was born in January 2018.

And in 2020, William was almost two years old. I thought, okay, he's old enough. He can sleep at night by himself. Probably I can go back to my music life a little bit more. And then you know what happened, right? The pandemic. So everything closed. All these rock and roll clubs, like the famous The Trace, Club Dada, Bob Factory, everything was closed. So I had to, so of course, I had to put a plan on hold. Sadly, right now, a lot of the clubs, they're not very open again due to this shock.

So I have to ask you, do you ever mix music and data science now?

Absolutely, yeah. I can finish that very quickly. So I believe that for me, there are some data science, but I also want to talk to people in my team. So I said, if you want to really do it, want to do it very well, treat it like art, not just science. In order to be a very good data scientist, I believe you need to be creative. And I believe that you need to possess some artistic creativity. And for me, for myself, the creativity comes from the music.

In order to be a very good data scientist, I believe you need to be creative. And I believe that you need to possess some artistic creativity. And for me, for myself, the creativity comes from the music.

Getting into data science

Okay, so obviously we can't put an ending to talk about my music life here. So probably we can start to talk about data science.

But I'm really curious, while we wait for questions to come in from everybody, what made you so interested in data science from the beginning?

Well, it just resonates with my nature. So I'm an analytical person. I'm very focused on details and I like to find patterns from data and use it to make predictions. And I like to work with people from the academia, like from universities, from the research team of big tech companies. So everything fits perfectly as a data scientist. So that's why I started my job as a data scientist.

What has the progression been like for you, like moving from an analyst to data scientist to leading a data science team?

The first thing is that because when I work as an analyst, most of my job is kind of ad hoc analysis. So I put data, I build reports, I do some other analysis and present my results for the leadership team. However, I later found that so instead of just doing something ad hoc, I want to do something a kind of more systematic. Again, I'm always, because my secondary degree was in operations research as basically math modeling. So I still want to build models for all the analytical jobs. And because my first job was for a health care company. So that time I want to say, okay, so can we use all the data to predict the behavior of patients? And I start, I built my first predict model during that time. And after that, I believe that this is what I want to do.

So that's why I want to pivot to, and it's a little, it's a small pivoting from analyst to data scientist. And after that, because I changed to, I switched to a financial service company. So I'll come in doing subprime loans, and then which is subprime auto lender. Because if you want to buy a car, you'll go to a dealership. And if you cannot pay cash, you need to apply for a car loan. And because if your credit score is not that well, for our definition, subprime means that people with credit score less than 620. And for that customer segment, there are a lot of risk there. So we need, you need to build a lot of predict models to predict the risk and then price it to manage the price, manage how we service the law, how we originate the law. So this is why our entire subprime lending business is driven by data science.

And because of this, I feel that, okay, I had my knowledge and my desire, my interest plays a super important role for the business. So that's why, okay, this is the industry I like. And I started as a data scientist and gradually I feel, okay, it's not just about modeling. So you can just say they are building model coding every day, but this is the most basic work. So if you want to do it really well, you'll need to understand the business part. You need to understand, okay, so what problem the business is facing and how can you formulate a business problem into a data science problem? And then build a model is the, I would say, the simplest part.

With this, I kind of, I actually started to, I mean, spend my time to collaborate with all the other teams in the company, try to understand their work, their work, okay. How is the sales, how is the sales team working? How is the marketing team working? How is the local origination teams working? And by gradually, because I have more and more understanding of the business, I kind of gradually move up in the queer hierarchy, I'll call it. And after I become a senior manager of data science, I feel that, okay, so my data science knowledge is not enough. So if you go to a higher level, it's not about coding, it's not about data science. You'll need, suddenly you'll need some hard skills from other areas. For example, you need to understand accounting, you need to understand finance, you need to understand marketing. And for all of these, okay, I think, okay, I need to learn something, probably.

Of course, ideally, you can also outsource these skills, okay, because if you are a leader, you need some accounting, higher accounting. You need some financial knowledge, hire a financial manager or financial analyst. And if you need marketing, hire a marketing manager. But these hard skills, you can outsource. But there are also another part of probably what I call them soft skills. You can only do it by yourself. For example, like, how to build a production, productive team. Like how to, for example, how to do business negotiations. All these skills, you cannot outsource them to other people. And other people help you do it. You need to do it yourself. And that's why I decided, okay, I need to go back to school. I need to pursue an EMBA program. Probably this is the best time for me.

Working with the business

Yeah, absolutely. And I know we shared in the chat, if you want to ask questions anonymously, you can use Slido. But feel free to just raise your hand here if you want to jump in the conversation too. But you, I love the way that you explained working with different parts of the business to actually understand the business problems. And I was wondering what, like, what that process looks like for you of building relationships with different people across the business? Is it like joining some of their meetings? Like, how do you go about that?

Actually, it all started with networking. Because I know a lot of, because I work with very analytical, I would say, very nerdy team. So a lot of people, they are, okay, I just want to code. I just want to build a fantastic model. I do not want to spend too much time to network with people from other business lines. But actually, this is probably, I would say, know the right attitude. Because if you want to collaborate with other people, first, you need to get to know them. You need to understand how their business is. And yeah, in other words, probably most essential, you need to become their friends. So usually, I reach out to them and say, okay, do a self introduction. And this is what I do. This is what my team can do. So let us go out for lunch. Let us go out for, let us grab a cup of coffee. Let us talk about for 15 minutes. Let us see what kind of business, what kind of difficulties you are having now in your business. If you are a sales team, are you worrying about how to increase the volume of sales? If you are a loan originations team, are you worried about how to increase the capital rate? And then let us talk about that. Then I will see, okay, how can I connect my data science knowledge with the problem you are facing now? See whether I can build a model to help you solve the problem you have. Usually, the collaboration starts with this.

And when you build a model for them, how do you share that with them?

I would say that probably, again, that is a very important part because our data scientist has to be a very good storyteller. It is easy to build a model today because we have auto ML, we have wire-array automatic pipelines. Basically, we can talk about that later. So we design our model in a pipeline, like a streamline or something like that. So basically, even if you have no knowledge of data science, you will just change some hyperparameters and then you will click a button, of course in RStudio, and after two hours or two days, that is on the side of your data set, you can get a bunch of models. And a lot of time, mostly, this model performs well.

But this is only the beginning. And then the difficult part is that you need to sell these models, like your own product. You need to let the business owners, stakeholders, buy this model. You can build a very good model mirroring dealer risk. However, you need to be able to let your dealer sales managers understand this model. And they should be able to explain this to the dealers. I cannot reach out to every dealer and say, OK, this is a large GBM model, that builds mirroring risk. They won't buy it. No, we don't want to hear about it at all.

So the model needs to be transparent, needs to be very explainable. And to do that, usually, my approach starts from the easiest one. So usually, if we build a new model, for example, either to help for the sales team or mirror the risk of dealers, usually, we start with a simple logistic regression model. Because, frankly, everybody understands a little bit about logistic regression, or pretends that they understand logistic regression. So usually, if you present a logistic regression model, there won't be a lot of pushback. You'll say, OK, I understand that. Just tell me about all the variables. And then if the directions of the variables make sense, we'll buy it. And we can explain that to the other users.

And after that, when you build some relationship or trust with your customers, all the users of these models, then later, you can gradually push things forward. I'll say, OK, so from the next step, we know some machine learning algorithms. The models we build using these algorithms perform better. They are more predictive. And they are explainable. I can try to show that, show you how to explain that. And they are 100% compliant, because we are in a highly regulated industry. So everything we build, we are usually audited by people from the government annually, every year. They will say, OK, they will check all the models we build, say whether this is discriminatory, whether all the models we use are compliant, these kind of things.

Transitioning to VP and growing the team

And we are sure that, OK, the model is fully compliant, is fully transparent, and is accepted by a lot of players in the industry. So because we are not the biggest company in the industry, we are probably the third or fourth largest subprime lenders. So our company, our leadership is a little bit conservative. There are cultures. So what we always say, OK, we have a model. This is the algorithm we just figured out. They will try to say, are we the first one using this in the industry? If the answer is yes, they'll probably say, oh, probably we should wait. And if we say, OK, no, probably Capital One is using it right now. Or maybe Santander, maybe American's price is using this right now. They will say, OK, let's go ahead.

I see there's a lot of love for logistic regression in the chat there. But Alan, I see you just asked a question. Do you want to jump in?

Sure. Yes. I'm really curious about how your organization has treated your shifts and roles from individual contributor to a VP of data science. And curious if what's that meant for you in terms of figuring out, what is my mission at each stage? And whether you had to create that for yourself as you became VP, or if that was already a well-established role that you could slot into and know what success looks like there. How did you make those transitions? You talked a little bit about leadership and about the business, learning the business stuff. But I'm wondering, did the organization help you? Or were you on your own to go, oh, I'm a VP now. What does that mean?

OK. Thank you very much, Alan. First, one more question. And to explain that, first of all, I will say that my leadership, my company, they are super supportive for my growth, for my transition. And they provide a lot of help, either help from the perspective of leadership to help me to get exposure, to see their leadership, and help me learn a lot of things, how to lead a team, how to work with other teams. And they even provide some financial help for my MBA study.

And on the other side, our company is kind of, again, it's not a very big company. So a lot of time, you'll need to, as a VP, I'm the first VP of, I'm the first person to hold the title VP of data science in the company. So basically, I need to build a lot of procedures, processes, all by myself. It's kind of, now I have a team of data scientists, and I need to define my own function. Okay, so now you are VP of data science now. So it doesn't make sense, you're still sitting there building models every day, you need to do something else, because you should delegate these jobs to your repos.

And then I figure out my, the goal I made for myself is that as a leader, as a VP of data science, the most important thing for me is to grow my own team. So I have some data science repos to me. My goal is that I want to let them know that, okay, so one day you will be in my shoes. I won't be there forever, and one day you'll move up, up, up, and finally you'll be here. And in order to be prepared, I want you to start to, I would say, advise yourself, you need to make clear, clear goals, what you want to achieve in the next three years, five years. No, just from a technical standpoint, no stress, okay, I want to master how to build boosting models, how to kind of propose some innovations to our current model development process, not just that. I want them to make goals on their behaviors, like how to, I mean, how to develop their own leadership skills, how to, I mean, how to create opportunities for themselves to be a leader. And I will be 100% supportive for their goals.

I believe this, right now, this is my goal. So I think, for me, probably, I will say that maybe when my boss measure my own performance, I will say that 50% just measure the output of myself and my team, and another 50% I want him to focus on how I advise on the growth of my own team, how everybody in my team, how they grow, how are they, I mean, how are they making progress in their own career path? I believe this is super important for me as a leader right now.

My goal is that I want to let them know that, okay, so one day you will be in my shoes. I won't be there forever, and one day you'll move up, up, up, and finally you'll be here. And I will be 100% supportive for their goals.

That's great. Thank you. I really, really like hearing both that you have support coming into the role, like good enablement, and also a bunch of room to kind of figure out what it looks like, and being supportive to do all of that stuff on behalf of your team, in terms of their development, you know, like that doesn't happen everywhere, and it's not easy, and so, yeah, I like, I really appreciate hearing your thoughts on that. Thanks. Yeah, again, probably, the company is still small enough, so I kind of have a lot of room. I can make a lot of decisions all by myself, so it means that I can, like, if you can call the shots, I mean, I have, it means that you are afraid to build what you want to do.

Building a data science team from scratch

I was talking with someone just this week who is starting to, well, wants to build up their data science team within their organization, and wants to, like, move into that VP of data science role that doesn't exist yet today, and I'm wondering if you have any recommendations for, like, where to start with that, because I think there's so many different things for people to think about, like, when you start off at this blank slate where you could have any tools or any infrastructure, it can be kind of overwhelming to know where to start.

If you mean you want to build a data science team from scratch, I'll say that first, you figure out what your team can do for the business. It's not, if data science can have little to do for the business and can create very little values that there, it doesn't make sense for you to build a team there. You need to go to a new company, probably, and if there is a need, so if you figure out, okay, for data science can create this value, this value, this value for the business, then first you'll need to, I will say, do some public relation complaint. You'll need to let everybody, all the leaders in the company understand this.

So usually, for example, for my MBA education, they usually ask you to do a 360 feedback, so ask you to reach out to the leaders, your co-workers in the company, and you ask them to provide feedback for yourself, and then probably you also can say, you can also say, okay, my goal is to build a data science team. My goal is to become a data science leader in this company in the next year, so what do I need to do? And this is most important, so everything starts from the people, from Bayi, from other leaders. If you can build this, if you can let other leaders reach agreement that, okay, this guy, the word of this guy makes sense. Probably, if we invest in data science, we can achieve this and this for the business, and it's kind of highly valuable for the business itself, and the company will eventually invest on this.

And then another thing is that to kind of help, if the person just wants to say, okay, I'm a data scientist, I want to grow up to a leadership position, how can I persuade the company, convince the company that I'm the right person? And then also that, besides good results, of course, you also need to make sure that, show them that you can always do a good job whenever they need it, and the other thing is that you need to show the ambition to the leadership. This is my takeaway is that, for the, especially right now, you know that the job market for data science is kind of hard, right? So, okay, this is the right time for you to show the company, I want to invest on my own self, and I'm not just want to be a data scientist doing coding, building models, I want to enlarge my influence, I want to learn more things about the whole business, I want to move up, I want to be a leader of a large organization. And usually, while you do that, the company will realize your ambition, will realize your value, and it will generate good results, good feedbacks from the leaders.

And actually, this is what I did, I remember that when I say, when I reached out to leadership of the company, I said, okay, I want to pursue my EMB degree, and I need your help, it will be a lot of big commitment, I need financial support, and also it will take a lot of my time, my everyday time. And the leaders, after some discussion, they believe that, okay. So, of course, I know that probably in the next two years, you'll be busy, your time will be swim, but I, we feel that it shows that you are, you really understand, you want to understand more about the business, and you really want to do more to contribute to the entire organization. I think that's why they decided, okay, it's a good time to invest on you as an employee.

Staying current with machine learning

Thank you very much, it's really helpful. I see, Catherine, you had asked a question a little bit earlier in the chat, do you want to jump in next? Yeah, I was curious how you stay on top of best practices in machine learning, or new techniques that you can use, data science techniques, as you transition up that ladder, and you start focusing more on big picture things. I know within my own career, I've been finding it more difficult to stay on top of like the new model that's, you know, going to achieve the same thing that I know about, but my knowledge is even six months out of date. So, how do you navigate that so that you stay informed, and can properly manage your team, or understand the models that they're giving back to you?

Okay, thank you, Catherine, that's a perfect question. So, that's, this is probably related to one of my further part of my job. So, I kind of act as the ambassador of the data science team of Exeter Finance. So, basically, because, you know, I spent time in, you know, I spent a lot of time in universities. I was first in a PhD program, but I dropped out with a master, and then switched to the analytical program. And because of that, I love to work with the academic people. So, probably, I build a lot of connections with people in universities, maybe PhD students, they are postdoc researchers, they are professors, all research scientists from either Microsoft, MITRE, I mean, Facebook, this one. So, usually, every other week or every month, I will kind of go online, and then reveal the latest publication in machine learning, data science. Of course, I will search using some keywords, make sure that they're related to my topic.

And usually, after that, I will, if I find something interesting, I will reach out to the author of the publication, and then I will have a discussion, say, okay, so I read something really interesting. Could you share some codes? Could you, maybe, maybe we have a meeting, we can talk about, if we want to try to implement the algorithm you proposed to our industry, let's see what we can do here, and generate good results. For example, I remember in 20, let's see, yeah, 2020, I don't know whether you guys know, there is a Python package, auto machine learning package, developed by a Microsoft team called Flamo, F-L-A-M-O. It's called, Flamo, what does it mean? I forgot what it means. Fast and lightweight auto ML. At that time, the product was just released, maybe two or three months, so it's completely new. It's kind of, it's kind of 0.1 version, version 0.1.

And I happened to find this product, and I found it fits our needs perfectly, and I kind of had a lot of conversations with the Flamo development team, back and forth, back and forth, and they are super nice, they are super responsive, they work super hard, and all of my research scientists in Microsoft. And I go to them, otherwise, okay, because I know that's a good product, but we are a financial service industry, and if we want to use it, we need this feature, this feature, and this feature. And then they said, okay, we can have your idea, and I keep giving them the feedback, okay, so after this feature, after this feature, how it helps our model development process, or the workflow. And then gradually, finally, their product got better and better and better. Right now, I believe, it's two years from the first version, more than two years, the product is very popular right now, and I kind of become, I will say, very crucial, a critical part, critical component of our model development pipeline right now.

So I will say, that's one part, so that's the first part of my question, so if you want to keep on top, stay informed of all the latest research, all the latest programs in the machine learning industry, just try to read some papers, try to keep in touch with other researchers in the university, because if there are some new discovery, new progress, they are driven by them. And secondly, for my team, there's a, first, I also request them, so every year, every month, or every quarter, you need to spend some time to read these papers, this research progress, and also I give them some freedom, so probably you can use 10%, 15% of your time to do something you believe that will be useful sometime in the future, but not necessarily useful right now. As long as you can justify it, as long as you can tell me why you want to check this algorithm, why you want to attack this algorithm, it's okay. Probably we cannot use it right now. If you want to show me, you want to build a deep neural network model, something like that, it will perform very good, it's great. Right now, our industry cannot use that model because of the, I mean, it's not compliant because of the explainability, but probably in the future, sometimes, you will be able to use it. We will have that preparation for the future, and again, in this whole process, I believe all of my team members, I help them, their own creativity and thinking, because when they do the research, they all think about the business, what kind of model, what kind of algorithm I should try to match with the business, to try to match with what I'm doing right now.

I love how much you point towards networking, when you are both talking about, like, working with the business, but also in keeping up to date, and I see Libby, I think you just called that out in the chat as well, and I think it's a, it's a skill that we don't often highlight through, like, I don't know, interviews or on a resume, like, your ability to network and get ahead through that. Yeah, and I wanted to add that I like the idea of, you know, I like to keep continuing education, you know, as one of my personal career goals and as the goals for my team, but the way that you were describing it, saying, making that kind of specific, like, I want to research this particular thing or learn about this type of model, can really help, you know, you understand where, where your team is spending their time and can give you some, some insight into those things to stay on top of, so I liked that as an idea. Thank you for sharing.

See, Santiago, you also had some thoughts there to share if you wanted to jump in, too.

Yeah, so I, I subscribe to a ton of newsletters, and it clutters your inbox, I have a whole separate email account just to stay on top of these things, but there are a ton of good resources that are free. One will summarize Arc, Viz, or whatever that's, that is, papers, like top 10 papers of the last month or whatever. Most are deep learning related, some are stats. There's a couple of deep learning newsletters that are really good, and then I'm a member of the ASA, and they send out daily webinars, and there's all kinds of good stuff, geospatial, like temporal time series. There's one today on something later, and then through one of those, there's a quick shout out for something that I thought was really cool. Somebody named Cynthia Rudin at Duke and her team developed a series of models, one, I think it's called Ghost, I don't know if I'm saying that right, but it's really cool, interpretable, has high performance.

Yeah, sure, and her papers are interpretable too, for anyone that's interested in that.

The EMBA program and being a data science ambassador

But when we were talking about networking and different companies that you reached out to, the past few months we've featured a lot of different pharma customers that have been collaborating together on a few open source projects, and I was curious if there are things like that happening in the finance space? Do you mean like the finance space, network? Do I reach out to data leaders, data science leaders from other companies?

Oh, right now I would say no, because of the financial service industry, it's a very small world, so people kind of know each other, and they only hire people, they like to hire people who they work with before, so it's kind of, you don't need to network, you don't need to network with other people, you just know everybody working in this small circle. Yeah, but again, networking is, I would say, a super important concept, that is also what I learned from the AIBA program, so to network, it's not just you find some people who can help you, but another equally important opportunity is you can find opportunities for you to help others, so you will grow as people who are being sponsored by other leaders, as well as people who sponsor the growth of other people, so this is, for me, I believe that this idea makes a lot of sense.

I think for a lot of people, networking can sound scary sometimes, and I'm wondering if you have any tips for people on how to first reach out to somebody.

I don't know, probably, I think it's part of people's nature, if you are some super extrovert, it's easy for you to talk to anyone, people always talk to other people, and if you are introverted, it doesn't matter, but for me, networking, first, you just need to be genuine, that's for sure, and again, don't kind of go to a networking event with the idea, okay, I need to figure out a new modeling algorithm that I do not know right now, so just be open, just, okay, I just want to try to make some new friends, I think this is the starting point of everything, and then later, gradually, if you become friends with other people, then during the conversation, ideas and opportunities will grow, will rise up naturally.

Thank you, so I think this was maybe like 15 minutes or so ago, you mentioned being an ambassador to the data science program or data science community, and I was curious what that meant, and do you have like a data science community within Exeter? Oh, no, but I mean, because data science is, you know, it's a kind of a catchy word, a lot of people are very interested in this, people say, okay, we don't know a lot about data science, but we know it's used by almost every business, so when I work in any group, in any community, I will always try to explain to them what data science is, what I do, for example, again, go back to the EMBA program, the current EMBA program, I'm in the EMBA America's program, it has maybe 170 students in the cohort, but I'm the only data science professional in that cohort, so people are very curious, what do you do, what do you want to pursue in the EMBA program, your job looks like very technical, so what do you want to do, what do you want to join the EMBA program.

But that's why I believe I'm the ambassador here, so I always say, okay, because, again, for the EMBA program, the background of your classmates or colleagues can be very diverse, for example, in my program, in the Dallas boardroom, people from Dallas, we have a business owner of a fabrication business, we have a seasoned officer of the special operations of the American army, we have people, we have a head of sales of another financial organization, we have people from the pharma industry, but again, that's why it's super interesting and super cool if you, well, if you complain, okay, this is a data science, this is what I do, this is what I believe data science can do for you, you know, for your job, for your business. And I believe this is a very cool, there's always a very good conversation there, and when you share your experience, your knowledge with people who is holding the similar title to you, but from a completely different field, this is how you open your mind and how you grow up from there.

And again, probably I didn't want to spend about five minutes to talk about the EMBA program, because I believe I fit a lot of people here, because, again, people, if you believe that you want to pursue a high quality, I believe EMBA program, you do not want to quit your job, you do not want to move to another city, to where the school is, the siting of the Cornell EMBA program is where you need, because it's called America's, it has distributed classrooms in maybe 10, about 20 big cities in North America, maybe both the United States and Canada, it's a joint program, sponsored, provided by both Cornell University in the United States and Queen's University in Canada, so every other weekend, you go to the classroom in your city, and you'll meet your customers in person, so you'll have this kind of face-to-face experience.

And then the class itself will be taught by a professor in a studio at either Cornell University or Queen's University, and there is a team based, because all the students there are leaders, executives, if you just need to take a class and do some individual homework, it doesn't matter, right, for me, I don't care about grades, I got an A, I got a C, it doesn't matter, right, for me, at least I learned what I learned, so, but if there's a team based, you need to work on a team project, so, you know, okay, I represent team Dallas here, I need to do a good job, I cannot just, I cannot just kind of spend a little bit, a little time, okay, I need to do a good job, as a deciding, make sure that all the students are highly engaged, and make sure that it's competitive.

Moving to the cloud and current challenges

Maybe shifting gears, just a little bit here, I've recently been talking to a few people, who are talk, maybe moving to the cloud, and I was curious, if that's something that your team is doing, or what your, this is our, this is our new direction, so we use, we just build out, we have our on-prem server, so it's a Linux server, so it's a big server, so it's shared by all the data scientists in my team, and we build models there, and we implemented it in our own IT environment, but starting from last year, we begin to shift into a zero machine learning studio, a zero synapse, so the shifting happens now, so again, after that, so it's kind of, we are still exploring all the functions, all the options we have there, but we believe this is the future.

I realized, Alan, we kind of went on the same wavelength there, at the exact same time, do you want to jump in with your question too? It was, it was really similar, and, and I think you spoke to it a little bit already, maybe the only thing to tack on would be, as you think about moving to cloud thing, do you have a bunch of, like, negotiation that you have to do within the organization for approvals, for security reviews, for, you know, that kind of stuff, sort of just, just curious in general about the level of autonomy that, that you and your team have versus needing to sort of play in the space that the rest of the company kind of is ready to set up for you?

There, there are, but frankly, all the negotiations happens at higher level, so I didn't make the decisions, that happens at a top, top level. Of course, we had a chance to reveal the products, to give them recommendations that, okay, we have this concern on that function, so probably we need to figure that out, that's what we, that's our input to that shift. And, and they know that you're there as stakeholders, so, like, are they asking you, like, hey, we've got an enterprise plan for cloud, whatever, coming along, help us make sure it's good, or do you have to be really assertive and, and, like, remind them that you exist as a function and, and need to be able to input? Oh, I think we go to the assertive way, so we said, oh, we kind of, we make some more clear requirements, okay, so because we are, we, because we are financial industry, we have some more sensitive customer, PII, personal identification, some, some data. Yeah, we said, okay, so kind of data security is our top concern, so we give them a specific list of needs, okay, we need, we need to make sure this, and this, and this, and then kind of, really, kind of, they figure that, okay, we can help you meet all the requirements.

One anonymous question that's over on Slido is, what is the biggest challenge or project that you and your team are trying to tackle right now?

We are building, traditionally, we are building a lot of machine learning models, so I won't say any of, some of, some are more difficult than others, but I won't say any specific ones that are especially challenging. One thing we are doing right now is we are trying to build some optimization models, you know, I'm, my major is operations research, so it's not pretty much machine learning, for example, like solving travel, solving model, like traveling salesman, I guess, I'm sure some people in the audience know that, and then this is a new field for my team, and it's, it's kind of, I won't say it's challenging, but it's very interesting. So, for example, we want to, we have a sales, we have a sales team, they need to visit different dealers, so it's a kind of classic traveling salesman problem, that we have these dealers, we know their, we know their location, so basically, we have the distance between every two points, right, and then we know the starting point, probably the home base of the salesman, sales manager, and the everyday manager can only travel certain amount, certain number of miles, can only visit certain number of dealers, and then for each dealer, we have an expectation, so probably a potential benefit by our visit, and then we need to build an optimization model, too, so, okay, finally, give a list, okay, tell, okay, dealer sales manager, why, okay, on Monday, this is the list of dealers we think you should visit, and we can also give them, give a map, okay, you can start from dealer A, and after A, you jump to dealer C, and then this comes in.

This is the project we are working right now, it's kind of, the challenging part is the implementation, and because it can be very complicated, because dealer sales manager, they may have different home base, and maybe the area, the territory he or she manages is so big, and you need to add some other concerns, like how much time he can travel in the morning, how much time he can travel in the afternoon, and if you add all the constraints, the problem can get very complicated, so right now, this is the problem we are trying to tackle, also, it's a challenging one.

Another question over on Slido is, is your group considered a revenue generator? Have you or your team developed something which made higher-ups take notice? Yeah, I would say absolutely, because actually, I asked the same question when I took the accounting class in the MBA program. I said, okay, so our program, we, do we, shall we be considered as revenue generator? And the professor said, yes, because you generate a lot of, I mean, intelligent assets, so, and especially for the financial service, for the lighting industry, kind of, data science is driven by data science, because you have a lot of, you have a lot of money here, and if you, a lot of people will reach out to you to say, could you borrow me some money, I want to finance my vehicle, and if you cannot use your data science to measure the risk well, to make the rejection, to make the approval or decline decision wisely, then you will lose all your hype very quickly. So, because of this, I believe that, again, I definitely, data science is kind of a core part of revenue generator of our business.

Subprime lending and risk models

Okay, I'm not sure how to ask this, this question, but, so, I just watched recently, the big short, and so, I'm just, I'm curious if there's certain things that you have to do as a data science team to make sure that, like, certain areas aren't, like, lending to too many people who might not be able to, to pay back, like, looking at individual areas as compared to, to others?

Could you rephrase the question? So, what's the difference between lending too many people? Well, I guess, yeah, Libby's, Libby is helping me out there, like, managing the risk with subprime loans, like, do you have to look at individual locations versus others, or, like, individual dealers? Got it, got it, yeah, of course, yes. So, I can explain a little bit, it's kind of very interesting, the subprime industry. So, basically, the principle of business that we have a lot of money, we are trying, we are trying, we're trying to lend them out. We'll start from the safest customer segment, and they will keep lending, lending, lending, and then the rest of the portfolio will increase. So, basically, we have a lot of money, we are trying, we are trying to lend, and then the rest of the portfolio will increase, increase, increase, until it reaches the threshold, or reach the level that, okay, the risk is too high, we'll stop.

Yeah, so, and then, again, in this whole process, the risk is just measured by a lot of machine learning models we build. For example, we receive an application from a customer that says, okay, I need to borrow $10,000 to buy a car. The first thing we need to do is, why don't we approve or decline? And then we use bureau data, we use some alternative data, like on payday loans, these kinds of things, to build a model, to make a decision, maybe in 10, 15 seconds. And then say, okay, we approve it, I'll decline this application. If I approve it, then there are some other, other decisions to make. So, if I approve it, what's the price? What's the API ratio charge? What kind of deal we can approve? The customer is, okay, I want to borrow $40,000 for BMW, but we build some models, so it's, sorry, we can only approve, we can only give you maybe $15,000 for Toyota. This is kind of determined by the model.

And after that, we'll, there are also another similar series of models, like probably, if we, if we booked a deal, if we booked a loan, I will say that, okay, so, if we have the loan in the portfolio, we'll build a model predict, okay, well, so what's the default risk of this loan during the entire life cycle? So, we call it post-funding model, something like that. And then we need to, we have models managing, okay, so, if the loan, how, how shall we service this loan? So, if the, if the customer is, is a very risky customer, we shall call them every month to say that, okay, you need to, you need to make the payment, or is it a regular customer, we can just let him, let him or her manage the payment by himself, or herself. So, anyway, there are a bunch of, I would say, we call scorecards, a bunch of credit risk models behind