RStudio Sports Analytics Meetup: NFL Big Data Bowl 2022 Winners discuss the Math behind the Path
videoimage: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
everybody, it's so nice to see you back at today's RStudio community meetup. If we haven't had a chance to meet yet, and this is your first meetup. I'm Rachel calling in from Boston today. It's so nice to meet you. Feel free to introduce yourselves through the chat window and say hello as well. I love getting to see where people are calling in from all over the world and to also see people sharing helpful resources with each other there in the chat too. I host these meetups every Tuesday at noon Eastern Time, and they are all recorded and shared up to the RStudio YouTube. If you ever want to go back and check out past sessions too.
If this is your first meetup, this is a friendly meetup environment for teams to share different use cases with each other and teach lessons learned. Together, we're all dedicated to making this an inclusive and open environment for everyone, no matter your experience, industry or background.
For a heads up about next week, we will have Julia Silgi and Isabel Zimmerman with us here next Tuesday to talk about MLOps with with Vetiver in Python and R.
Today, I am so excited to have you all here for our sports analytics meetup with the NFL Big Data Bowl 2022 winners, led by Robin, Brendan, Riker and Elijah all here with me today. I've seen a lot on Twitter and LinkedIn about their team and I'm excited for us all to be able to ask them questions here today as well.
Robin, Brendan, Riker and Elijah are all graduate students in statistics at Simon Fraser University in Vancouver, Canada. They each have a passion for sports analytics and enjoy applying their stats knowledge to the game. First off, I want to say congratulations. And thank you so much for being here to share your experience with us today.
Introduction and team background
All right. Hi, everyone, and welcome to our presentation. We're extremely excited to be able to share our work with you all today. So today we'll be presenting on our 2022 NFL Big Data Bowl Grand Champion winning project. We'll provide details on the math behind the path and introduce some upgrades we've made since the conclusion of the competition.
So SFU has been a sports analytics powerhouse for years. Professors such as Tim Schwartz and Luke Bourne have attracted students who have a passion for statistics and sports. Together they have revolutionized the department and have helped students land jobs with professional teams such as the Seattle Kraken, LA Rams, Sacramento Kings, Vancouver Canucks, and many, many more. SFU's Big Data Bowl success started in 2019, where a team of students won the college competition, and were again finalists in 2021. Their work really inspired us to continue on the SFU tradition, and their mentorship helped us throughout the entire process.
So before we get into the project, let's meet the team. So my name is Riker. I am a current Masters of Statistics student at SFU. I formally interned with the Vancouver Canucks of the NHL, and I'm currently an intern with the Detroit Lions of the NFL.
Hi, everyone. My name is Robin. I'm a PhD candidate at Simon Fraser University, and I'm focusing on revolutionizing the game of curling with analytics and machine learning tools. In addition to the Big Data Bowl, I was part of the winning team of the Big Data Cup this year. So for that we use tracking data to evaluate passing in women's Olympic hockey.
Hi, everyone. My name is Brendan. I'm currently a Masters student at Simon Fraser University, and a data analyst intern at Exelis Analytics. And I also won the Big Data Cup just a year before Robin in the 2021 competition, where I worked with a few others in building out a expected possession value model for hockey, so evaluating every single event in a play.
And hi, my name is Elijah. I'm also a Masters of Statistics student at SFU, and I'm currently an intern with the Pittsburgh Pirates.
Big Data Bowl overview
All right, for those of you who are unfamiliar, here's a quick overview of last year's Big Data Bowl competition. So this was the competition's fourth year, and the subject was chosen to be on special teams. Special teams includes plays like punt returns, kick returns, field goals, and extra points. We were provided data from three NFL seasons, including next gen stats, tracking data, PFF, advanced scouting data, and other basic stats from the NFL. In total, there were $100,000 in prizes awarded to three college and five open finalists. And then we were lucky enough to be named the grand champions out of these eight finalists.
So now for a bit of inspiration behind the project. This is an entry into this year's 2022 Big Data Bowl by Zach Rogers, and this gives good motivation for our work. It puts you in the shoes of the punt returner. If you can imagine that you are the punt returner, what should you look for after receiving the punt? Our project aimed to answer a couple of these main questions. So firstly, are returners making good decisions to increase yards? Are they finding optimal gathering? Are they finding optimal gaps, assessing tackle risk? And are they using their teammates and following their blocks? Secondly, we wanted to determine who was the best at evaluating their decision. Could we rank returners, which could then allow coaches to determine the best returners based on a given situation?
Building the optimal path
Cool, thank you. So now that we know what the punt returner needs to consider, and what our goals are, we want to be able to find show the best path to the end zone, like we see in green on the figure here. The observed path is shown in black, the returner and his team are shown in yellow and gray, while the kick team players are shown in pink.
So the first step in building our optimal path is to find gaps in the kick teams coverage. For this we use the Lonnie triangulation. So this is a mesh like structure that connects the points such that all players in the hall are all connected to form triangles with no lines overlapping. This is shown with the purple lines between any two kick team players. And to account for the punt returner going around all the points, we need to find gaps in the kick team's coverage. So for this, with the punt returner going around all the players and towards the sidelines, we add new gaps that connect the players on the outside of the hall to the boundaries of the zone of interest. So these are shown by the vertical and horizontal purple lines here. Some of the interesting packages we use are tri-mesh and neighbors from the tri-pack family.
Penalized expected arrival time
And now that we have these windows to find that the punt returner can move through on his way towards the end zone, we want to move towards quantifying the pressure along each of these windows. And we do this using our penalized expected arrival time algorithm. To give a bit of visual intuition behind this algorithm, if you look at that heat line in the top left of the plot here, you can see that towards the sideline, it's colored green, indicating that there's a lot of open space there. And it's going to take the members of the kicking team a long time to get to that position. While as you move closer and closer to where all the players are, you that it goes from green to yellow to red, indicating that there's more pressure and less time for the punt returner to move through that area.
And to kind of formalize this a little bit more, the goal of this algorithm is to obtain the minimum time we expect it to take a member of the kicking team in red here to reach a target location as indicated by the X here. And we're going to take a bunch of different target locations. But let's just focus on this single spot right here.
So with that target location, we're going to look at every kicking team player on the field or every player in red, that's trying to stop the punt returner from moving forward. And let's say we just look at number 97, for example, to start off here. So with number 97, what we're going to do is we're going to create a straight line connecting him to that target location. And once we do that, we're going to find all blockers that we can project onto this straight line at a 90 degree angle. And that's going to sort of be all of the blockers that can really reasonably impede his path from getting to that target location. So here we have number 82 and number 19. And for these blockers, we're going to take two different measurements. So first, we're going to take the length of that projection from 82 to the straight line, as indicated by D1 here. And that's going to be a measure of the lateral distance from the kicking team player. And then we're going to take the distance from number 97 to that projection. And that's going to be indicated by D2 here. And that's sort of the forward distance from the kicking team player to the blocker.
And with these two measurements, we're going to attribute time penalties for each of the blockers using what we call a projected Gaussian kernel. And that's characterized by the equation right here and the heat map below. And sort of what we're trying to do here is to create an equation where we're kind of accurately capturing how much a blocker is likely to impede a member of the kicking team's path to that target location. Right, so you can see here that when you're very close and right in front of number 97, you're going to impose a very high penalty. Whereas as we move further and further out, you can be a bit more lateral from the kicking team player or from number 97. But you you have less of a penalty overall, right? Because if you're say you're right in front of number 97, like one foot forward from him, but five feet to the left, then you're less likely to actually catch him or impede him than if you're, say, 10 feet forward from him and five feet to the left.
And once we assign these blocker penalties, we're going to simply add them all up. So with number 19, and number 82, they impose a 0.54 and 0.31 second penalty on number 97, respectively, giving us a total penalty of 0.85 seconds by these two blockers. Then we're going to calculate the expected time to the target location for number 97, given that he runs at a straight line speed of seven yards per second, and that's going to be 2.34 seconds. Then we add these two values up. So the time that we expect him to take to get to that target location unimpeded as 2.34, and the blocker penalty that we expect to happen as 0.85. And that's going to give us 3.19 seconds for number 97 to that target. And we're going to repeat this process over and over for every member of the kicking team.
And on the left here, you can see we have a little table with the top few players here. And number 17 is the quickest to get to that target location at 0.92 seconds in this in this case. And that's sort of that's going to be our measure of the time we expect a member of the kicking team to get to that location, or our penalized expected arrival time. And we're going to repeat this process over and over for up to 20 evenly spaced points along each of these windows.
A* search algorithm and optimal path
Great, so now that we have our mesh structure and a waiting system for it, we need to find a way to get the punt returner to the end zone. And to do this, we use the A star search algorithm.
So it works a little bit like a GPS and just finds the safest route while avoiding traffic. Our algorithm looks at a heuristic function, which defines how many yards are remaining from the target to the end zone, and a cost function. This cost function describes the distance between the punt returner and the target, the danger associated with the target point, as well as how risky of a target the punt returner may be willing to consider traveling to. The algorithm looks at various possible routes and restarts whenever the total of the heuristic and cost function become too large.
We can set a few constants to help us with the trade off between risk and reward for the punt returner. So today, we're going to really focus on the alpha term and see how changing it might adjust this trade off between risk and reward. So for our submission, if you have the chance to look at it, we used alpha of 0.5. So we consider this to be a good mixture of risk and reward for a punt returner wanting to gain yards while still avoiding the tackle, avoiding injury, stuff like that. If we were to change alpha to be one, the path considers only yards remaining to the end zone with no care of the danger of being tackled. So this would mean that you're going basically straight to the end zone and hopefully that you can push everyone out of the way. And then setting alpha to zero finds the safest path available without considering if it's really going to bring you closer to the end zone. So this typically will send you along the sidelines and push you the rest of the way.
Fréchet path deviation
And now that we have this optimal path defined, let's just take a look at a case here where we set alpha equal to 0.5. And we want to compare the optimal path versus our the actual observed path for the punt returner. And to do that, we look over the next five yards for each of the path. So as you can see on the top right here in the visual, we have sort of the highlighted black and green line over the next five yards. And we want to calculate the crochet distance over this time. And crochet distance is a measure of difference between two paths where essentially intuitively you could think of it as one path being a man, the other path being a dog. And the crochet distance is the minimum length of a leash you need to traverse from one end of the paths to the other end.
crochet distance is a measure of difference between two paths where essentially intuitively you could think of it as one path being a man, the other path being a dog. And the crochet distance is the minimum length of a leash you need to traverse from one end of the paths to the other end.
You can see visually a bit of visual intuition behind this. With the blue lines here, you can see that sort of like steps along each of these paths. And the teal line there is another step, but it's just the step where the length is maximized. And that's sort of the minimum length of a leash that you would need to traverse from both ends.
And with this crochet distance, we also have to take into account a couple of other things. Just in case there are cases where there's less than five yards remaining in either the observed or the optimal path. So if there aren't, so if both paths are more than five yards, then we just simply calculate the crochet distance. And that's sort of our measure of deviation from the optimal path in that moment. But if either of the paths are between one and five yards, then we're going to multiply the crochet distance by five and divide it by the minimum of those two, the minimum length of those two paths to sort of scale up and account for the fact that that path, we aren't really seeing a full five yard movement. So we're trying to kind of counteract that. And if there's less than one yard on either of the paths, then we omit the calculation since we don't really have enough information to see how the punt return was moving relative to the optimal path.
Key R functions used
And yeah, with with this optimal path algorithm, we also just wanted to give a quick shout out to some of the key R functions that really helped us in this process. So first up nest and map are super helpful for making this whole process very tidy and clean and, and being able to apply large scale analysis using very complex operations within just a single data frame. So nest essentially allows you to group say you want to group the data by moment or by each frame of the play where a frame is simply just a moment in time. But you have 20 rows of player locations. If you group by the play or by the frame, then you can sort of nest the data into a column of data frames, which includes all of the player locations. So you have one row per frame. But within that row, you have sort of a data frame of all of the player locations. Then with the map function, you can start applying different functions to that data.
So you can apply basically functions to find the windows for the punt return to move through or calculate the penalized expected arrival time or find the optimal path with the A star algorithm using this map function all within one data frame. And future map is just an evolution beyond that where you can parallelize the code. So it runs a bit quicker. And dplyr, of course, was super helpful for us just with building out sort of a tiny verse pipeline with key functions like mutate, select and filters, and also the pipe operator making the code nice and sleek. Then of course, our all of our plots here are made with ggplot and gganimate. And it's those packages are just super awesome for making very nice, aesthetically pleasing plots with very little effort.
Q&A: graduate programs and methodology
Yeah, I can see a question that came over from Slido. And just a reminder, a reminder that you can ask questions over on Slido and I'll just put up the link right now on the screen. So you can ask questions there anonymously. Or you can ask through YouTube Live too. But one of the questions that came up was, I'll put on the screen, do you think being in a general statistics graduate program was beneficial to your work as opposed to more specific like data science or sports analytics?
That's a good question. There might be debates there. I think like, so we'll talk a little bit later. But for our work, a key aspect in building a team is definitely to have a good variety of backgrounds. So having someone who is like a statistics graduate, to really understand like the models and what's going on behind it is great. If you could get like a computer scientist on there. They're super, super good at they're super efficient and can write code probably a lot faster than a lot of us stats folk. Yeah, but it's definitely good to have always someone on your team with a stats background to understand what's going on. But a mixture of various backgrounds and departments and origin stories is always, always great to see.
That's great. Thank you. One other question I see over on Slido, I'll put on the screen here is, did you calculate P or PAT at just a single point in time at catch or at multiple points during the return?
Yeah, that's a great question. So yeah, we recalculated the P basically at every single frame of the play. And then for our big data ball submission, we were sort of taking the median over the entire play, just because there are a few frames where say a punt returner is running in a path that's maybe not the optimal path by our algorithm, but also a very good option. Then after a few frames, the optimal path should adjust to that. But during that time when they are running in the optimal path, telling them to go a different way, it is it has quite a big crochet deviation. So the median, or we took the median to sort of account for that, and adjust for the fact sort of for our naivety in that there could be two different, very great, great options. So that's what we were doing for the big data ball. But we've also played around with metrics for looking at it, just write a catch, taking the mean, looking at, say, the frame where there's the first forward movement, and just different metrics like that. But yeah, basically, we can calculate it over every frame.
Shiny app demo
OK, so to make our work a little bit more interactive, we built a Shiny app so that everyone here today can have a look at our optimal path, not any return that you might want to see throughout all the years of the data that we have. So as the user, you can select the season, the returner, the play, and set the risk alpha that you want the punt returner to consider when making his optimal path, I guess. So you can look at the return at any point along the way by hitting the play button in the top right corner. Or you can use the slider bar to look at any specific point along the return. So you're given a description of what the return looks like, or what the return actually achieved. And then you have the option to save it as the image of the individual frame that you're looking at. You can save the GIF of the entire return, or the data that's behind the return if you want to look at it and play with it yourself.
So maybe don't try to log on right away, because we want to make sure we're efficient for you guys. So we can start off by picking a year, let's pick 2020, maybe. And then if anyone in the comments, do you have like a favorite punt returner that you might want to see from the 2020 season? Feel free to comment now.
So any other questions, Rachel, to fill our our next five seconds?
One question that I had was, like thinking about people getting started with the Big Data Bowl for 2023. And you just have all this data. How do you narrow down what you want to focus on?
Yeah, so typically, so we're not sure what they're going to announce this year for the Big Data Bowl. But I think they're leaning towards having a top a general topic each year. So for us, it was punt returns. So this year, who knows what it might be? But yeah, just like, definitely start by just like, looking at the data, seeing what you can find. And playing around, make some histograms, make some pie charts, anything you can do just to understand the data, it's going to be a lot of data and a really hard to run. So often just like, looking at a couple plays or one good play, make sure everything works for that play before you scale it up to big, because some of our stuff, it would run, I think Riker was running stuff like overnight. And then if we had to, like, find one error, you'd you'd fix it. And then you'd go and run everything again, overnight. So it's, you don't want to start there, you want to start small, maybe with one good play and see what you can do.
Yeah, and definitely, definitely, you know, we had kind of a broad topic with special teams, there was a lot of different ways we could go with it. But figuring out what what's going to be useful as well in a football sense. And then, like Robin said, kind of getting a general idea how the data looks, what's going to be possible, and then kind of running with it from there.
Yeah, we definitely had a lot of other ideas, like are we going to do, you know, field goals, are we going to do this and that and then it's just kind of trial and error and getting close to what you think is the best idea.
Working with tracking data is definitely a great experience. Like even if you're not ready to enter the big data and maybe you don't have your star lineup ready, play around with the data, like get used to working with tracking data is definitely something if you want to start a career in in sports analytics, just getting those tools in your toolkit is is definitely super useful.
McCole Hardman case study
So here, we're going to show McCole Hardman's week 12 return against Tampa Bay, where we have an alpha of zero, so be as safe as possible 0.5, kind of a good mix of risk and reward, and one, which basically just says run straight to the end zone, plow through everyone if you need to. So in our Big Data Bowl finals presentation, we concluded that McCole Hardman was a fairly conservative player. And we can use our app to further analyze just how conservative he is.
So if you want to play the GIFs, so the optimal path updates at each frame, constantly updating and adapting to the situation that McCole Hardman is facing, suggesting a new optimal path given each situation. And overall, McCole Hardman, we think could use this tool to improve his return, because a lot of his return is probably closest to a risk level of zero, which, you know, we kind of want to gain yards as as we do a return. But it depends on the scenario, right? He's a valuable player in his other, other role on the team. So it's, you kind of want to keep them safe at the same time. So things like that could influence how you might determine what alpha you might want to use.
Team analysis and post-submission work
Yeah, okay, so I'm going to take over now and talk about kind of what we ended up doing after the initial submission. Go to the next slide. Yeah, so the task that the NFL gave us was to look at a look at a specific team, and try to find out how we can use the work we did to improve, you know, their results in the coming seasons. So we picked Kansas City. So as you can see, like in 2018, they are really highly ranked in terms of, you know, average punt return yardage. And that kind of fell off in 2019 and 2020, which coincided with Tyreek Hill, the returner being changed to McCole Hardman, who Robert just showed. And so after watching some of McCole Hardman's punts, we kind of, we kind of chose a Chiefs as someone that we thought we could improve.
And we can see why in this plot. So if you look at the McCole Hardman plot on the far right, he often kind of just goes towards the sidelines, he's not cutting up and, you know, running up the middle of the field where there's obviously gonna be more defenders, we also have more blockers, and there might be more space to maneuver. And we can kind of contrast him with the other three punt returners on the right who rank really highly in our expected return yardage metric, or just on average punt return yardage, they're, they're much stronger returners, you can see that by the really long returns for some of them.
And so the way we quantify that is by looking at, you know, their average yards gain per punt return versus their frechet path deviation. So the frechet path deviation is kind of a concept of good decision making. So being close to the optimal path versus, you know, average yardage gain per play is more of a just skill metric of, you know, it's dependent on how fast you are. And obviously the strength of your blockers, and the strength of the, of the kicking team. So we can look at McCole Hardman, he's kind of on the bottom right of that plot, which means he doesn't have a lot of skill, and he's not making great decisions.
And that's, you know, contrasting to the other three highlighted names who were the same, who are the returners we saw on the last slide, who are very strong returners. So the idea is, you know, if you decrease your frechet path deviation, you know, that might lead you to gaining more yards than the average punt return.
Tips for a good Big Data Bowl submission
All right, and then just talking about more generally, you know, how to have a good submission for the Big Data Bowl, I think one of the biggest parts is, like Robin mentioned earlier, information, having a very diverse team with, you know, many different skill sets, you know, I come from more of like, a math physics background, very different from the background that maybe Robin has, you know, doing an undergrad in stats. And so that helped us, we kind of brought different ideas to the table. And then, you know, brainstorming, you know, as Robin mentioned earlier, making visualization, so you know, exploratory data analysis, seeing what kind of data you have, what kind of paths might you go through, in terms of different projects, you know, just running with something and then not being afraid of scrapping it and going in a different direction, if you see a better direction.
All right. So for any of you who are interested in the 2023, or beyond those Big Data Bowls, this is what we kind of believe makes a good submission. So firstly, we think do a few things great, rather than several things good. The goal is really to stand out to the judges. And we think this is kind of how you do that. Second, have multiple ways to digest ideas. So this could be through visuals, text, equations, titles, or even captions. The third is to be really clear and concise with your project. So have a goal in mind, what problem are you trying to solve? Then determine the method, how are you going to solve this problem, and then apply it, it's really important for your project to be useful to the NFL, and for you to explain how it could be useful. Lastly, have fun and learn as much as you can, not only from your own project, but many all of the many great submissions every year.
do a few things great, rather than several things good. The goal is really to stand out to the judges.
All right, so that's really everything we have for you guys today. On the left there, we have QR codes that take you to our original submission, the GitHub page, like we mentioned earlier, as well as the Shiny app that we we showed you guys. We really appreciate the opportunity to talk to you about our work. And really want to thank everyone for coming out today. And obviously, thank you, Rachel, for having us. We'd be happy to answer any more questions that you guys may have.
Q&A: real-time use, code sharing, and challenges
Thank you all so much. This is awesome. Really impressive. Actually, it's like being able to see it through all the animations to in the presentation was great.
I really liked the way that you included like tips for people who are wanting to get started. And was just thinking, like, on the last tip that you said focus on like a few, like great things, instead of like trying to do everything. Were there a few things that you had to narrow down to get to where you got to the project now?
Yeah, I think we I think we had like kind of Eli went over when we, you know, we kind of built it up from the ground up, and had these like convex hulls and then developed it more into like the optimal path. And then once we got the idea for the optimal path, and Brendan and Robin kind of took the lead on that side, that became a real focal point for us. So we did have a lot of those other like original ideas, we use those as features in our returners of expected model. But they weren't as featured in the project, we really tried to hone in on the optimal path and make that the focal point.
And definitely while you're working on your submission, I think for us, it was you're allowed up to 2000 words, which doesn't go a long way. So like definitely having visuals that kind of explain what's going on to can save you a lot of time in space. So you don't like if you do too many small things, you won't be able to explain it near the extent it would deserve. So it's nice to be able to do this where it's like we can actually show people all the hidden elements that they didn't get to see and, and would never really know if we didn't go through this.
one other question that just came through on Slido was, what were the biggest challenges when going through the data?
First of all, the the size of it, how much was there really trying to really trying to utilize like every piece that they provided for us like trying to really diversify how we were building out the project. But like I got mentioned a few times throughout the presentation, the amount of the amount of tracking data is really the challenging, really the challenging part. And then being able to turn that into our own, our own metrics was was also a challenge.
Yeah, I can't remember if it was the NFL or somewhere else. But I think they're, they have the ability to go like, another 10 times of a finer grain of tracking data. But they're just like, please don't we do not have the capacity to like, to be able to even use that like it's too fine. So there are like, hopefully they don't go too crazy. But it's a it's an interesting area to work with is the tracking data.
And with that, too, I just like to say the NFL tracking data was probably the nicest data I've worked with and how clean it was and just easy to work with. Like there weren't really any errors. There might have been like one time I might have saw a player where two players have the same number. Jersey number, but that's that's about it. Like it was pretty smooth. There were no discontinuities in the tracking data at all, really. So it just made it so much easier to work with.
Yeah, like Riker was mentioning size was the biggest issue just with dealing with all of this, these millions of rows of data. But actually, having to clean up the data was super easy.
No, I definitely having like a team to work with can alleviate a lot of the challenges because we like the way our project was we approached our project was we split it up. And if you have a strong computer, you're doing the data cleaning. And if you have a weak computer, you're going to do the research and we kind of like split up the different tasks that we had to face that we could always have everyone doing something and progressing the project.
Getting started with player tracking data
Um, how do you get started working with player tracking data? Were there resources that you turn to?
I think a lot of us had a bit of experience prior to whether it be in school or through internships or other just other projects. I know, like, was mentioned, you know, Brendan, I worked with big, the big data cup the year before. Yeah, like for, for me, it was just kind of like, diving right into it and figuring out as you go.
Yeah, I think for me, I'd only worked with event level data before going into this competition. And just like the key thing that I really focused on and needed was like, I need the visual. I like I won't understand that it's working, you can give me something, a bunch of numbers. But like, if I can have a visual to go with it, that like definitely helps with the understanding of what's going on, like a million times.
Yeah, yeah, I'd say, yeah, like along those lines, having good knowledge of tidyverse and using those nest and not functions and gg animate made it so much easier to understand the data because if you can kind of put it into this nice clean pipeline where you can break it down into like one row per frame or per player, whatever you need, then you can just visualize it with gg animate, it makes it so much easier to see what's actually going on, rather than just looking at the data and seeing like x coordinate 34, y coordinate 43 or something. So that just made it way, way easier to kind of get going and see where issues were happening if they were happening.
Building a team and community
I'm curious, do you all normally win your like fantasy football leagues?
We're all in one together that just finished its first weekend. It was a bit rough for some of us.
One of the other question was, at any point during the project, did you feel like you hit a roadblock?
Yeah, well, I mean, like with our projects, really, for when we started in September, we didn't really work on it super, super hard until December when our courses ended. Like, because we had a couple of like, very difficult courses to go through that were sucking up a lot of our time. So for once we got through that, there were a couple days where I was thinking like, man, are we going to get it done in time? And there were a few things like trying to find out, say the penalized expected arrival time algorithm, trying to find ways to actually do that. It took a little while to really think through it, and come up with an algorithm to actually do that.
Yeah, I remember. I think it was I think it was due on like the 7th of January or something. And I remember January 4, we realized that we had calculated something completely wrong. So the overnight running the code overnight that Robin talked about, he had to redo three days before it was before it was due. So that was a little bit stressful, but happy we happy we turned it around and made it work.
That's great. So I know you've mentioned like forming your team with like diverse backgrounds is really important. I'm wondering for people who maybe are working in like different industries right now, but are also super interested in sports and big data bowl. If you have any suggestions for like meeting people that could possibly be part of their team?
I'll maybe Yeah, so for the big data bowl, I think we typically stay SFU. But like stuff for the big data cup. Like, maybe Brendan, you can talk about how you formed your big data cup Yeah, yeah. So yeah, like Robin was mentioning for this one, I just emailed all the students who are interested in sports and we formed a team. But for the big data cup, I just met my teammates online, like on Twitter. So I think Twitter is a great resource for sports analytics, because there's so many people on there just sharing their ideas through blogs or through videos and just various forms of media. So I think that's a great way to kind of get involved and like interact with other people who are in the industry and trying to get into the industry. And yeah, I literally just met one of my teammates online through Twitter. And then we said, how are we going to get other people then we emailed one guy that he knew
