Resources

Regina Lionheart - Making Waves with R, Python, and Quarto

video
Oct 31, 2024
18:55

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hey, thank you everyone so much for coming to see my talk, and let's dive right in. That will be my only water-related pun for this talk.

Okay, so how would you build a sandcastle? You could make it really simple, you could make it really complex, lots of turrets and all kinds of interesting things on it, but all sandcastles have something in common. Eventually, the waves are going to get them.

So how can we give ourselves some extra time to make our sandcastles last? We could add a moat that'll buy us some time as the tide comes up. Materials matter as well. If we use loose, dry sand, that is not going to last as long as if we use something a bit sturdier, like these kids may have used concrete for theirs. But how do we know what's going to happen to our sandcastle? How can we provide something that is going to protect it from all of the many things that can happen to sand-based structures on water?

Introducing hydrodynamic modeling

So I'd like to introduce you to hydrodynamic modeling. This is a way where we can make the sandcastle, the waves, the surface, the conditions, the wind, all of it numerically. So this technology is already being used across Washington State, here where we are, where shoreline change is just a part of life for a lot of our coastal communities.

So Washington State has about 3,026 miles of coastline. We have a vertical tidal range of about 13 feet in some areas. Our currents can reach up to 12 feet per second. For reference, a fast current is usually about 5 feet per second. And we have tall coastal bluffs that erode and provide sediment. We have wide, high-energy ocean-facing beaches. And we have an incredibly rich near-shore ecosystem that contains everything from sea stars and cnidarians to our big charismatic orcas, which if you've been walking along our shoreline, you may have been lucky enough to see.

We also have an incredibly developed shoreline. Humans have been in this area for about 13,000 years. And when Western settlers came to the area in the 1800s, the shoreline modification changed at a very high rate. So a lot of shorelines look very different from how they used to. And these modifications interfere with natural processes that include erosion and beach building and habitat formation.

So some of these modifications have produced some pretty serious conflicts between man and world. So one example I'm going to show you of that is the town of North Cove, which is about three and a half hours southwest of here. This is the town of North Cove from satellite in 1995. And then just 10 years later, this is what it looks like. And then in 2020, it looked something like this. So we lost 3,000 feet of shoreline in just 25 years. At the highest rate, it was 100 feet per year. And I have visited this site and the houses are literally falling into the sea.

So it is important that we know what is going to happen to our shorelines and have some tools to help defend and mitigate some of the risks that our coastal communities face. At this point, you may be wondering if you are at the correct conference. This is an open science or open data science talk. And this is a use case about how we used open science to make hydrodynamic modeling software more accessible to our coastal communities.

Clayton Beach: the project site

So this project was in Bellingham at a site called Clayton Beach. And that's about 80 miles north of where we are now. It's like an hour and a half drive. So this is Clayton Beach. It's a really popular spot for locals to go picnic and be by the water. It's a very beautiful spot. It is also heavily modified.

So in the late 18th century, the Great Northern Railroad was built along this site. And that involves a lot of shoreline modifications and armoring. So a big hydraulic trench was dredged offshore and rock was blasted out of the side of the Chuckanut Formation, which is the rock type that we have up there. And all of that side cast fill was blasted out back into the water and invariably brought back to the beach from our currents. So that resulted in this sort of false armoring and a heavily modified shoreline that remains to this day. So on the left, there's that 1800s picture. And from the right, that's from 2022. So you can still see all of that rock and the old pilings are still there.

So our job with this project was to try and understand how we could restore the beach within our stakeholder resources from a natural approach. So this photo is Clayton Beach taken from drone in 2006. And that red line you're seeing is armoring that was put in place when erosion reached too close to comfort for the railroad that curves near the beach. In 2006, that was 50 feet. And in 2016, it was 150. So we're entering this cycle of erosion and armoring and erosion and armoring, and we wanted to try and find a better way.

Running the wave models

So how we're going to do this is run hydrodynamic modeling over engineering surfaces that our engineers came up with. And once we have the surfaces, we can identify areas of high risk. So Clayton Beach is exposed to a very high fetch, which is the distance over which waves can develop. It is openly accessed through the Strait of Juan Fuca. And there are mountain ranges that surround us, creating variable wind patterns and a lot of things that we needed to account for in our models.

So this is what a model domain looks like. This is called a SWAN model, which computes random short crested wind generated waves in a hypothetical environment. You're looking at depth there and energy dissipation over a certain space. And these models can do everything that real waves would do. So shoaling as they hit the bottom, refraction, changing the waves around items that they might hit, white capping, reflection, transmission, all of these wave behaviors can be modeled in the software. In other words, it can basically tell you what your sandcastle is going to look like by the afternoon.

These are the shallow water equations. This is how this modeling software works. This describes the movement of fluid under a free surface with low viscosity. And when you put these models into the real world, they start to describe the action of real waves. So this is called a wave breaking, which is kind of like a person walking in the trip and their head goes over their heels. So the top of the wave is moving faster than the bottom. And once we have these equations, we put them over a geographic grid. And then those equations are run in space and time over the grid.

So what this ends up looking like, on the right hand side, you have a flat surface that's being modeled within a bathtub scenario. The water equations are run across that surface. And then all of that refracting and changing of the water is handled by the modeling software. On the left hand side, you can see the waves are moving along that coastline. When they hit the man-made armoring structure at the top, they refract around the sides. So this is just visualizing what those mathematics are doing.

So back to our site. There's Clayton Beach up on the right hand side. And here's the entire modeling domain that is containing the site where the mathematics will be run. And if you remember this fun gif from earlier, it really does look something like this. These waves are running over a computational grid. Fortunately for us, the modeling software handles all of that mathematics internally, because otherwise that would be a lot of math to do for that space.

The data extraction challenge

So what do these models look like and what goes into them? For the ones for Clayton Beach, we had three separate oceanographic conditions, times four engineering surfaces, times nine timestamps of runtime, which gives us about 108 potential scenarios that we were running over this beach. For each one of those scenarios, we have 28 hydrodynamic variables of interest. And if any of you can do math very quickly, you'll see that is 3,024 individual items that we need to pull out manually.

So if I have an action per minute of two clicks per second, that's about half an hour, which is not that bad. You can do that manually. A carpal tunnel is fine. But we're always asking new questions. What if we filled in that trench that's dredged offshore? What if we add more groynes and try to shift that sediment transport? What if we restore the eelgrass and if we have funding for that? And each time we do this, we have to rerun the models and extract everything again. And pretty soon you have this rapidly expanding database of sort of similarly named things that you have to query and organize.

Open science to the rescue

And there's a better way. So introducing the open science part of this. So instead of using the manual traditional hydrodynamics software, I used X-Ray and exported everything as a net CDF, which is kind of a portable, self-describing data structure that's really common in scientific data types. So using X-Ray and that modeling platform that we were talking about earlier, that all is being combined into a single array that is now easily queryable.

Another package that was really helpful was CartaPy. CartaPy places those modeling outputs in space. So we can see where the energy dissipation, those hydrodynamic modeling variables I was talking about earlier, we can see where they're happening. So this is an export of those two things combined. The X-Ray multidimensional array over CartaPy and that's over Clayton Beach. So you can see the depth just displayed. And I did that with a single script and a single click out of those 3,024 extractions.

And I did that with a single script and a single click out of those 3,024 extractions.

So zooming in on this image, you can see the red dot on that right hand side was an area of high erosion risk. And so we extracted data from that point and then we could start plotting it and asking questions about it.

So the existing dredged surface, this is energy dissipation as wave height increases. So you can see that as time goes up, the wave height is increasing and the dissipation itself is increasing. So small waves do not carry a lot of energy. You can't really transport anything on them and you're not really going to face much of an erosion risk. But as those waves grow in height, they are transporting a lot of energy and that energy is being imparted into the sediments resulting in erosion.

So we knew that wave height was a problem. But what happens if we restore the surface? Now with a naturally restored and filled surface created by our engineers, we can see that with increasing wave height that dissipation itself is way lower. So we have some data now about what restoration we might be able to do.

And we can do that across any variable. That was for energy dissipation, but we can also do wave steepness. Wave steepness is what I was talking about earlier. As that wave starts to break, it trips over itself and high steep waves impart a lot of energy. So you can see on that left hand side, it's a little dark, but that dark level and the dark color is indicating that the natural breaking of waves as they hit the shoreline wasn't happening over that dredged area. The waves never felt the bottom. So instead, that high dissipation energy was happening right on shore. But when we restored the conditions, that breaking of energy is being dissipated across the whole beach and we are not having this high energy spot where the waves are breaking and becoming really steep.

Combining R and Python for analysis

So this is great and all, but I'm fundamentally an R user and I needed the power of ggplot as well. So Python helped me leverage that data from a spatial approach to a more temporal approach and just more flexibility of graphs altogether.

There's Clayton Beach zooming in on that spot. These are two transects that we extracted so we can replace time on the x-axis with geography on the x-axis. And we can see snapshots of time on a spatial approach.

So ggplot, which I love, this is those two profiles that we were looking at. So on the left-hand side, you can see the bathymetry of that dredged trench from those two extracted profiles. And that's showing offshore dredged. When we fill in that offshore dredged with our engineering surfaces, we can see now that the bathymetry is a lot more natural, how like a normal beach would look.

And now across those profiles, we can again extract anything we want from those. So this is energy dissipation again, and there's our two profiles. And on the top thing, on the top set of graphs here, we have the offshore, the existing beach surface. So you can see that that high peak of energy dissipation is happening really close to shore because on the x-axis, we have distance from shore. So right on those sensitive onshore structures, we're getting a really high energy dissipation. On the bottom, when we fill in the restored beach surface, that energy is shifted offshore. So we can never destroy the energy, but we can move it somewhere else.

And that vertical black line that you're seeing is a reference point for those rocks from the image earlier, that old sidecast fill. So we can just actually see that the energy dissipation is being moved away. And this was the selling image for our client to convince them that natural restoration is an option.

And this was the selling image for our client to convince them that natural restoration is an option.

Making results accessible with Quarto

Lastly, these kind of results need to be accessible to a lot of different kind of people. We have engineers, coastal scientists, government officials, and the coastal communities who are actually dealing with this shoreline change. So Quarto ended up being the perfect tool for that. I was able to use my R chunks and my Python chunks in addition to all of that extra scientific, hydrodynamic information that you need to have to give this kind of work context.

So it went from that raw version to a nicely rendered sheet that I could pretty much render indefinitely for multiple different audiences with very few clicks. And we were able to get a single Quarto document that had all of the information that we needed for any audience that wanted to look at it.

And everything is looking bright for this field of research because Python is now producing packages that can handle pre-processing as well as post-processing. This is output from a package that can render the computational grid automatically instead of doing it by hand, which is one of the most time-consuming parts of hydrodynamic modeling is making that grid by hand. So these are tools to protect our shorelines for the next generation.

This is my daughter, Nadia, making a cameo in this. Introducing her to shoreline work early. But it just makes this previously inaccessible type of work more accessible to lots of different kinds of people and saves us a ton of time as well.

So thank you so much for listening to this. This is my LinkedIn if you have any questions. And I would love to talk to you guys more. Thank you so much for listening to my talk.

Q&A

All right, a couple of questions. How did you become interested in this data or in this field? And how do you see this work developing in the future?

So my background is in biochemical oceanography and metabolomics in a lab-based setting. And I honestly wanted to spend less time at sea. So I moved to a more coastal setting. And I've always been really interested in wave dynamics and shoreline protection from a natural perspective, watching the shorelines change over Puget Sound for the time that I've lived here.

So as for where I see this going, I just think that the packages that accompany this kind of modeling are just exploding. It's amazing what people are doing. And some classic packages like X-Ray and the Tidyverse are playing really nicely with some of the new ones that are doing really complex pre- and post-processing steps for hydrodynamic modeling. So I think there's a bright future for this kind of stuff.

I guess a follow-up to that is how much of this type of work needs to be done by domain experts? And how much can it be done by a newly data science graduate?

I think a lot of the post-processing is pretty much is accessible to pretty much anyone who has a decent understanding of data structures and how they work. The modeling and the pre-processing requires a lot of domain expertise to understand where to pull out data for erosion risk areas and understanding of sediment transport to use your resources and the cost of modeling effectively. This is a very small model. This model took maybe 10 minutes to run. I've worked on another that took six hours to run because it uses 3D dynamics and water shearing over time. So some of those pre-processing steps require domain expertise. But a lot of the analysis and the images that I was showing up there, I'm relatively new to R and Python. And I think you guys are probably more experienced. So I think it's really accessible.

I think this is the last question. How far in advance do engineers need to interfere with issues like this? And how do they keep up with all the different problems along the shore?

That is often up to the clients. The eternal problem with shoreline armoring is that erosion is a natural process, especially here in Puget Sound because of our sedimentary rock formations. So there is always a trade-off between natural restoration and what we call hard armoring or bulkhead installation. And engineers have to weigh some of the benefits and the pros and cons of hard armoring. One of the effects of hard armoring is the destruction of the habitat that feeds salmon, which has a really big effect on the salmon industry. So there are a lot of people and opinions at play. But in general, I would say it depends on the location. So anywhere from like a year to if you're in the North Cove location, you know, you have days to think about what you're going to do before the water starts reaching your house.

All right. Yeah, that's all of our questions. Let's thank Tina one more time.