National Academies Press: OpenBook

The Future of Water Quality in Coeur d'Alene Lake (2022)

Chapter: Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen

« Previous: Appendix A: Coeur d'Alene Watershed Analysis Methodology for Metals and Nutrients
Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×

Appendix B

In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen

This appendix details the committee’s methodological trends analyses of CDA Lake data on dissolved oxygen and nutrients (Chapter 5) and metals (Chapter 6).

DISSOLVED OXYGEN TRENDS ANALYSIS METHODOLOGY

The first step in the analyses of dissolved oxygen trends in Chapter 5 was screening out problematic raw data and then standardizing the analyses within each sampling station to maximize data comparability. When analyzing the vertical profile data, the committee started by screening out any data collected on sonde upcasts (see Chapter 8). An upcast is when the sonde is retrieved from the bottom of the profile to the top while recording data. This is evident by comparing the time stamp for sonde readings to the corresponding depths. Upcast data are invalid because when the probe is being pulled upward through the water column it creates a void behind it, which gets filled with water that has been turbulently mixed and displaced upward. Conversely, downcast data are collected as the probe is being lowered through the water column; therefore, the sensors that are located on the bottom of the probe encounter laminar conditions as the sonde is lowered through the water column. Therefore, upcast data are very noisy and offset compared to downcast data. In most cases, only a small portion of the profile data was discarded. However, many profiles collected during 2016 were taken as upcasts. Upcast data should not be used, and people collecting profile data from CDA Lake in the future should be warned not to collect vertical profiles in the reverse order.

The first goal when analyzing the profile data was to characterize seasonal and depth trends in the dissolved oxygen data collected at each station. Once the times and depths with the lowest oxygen concentrations were identified, the committee used this information to design its long-term trend analyses. To look for general seasonal/depth trends in these data, the committee sorted all of the profile data collected by their Julian dates (i.e., days 1–365 within a year) and plotted all of the data collected from different years as a single annual time series. Because there was considerable year-to-year heterogeneity in sampling dates and even sampling depths, the committee binned these data into weekly and 1-meter intervals. When doing these preliminary analyses, it was evident that the script used to interpolate the data for these plots (rLakeAnalyzer1) created odd outcomes

___________________

1 https://cran.r-project.org/web/packages/rLakeAnalyzer/rLakeAnalyzer.pdf

Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×

in regions of the overall time-depth sampling matrix that were poorly represented by real data. For example, at station C1 the majority of the vertical profiles ended at a depth of 41 m, but a small number of profiles extended down to 42–44 m (n = 6 for the entire C1 dataset). Because the deepest profile extended to 44 m, the script output estimated results for all dates down to this depth. In the vast majority of cases, the results for depths 42–44 m were based on interpolations, and some of these interpolated data were suspiciously different from the observed data at adjacent depths. Therefore, the committee changed all of the 42–44 m depths in the dataset to 41 m to eliminate strange interpolated values. Because most profiles collected at the C4 station ended at approximately 39 m, all of the observations below that depth at this station were rounded to this depth. Similarly, the deeper data collected from station C5 were rounded to 13 m, and the deeper data from station C6 were rounded to 11 m. Then the committee used the R-script (rLakeAnalyzer) to create annual time-depth dissolved oxygen isopleths for stations C1 and C4, where the most data were available, and for stations C5 and C6, where dissolved oxygen depletion was most evident. Because these plots are based on a time by depth calculated matrix of oxygen concentrations, they should not be biased by inconsistencies in the temporal and spatial frequency of sample collection.

Once the annual patterns in depth average dissolved oxygen concentrations were characterized, the committee analyzed the long-term dataset for intra-annual trends for the same sites. In this case, the data were binned by 1-meter and monthly increments. As already noted, the occasional deeper sonde readings were rounded to the most common deepest sonde reading for each sampling station. When examining the long-term vertical profile datasets, it was also apparent that ≈ 20 percent of sampling dates utilized a greatest sampling depth that was less than the previously noted most common sampling depth. For example, at the C1 station, 40 m was the most common deepest sampling depth, but in 21 percent of the profiles, the deepest depth sampled was shallower. Missing 40-m data were common before 2010 and rare afterward. Because the script used to interpolate the depth profiles is quite sensitive when the deepest depth is missing, the committee estimated the oxygen concentrations for 40 m using the next available depth, which in 75 percent of cases was either 38 or 39 m. The committee estimated the missing 40-m data by generating statistical relationships between the oxygen concentrations at lesser depths and at 40 m when data were available for both and applying this equation to estimate the 40-m values when direct observations were not available. For example, when predicting the 40-m oxygen concentrations from the 35-m data, the fit between the two datasets was strong (r2 = 0.96), with predicted dissolved oxygen concentration at 40 m being 0.74 mg L−1 lower when the observed concentration at 35 m averaged 7.0 mg L−1 (the lower end of observed values) and 0.17 mg L−1 lower when the observed concentration at 35 m averaged 12.0 mg L−1 (the upper end of observed values).

These calculations generated a series of vertical dissolved oxygen profiles interpolated at 1-meter increments from the surface to the most common deepest depth sampled for each month from January 1991 to April 2021 for stations C1, C4, and C5. These profiles were then used to generate monthly matrices of surface and bottom dissolved oxygen concentrations for all of the years for which data were available. For stations C1 and C4, which were deep, the upper average was for 1–10 m and roughly represented the surface mixed layer or epilimnion when CDA Lake is stratified. The bottom layer for these stations was the deepest 10 m for which data were available. Ideally, the bottom layer average would only represent the area immediately above the sediment–water interface because the presence or absence of oxygen at this interface determines the fate of many biogeochemical reactions in the lake sediments. However, the exact depth of this interface was not known when analyzing these data, and the availability of oxygen data at the deepest depths in these profiles was not consistent. At the C5 station, which is much shallower, the surface was represented by the 1–5 m average and the bottom was represented by the 14–17 m average. The dissolved oxygen data for the C6 station were analyzed slightly differently than described above.2 For this station, the surface averages were calculated from all data collected at depths ≤ 4 m, and the bottom averages were

___________________

2 C6 data were analyzed slightly differently because this was the first station analyzed (since it had the lowest O2), and because it was shallow, and using the r-script to interpolate the data would not have saved time.

Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×

calculated from all data collected at depths ≥ 8 m (i.e., 8–11 m) without any data interpolation. When two profiles were available for the same month (as was often the case for June–November since 2011), these values were averaged.

METHODS FOR IN-LAKE TREND STATISTICAL ANALYSIS

Locally Weighted Scatterplot Smoothing

One important feature of any method for evaluating long-term trends in water quality data is that the method must be able to reveal the presence of non-monotonic trends. As discussed in Chapter 3 and Appendix A, for the river data, some of the constituents of interest (specifically, lead and phosphorus) at some sites, when characterized using the Weighted Regressions on Time, Discharge, and Season method, revealed increasing trends over a decade or two, followed by decreasing trends in the most recent decade. Given that pattern in river inputs to the Lake, it is reasonable to assume that similar non-monotonic trends are possible for lake chemistry data; hence, the committee used locally weighted scatterplot smoothing.

The first step in processing the data was to plot them and determine the date at which the detection limits appear to be relatively stable, and the frequency of data collection is generally monthly or more frequent and includes both surficial water and the deeper water. All of the datasets considered here end at the end of calendar year 2020, but the starting year varies from as early as 1991 to as recent as 2004 (the record lengths are given in the tables associated with these results). The data were separated into deeper water and shallower water. For C1 and C4 (sampled by the Idaho Department of Environmental Quality), the definition of deep is > 21 meters. For C5, C6, and SJ1 (sampled by the CDA Tribe) the shallower category is denoted as euphotic zone or epilimnion (typically with sampled depths of 6 to 10 meters), and other samples are referred to as bottom water, and their sampling depths are not recorded in the database. Censored data (those observations reported as “less than” some reporting limit) were treated as values that are half the reporting limit. If the number of censored values were large, this would be an inappropriate method, but in these datasets the percentage of data points that are censored is rather low, and thus this approach makes little difference (and is certainly preferable to deleting the data or treating them as values of zero or the reporting limit). Of the 19 datasets evaluated here, 6 of them had no censored data, the median percentage of censored data was 0.7, and the maximum was 5 percent. The metals datasets from C6 and SJ1 had at least 52 percent of their data censored and most had 94–100 percent of their data censored; hence, they were eliminated from consideration for that reason.

Locally weighted scatterplot smoothing uses the loess function in R (based on Cleveland et al., 2017) to show the change in concentration as a function of time. It was applied separately to the shallow and deep datasets. The smoothing parameter (called the “span”) was set to its default value of 0.75 and the fitting used a second-degree polynomial. The data were pre-processed by computing a median value for each day in the record, for both the shallow and deep water datasets, because there are often replicate samples taken on any given day. The fitting was applied to the logarithms of these daily median concentration values, and, as such, the fitted line is an estimate of the median of the data (rather than the mean). The reason the logarithms are used is to avoid oversensitivity to some of the extreme high values and to make the model residuals more nearly normal. The idea of this approach (loess on the logarithms) is to show the central tendency of the data as it changes over time.

Looking at the plotted data points, particularly from 2017 to 2020, many of the datasets suggest very steep downward trends in these final years. By design, these loess curves are designed to avoid abrupt changes in slope and tend de-emphasize what might be rapid changes in values near the start or end of the dataset. One could apply a loess smooth to the data with a substantially narrower smoothing window (lower span value). This would show the downward trends in the more recent years as being substantially steeper, but it would be intuitively difficult to argue that the many oscillations that would appear are realistic interpretations of changes in the system. This steep downward trend in the final years of some of these datasets is rather unexpected. Reevaluating the data after another one or two years have gone by will be critical to evaluating how the system is changing. The amounts of

Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×

change presented in the subsequent tables are based on these loess curves and are an average of the changes in the shallow and deep waters. For example, “Change over past 10 years” is computed as follows:

Equation

where:

Δ and Δd are the change in percent for the shallow and deep water, respectively;

cs(tm) is the loess estimate of median concentration for the shallow data at time tm, which is the maximum of the time variable in the loess model expressed in decimal years (typically about 2020.95, near the end of calendar year 2020);

cs(tm − 10) is the estimated median at a time 10 years before tm; and

cd values are the loess estimates for the median concentration for the deep water data.

The loess model results are also used to provide estimates of the median at the end of 2020 for both the shallow and deep water data, in order to quantify differences among sites and between the shallow and deep water data.

Seasonal Kendall Test

The second approach uses the Seasonal Kendall test (Hirsch et al., 1982), which is widely used in environmental science applications for datasets that are expected to exhibit seasonality. The test is based on comparisons of data for each season (typically a month) from each of the years in the record and does not make comparisons across seasons. For example, it compares June values from one year to June values from another year but not June values to October values. It assumes that the probability distribution for June is different from the probability distribution for October. It is a non-parametric test, which means that it requires very minimal assumptions regarding the data. Specifically, no probability distribution is assumed for the data, no particular seasonal pattern (such as a sine wave) is assumed, and the trend is not assumed to be linear with time. However, the test is only a test for monotonic trends (meaning that a dataset with a trend that looks like a letter “U” or an inverted “U” would not be seen as having a trend). It also is not sensitive to substantial differences in the amount of trend in different seasons. For example, if six months showed upward trends and six months showed downward trends, the result might indicate that there is no trend in the record. Censored data create no particular problems for the test provided the reporting limits do not change over time (censoring is minor here, and the reporting limits do not change greatly over time). The test is performed separately on the median value for each sampled day, both for the deep water dataset and the shallow water dataset.

The implementation used in this analysis (the kendallSeasonalTrendTest in the EnvStats package in R) is documented in Millard (2013). The purpose for using it here is simply to evaluate if the observed trend evidence is stronger than what we might expect by chance alone if the true conditions were totally without trend. The results are categorized in the tables by colors, where the deep blue (for downward trends) and deep red (for upward trends) colors indicate trends that are statistically highly significant (p < 0.01). The lighter shades of blue and red indicate trends for which the p values lie between 0.10 and 0.01 (moderate statistical significance). Those trends that do not attain a p < 0.05 are shown with no shading.

Commonly, when the Seasonal Kendall test is used, the seasons identified are equivalent to the calendar months. This application departs from that in two ways. The first departure arises because sampling in the months of January and February is rather limited. Thus, data from those two months were grouped into a single two-month period while the rest of the year the periods are equivalent to the calendar months. The other departure is based on the recognition that there is typically a difference in the distributions of the data between the shallow and the

Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×

deep data. This can be seen by the rather regular separation of the red and blue curves on some of the loess plots. As such, the “seasons” used here are actually defined by a particular depth group and month of the year. So, in this application, for each site and analyte pair, the number of “blocks” of data is 22. (In statistics, a “block” is a subset of the full dataset within which comparisons are made, but comparisons between blocks are not made.) The blocks are a combination of the 11 time periods and the two depth groups. The Seasonal Kendall test then performs a Mann-Kendall trend test on each of the 22 blocks of data and sums these results to form an overall test statistic, and the variance of that test statistic under the null hypothesis (the null hypothesis is that all of the 22 blocks are trend free) is the sum of the variances for each of the 22 blocks. The p-value for the test is based on the overall test statistic and its variance. It has been shown that this test statistic is well approximated by a normal distribution under the null hypothesis, regardless of the underlying distribution of the data.

The Seasonal Kendall test for each dataset is done twice: the first is on the full duration of the dataset and the second is on the last 10 years of the dataset (2011–2020). The Seasonal Kendall test does not directly inform us as to differences among the trends for the different blocks. For example, there may be trends in some months and not in others, and there could even be positive trends in some months and negative trends in others. These types of comparisons can all be made from the full set of outputs, but they were not evaluated or reported here. As part of a long-term program of trend studies to be carried out on the Lake, these kinds of differences would be a topic of interest.

In the analyses of lake dissolved oxygen (Table 5-10) and lake chlorophyll (Table 5-11), the Seasonal Kendall test was used. In the case of dissolved oxygen, the test was run on all 12 months but only for the deep water values. The slopes are based on the Theil-Sen slope estimator on the mean concentration for that month. Trend slopes are expressed as mg/L/yr, and the color codes are based on the significance of the trend for that month. These results are then summarized by the Seasonal Kendall test, and the overall slope is computed as the median of all pairwise slopes computed over all 12 months. These trend results were computed using the rkt function in the rkt R-package. The same approach was used for the shallow water chlorophyll data except for the fact that the seasons were collapsed to a total of 10. That is, because data are very scarce in the months of November through February, the months of November and December were combined into an individual season and the months of January and February were combined into an individual season, and all of the eight remaining months were defined as seasons. Here again, the rkt function was used to analyze the data.

REFERENCES

Cleveland, W. S., E. Grosse, and W. M. Shyu. 2017. Local regression models. Pp. 309–376 In: Statistical models in S. Routledge.

Hirsch, R. M., J. R. Slack, and R. A. Smith. 1982. Techniques of trend analysis for monthly water quality data. Water Resour. Res. 18(1):107–121. doi:10.1029/WR018i001p00107.

Millard, S. P. 2013. Hypothesis Tests. In: EnvStats. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8456-1_7.

Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×

This page intentionally left blank.

Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×
Page 361
Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×
Page 362
Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×
Page 363
Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×
Page 364
Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×
Page 365
Suggested Citation:"Appendix B: In-Lake Analysis Methodology for Metals, Nutrients, and Dissolved Oxygen." National Academies of Sciences, Engineering, and Medicine. 2022. The Future of Water Quality in Coeur d'Alene Lake. Washington, DC: The National Academies Press. doi: 10.17226/26620.
×
Page 366
Next: Appendix C: Biographical Sketches of Committee Members »
The Future of Water Quality in Coeur d'Alene Lake Get This Book
×
 The Future of Water Quality in Coeur d'Alene Lake
Buy Paperback | $50.00 Buy Ebook | $40.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Coeur d'Alene Lake in northern Idaho is an invaluable natural, recreational, and economic resource for communities in Idaho and eastern Washington. Starting in the late 1880s, mining in the Lake’s watershed sent heavy metals and other mining wastes into the Lake, resulting in contamination of lake sediments with lead, cadmium, arsenic, and zinc that persists today. The watershed was designated a Superfund site and cleanup has been ongoing for 30 years. However, the Lake's environmental quality and cleanup is overseen by a Lake Management Plan, originally implemented by the Coeur d’Alene Tribe and the state of Idaho. A major focus of that plan is whether lakeshore development might promote low-oxygen (anoxic) conditions that could release toxic metals from lake sediments back into the water column.

This report analyzes water quality data collected from the Lake and the watershed over the past 30 years. The analyses indicate that, although the Lake is still heavily contaminated, concentrations of metals in the major inputs to the Lake have declined, and there is no evidence that phosphorus concentrations have been increasing in the last decade or that low-oxygen events are becoming more common. However, the shorelines of the Lake, where exposure to metals or harmful algae is more likely, are not currently monitored. Protecting the water quality of Coeur d'Alene Lake will require that monitoring efforts be expanded to provide an early warning of deteriorating conditions, regular syntheses of data, and targeted studies—all coordinated among interest groups—followed by application of those results to managing the Lake.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!