This workshop session focused on strengths and weaknesses of current modeling efforts for hog production: state-space models and the new approach to modeling of biological processes. An example of a state-space model is the Kalman filter model (KFM), while the new approach, presented in Chapter 7, is the Sartore, Wei, Abayomi, Riggins, Corral, Sedransk (SWARCS) model. Formal discussants were Katherine Ensor (Rice University) and Christopher Wikle (University of Missouri). The session included open discussion. The moderator was Eric Slud (University of Maryland and U.S. Census Bureau).
Ensor acknowledged the difficulty fully understanding the challenges of the project and how all the pieces fit together. She pointed to promising results from the state-space modeling perspective. State-space models allow bringing in prior information and are frequently computationally manageable. She supported Andrew Lawson’s suggestion (see Chapter 8) about developing some type of switching between endemic and epidemic time periods. This approach allows the dynamics of the model to change, for example by using hidden Markov models. She also observed that collaboration with the Animal and Plant Health Information Service (APHIS) might help the National Agricultural Statistics Service (NASS) develop an epidemic model to use with a switching approach.
She added she strongly favors a more model-based approach because of apparent data limitations. She asked how many time points are used
for fitting the models that have been described. She reported that the more empirical-based approaches using LASSO and seasonal time series models will be very unstable unless a long time series is used in estimation. Understanding and measuring uncertainty are also very important.
Wikle said he appreciated the complexity of the problem and praised the modeling efforts described. He said the problem reminded him of new work in ecological modeling called integrated population models (IPMs), which take many different data sources in a population-based model that accommodates biological dynamics (referred to as constraints by NASS) in a spatial setting. IPMs seem to provide an ideal framework for NASS, he suggested. They are often fully Bayesian, and there are frequentist versions.
Wikle also said NASS would benefit by incorporating spatial considerations into its modeling (see Chapter 10 for further elaboration). He expressed interest in modeling in both space and time and remarked that combining them requires more thought than back-engineering a time series model with spatial items. He observed that dynamics occur on scales that are not currently incorporated in NASS modeling. NASS is modeling biological constraints in the data, which is important, but the real biological dynamics occur on an individual and maybe even a herd scale. He encouraged thinking about the dynamics and where they occur.
He added that while most users care about point estimates, a coherent decision-making process requires that the model produce a point estimate that has quantifiable uncertainty. That means that an estimate is needed of the reliability of the point estimate. Reliability may need to be estimated through simulation or microsimulation. This is especially important if parameters are re-estimated at every time period, he stressed.
Ensor agreed, noting a concern that parameter uncertainty estimates have not been examined. Such estimates provide information about how well the model is working. In answer to her question about how many data points are used to fit models, Luca Sartore replied the SWARCS model uses the entire time series of monthly and quarterly data from 2008 to the current quarter to prepare model estimates for the current quarter. His approach has so many parameters to estimate that the amount of data points is not sufficient to get stable estimates. He used the LASSO approach with penalties to make the process stable. Ensor replied that using the whole history eliminates the opportunity to capture dynamics.
Sartore elaborated that the sequential generalized linear model uses a 4-year moving window and the KFM uses the entire time series, as his model does. Ensor urged the importance of recognizing the number of data points in a model and how they are used. Some of these methods are highly variable in their estimation. Even 10 years of data, either 40 or 120 data points, is quite small from a time series perspective, she said. However, if the model can capture the dynamics, she would favor a switching model or a model with dynamic parameters.
Ensor and Wikle agreed that pursuing state-level models over time and rolling them up to the national level (bottom-up) might give better results than the current top-down approach. Ensor noted that it is important to recognize that because of data limitations, the bottom is the state and not the producers. Lee Schulz asked about the potential implications of a bottom-up approach given that external slaughter data, not available at the state level, are used to correct for biases in the survey data. Ensor agreed that this poses a constraint.
Slud asked about auxiliary information available at the state level, noting pork check-off data as one source. Ron Plain explained the mandatory check-off program. When animals go to slaughter, the packers make an assessment of 0.4 percent of market value. That money goes to the National Pork Board, part of which is allocated to the state where the animal was raised. The Pork Board publishes those data monthly. The data show how the dollars are allocated and how many hogs come from each state. They look like very good data when compared to the national-level slaughter data, he said. However, one issue is that many pigs change states: For example, they are born in North Carolina and are moved to another state to be fed and raised, but ownership does not change. If there is no ownership change, then no check-off is assessed.
Ensor asked whether the problems with the check-off data are known well enough so that they might be useful in modeling. Slud asked about other data that might illuminate the shortcomings—for example, survey data on state transfers of hogs for raising or state-to-state transfer information. Plain responded that in the past, state veterinarians’ offices had information on the shipment of hogs for nonslaughter purposes. The data were based on health certificates needed for interstate hog transfer
reported to the state departments of agriculture. Dan Kerestes noted that NASS makes use of veterinarian data. He reported the quality of those data vary among states and have deteriorated over the years, but the data are used to prepare production and disposition reports to account for pigs in every state and come up with cash receipts.
Nancy Kirkendall summarized discussion about check-off data that the committee had before the workshop, asking Plain to provide any needed corrections. She said that the check-off data report sales in three categories of hogs. The most relevant for current modeling purposes might be sales of market hogs because those animals are most likely intended for slaughter. That category of sale might be correlated with slaughter data by state for that size hog, possibly with a relatively short lag between the sale and the slaughter. As with any potential new data to be used in modeling, there would need to be some evaluation of correlations to see how the data might be useful.
Matthew Branan said APHIS works with the veterinarian data to provide an indication of animal movement. It also collects some information from National Animal Health Monitoring System studies. These are smaller than the National Animal Health Reporting System studies, so aggregate information is limited to regional or national-level groupings, such as a slaughterhouse-based level estimate rather than a state-level estimate.
Ensor asked about opportunities for a spatial temporal modeling approach with simultaneous estimation of state and national information. Branan recognized the value, but noted supporting data may not be available. The goal of the animal disease traceability program is to get that level of information, but for now modeling may be restricted by the data that are available. Ensor noted with state-level state-space models, estimate uncertainties could be used as variance estimates in the measurement equation. At the national level, having the dynamic hidden Markov model switching between endemic and epidemic, and building up the epidemic side with information from APHIS, would likely provide value. Young said use of the KFM would require adaption to give state-level estimates. Ensor suggested NASS explore borrowing strength, looking at space-time and incorporating uncertainty into the modeling.
Slud noted further discussion of state-level estimates and borrowing strength later in the workshop (see Chapter 10). He observed that switching could be incorporated in several ways. First, if an indication came in about a shock, it could be used to switch models. It could also be used as a covariate in a generalized linear model framework with other covariates. In either case, he said, past data from regions where that indicator variable was present would be needed to estimate the parameters of the model. That situation suddenly knocks down the length of the time series available for estimation. How to get enough data for estimation is a puzzle, he commented.
Sartore agreed, adding that the Markov switching model might also be useful to directly model the monthly estimates for pig crop and sows farrowing and also for modeling survival rates. Survival rates depend on the disease, when the outbreak starts, and how long it lasts. This situation is more complex at the state level because it depends on which states are near each other, where the disease starts, and how it spreads. All these can be modeled by the hidden Markov model, but the underlying process is more complex than is visible from the survey data.
Slud observed that Sartore’s model is already over-parameterized and adapting to shocks would result in a very small training dataset. Sartore agreed, saying that NASS would like to keep the model and estimation frequentist, which makes it difficult to account for missing information. He said perhaps a Bayesian approach could help because prior information from experts, the literature, or past time periods could be used. He said that the question is how to get stable, viable estimates quickly. Slud said although his inclination was to start with a frequentist model, this discussion made him consider Bayesian approaches.
This page intentionally left blank.