Chapter 2 lays out a vision for the National Agricultural Statistics Service (NASS) in 2025 that includes evolving the role of the Agricultural Statistics Board (ASB) from integrating multiple data sources through a process that appears subjective to one of evaluating estimates prepared through a statistical model-based integration of these alternative information sources (Recommendation 2-1), preparing its county estimates using a transparent and well-documented process (Recommendation 2-2), and developing and publishing measures of uncertainty along with point estimates (Recommendations 2-3 and 2-4). Taking these steps will bring NASS into compliance with Standard 4.1 of the statistical standards promulgated by the U.S. Office of Management and Budget (2006). To adhere to these standards, NASS will need to develop, evaluate, and use statistical models; formulate a plan for ongoing evaluations to assess model and survey methodologies; and provide documentation on its website concerning its methodologies for developing county-level estimates (Recommendation 2-6).
Foundational for achieving this vision is for NASS to have a georeferenced list frame that can be kept up to date using available administrative information, will support improved use of administrative data in NASS’s survey operations, and will provide the geospatial information needed to support farm- and unit-level modeling. Development of a georeferenced list frame will be accomplished most expeditiously if NASS adopts the Common Land Unit (CLU), the geospatial convention already in use by the Farm Services Agency (FSA) and the Risk Management Agency (RMA), as its basic spatial unit (Recommendation 2-8). NASS also
will need to be prepared to maintain alternative geospatial field boundary data, such as those from precision agriculture or the resource land units (RLUs) approved by RMA, in its databases (Recommendation 2-9). NASS will need to identify the CLUs that make up each NASS farm, which will likely require changes to the structure of the list frame to accommodate the georeferenced CLUs (Recommendation 2-10). NASS also will need to determine how to collect or identify CLU (or equivalent) data for farms that are not on FSA or RMA lists.
RECOMMENDATION 5-1: The National Agricultural Statistics Service should undertake a staged, systematic effort to implement the vision presented in Chapter 2 of this report.
Achieving this vision will take many years and a focus on achieving results. Senior NASS leadership will need to adopt the vision and clarify the overall goal. They will need to identify one or more champions who can identify others to be part of the process and can promote the importance of evolving to achieve the vision.
This chapter breaks down the vision and the panel’s recommendations into projects that could be accomplished by different groups of people within two stages of effort. This detail is provided to assist senior leadership in their planning and staff in better understanding the panel’s recommendations.
First-stage projects can start now. They include enhancing liaisons both within and outside the U.S. Department of Agriculture (USDA): within to leverage each agency’s unique areas of expertise, and outside to gain access to precision agriculture data in a way that reduces the burden on farmers and enhances accuracy (Recommendations 3-3, 3-4, 3-5, and 3-6). First-stage projects further include working to enhance, document, and use the current cash rents model (Recommendations 4-2 and 4-3); developing useful small-area models for acreage and yield; planning for the inclusion of CLUs and equivalent geospatial information (such as RLUs and boundary information from precision agriculture) in the Enhanced List Management Operations (ELMO) database; and linking FSA and RMA data to the NASS list frame (Recommendation 2-10).
Second-stage projects can be accomplished once sufficient progress has been achieved in linking FSA and RMA farms with the farms in the NASS list frame. These projects include adding a new Internet reporting option for linked FSA and RMA farms to take advantage of administrative data and reduce reporting burden (Recommendation 2-11), enhancing surveys to make use of administrative data in imputation (Recommendation 2-12) and estimation (Recommendation 2-13), and expanding the modeling effort to consider farm-level or unit-level models.
The goal of this chapter is to help NASS see how the panel’s vision can be realized. Moving to make this vision a reality will present many challenges for NASS, but the adoption of new ways of doing business is necessary to maintain the agency’s credibility and promote efficiency in its operations.
Project 1: Collaboration
NASS is a statistical agency within USDA, a department including many other agencies with expertise, data, and skills that complement those of NASS. NASS staff who collaborate on intra-USDA agency projects can identify ideas and models (broadly construed) that ultimately might improve NASS methods. It is likely, for example, that NASS cannot accomplish all the remote sensing work on its own and would benefit from an exchange of ideas with other geospatial experts within USDA. NASS modelers would similarly benefit from consultation and collaboration with modelers from sister agencies. NASS needs to collaborate broadly, especially with other USDA agencies that have substantial expertise in remote sensing and modeling, such as the Economic Research Service (ERS), the Agricultural Research Service (ARS), the Foreign Agricultural Service, RMA, FSA, and the Natural Resources Conservation Service. Specific steps to this end include the following:
- ERS and NASS administrators now meet monthly, and they could add approaches to achieving the vision to their agenda. ERS has extensive modeling and geographic information systems (GIS) capability that could be invaluable to NASS, while improvements in NASS data and estimates would benefit ERS, other agencies in USDA, and other data users.
- NASS leadership could encourage ERS to propose and establish joint projects that might lead to improved matching of the NASS list frame to FSA and RMA farms.
- NASS leadership could encourage ERS to extend its analysis of farm values (Nickerson et al., 2012) to include similar research on the impact of parcel-specific variables on cash rents so as to identify variables that might improve models (Recommendation 4-1).
- NASS leadership could establish working relationships between modelers within NASS and ERS modelers to facilitate skills enhancement.
- RMA and NASS could collaborate on sharing of data (Recommendation 3-1), on sharing of ideas and methodology for modeling, and on NASS’s collection of data from precision agriculture.
- RMA and NASS could consult concerning access to individually identifiable data and how NASS will protect confidential data (Recommendation 2-5).
- NASS could collaborate with RMA actuaries and statisticians to learn how they have accounted for heteroscedasticity and incorporated soil variability and geographic differences into their estimates of yield. The intent would be to see whether those methods could improve NASS’s approaches to modeling.
- NASS could consult with RMA and the Approved Insurance Providers (AIPs) on the development of an option for reporting of planted acres, failed acres, and production through use of precision agriculture.
- FSA and NASS could collaborate on sharing data and CLU identification.
- FSA and NASS could agree on how FSA data will be protected (Recommendation 2-5).
- NASS could consult with FSA on the determination of CLUs for farms that are not in the FSA data system.
- NASS could collaborate with FSA concerning definitions of CLUs, both to understand current definitions and procedures and potentially to influence FSA to adapt CLUs to NASS’s needs.
- NASS could consult with FSA on the development of an option for reporting of planted acres, failed acres, and production through use of precision agriculture.
ARS, ERS, and NASS could collaborate on the development and use of a crop model that can provide the most relevant variables for predicting yield of a given crop in a given county (based on soil, precipitation, growing degree days, etc.). The panel views this model as being similar to the Versatile Soil Moisture Budget (VSMB) used by Statistics Canada to prepare agroclimatic data for use as input to Statistics Canada’s Integrated Canadian Crop Yield Forecaster (ICCYF) (see Chapter 3). The VSMB is a yield model developed by Agriculture and Agri-Food Canada and modified within Statistics Canada. The model synthesizes soil, weather, and precipitation data from many ground weather stations in Canada. It is used in preparing estimates for crop- and Canadian Census Agriculture Region–specific variables, such as growing degree day (GDD) accumulation, growing season precipitation accumulation, water stress indices, and other variables scientifically recognized as being potentially useful for explaining crop growth through the growing season. A model such as the VSMB could prove very helpful to NASS in identifying and preparing relevant independent variables that could be used to improve both remote
sensing models and area-level models for yield. NASS likely could obtain, use, and/or adapt such a model from work done by other agencies within USDA, such as ARS. The panel is not aware of NASS’s use of data from ground-based weather stations and ARS flux towers, which represent one approach for acquiring the information in a form that would be most useful as direct input to NASS modeling efforts, both for remote sensing yield indications (where augmenting the current methodology for yield estimation with new input variables may be quite simple) and for including weather- and soil-specific variables in area models.
NASS also will need to collaborate outside of USDA to keep abreast of emerging data sources (Recommendation 3-9)—possibly with Statistics Canada and/or other international agencies on the development of a crop-specific model, described above as a collaboration with ARS, and with independent software vendors—to achieve the goal of developing a precision agriculture reporting option (Recommendations 3-4 and 3-5). NASS also could collaborate with farmer cooperatives to develop new approaches to obtaining data in ways least burdensome to farmers (Recommendation 3-3). More generally, NASS will need to collaborate with farmers as NASS pursues these enhancements (Recommendation 3-6).
Project 2: Cash Rents Model
NASS needs to enhance, document, and develop guidance on how best to use the cash rents model–based estimates in developing official NASS county estimates. The cash rents model developed by Berg and colleagues (2014) integrates survey data from 2 years and was originally developed for use when the Cash Rents Survey was conducted annually. The model-based estimates were provided as input to the ASB process in 2013, 2014, and 2017. The 2014 farm bill stated that the survey should be conducted no less frequently than every other year. The survey was not conducted in 2015, but was conducted in 2016 and 2017. The model was adapted to account for the change in survey frequency, and results were provided to ASB in 2016. Especially for the case in which there is a 2-year gap between surveys, the model would be enhanced by using Bayesian approaches that would support relaxation of the assumption that the survey variances are the same in both years (Recommendation 4-2). NASS modelers also could collaborate with ASB and FSA (users of the estimates) to develop convincing summaries of survey and model uncertainties that would inform decisions about how model outputs (estimates) can best be used by ASB (Recommendation 4-3). NASS also will need to prepare model documentation and provide it on its website (Recommendation 2-6).
NASS will also need to decide how to improve its benchmarking to achieve consistency with previously published rental rates. As noted in
Chapter 4, the panel favors including benchmarking as constraints in the modeling process.
Project 3: Development of Models for Crop Statistics
NASS needs to continue working on the development of area models that can be used to synthesize data and prepare accurate estimates of planted acres, harvested acres, and production or yield. This project has two key aspects: developing the models, and building the technical capacity for model development and improvement.
As described in Chapter 2, further development is needed before NASS will have models that can be used to provide county-level estimates suitable for publication after careful review. As it is now, development of such models would be a focus of the Research Division. Once modelers had identified candidate models, they could work with ASB (the user of the models and their results, and responsible for maintaining the quality of official NASS estimates) and with RMA and FSA (key users of NASS county-level estimates) to reach agreement on the utility of model-based estimates. (Success with the cash rents project described above could provide some guidance on how this can be accomplished.)
The Fay-Herriot type of subarea model currently under consideration by NASS represents an excellent start at model development. However, this type of model considers direct survey estimates to be primary and does not account for the error structure of other potential input variables. Use of a more complete measurement error structure might evolve the current approach into one that is capable of integrating multiple data sources. For more detail, see Appendix C, which also addresses unit-level models, as well as extending models to include space and time.
As summarized in Chapter 2, NASS already has experience with subarea-level modeling (Cruze et al., 2016; Erciulescu at al., 2016) using Bayesian approaches. The panel encourages NASS to continue exploring and extending Bayesian models to incorporate the full measurement error structure, especially for the key data sources (those that provide alternative estimates of one of the key variables). NASS needs to explore new ways of addressing such current issues as skewness and multicolinearity. It could consider including alternative independent variables such as the Normalized Difference Vegetative Index (NDVI), as it is already computed for many counties in support of the current NASS remote sensing estimate of yield. Note that ecological bias can be avoided (under certain assumptions) by including the within-area variance of a variable, as well as its
mean, in the model (Wakefield, 2008). Adding spatial dependence is conceptually straightforward using Besag or Leroux spatial formulations (see Appendix C). Spatial random effects can be incorporated to leverage spatial similarity due to unmeasured variables. Stratification can be accounted for by including fixed effects in the model, and cluster sampling by including random effects. (Scott and Smith  is an early reference; relevant references for the non-Gaussian setting and ecological bias are Bradley and colleagues [2016a] and Bradley and colleagues ). Computation with the integrated nested Laplace approximation (INLA) (Rue et al., 2009) approach is fast (compared with Markov chain Monte Carlo [MCMC], which NASS has been using) and accurate. There is a reliable R implementation of the INLA method, although it is not a standard package.
The Bayesian approach to modeling naturally leads to intuitive measures of uncertainty. The fundamental output of a Bayesian analysis is a multivariate posterior distribution over all unknown quantities in the model. This distribution is typically of high dimension, so summarization is required. In particular, summaries of univariate posterior distributions of quantities of interest may be reported. For example, the posterior median (or posterior mean if the posterior on the quantity of interest is symmetric) may be quoted along with quantiles such as the 2.5 percent and 97.5 percent to give a 95 percent interval. In a Bayesian analysis, the posterior variance is a standard measure of uncertainty when the posterior distribution is normal.
Panel members discussed the importance of including soil productivity, and possibly weather, in models. As described to the panel, the experience of NASS modelers to date has been that such indices when included as independent variables are of marginal significance. Following the lead of RMA, NASS could develop better groupings of counties than Agricultural Statistics Districts (ASDs) and use them to borrow strength in model estimation. Sometimes a county on the border of a state is more like the adjacent county in the next state than it is like the other counties in its ASD. The section in Chapter 3 on RMA’s yield modeling addresses some of these points. NASS needs to publish county-level data, and there are legitimate reasons for the aggregation of county estimates within a state to add to previously published state estimates. If alternative grouping of counties leads to measurably more accurate results relative to ASDs and fewer outliers, NASS may choose to benchmark county estimates directly to state totals, computing ASD totals as the sum of benchmarked county estimates.
NASS has been investigating benchmarking options as part of its modeling efforts. The panel agrees that including benchmarking as a constraint in models will ultimately lead to more defensible estimates.
Building Technical Capacity
The second challenge for NASS is maintaining and expanding the technical expertise of NASS staff. In addition to the points discussed below, many of the collaborative efforts described earlier under Project 1 will support this goal by helping staff learn new approaches and incorporate them into NASS processes:
- NASS modelers need to work to understand the ASB process and incorporate relevant features and data in their model development efforts. This would be facilitated by assigning a modeler to be a member of the ASB (Recommendation 2-7).
- NASS needs to enhance its statistical, economic, and geospatial modeling expertise with new hires over time. (The panel recognizes that NASS has been attempting to do this and that it is not easy to identify and recruit highly qualified staff.)
- NASS needs to enhance the capabilities of current staff through ongoing training opportunities in such areas as new modeling approaches, software, and incorporation of model-based approaches that use auxiliary data into traditional survey methods. NASS could consider bringing in experts to provide this training for staff.
- NASS could arrange for ad hoc expert review committees and/or (non–Federal Advisory Committee Act [FACA]) advisory committees such as the American Statistical Association’s (ASA) Committee on Energy Statistics, sponsored by the Energy Information Administration, to bring in external expertise and advice.
- NASS technical staff need access to appropriate hardware and software. With regard to hardware, the methods most useful for addressing geospatial issues and modeling require significant storage and speed for computation. Staff also need to have access to the most up-to-date software, especially in a test environment. Ways in which software products that enhance NASS capabilities can be moved into operational environments are needed as well.
Project 4: Geospatial Database
Linking FSA and RMA data (CLUs and RLUs) to the NASS list frame is a high-priority, first-stage project. Achieving complete linkage with these administrative sources will be time consuming, so NASS’s work on this effort needs to begin right away, starting with the development of an approach that will build to success. One first step will entail planning how to accommodate the CLU, RLU, and field boundaries derived from precision agriculture, as well as FSA and RMA identifiers (IDs) in ELMO (or in
files easily accessed by ELMO). Second, NASS has pursued studies in the past that involved identifying FSA farms that match NASS farms. If information from those studies concerning farms that match is still available, it may serve as a starting point for a matched list. For example, for several years the June Area Survey has made use of FSA CLUs, and remnants of the 2010 record linkage effort in Nebraska may be available. Third, NASS’s traditional matching approaches using owner/operator names can provide linkage for the least complicated farms—those for which one NASS farm corresponds to one FSA farm. The Nebraska matching experiment revealed that an “easy” one-to-one exact match existed for almost 50 percent of farms. NASS could develop revised survey forms for matched farms that take advantage of the administrative data to simplify reporting and reduce respondent burden (Recommendation 2-11). With the promise of a simple reporting form, perhaps NASS can work with all respondents to encourage them to identify the FSA IDs with which they are associated. Completing the match through manual reviews is possible, but time consuming. Expensive, manual efforts at matching will best be dedicated to achieving matches for the largest farms.
While the linkage effort may be complex, maintaining the linked database should be relatively simple. FSA and RMA update their data annually. The panel learned that FSA introduces some changes in CLU definitions over time—possibly because of changes in ownership or land use. At its second meeting, the panel heard from ERS analysts who have accomplished linkages between FSA data and other USDA data and maintained those linkages over time. They estimated that no more than 10 percent of CLUs are changed from one year to the next. NASS needs to plan how to incorporate these changes in CLUs into its databases, preferably using automated methods.
Project 5: Evolving the Role of the ASB and Updating Documentation
Ultimately, multiple data sources will be integrated through the use of formal statistical models that will provide the estimates to be reviewed by ASB. ASB will exercise judgment in deciding when model estimates fail to reflect the impact of unforeseen events (e.g., droughts or hurricanes) or of systemic changes (e.g., rapid farm consolidation) that are not well captured in the models. ASB will make the final determination as to whether any county estimate should not be published. As a means of providing quality control, ASB will drive the feedback loop with analysts by suggesting modifications to improve model performance and interpretation. Documentation of models and reasons for changes from the documented methodology will be available to the public so that increased transparency will bolster confidence in the robust nature of the process.
The panel believes the ASB process for cash rents can begin to evolve toward a more transparent approach now, as summarized above under Project 2 and in Chapter 4. The crop estimates program is more complex at present, and also requires that a candidate model be developed and adopted before serious evolution can begin.
For now, NASS could take the following steps:
- Establish templates for posting estimation methodology on the NASS website. Examples include the Bureau of Labor Statistics, the Bureau of Economic Analysis, and the Census Bureau.
- Develop detail on how outliers are identified and evaluated for whether they can be explained, and why decisions are made to revise them.
- Develop a list of key reasons for making changes, and use it to keep a record of changes made.
- Formulate new publication standards based on Recommendation 2-3, and determine how to apply them now and as models are adopted in the future.
- Determine how the software product DICE—on which ASB currently relies to support its review and evaluation process, including benchmarking of district and county estimates to previously published state totals—should be revised to support new processes.
SECOND-STAGE PROJECTS: PROJECTS OCCURRING ONCE FSA, RMA, AND NASS FARMS HAVE BEEN LINKED AND INCLUDED IN THE LIST FRAME OR OTHER ACCESSIBLE DATABASE
Projects in the second stage of implementation of the panel’s vision will depend on progress on first-stage projects described above. Collaborative efforts under first-stage Project 1 will lay the foundation by obtaining access to administrative and auxiliary data and facilitating the development and use of new data sources. Having the CLU as the basic spatial unit and incorporating CLUs for (most) farms in the NASS list frame will have many advantages for NASS, including the following:
- FSA will provide annual updates of owner, operator, and CLU data for all FSA farms in the database, supporting NASS’s efforts to keep the list frame up to date.
- Current-year FSA and RMA data on planted and failed acres by crop will be available to improve imputation and estimation for the County Agricultural Production Survey (CAPS) and Acreage, Production, and Stocks (APS) survey system.
- Additional types of models that require use of the geospatial information associated with CLUs or farms (groups of CLUs) can be explored.
By the end of the first-stage projects, the cash rents model will have been refined and built into a new and more transparent ASB process, and models that successfully integrate multiple data sources for crop estimates will have been developed and adopted. Second-stage projects include improving surveys forms, improving survey imputation and estimation, developing unit-level models, and potentially redesigning NASS surveys to take advantage of newly available information.
Project 1: Improving Survey Forms
With the availability of linked FSA data, NASS could be working to develop a new Internet survey form for CAPS/APS that would ask matched respondents to provide only the current-year data items that are not available from administrative data. For example, since the crops planted are known, each survey form would specifically request production or yield information only for relevant crops. The intent would be to reduce the burden of responding to NASS surveys (Recommendation 2-11). It is anticipated that collaboration with farmers, farmer cooperatives, FSA, RMA, and software developers that translate precision agriculture data into variables required by NASS will have been fruitful, and that NASS will be able to make arrangements for collecting precision agriculture data as an alternative response mode in CAPS/APS (Recommendation 3-5).
Project 2: Improving Survey Imputation and Estimation
As described in Chapter 2, once a reasonable number of FSA farms and their CLUs have been linked with the NASS list frame, the use of FSA current-year data on planted acres by crop for imputation will be feasible.1 As time passes and more FSA farms are linked, NASS might realize significant improvements in accuracy due to the inclusion of current-year data on crops planted. As nonsurvey precision agriculture data become available, they may also prove valuable in imputation for nonresponse. The current sample-based data collection and estimation method could be used, but nonresponse on planted and harvested acreage items could be
1 As noted in Chapter 2, the panel was told that crop switching causes considerable concern during imputation. Imputation relies on past-year data for a missing farm to select crops for imputation. However, past-year data may not be related to the crops a farmer plants during the current year.
imputed with data from the actual nonresponding farm or a “similar” farm (Recommendation 2-11).
FSA and RMA data both define subpopulations of all farms, and NASS may be able to access data from precision agriculture databases on an ongoing basis for a substantial fraction of the total population. These auxiliary data for these well-defined subpopulations could be used to provide more efficient estimators relative to the current NASS sampling and estimation methods.
Data from precision agriculture databases, FSA program participants, and RMA purchasers of insurance clearly are not probability samples and are not likely to be representative of the population. Thus the data could not be treated as a direct replacement for or even an augmentation of the probability sample. However, methods for making use of data from nonprobability samples can be used in ways that are familiar to those accustomed to probability sampling methodology. All of these methods require that it be possible to link the data from the nonprobability samples to the list frame. With the new linkages established in first-stage projects, the methodologies described in Chapter 2 could be used to improve estimation in CAPS/APS and possibly even the Cash Rents Survey (Recommendation 2-13).
Project 3: Developing Unit-Level Models
Appendix C summarizes modeling strategies, including both area-level models (for counties) in NASS applications and unit-level models (for farms or CLUs). The Battese-Fuller model (Battese and Fuller, 1981), used to develop remote sensing estimates of acreage, and the Battese-Harter-Fuller model (Battese et al., 1988) are examples of unit-level models. NASS could consider models for its basic spatial unit (CLUs) or for farms (collections of CLUs). One disadvantage of farm-level modeling is that farms can be of vastly different sizes. One advantage of unit-level modeling is that it would support incorporation of parcel-specific variables and observations from remote sensing and precision agriculture with the potential to enhance model performance.
Project 4: Redesigning NASS Surveys
Potentially the most important project in the second stage could occur only after substantial linkage had been accomplished, and either production or yield data by crop were available from an alternative source (either from RMA in time for NASS to use the data in preparing county-level estimates, or from a sufficiently accurate yield model that had been developed, documented, and accepted and did not rely on current-year survey data). To the
extent that NASS can acquire information already provided by farmers to USDA, it should be possible to greatly reduce respondent burden in surveys. Greater reliance on administrative data could allow redirection of scarce survey resources to farms not in FSA or RMA records or to areas where survey response is low. In this situation, NASS could consider revamping its survey program to take advantage of available administrative data and models and to refocus survey efforts on areas or commodities not well covered by alternative sources.
The panel’s vision for the future of NASS is for 2025. The panel debated a shorter time frame, concluding that a time frame with the greatest feasibility would be preferable. Many of the projects described in this chapter will take time and dedication to complete, and in the interim NASS will necessarily continue with its current ongoing schedule and workload. Roughly speaking, each stage might be viewed as taking approximately 3 years to complete, although individual projects might be completed sooner.
As noted above, attaining this vision will require a focus on achieving results. Most organizations that have tried to implement change have found that support by the most senior leaders is critical to success. Therefore, senior NASS leadership will need to adopt the panel’s vision and clarify overall goals, and they will need to establish and monitor projects. The panel proposes that senior leaders who potentially could serve as key champions include the head of ASB, to oversee the Board’s evolution to a more transparent and reproducible mode of operation; the director of research, to serve as the champion for model development activities; and the head of the list frame organization within NASS, to oversee the development of a georeferenced list frame. These champions will need to identify others to be part of the process and will need to promote the importance of the changes. They will need to plan to provide resources for continuing operations while simultaneously working to identify, develop, and implement these changes. They will need to lead in the collaboration with other USDA agencies, and identify and involve supportive midlevel managers and champions. They will need to prepare a timeline, scoping the component projects essential to implementing the vision, identifying necessary steps, and monitoring progress. And they will need to celebrate success.2
2 Celebrating success is an important aspect of leading change. For example, see http://www.brendabence.com/media-room/articles/The-Top-10-Reasons.pdf [October 2017].
This page intentionally left blank.