This study has covered the current state of personnel and readiness data and uses and has opened a window onto the application of analytics in the personnel and readiness environment. In the preceding chapters the committee described the challenges it observed and discussed the data and analytics methods used in industry and by other players. The study’s charge was to develop a roadmap and implementation plan for integrating data analytics in support of decisions within the purview of the Office of the Under Secretary of Defense (Personnel & Readiness), referred to throughout the report as P&R. The committee interpreted that charge as calling for high-level conceptual advice, because P&R must have the latitude to craft specific plans for strengthening its capabilities in, and use of, data science.
Accordingly, the committee believes that the first step is for P&R to develop a data and analytics framework, in accordance with the advice in this chapter, to better address the short- and long-term needs of the Secretary of Defense. Many of the Department of Defense’s (DoD’s) data sets have been built up in response to disjointed and one-time questions (e.g., from Congress) or to deal with narrow transactional issues. They are not typically the result of a data and analytics framework derived from a coherent picture of what the Secretary of Defense and others need to know to manage DoD.1 This chapter articulates some short-, medium-, and long-
1 The Defense Readiness Reporting System (DRRS) is somewhat of an exception, although even it is mostly the product of federating existing data systems, both to keep costs in check and to encourage participation and compliance.
term goals to help P&R develop this framework: (1) improve data quality and sharing, (2) enhance data science methods, and (3) strengthen data science education. The first goal, data quality and sharing, can be improved immediately, while data science methods can be enhanced in the medium term. Strengthening workforce capabilities in data science can begin now but is a longer-term task.
DoD has made significant investments in improving its use of data and analytics for recruiting and pay over the past 40 years,2 by means such as sanctioning the creation of the Defense Manpower Data Center (DMDC) to provide a common set of data records and turning to the federally funded research and development centers (FFRDCs) for analytic assistance. These investments have helped DoD overcome difficulties, and arguably they have played a significant role in its long-term success. To continue that success, the analyses and the analysts supporting the Under Secretary of Defense for Personnel & Readiness need insight into the outlooks and motivations of personnel. However, these attributes may not be revealed by administrative records, and new data may be required. These new data needs must be defined and characterized, and the design of enhanced data systems must take into account how the data will be used, including their analytic applications.
One important consideration in planning enhancements to the data systems available to P&R is that inaccuracies and incompleteness often impede the full use of data in support of P&R decision making. For example, analysis of the burdens of deployment requires DMDC to work with a number of variables to determine whether an individual was serving in forward combat areas, but those variables do not always agree. Likewise, using the central records to review language competency will reveal that
2 The impetus for many of these investments was the return of the United States to its tradition of relying entirely on paid volunteers to staff its military establishment. Recognizing that the shift from conscription to a reliance on volunteers required key decisions about compensation and the active management of recruiting, the nation’s leadership turned to outside analytic expertise in support of the Gates’ Commission, which buttressed President Nixon’s decision to end the draft. A symposium hosted by the Center for Naval Analyses in 2014 honored Walter Oi and his influence in establishing the all-volunteer force. The website for that symposium is https://www.cna.org/news/events/all-volunteer-force, accessed January 6, 2016. Emblematic of the community of interest that these decisions created is the series of commemorative volumes produced at each of the 10-year anniversaries of the all-volunteer force in its first three decades (Bowman et al., 1986; Fredland et al., 1996; Bicksler et al., 2004).
such information is often missing. The same will be true for indicators that officers have completed joint duty assignments (that is, assignments involving more than their parent service, which ultimately can be important for promotion), because such data are often kept in separate systems. Even for data fields that are well populated in the original submissions, DMDC reports that it expends considerable effort to clean the data to ensure their accuracy.
Although many of the data sources P&R relies on arise from administrative needs, few rest on transactional record systems that may have the most relevant information. Readiness data, for example, do not typically derive from parts ordering or repair transactions but derive rather from reports on equipment availability. Biases in reporting may occur when the underlying variable is judgmental in nature; for example, whether a piece of equipment is mission capable can reflect the judgment of a commander. Likewise, deployment data generally have not depended on records of arrivals in and departures from theater but rather on a downstream transaction (the pay record). Readiness data have long suffered from this weakness, which in part led to the establishment of DRRS, with its emphasis on access to the underlying data files. Other data sources (responses to questions on race and ethnicity, say) can potentially be affected by bias, especially data collected via surveys. Transactional data may be used at the Service level but are much less likely to be used by P&R. An important exception to this generalization is health care data, which this study did not examine in detail.
Most of the data sources easily available to P&R are snapshots of either current or retrospective conditions. For example, P&R does not have all-Service forecasts of enlistment contracts with which to gauge the status of recruiting, although it is conceivable these could be generated either from separate systems maintained by the individual Services or from questions asked during military entrance processing station (MEPS) testing. Rather, P&R receives data on actual enlistment results for the month as well as the status of enlistment “reservations.” The latter are tallies of those who have signed contracts to enlist at a future date, the so-called Delayed Entry Program (DEP); actual enlistment results and the size of the DEP are used as recruiting barometers. Some surveys offer direct forecast information (e.g., reenlistment intentions) or indirect information (e.g., the Youth Tracking Survey, with its questions on how American youth feel about military service), but these fall short of what P&R really needs.
Individuals might enrich a number of the data sources or, at a minimum, improve their accuracy and completeness, provided there are sufficient incentives for them to offer corrections. Systems that encourage individuals to self-report accurately constitute a potentially rich source of data for P&R on both program preferences and the outlook of service-
members on their military careers and have strong predictive power. The Army’s pilot study “Green Pages,” which ran from 2010 to 2012, provided insights into the preferences of mid-career officers for assignments as well as additional detail on their backgrounds and accomplishments not otherwise available. (In the pilot study they provided resumes with much more detail than is contained in their administrative personnel records.) Likewise, the Navy administers Assignment Incentive Pay, in which a modest number of hard-to-fill enlisted assignments are put up for “bid,” and individuals may specify the additional compensation, if any, they would require in order to accept a particular assignment. This process implicitly provides useful information about enlisted preferences that could conceivably be used for predictive purposes such as examining trends (Carter, 2015). Self-reporting has long been relied upon to generate longitudinal data series, such as NLS79, the Millennium Cohort Survey, and the new Military Family Life Survey. These longitudinal series are often the most valuable sources of information on behaviors.
Although the administrative process, self-reported or otherwise, might generate some of the underlying data promptly, it can take considerable time before those data appear in files that the policy analyst can employ. For example, the time DMDC invests in assembling and cleaning data from the central personnel records results in a lag in the analysis compared to when the events actually took place. Likewise, although surveys are now generally administered via the Internet, the process of collecting, cleaning, and organizing the data usually takes weeks if not months, again introducing latency in the analytic process. And if the data are to be employed by an outside organization, there may be additional delays in accessing them.
DoD has an opportunity to get more value out of existing data collections by creating an expectation of data repurposing and analytics reuse. P&R could capitalize on some of the transactional data available to the Services by establishing partnerships with them. For example, repurposing assignment orders could provide a portrait of one element of future readiness (e.g., how well trained and experienced are those who are moving to a given unit?), as well as be an indicator of unit turnover. Data from tests administered at the conclusion of training could be repurposed to monitor training effectiveness and predict future unit readiness.
The assessment of noncognitive attributes through, for example, the Army’s Tailored Adaptive Personality Assessment System (TAPAS) (Stark et al., 2014) offers research and operational opportunities to improve selection and placement decisions that have historically been grounded in the administration of and reliance on cognitive ability tests. Data collected could be used at the point of entry into military service as well as at mid-career to guide further investment in human capital. New monitoring elements could also be devised (end-of-tour questionnaires that would expand
on the health-related questionnaires now required of those returning from a combat theater).
Converting unstructured or semistructured data from a variety of sources into structured form could be another avenue of potential value for P&R analytics. For example, natural language processing could provide valuable and timely information from structured records (documents, messages, reports). Annotating existing data with additional information—for example, geocoding—could allow new classes of tools to be developed and used.
The ideas suggested above are examples of new or repurposed data sources that P&R might explore and incorporate into its data and analytics framework. In each case, P&R will need to assess in detail whether the potential value mentioned is feasible and an important priority. It will also need to evaluate whether there are technical or management hurdles to overcome. These details, which can be quite nuanced, were not readily available to the committee and would best be examined by experts within P&R.
Finding: Despite the substantial amount of data available on DoD personnel, the data may not be appropriate for DoD’s analytic tasks, or they may necessitate considerable investment in constructing the variables of interest.
Finding: Analyses developed to support the Secretary of Defense are often disjointed, one-off activities undertaken to respond to immediate questions and may lack a plan for future use of data or analytic methods.
Finding: The reuse of operational data for analytic purposes can expose issues in data collection, recording, transmission, cleaning, coding, and loading. Problems are often not detected until the point of analysis, when anomalies crop up in results.
Recommendation 1: The Office of the Under Secretary of Defense (Personnel & Readiness) should develop a data and analytics framework, and a strategy to implement that framework, that addresses both the principal outcomes of its responsibilities and the short-term and long-term needs of the Secretary, based on the findings, recommendations, and discussions outlined in this report and in the Force of the Future proposals.
While P&R enjoys considerable access to data that help it address the issues for which it is responsible, substantial gaps remain, particularly for events that occur outside of DoD. Publicly available data (e.g., from social
media) and data maintained by other government agencies (e.g., Department of Veterans Affairs [VA]; the Employment Cost Index and unemployment data from the Bureau of Labor Statistics; the American Community Survey and the Longitudinal Employer–Household Dynamics from the Census Bureau) could all play pivotal roles in filling current information gaps. Many Services collect information on their members that is not reported to P&R (from, say, new-recruit surveys). The advent of the capacity to analyze large data sets has made it possible to utilize such previously unused data to inform other predictions.
One example of the potential benefit of sharing data arises in the Career Intermission Program that Secretary Carter has endorsed, which allows some servicemembers to take a 1- to 3-year sabbatical (Losey, 2016; Schechter, 2016; Serbu, 2016). Despite its appeal in principle, in practice the Services have struggled to realize the program’s potential. Understanding the problems each Service has encountered in implementation, and especially the experience of the Navy (the original proponent of the authority and the first to employ it), would avoid the missteps and identify early on what must be done if the program is to succeed. While longitudinal data are often essential to understanding both behavior and policy choices, few of P&R’s data sources are constructed originally in that form. Rather, the analyst or the data agency (e.g., DMDC) will construct the longitudinal data set from the available “snapshot” files. The two big exceptions to this generalization are the Millennium Cohort survey (conceived as a longitudinal effort from the beginning, to track the health experiences of military personnel after the illness controversies that followed the first Persian Gulf War) and the earlier National Longitudinal Survey of Youth 1979, in which DoD participated.
A particular opportunity arises in the data collection efforts of the VA. VA data illuminate other aspects of an individual’s experience after leaving active duty (pursuit of further education via the GI Bill, treatment of medical issues that may derive from service, etc.). Harmonizing the data elements from DoD and the VA would benefit both agencies, by giving DoD a better understanding of the status of reserve component members and, for the VA, by probing the antecedents of postservice health issues. More important, it would benefit veterans and provide them with the recuperative support they have earned.
Understanding the life trajectory of military personnel after they leave active duty—because they are discharged, retired, or demobilized (i.e., Reserve Component members, including those who continue to serve in a Reserve capacity)—could give P&R valuable information on the appeal of the military service or its detractions (Wilmoth and London, 2013). Unemployment data are collected by the Department of Labor without any regard for DoD’s post-Service experience needs. Understanding the life
trajectory would require larger sample sizes (to shrink the error associated with the estimates), different age range aggregations, knowing the reason for leaving active duty (e.g., completion of active duty obligatory service or discharge before completing obligatory service), and a marker for whether the person is enrolled in school.3 DoD needs an interagency mechanism to persuade its sister agencies to undertake necessary changes, including a vehicle for funding when additional resources are required.
The transactional and survey data sources compiled by DoD are a valuable resource and should be maximally leveraged to support P&R policy decision making. There are legal, regulatory, cultural, and technical barriers to (appropriately and responsibly) sharing data. One particular challenge is balancing the ability to link disparate data sources for individuals against the simultaneous goal of protecting their privacy. On the technical side, choices of hardware and software architectures provide different mixes of advantages and challenges with respect to data access and management, and these trade-offs need to be considered in developing a data and analytics framework. In addition, because some of its most important data come from series established by other agencies, it behooves P&R to take a more active role in interagency data deliberations and to advocate for sample sizes and data constructs that better serve its needs. DoD will need to be willing to finance the additional costs of these requests.
The use of social media and other outside data in support of P&R decision making is worth further examination. Although DoD warns its personnel to avoid posting information regarding their responsibilities online, many individuals maintain accounts with some degree of public access. Government monitoring of such postings, while potentially of significant value, raises difficult questions about privacy and the appropriate role of government itself. For personnel who hold security clearances, the government already asserts some authority over private matters (e.g., drug tests, reporting of law enforcement actions). Internet postings are by definition public, but some may find their collection and analysis by the government unsettling. Further, there is a difference between analysis of posts by an outsider and the analysis of posts by DoD, which has private information about those making the posts. The latter situation ranks much higher on the “creepiness” scale (Tene and Polonetsky, 2014). Privacy-preserving data analysis tools, such as differential privacy, may be particularly helpful should P&R pursue analyses that draw from social media postings.
Whether it exploits social media or government records, P&R should explore whether contemporary text analysis techniques would allow it to
3 In many states, veterans are permitted to collect unemployment insurance at the same time they are pursuing schooling under the GI Bill. A number of National Guard members are students who presumably are returning to school.
understand the content in documents of interest. Could text analytics that allow the analysis of Internet postings help P&R better understand the burdens of deployments? Could such techniques, in a further example, give P&R the ability to evaluate, fairly and even-handedly, the joint duty experiences of those applying for credit? Not only might such text analytic methods provide an efficient approach to current document reviews, they might also make available new sources of data contained in documentary sources too voluminous to examine with other methods.
Improving data sharing across Services could also prove valuable for P&R. However, standards specifying how data are collected are not comprehensive and systematic across Services. Each Service may collect similar data but utilize them differently based on Service-specific needs. This divergence results in Services tracking different data and metrics and making different policy decisions. To deal with multiple sources of data originating from multiple organizations, the strategy is to develop a process for agreeing on what new information to record, to set standards for how modifications should be made, and to render information into a common representational form.4 As noted in Chapter 3, however, the current situation represents a clear improvement over the situation that existed before DMDC’s creation.
DoD’s internal processes typically do not involve data sharing but rather the reporting of results based on data. It is for that reason that in formulating a data and analytics framework as recommended in the previous section, a “preparation instruction” should be issued that tells subordinate personnel what data they must submit. Historically, that instruction was a matter of considerable debate. Perhaps the biggest implementation challenge is the difference between the philosophical outlook of the “big data” community (in which raw data are generally available for authorized analysis) and the “command and control” ethos of DoD, in which information is indeed power. There is a notion that access by more senior personnel of the bureaucracy to the underlying data sets might prove contentious, to say nothing of access by any “outsiders.” The Services are reinforced in their preference for limiting access to data by their legal charge from Congress to organize, train, and equip military forces, while the role of the Secretary of Defense and his staff is to set policy and be politically responsible for the results.5 The desire to control data needs to be balanced against policy needs.
Improving data sharing with FFRDCs could accelerate their research processes and allow enhanced opportunities for verification and validation
4 This process usually consists of establishing a cross-Service working group, often with representation from P&R, to reach agreement.
5 Congress has subsequently said that the grant of authority to a subordinate element of the department in no way precludes the Secretary from exercising it.
of study findings. Some of these data access challenges are discussed in the following section on Institutional Review Boards, and P&R may benefit from exploring opportunities to ease data transfer.
While those outside the federal government, especially the FFRDC staffs that are working for DoD, may request access to data available to P&R, there is no tradition of public use of data sets as, for example, characterizes the Census Bureau. DoD is starting to explore this possibility. The Person-Event Data Environment (PDE) that the Army Analytic Group and DMDC created is the principal current focus, although there are major challenges in making the PDE a widely useful tool. However, technical and cultural challenges (such as possible data reidentification and other privacy compromises), a slow and complicated approval process to gain access, lengthy reviews for data import and export, limited computational capabilities, concerns about data quality and comprehensiveness, and concerns about data ownership rules pose a significant deterrent to utilizing the PDE. In addition, it is not clear that the architecture scales up in such a way that it can serve all of P&R’s needs, and forcing analysts to work through the PDE personnel (who then must work through the data owners) may represent a barrier between the analyst and the raw data. The substantial efforts undertaken by PDE personnel to prepare the data for linkage are not transparent and may inadvertently impact the results of analysis results. The Defense Health Agency has also begun exploring the creation of a data set to which greater access might be granted.
From the policy analyst’s perspective, perhaps the most important innovation would be a forum or mechanism that would channel feedback from the user about the data constructs that are needed for typical analytic tasks and the nature of the variables that would best meet those needs. Some entity, such as an Office of People Analytics or an expanded role of DMDC, with clear lines of responsibility under the Confidential Information Protection and Statistical Efficiency Act (CIPSEA), could provide such a forum. The PDE could become a central activity of such an office. Such a focus could immediately help P&R leverage better data and data analytics to support what the military cares about. Services alone are not able to keep pace with the growing need for expertise in data science. Secretary Carter’s Force of the Future effort calls for such an office. It could connect to similar activities across the federal government, leveraging the best analytics, capabilities, and talent.
Finding: The existence of DMDC and a unified personnel file has greatly enhanced the ability of the Office of the Secretary of Defense (OSD) to understand the behavior of its personnel and to refine its policies so as to improve both retention and performance. The creation of the Civilian Personnel Data System was a similar achievement.
Finding: There are benefits to be gained by enabling deeper and richer collection and sharing of data, which support a richer picture of the individual. This could in turn allow for better matching of personnel to the needs at hand (e.g., with regard to desired data skills, language proficiencies, and experiences), improved identification of at-risk servicemembers, enhanced management of the force in terms of retention and training, and many other benefits.
Finding: The challenges of data sharing and repurposing are significant; in particular, different data definitions and formatting complicate data merging and linking. Business practices (e.g., methods, procedures, processes, and rules) vary from Service to Service and from one database to another.
Finding: Enhanced data sharing within DoD, across agencies, and with the research community at large could promote the creation of new statistical methods, tools, and products.
Finding: The existence of alternative data sources, such as social media, especially when they are tied to extensive information about individuals, may deliver deep insights relevant to the mission of P&R. Owing to concerns about privacy and appropriateness and to the difficulty of ensuring statistical validity, further pursuit of this path requires careful consideration and additional research.
Recommendation 2: The Office of the Under Secretary of Defense (Personnel & Readiness) should investigate the feasibility of exploiting alternative data sources to augment traditional methods for measuring collective sentiment, evaluating recruitment practices, and classifying individuals (for creditworthiness, perhaps, or for battle-readiness). Hand in hand with this effort there should be an investigation into privacy technology appropriate for these scenarios for data use.
Recommendation 3: The Office of the Under Secretary of Defense (Personnel & Readiness) should identify incentives to enhance data sharing and collection, such as the following:
- Tracking usage of data by source in repositories such as the Person-Event Data Environment and periodically reporting back to data providers on usage (e.g., number of uses, who the users are, the nature of the study, or analysis the data contributed to);
- Providing incremental funding on contracts that involve data collection and organization to cover the costs of archiving and documenting the data for other users; and
- Giving preference to projects for constructing or redesigning operational data systems that include explicit functionality to support data sharing.
Recommendation 4: The Office of the Under Secretary of Defense (Personnel & Readiness) should leverage opportunities to improve access, including better reuse of prior data, tools, and results, and should investigate incentives to increase interagency and inter-Service data sharing.
Recommendation 5: The Office of the Under Secretary of Defense (Personnel & Readiness) should establish a working group with representation from the Services and other elements of the Department of Defense, as appropriate, to
- Identify productive new fields and formats for personnel files, such as enabling the inclusion of unstructured data and free-form text in future records;
- Identify opportunities for data sharing between Services and the Office of the Under Secretary of Defense (Personnel & Readiness) and within Services and lower barriers to such sharing;
- Work with organizations that provide operational data or collect them for analysis to improve data quality by providing standard ways for data users to report problems with data collections and channel those reports back to data providers when appropriate;
- Clarify self-reporting rules and practices;
- Identify legal and regulatory barriers to the appropriate and responsible sharing of data; and
- Examine new hardware and software architectures that facilitate data access and data management.
Finding: The development of the Person-Event Data Environment is a positive step in making some data more easily accessible. However, certain technical and cultural factors deter the use of this tool.
- Spreads the overall cost of data acquisition, cleaning, ingestion, and linking.
- Reduces time for researchers identifying and downloading data, since they work on it in situ.
- Aims to improve handling of sensitive data.
- Monitors data usage.
- Creates a group that supports users with data and tool issues.
- Sensitive personally identifiable information is susceptible to reidentification and other privacy compromises such as revelation of sensitive traits or attributes.
- Linkage attacks—innocuous data in one data set used to identify a record in a different data set containing both innocuous and sensitive data—can be carried out via external data sets brought into the PDE by researchers.
- Review processes are lengthy for access to some data.
- Delays in the review process for export of analysis results pose a deterrent to publication and peer review.
- The hurdles to become a PDE user mean that the current user community is much smaller than intended.
- Some users have been limited by the computational power, memory, and tools of the current installation.
- The PDE does not solve completeness and quality issues in the underlying data sources.
- There does not exist a systematic mechanism for reporting data problems.
- Some PDE users say they have been given conflicting statements about the ownership of external data uploaded into the PDE.
Recommendation 6: The Defense Manpower Data Center should assess how well the Person-Event Data Environment is working and whether it is serving its intended community. In doing so, the center should consider taking the following steps to improve the usability of the Person-Event Data Environment and enhance its value:
- Assess if current privacy and security policies are adequate, taking into account modern methods of attack and sources of auxiliary information that can aid in these attacks, such as multiple releases of statistics and data sets (Ganta et al., 2008), linkage attacks that make use of public sources (Sweeney, 1997; Narayanan and Shmatikov, 2008), and chronological correlations with public sources (Calandrino et al., 2011).
- Analyze data usage information, both for privacy and determining value of assets.
- Do a better job of establishing and defining a user community for knowledge sharing. This includes improving relationships with the federally funded research and development centers doing work
for the Department of Defense and determining which researchers would benefit from the capabilities of the Person-Event Data Environment.
- Remove unnecessary barriers for researchers to gain access to the system.
- Enhance computational power, memory, and tools.
- Respond to concerns about the quality and comprehensiveness of available data.
- Develop an explicit process for reporting data problems.
- Clarify data ownership rights to external data that are uploaded and merged.
- Assess protocols for accessing personally identifiable information.
- Review approval process for exporting analysis results.
- Consider widening access to the data and/or rebalancing Institutional Review Board requirements by establishing a differentially private interface.6
Institutional Review Boards
Institutional review boards (IRBs) are normally involved in carrying out human-subjects research in order to ensure that the work is handled ethically. The human-subjects protection regulations promulgated in 1981 (45 C.F.R. § 46) and the revisions known as the “Common Rule,” issued in 1991, aim at delineating human-subjects research policies, and they apply to FFRDCs and other groups that work with DoD on P&R research projects; these regulations do not apply to P&R data analyses completed solely for operational purposes. For most studies that assemble enough human data to infer useful results, IRB oversight is necessary.
The committee was told by a number of analysts from those nonfederal research organizations that they are sometimes required to satisfy often multiple reviews by IRBs in both their own institutions and DoD, and they often feel this duplication is unnecessary. In addition, there is sometimes a lack of clarity as to which DoD IRB should conduct a review of protocols. In at least one case, a local commander successfully insisted that a study gathering data from his command had to be approved by his IRB, even though IRBs from both DoD and the FFRDC had already approved the study. The committee was told how this duplication adds steps and considerable time to an already lengthy process. Sometimes an IRB may require changes that then need to be resubmitted to and reviewed by other IRBs, further extending the timeline.
Similar problems have been noted by others, and the 2014 National
Research Council (NRC) report Proposed Revisions to the Common Rule for the Protection of Human Subjects in the Behavioral and Social Sciences includes numerous recommendations for improving IRB processes. For example, it recommends establishing single IRBs of record for multisite studies.
Another kind of barrier is the requirement imposed by the Paperwork Reduction Act that U.S. government agencies obtain approval from the Office of Management and Budget (OMB) before collecting information from groups of more than nine persons. The committee was informed that in some cases FFRDC analyses using focus groups had been conducted with information gathered from only nine people, when a desirable sample size would have been considerably larger, because it was thought to be impractical to obtain OMB approval within the time allotted for the study.
Finding: Reviews by multiple Institutional Review Boards can significantly slow down the research process and add months or years to the time it takes for researchers to have access to DoD data. This creates a serious problem for responding to policy needs in a timely manner.
Recommendation 7: In order to support timely and efficient research, the Office of the Under Secretary of Defense (Personnel & Readiness) should encourage streamlining of Institutional Review Board processes that involve multiple organizations—for example, federally funded research and development centers and the Department of Defense.
Recommendation 8: The Department of Defense should carry out research on the feasibility of differential privacy methods for its personnel analytics. These methods could reduce the need for Institutional Review Board oversight.
Privacy and Confidentiality
As discussed in Chapter 5, the Fair Information Practice Principles (FIPPs) that regulate the federal government’s maintenance, collection, use, and dissemination of personal information in systems of record outline that an individual has a right to know which data are collected and how they are used, as well as to object to some uses and to correct inaccurate information. The collecting organization for its part must ensure that the data are reliable and kept secure. As data collection on individuals continues, and as data sharing becomes more common, these are important principles for DoD and P&R to keep in mind.
One way to do this is for DoD to adopt or adapt the privacy and governance structure developed by the Office of Management and Budget for
civilian statistical agencies. The Confidential Information Protection and Statistical Efficiency Act (CIPSEA) provides a uniform set of confidentiality protections for information collected while adhering to stringent privacy laws governing many agencies. It serves as a standardized guide for agencies to protect the release of survey participants’ information, both by obtaining their consent at time of collection to use their data in statistical data products and by protecting the information that would allow for identification of the survey respondent.
Recommendation 9: The Department of Defense should consider adopting or adapting the privacy and governance structure developed by the Office of Management and Budget for civilian statistical agencies. In particular, the department should follow the guidance on use of administrative records and establishing of statistical units under the Confidential Information Protection and Statistical Efficiency Act for both military and civil service personnel. In doing so, the department should examine the applicability of Fair Information Practice Principles in the treatment of Defense Manpower Data Center data.
Recommendation 10: The Defense Manpower Data Center, in its role as steward of the Person-Event Data Environment, should consider ways to adapt and use privacy and governance practices that the Office of Management and Budget has created for civilian use.
Currently, barriers exist to the effective use of analytic methods that support P&R’s policy decision making. For a variety of data sources and systems, stronger analytic approaches are needed to shorten the time required to go from raw observations to analyses to decisions. The committee found an uneven use of data science throughout the DoD personnel offices, with some areas having advanced skills and others just beginning to incorporate entry-level analytics to inform decision makers.
The potential for strengthening and expanding the use of data science methods in P&R aligns well with Secretary Carter’s recent comments on the Force of the Future. There, he outlined a number of goals that have a direct relationship to the use of data analytics in DoD, including the establishment of an Office of People Analytics (OPA) to better harness DoD’s big data capabilities in managing its talent. While few details have been released at the time of this report’s publication, OPA is designed to provide direct analytic support to the Services and OSD. Such support would give researchers and analysts better access to data on personnel characteristics and would allow them to conduct comprehensive analyses on how policy
or environmental changes will affect the performance or composition of the workforce. OPA will be prepared to partner with the Services and OSD on questions pertaining to recruiting, hiring and retention, succession planning, and training and would improve their ability to match the talents of individual personnel to the talents demanded by the jobs they are assigned to.
Improved data and descriptive, predictive, and prescriptive analytics, which are discussed in generality in Chapter 4, could be more effectively used to explore and evaluate policy options. Following are the six principal outcomes for which P&R is responsible (introduced in Chapter 2) and the committee’s observations about how advanced data analytics could help strengthen the way they are addressed:
Ensuring DoD can recruit, train, motivate, and retain the necessary numbers and qualities of personnel. Improved access to and sharing of data, and possibly the use of novel data sources, allows for a deeper understanding of how best to recruit, train, motivate, and retain the necessary numbers and qualities of personnel. The private sector faces many similar personnel challenges and has developed and deployed descriptive, predictive, and prescriptive analytics to address these workforce issues. There are unique considerations for DoD, but understanding the large-scale efforts in the private sector is a starting point. Important workforce issues can then be examined through variations of these methods, coupled with DoD data.
This report’s recommendations to establish a process for identifying domains in which data sharing would be of biggest benefit; develop a working group from the Services and other elements of DoD to improve coordination across groups; and leverage opportunities to improve and incentivize data sharing would all help this P&R area.
Creating incentives that guide DoD to an “optimal” mix of personnel. Increased use of prescriptive analytics could help inform staffing-level decisions for DoD personnel. This might include the various methods associated with both stochastic models (e.g., analytical, numerical, and simulation methods) and mathematical optimization (e.g., stochastic optimization and stochastic optimal control methods) to study options and determine the best decisions.
Controlled experiments such as small-scale pilot projects could also help inform policy decisions relating to this area (e.g., the effectiveness of incentives). The outcome of these controlled experiments could then be used as input into the prescriptive analytics models and methods, thus improving the analytics for future use.
- Ensuring DoD creates a force that is ready to carry out directed
actions. Readiness data can be explored using improved data analytics, therefore making it easier to detect underprepared units or capabilities and to estimate the impact of any deficiencies. This has the potential to target available resources more effectively to gain greater benefit from investments.
Having a strong data science capability for DoD could also be considered its own aspect of readiness. Since many data analytics advances are being employed in the private sector, it is reasonable to assume that they could also be developed within other military forces. The task of ensuring DoD maintains a force that is ready to carry out directed actions, even with a shrinking budget and growing demand for talent, could be helped by improved analytics.
- Influencing DoD’s decisions that affect the shape of military careers. Improved data access and sharing and improved analytics capabilities could significantly improve the knowledge base for shaping military careers. Better understanding the range of career paths and which positions tend to lead to certain outcomes would be of value to personnel making career choices and to leaders evaluating whether talent is being used well. The private sector has made considerable investments in understanding and shaping career paths, and DoD could benefit from exploring those efforts.
- Ensuring the services supporting DoD personnel are properly structured and provided. Monitoring the services supporting DoD personnel is a data-intensive challenge, and so steps to improve access to data and effective use of data analytics would be helpful. Increased use of prescriptive analytics, including the various methods associated with both stochastic models (e.g., analytical, numerical, and simulation methods) and mathematical optimization (e.g., stochastic optimization and stochastic optimal control methods) in particular could help assess if the services are properly structured and provided.
- Anticipating and responding to sensitive behavioral issues. Data analytics have the potential to help identify individuals at risk for sensitive behavioral issues and ensure that responses are proportionate and consistent. Improved understanding could be enabled both by improved access to existing data as well as to new sources of data. However, use of these data and methods requires careful consideration because of privacy implications, and data analytics must be carried out with care and in compliance with appropriate federal guidelines.
While the preceding examples illustrate some ways in which improved data analytic approaches could assist P&R with its mission, these are only
the beginning of what could be possible. As can be seen, some particularly striking opportunities could follow as capabilities for predictive and prescriptive analytics are built up. As discussed in Chapter 6, the committee does not believe current personnel analytics tools are a good match for DoD needs in part because they are largely targeted at operational decision making related to HR and aimed at a targeted market segment that does not match DoD. P&R may gain the most benefit from developing tools guided by P&R’s best understanding of the relative priorities of the various decisions within its mission. Many of P&R’s individual challenges are being addressed using some aspects of data analytics, but the collection of those individual studies falls short of a cohesive plan of how and why to use these approaches across the many domains that P&R informs.
Finding: A wide range of problems are being addressed for P&R using data analytic techniques and the rich data sources discussed in this report. These are often applied in response to specific questions but are not incorporated into a long-term plan.
Finding: Turnkey personnel analytic solutions and currently commercially available software are unlikely to meet P&R’s needs.
Recommendation 11: The Office of the Under Secretary of Defense (Personnel & Readiness) should assess which predictive and prescriptive analyses would benefit its mission over the longer term, taking into account its understanding of which specific decisions could, if evaluated by applying more powerful data and/or methods, better enable the Department of Defense to prepare for future demands it may face. Some possible steps that might follow include these:
- Emphasizing the use of prescriptive analytics in conjunction with predictive what-if scenarios;
- Enhancing prescriptive analytics usage and disseminating best practices across the entire department; and
- Adapting the prescriptive analytics methods successfully used in the private sector for workforce and talent management.
Often P&R needs additional data to evaluate potential policy solutions, and the use of controlled experiments can sometimes provide those data. Controlled experiments are an opportunity for P&R to test a hypothesis by adjusting a key variable in a subset of its population and then measuring the results of the change. Besides the challenge of structuring the experiment,
the resources required and the time delay in obtaining results can inhibit the use of controlled experiments. The urgency of solving a problem frequently leads to deploying solutions without a framework for testing their efficacy except through pre- and post-solution data analysis. The result is an institutional culture in which experiments are the exception rather than the rule (as opposed to, say, the civil health community); however, this is a missed opportunity.
Controlled experiments can be used for a variety of areas important to P&R (e.g., recruiting [Fricker et al., 2015]) and could be particularly helpful in buttressing the case for conclusions that contradict accepted propositions. A case in point is the enthusiasm for 2-year enlistments, endorsed by some political leaders as a means of shifting the supply of enlistees (particularly those from socioeconomic backgrounds that would not otherwise normally lead to enlisted military service). A controlled Army experiment demonstrated that any enlistment supply gain would be swamped by the attrition at the end of the service period while imposing unnecessary training costs engendered by high turnover and a short period of trained service within the 2-year period. Most skill areas require 6 to 12 months of training before assignment to a unit, at which point the individual is still at the initial point of the learning curve (Buddin, 1991).
The NRC (2004) reviewed a variety of experiments and quasi-experiments (in addition to the Buddin  study described above) that examined the effects of various recruiting incentives on enlistment, job choice, and other outcomes. For example, Fernandez (1982) reported a study of the effects on high-quality enlistment of differing types and amounts of postservice educational benefits. Polich et al. (1986) examined the effects of varying enlistment bonuses and differing enlistment term obligations on highly qualified enlistments. Sellman (1999) reported on a pilot program allowing recruits to attend 2 years of college prior to reporting for active duty.
Finding: The Department of Defense does not routinely employ controlled experiments to understand causes and effects of the Office of the Under Secretary of Defense (Personnel & Readiness) policies—for example, revisions to enlistment standards or choices affecting family welfare—to judge whether they produce the intended effects and provide benefits that justify their costs.
Recommendation 12: To the extent feasible and relevant, the Department of Defense should conduct carefully structured experiments to test the efficacy of policy.
The demand for individuals with data science skills is increasing rapidly across all organizations (inside and outside of DoD). For example, the U.S. Bureau of Labor Statistics reports that the expected rate of growth for statisticians is 34 percent (BLS, 2016a) and for operations research analysts (a kind of data science analyst) is 30 percent (BLS, 2016b) over the period 2014-2024 as compared with the average growth rate for all occupations of 7 percent. U.S. News and World Report ranks “operations research analyst” as the fourth best business job, the eighth best STEM job, and the twentieth best job overall by incorporating factors such as job growth and salary (Marquardt, 2015). Other kinds of surveys and rankings provide similar results. Individuals with data science skills that include descriptive, predictive, and prescriptive analytics experience are in strong demand for employment and salary growth. The private sector is fueling this growth; in response, more than 80 university programs have been created over the past 10 years to provide undergraduate and graduate analytics education (INFORMS, 2015).
There are significant opportunities within P&R for increased use of data science to improve a variety of aspects of decision making. It is notable that DoD employs very few individuals with expertise in statistics and optimization, and many quantitatively trained analysts are classified as operations researchers, regardless of their actual training.
Finding: Based on its collective experience with seeing data science mature in other organizations, the committee’s judgment is that P&R’s skills, depth, and resources in data analytics are not sufficient to recognize the full range of analytics opportunities and to implement these methods to better support decision making. It is always problematic to leverage scattered pockets of data science expertise, so raising the general level of awareness and skill would be more effective.
Recommendation 13: The Office of the Under Secretary of Defense (Personnel & Readiness) should create greater awareness of data science methods and disseminate them more thoroughly to its personnel to increase the general understanding of data science and the benefits of its use.
Recommendation 14: The Office of the Under Secretary of Defense (Personnel & Readiness) should enhance education in data science for its personnel, including civil service employees. This education could range from short courses in specific techniques for personnel who
already have the requisite foundational knowledge, to overview seminars for managers who need to be acquainted with what their analytical staff can undertake, to formal degree programs, whether at Department of Defense or civilian universities.
The heart of P&R’s responsibility is policy management for DoD personnel and assessment of their readiness to carry out the tasks the nation assigns. This mission requires high-quality data and the most up-to-date means to analyze it. High-quality, timely data are necessary to understand events now occurring (or that have occurred in the recent past) and to forecast what may happen next, to which policies must be ready to respond, if not to take preemptive action to thwart adverse outcomes. High-quality data are necessary to understand the structure of those events (that is to say, the causal factors), lest the policy choices deal with the symptoms observed versus the underlying causes.
P&R today commands an extraordinary variety of data sets and data sources when compared to what is available to most cabinet agencies. But much of what is available is based on administrative records or records that are driven in key ways by administrative considerations. What is available can be shaped into data sets that respond to P&R’s policy analysis needs, albeit at some cost in resources and the degree to which the information is available on a timely basis. If P&R desires more timely information, and information that is more complete relative to its needs—including the ability to forecast and evaluate alternative policy decisions, which good policy debate requires—it will need to consider additional investments. And it will need to devise mechanisms for controlling data access that on the one hand protect the variety of equities involved (including privacy), but on the other respond more quickly and adequately to the analytic needs that its own requests have often generated. Secretary Carter’s Force of the Future recommends the creation of an Office of People Analytics. This report is intended to provide a way to confront these issues in a manner that will bring P&R, DoD, and the American military into a commanding position of excellence in managing personnel.
Bicksler, B.A., C.L. Gilroy, and J.T. Warner, eds. 2004. The All-Volunteer Force: Thirty Years of Service. Dulles, Va.: Brassey’s.
BLS (Bureau of Labor Statistics). 2016a. U.S. Department of Labor. Occupational Outlook Handbook. 2016-17 Edition, Operations Research Analysts. http://www.bls.gov/ooh/math/statisticians.htm. Accessed April 8, 2016.
BLS. 2016b. Occupational Outlook Handbook. 2016-17 Edition, Operations Research Analysts. http://www.bls.gov/ooh/math/operations-research-analysts.htm. Accessed April 8, 2016.
Bowman, W., R. Little, and G.T. Sicilia, eds. 1986. The All-Volunteer Force After a Decade: Retrospect and Prospect. McLean, Va.: Pergamon-Brassey’s.
Buddin, R. 1991. Enlistment Effects of the 2 + 2 + 4 Recruiting Experiment. Santa Monica, Calif.: RAND Corporation. http://www.rand.org/pubs/reports/R4097.
Calandrino, J.A., A. Kilzer, A. Narayanan, E.W. Felten, and V. Shmatikov. 2011. “You Might Also Like”: Privacy risks of collaborative filtering. Pp. 231-246 in Proceedings of the 2011 IEEE Symposium on Security and Privacy. May 22-25.
Carter, A. 2015. Remarks on “Building the First Link to the Force of the Future” as Delivered by Secretary of Defense Ash Carter. George Washington University Elliott School of International Affairs, Washington, D.C., November 18. http://www.defense.gov/News/Speeches/Speech-View/Article/630415/remarks-on-building-the-first-link-to-the-force-of-the-futuregeorge-washington.
Fernandez, R. 1982. Enlistment Effects and Policy Implications of the Educational Test Assistance Program. Report R-2935-MRAL. Santa Monica, Calif.: RAND Corporation.
Fredland, J.E., C. Gilroy, R.D. Little, and W.S. Sellman, eds. 1996. Professionals on the Front Line: Two Decades of the All-Volunteer Force. Washington, D.C.: Brassey’s.
Fricker, R.D., Jr., S.E. Buttrey, and J.K. Alt. 2015. Future Navy Recruiting Strategies. Naval Postgraduate School. http://faculty.nps.edu/rdfricke/docs/CNRC_technical_report.pdf.
Ganta, S.R., S. Kasiviswanathan, and A. Smith. 2008. Composition attacks and auxiliary information in data privacy. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. http://www.cse.psu.edu/~ads22/privacy598/papers/gks08.pdf.
INFORMS (Institute for Operations Research and the Management Sciences). 2015. “Analytics and OR/MS Education.” https://education.informs.org. Accessed November 17, 2015.
Losey, S. 2016. “Need to Know, 2016: Air Force Offers Three-Year Sabbaticals.” Air Force Times. January 1. http://www.airforcetimes.com/story/military/careers/air-force/2016/01/01/need-know-2016-air-force-offers-three-year-sabbaticals/77762608/.
Marquardt, K. 2015. “Best Business Jobs: Operations Research Analysts.” Overview. U.S. News and World Report. http://money.usnews.com/careers/best-jobs/operations-research-analyst.
Narayanan, A., and V. Shmatikov. 2008. Robust de-anonymization of large sparse datasets. Pp. 111-125 in Proceedings of the 2008 IEEE Symposium on Security and Privacy. doi: 10.1109/SP.2008.33.
NRC (National Research Council). 2004. Evaluating Military Advertising and Recruiting: Theory and Methodology. Washington, D.C.: The National Academies Press.
NRC. 2014. Proposed Revisions to the Common Rule for the Protection of Human Subjects in the Behavioral and Social Sciences. Washington, D.C.: The National Academies Press.
Polich, M., J. Dertouzos, and J. Press. 1986. The Enlistment Bonus Experiment. Report No. R-3353-FMP. Santa Monica, Calif.: RAND Corporation.
Schechter, E. 2016. “Defense Secretary Outlines Strategies, Goals.” Pensacola News Journal. January 30. http://www.pnj.com/story/news/military/2016/01/30/defense-secretary-outlines-strategies-goals/79536406/.
Sellman, W.S. 1999. Military Recruiting: The Ethics of Science in a Practical World. Invited address to the Division of Military Psychology, 107th annual convention of the American Psychological Association, Boston, Mass.
Serbu, J. 2016. “DoD Bids to Make Military Life More ‘Family-Friendly.’” Federal News Radio. January 29. http://federalnewsradio.com/defense/2016/01/dod-doubles-maternity-leave-bid-make-military-life-family-friendly/.
Stark, S., O.S. Chernyshenko, F. Drasgow, C.D. Nye, W.L. Farmer, L.A. White, and T. Heffner. 2014. From ABLE to TAPAS: A new generation of personality tests to support military selection and classification decisions. Military Psychology 26:153-164.
Sweeney, L. 1997. Weaving technology and policy together to maintain confidentiality. Journal of Law, Medicine and Ethics 25(2-3):98-110.
Tene, O., and J. Polonetsky. 2014. A theory of creepy: Technology, privacy, and shifting social norms. Yale Journal of Law and Technology 16(1):2.
Wilmoth, J.M., and A.S. London. 2013. Life-Course Perspectives on Military Service. New York: Routledge.
This page intentionally left blank.