The committee was tasked to “gather, review, and discuss the business management and the behavioral science literature on generational attitudes and behaviors in workforce management and employment practices.” As discussed in Chapter 3, research on generations has a long history, but attention to generations and work-related attitudes and behaviors is fairly recent, with empirical studies notably on the rise in the past 20 years (Costanza et al., 2017). In its search for relevant scientific literature, the committee identified more than 500 articles that have been published since 1980 (see Appendix A for detail on the committee’s search strategy and the literature identified for this review). This chapter summarizes our findings and conclusions about the state of this body of research, referred to collectively here as generational research or generational literature. In our review, we drew on findings from previous reviews of this literature and critiques of the dominant methodologies used in these studies, and we conducted a pilot review of a small subset of the articles to confirm our findings.
Since the National Academies letter report (National Research Council [NRC], 2002) discussed in Chapter 1 was published nearly 20 years ago, the amount of empirical research on work-related generational claims has increased considerably. Several scholars have noted the paucity of empirical studies published before the late 1990s (Costanza et al., 2017; Parry and
As the idea of generational differences in the workforce has grown in popularity (see Chapter 3), new lines of inquiry, based primarily in the disciplines of psychology and business management, have adopted early sociological theories on generational shifts and social change (Mannheim, 1952; Riley, 1987; Ryder, 1965) as a framework for characterizing individual attitudes and behaviors. Seeking to verify and/or identify generational differences, empirical studies have been conducted to measure work-related attitudes and values, often with the assumption that attitudes and values directly influence behaviors in the workplace. Very few studies focus specifically on actual workplace behaviors1 as this kind of data is difficult to collect. Instead, most research is based on self-reported responses to surveys. Moreover, research has varied greatly with respect to the questions (i.e., item responses) and length of surveys used to operationalize values or attitudinal variables. In general, many of the values measured can be categorized in terms of work ethic, work centrality and leisure, altruistic values, and extrinsic values (Twenge, 2010). With regard to attitudes, most empirical studies focus on job satisfaction, organizational commitment, and intent to leave (Costanza et al., 2012; Parry and Urwin, 2011). Other scholars have remarked on the paucity of studies examining generational differences as regards training, motivation, and leadership (Rudolph, Rauvola, and Zacher, 2018). There is, however, much research examining the relationship of age to work motivation (Inceoglu, Segers, and Bartram, 2012; Kooij et al., 2011), and job performance (Ng and Feldman, 2008, 2010b). Further discussion of the literature related to the aging workforce is included in Chapter 5.
Much of the generational work has been undertaken with the purpose of helping organizations better understand how to recruit, develop, retain, and/or motivate individuals from different generations (Williams, 2019). Some of this work has been undertaken to test the very notion of generational differences, often testing assumptions or findings of other studies by using more sophisticated methods or data sources (Costanza et
1 Some studies have tried to better understand generational differences in regard to job turnover, but most of these use survey responses in relation to intents to leave job as opposed to actual behaviors of switching jobs. One study (Enam and Konduri, 2018) did examine changes in time engagement behaviors over time and found that those born between the years 1982–2000 were more likely than people born earlier to delay entry into the workforce and exhibit longer student status in higher education. This study did not look at the time engagement behaviors of the subsequent generation so it is not clear whether this finding is specific to “millennials” or part of a more general upward trend toward more education and later entry into work.
- “Although in some areas (e.g., work centrality) time-lag and cross-sectional studies are fairly congruent, in other cases they disagree. Where they are discrepant, the most logical explanation is that the cross-sectional study is also tapping differences due to age or career stage…. The other possibility is that the time-lag studies are finding a time period effect, i.e., all generations have changed over time in the same way” (Twenge, 2010, p. 206).
- “… current [cross-sectional] approaches adopted for the investigation of generations across most studies are fundamentally flawed…. Any study, whether quantitative or qualitative, that only considers a group of individuals at one point in time is unable to distinguish between age, period, and [generation] effects…. The only way to achieve any insight into these three different effects is to investigate generational differences using longitudinal data…. [Unfortunately,] there are limited numbers of datasets that (either quantitatively or qualitatively) ask the same questions of individuals, of different ages, as part of a panel or repeated cross-section, over decades” (Parry and Urwin, 2017, p. 142 and 146).
- “… none of the three commonly used approaches [in the generational literature] fully and accurately partitions the variance to age, period, and cohort in a generations context…. The linear dependency created when defining generations as the intersection of age and period creates an unresolvable identification problem, making it very difficult to isolate the unique effect of any one of the factors” (Costanza et al., 2017, p. 161).
Given these conceptual and methodological issues, discussed further below, it is exceptionally difficult to draw firm conclusions about generational differences in work-related variables. The current state of evidence suggests that any observed differences among workers are more likely to reflect age differences at the time of measure or evolving social and work conditions as a result of historical events impacting all people (see Chapter 2) than true generational distinctions. Further, the few studies that have used datasets with measures of attitudes over time have found weak generation effects, indicating that the variability in work-related attitudes and behaviors within generational groups is likely to be larger than the variability among generations. That is, individuals from the same “generation” are just as likely to be different from one another as from individuals of different generations.
Conclusion 4-1: Many of the research findings that have been attributed to generational differences may actually reflect shifting characteristics of work more generally or variations among people as they age and gain experiences.
Research on generations has varied across disciplines, and the use of the term “generation” has typically fallen into two different perspectives: one focused on descent and lineage-based linkages, prominent in such fields as anthropology, and the other on shared experiences of a group of people of similar age, more prominent in sociology (Burnett, 2011; Joshi, Decker, and Franz, 2011). Much of the generational literature on work-related variables discussed in this report draws on early theories in sociology (Mannheim, 1952; Riley, 1987; Ryder, 1965) regarding generations and social change (see Chapter 3). Instead of studying social change, however, the bulk of the work-related generational research adapts these sociological theories to the study of individual attitudes and values. Three elements of these early theories have been used as a basis for recent empirical studies: (1) significant events occur within a society that broadly affect a societal group of individuals; (2) these events impact a particular cohort—a birth cohort in their formative years of late adolescence/early adulthood; and (3) events that happen during people’s formative years exert a continuous influence on their thoughts and behaviors as they age—hence the emergence of a generation (Parry and Urwin, 2017).
The scientific literature contains many variations of the definition of the concept of generations, based in part on the conceptualization of Mannheim.2 These include
- “an identifiable group that shares birth years, age, location, and significant life events at critical developmental stages” (Kupperschmidt, 2000, p. 66);
- “a cultural field in which social agents participate to varying degrees dependent upon their structural location within society” (Gilleard, 2004, p. 117);
- “a group of individuals, who are roughly the same age, and who experience and are influenced by the same set of significant historical events during key developmental periods in their lives, typically late
2 In the seminal work of sociologist Karl Mannheim, “The Problem of Generations” (1952), “generation” suggests a group of individuals of a similar age and a similar location who experience similar social, historical, and life events (Lyons and Kuron, 2014; Parry and Urwin, 2011).
childhood, adolescence, and early adulthood” (Costanza et al., 2012, p. 377); and
- “[a group] of individuals born during the same time period who experience a similar cultural context and, in turn, create the culture (Gentile, Campbell, and Twenge, 2013)” (Campbell et al., 2015, p. 324).
In conducting empirical research, such theoretical concepts as generations need to be operationalized so they can be linked to variables that can be measured and studied. The definitions above suggest that the concept of generation is a complex mix of age, location, and context; however, it rarely is operationalized as such. The concept has been difficult to operationalize, and in many studies, birth cohort is used as a proxy for generation (Brink, Zondag, and Crenshaw, 2015; Parry and Urwin, 2011). For researchers, it is straightforward to classify individuals by birthdates but much more difficult to know when those individuals of the same birth cohort were also exposed to the same the sets of experiences. Many resources, popular and academic, add to the confusion by using the concept of generations interchangeably with that of age groups. This is particularly noticeable in resources that draw solely on one-time cross-sectional studies (see the discussion below).
The approach in most studies reviewed by the committee is to take predefined cohorts based only on birth year as representing distinct generations (Parry and Urwin, 2011). The labels and a range of birth years for each generation are generally assumed (see Chapter 3); across studies, however, there is substantial variation as to the exact starting and ending year for each group (see Costanza et al., 2012; Rudolph, Rauvola, and Zacher, 2017). For example, the label “baby boomers” is often used to apply to people born between 1946 and 1964, although, depending on the study and its criteria for categorizing research subjects, this range of birth years can be shorter or longer. As a result, generational categories are inconsistent across studies. Moreover, this lack of consensus on birth years for different generations indicates that there has been no empirical justification for any birth-year boundaries.
Using just birth years to define cohorts assumes that the influence of proximal historical events and social, cultural, and economic phenomena on those cohorts’ individual members has already been established—an assumption largely untested. Thus, the generational research tends to take as antecedent an undefined set of shared experiences assumed to have shaped the attitudes/values to be measured. While a few articles make reference to what those influences might be (e.g., growing up with new technologies or the lasting influence of significant events during adolescence, such as war, the moon landing, or the terrorist attacks of September 11, 2001), none
of them defines or investigates the mechanisms through which a specific event or phenomenon directly shapes the variable of interest (e.g., job satisfaction, work centrality). Because this research has generally adapted existing theories and conceptualizations of generations without examining assumptions built into the definition of generational groups, its theoretical contribution has been somewhat limited.
In addition to the theoretical issues of defining generations and specifying the precise mechanisms that differentiate them, other methodological issues make generations a challenging topic to study: the data needed to address generational questions related to the workplace rigorously are often difficult to obtain, and appropriate statistical approaches for studying generations are complex. This section examines some of these methodological challenges of separating out age, period, and cohort effects, and then reviews analytic approaches currently used to draw generational inferences and the concerns they raise. It also looks at other methodological concerns involving measurement and sampling.
Challenges of Separating Age, Period, and Cohort Effects
Research aimed at identifying and determining the extent of generational influences in the workplace essentially tries to separate generation effects from age or period effects (see Box 4-1). These three concepts and the statistical and methodological challenges associated with isolating the influences of each are present in large bodies of work in the fields of demography, economics, epidemiology, political science, psychology, and sociology (see, e.g., Hobcraft, Menken, and Preston, 1985; Keyes and Li, 2012; Yang and Land, 2013). An understanding of these concepts is foundational for determining whether workforce attitudes, values, and behaviors are attributable to generational differences, developmental differences between younger and older workers, or broad historical or social forces that impact all workers regardless of their age and generation.
In much of the academic work in these fields, age, period, and cohort are statistical concepts reflecting the impact of proximal causal processes on observed differences among people. Just knowing there is evidence for a cohort or age effect, in a statistical sense, would not tell researchers why any such differences are observed among workers or anything about the processes that created them. For example, researchers might study muscle mass and find an age effect. The observation of declines in muscle mass that are attributable to age does not explain how the functioning changes in terms of biological processes. A presumably large number of mechanisms
might be relevant. The point is that the statistical modeling does not provide clear guidance as to why and how muscle mass changes with age—it simply identifies an association between age and muscle mass differences. This is an important perspective to keep in mind when considering the literature that attempts to isolate age, period, and cohort effects and interpreting results from particular studies. Theoretical knowledge and insight into mechanisms are critical. As discussed above, however, little theoretical work has examined the mechanisms that could be responsible for differences among generations. Further, it turns out to be a challenging task even to separate cohort effects from age and period effects.
It is helpful to begin with an example to illustrate the challenges of separating out these effects. Table 4-1 shows the ages of individuals of different birth cohorts at different periods in time (i.e., year of observation). The ages shown in this table are restricted to those that might be observed commonly by workplace managers, such as individuals between the ages of 20 and 70. Careful inspection of Table 4-1 gives a sense of the relations among age, period, and cohort. For example, the 20-year-olds in 2010 were born in 1990, whereas the 40-year-olds in that year were born in 1970. The table illustrates that managers in the 2010 workplace would not have observed workers in the 2000 birth cohort given that such people would be only 10 years old. And forecasting whether any generational differences
TABLE 4-1 Observed Ages of Different Workplace Cohorts at Different Periods
|Period (Current Year)|
NOTES: NA = not applicable. For illustrative purposes, the table assumes that individuals below age 20 or above age 70 will not be observed in the workforce (NA). “Impossible” refers to the fact that it would be impossible to observe someone of that birth year at a given period.
SOURCE: Generated by the committee.
will persist into the future is hindered by the impossibility of a manager in 1990 having observed a member of the 2000 birth cohort. In short, there is a limit to how much cohort-related variation can be observed, and this holds true for both managers and social scientists.
Table 4-1 also illustrates that researchers taking a cross-section of workers from any given year will immediately face a problem in drawing sound inferences (see the yellow-highlighted column in Table 4-1): members of a given cohort will differ from members of other cohorts not only by cohort but also by age. Thus, any differences among these workers could be due to the effects of either age or cohort; at any single point in time (or in any period), age and cohort are confounded, so there is no way to separate the influence of the two in such cross-sectional comparisons. Researchers cannot know whether the 20-year-olds are different from the 40-year-olds because of where they are in their life span (an age effect) or the unique experiences of their generation (a cohort effect). In other words, the differences found in cross-sectional studies are consistent with either an age or a cohort effect, and nothing in the design constrains the inferences any further. Many of the studies reviewed for this report use a cross-sectional design, and as such, they offer insufficient internal validity (i.e., the study design makes it impossible to eliminate alternative explanations for any findings) to answer questions about generation effects. Such studies are therefore of limited value when thinking about the utility of generational distinctions.
One way to approach this limitation of cross-sectional studies is to compare responses from a sample of 40-year-old participants taken at different points in time. For example, if data were available on the commitment of workers to their organization in 1990 and 2020, it might be possible to compare the 40-year-olds in 2020 with the 40-year-olds in 1990 (see the orange-highlighted cells in Table 4-1). This approach holds age constant. Here again, however, a problem arises: because the workers from the different birth cohorts are observed at different times, any differences between these workers could be due to the effects of period rather than generation. It is possible that all workers observed in 2020 are less committed to their jobs compared with workers in 1990.
This discussion demonstrates that researchers attempting to separate out age, period, and cohort effects must struggle with what is known in the literature as the “identification problem”—the linear relationship among age, period, and cohort (age = period – cohort, where age is years since birth, period is current year, and cohort is birth year [see Fosse and Winship, 2019]). The identification problem makes it challenging to design a study that can distinguish cohort from age effects or cohort from period effects without making certain assumptions. In general, researchers assume that one of the three factors (age, period, or cohort) has a roughly
zero effect. For example, researchers wishing to draw inferences about generations from a cross-sectional study must assume that the age effect is zero, an assumption often debated since a large body of work points to age-related differences in many of the variables of interest to those studying generations (Inceoglu, Segers, and Bartram, 2012; Kanfer and Ackerman, 2004; Kooij et al., 2011; Parry and McCarthy, 2017; Roberts, Walton, and Viechthauer, 2006).
Thus, researchers must be careful to understand the limitations of their data in the context of the research question they are attempting to answer (Blalock, 1967; Cohn, 1972; Costanza et al., 2017; Fosse and Winship, 2019). Moreover, researchers need data from multiple periods along with data on multiple ages and cohorts to even begin this process. Figure 4-1 illustrates the possible findings from a hypothetical study with enough data to distinguish age, period, and cohort effects. The next section covers some of the different designs researchers have used to study generational differences in the workforce, including the over-used cross-sectional designs and improved designs for parsing age, period, and cohort effects.
Current Analytic Approaches
As discussed above, the vast majority of studies have sought to identify generational differences using cross-sectional designs. Other quantitative methods used in this research have included cross-temporal meta-analyses and complex multilevel statistical models applied to nested datasets (see Appendix A for details). These approaches entail different ways of attempting to isolate age, period, and cohort effects, with varying degrees of success (see also Table 4-2).
Cross-sectional surveys—the approach of comparing groups of people of different ages using an instrument administered to a single sample at a single point in time. As discussed previously, such a design confounds age and cohort effects. Period effects are undetectable because all groups are completing the survey at the same time, and period effects are therefore constant.
Cross-temporal meta-analyses—the approach of extracting descriptive statistics (often measures of central tendency, such as sample means) from studies conducted at different points in time. These descriptive statistics are combined using meta-analytic techniques and usually weighted for precision by the number of observations available for each time point. The objective is to test whether aggregated estimates vary because of when the data were collected. If the underlying studies used age-restricted samples (e.g., samples of high school students, college students, or new Army recruits), it is common for the results to be used to draw inferences about generations. As previously discussed, however, an observation of
TABLE 4-2 Overview of the Analytic Approaches
|Analytic Approach||Data Requirements||Advantages and Disadvantages|
|Cross-sectional surveys (e.g., see list of studies in Appendix A)||Data from a survey administered to a single sample across multiple ages or generations at a single time point, analyzed and summarized statistically (means, standard deviations) for each age group||Period effects are held constant, but cohort and age are confounded|
|Cross-temporal meta-analyses (e.g., Campbell, Twenge, and Campbell, 2017; Twenge, Campbell, and Freeman, 2012; Twenge and Campbell, 2001, 2008)||Descriptive statistics (sample means, standard deviations) from studies of people of the same age sampled at different time points||Controls for age effects, but period and cohort are confounded|
|Multilevel models (e.g., Donnelly et al., 2016; Jürges, 2003; Kalleberg and Marsden, 2019; Koning and Raterink, 2013; Kowske, Rasche, and Wiley, 2010)||Individual-level data collected from multiple survey panels repeatedly over extended periods of time||Partitions the variance attributable to age, period, and cohort (generation); relatively few available datasets have information relevant to workplace considerations; statistical assumptions are often complex and untested|
SOURCE: Generated by the committee with information from Costanza et al., 2017.
differences across different years could be attributable to period rather than cohort effects. For example, researchers might identify all studies that administered the same measure of organizational commitment across different years. The sample means for organizational commitment would be recorded from each study, along with the year of data collection. The sample means would be combined for each year to generate a more precise estimate of organizational commitment by pooling results from multiple studies. Researchers could then test whether the pooled averages for each year showed any systematic fluctuations across different years. Such a systematic pattern would suggest changes in organizational commitment over time; however, it would be impossible to discern whether any observed changes were due to period or generation effects.
Multilevel models applied to nested datasets—Multilevel models are a family of statistical tools that are appropriate for studying databases in which some observations are nested within others, such as when data are collected from multiple individuals across multiple years. Statistically
speaking, individual responses are then nested within each year in this design, which is essentially a series of repeated cross-sectional studies with data being collected across multiple years. Likewise, nesting can occur in longitudinal designs when the same people are observed repeatedly over time, as is typically the case in many disciplines. In this case, observations on different occasions are nested within people. Still other cases of nesting arise when observations are clustered within groups, such as students within schools or employees within workplaces. All of these kinds of designs raise the possibility that observations are not completely independent from one another because of the nesting. Multilevel models are statistical tools that allow researchers to address the nesting.
The statistical approach of using multilevel models to study generation effects is technical (see Yang and Land, 2013) and not without critics (see Bell and Jones, 2018). Concerns involve the statistical assumptions, coding of the data, and modeling constraints needed to achieve estimation of the model given the identification problem described in the previous section and the nature of the data that contain multiple ages observed across multiple time points. In the generational literature, such multilevel models are called age-period-cohort (APC) models or APC analysis (Fannon and Nielsen, 2019; Fosse and Winship, 2019; Winship and Harding, 2008). APC models are applied to datasets that include multiple ages and times of measurement. These datasets need not necessarily be longitudinal in the sense of following the same individuals across multiple time points. Instead, different people can provide information at different waves. For example, data from the General Social Survey (GSS) (https://gss.norc.org) are often used for APC analyses. The GSS has administered the same questions (e.g., “Taken all together, how would you says things are these days—would you say that you are very happy, pretty happy, or not too happy?”) across multiple years (e.g., from 1972 to 2018) to people of varying ages at each year (i.e., people between the ages of 18 and 89 or older). Data from the GSS can be analyzed with multilevel models to isolate how much variability in a given variable is attributable to the statistical effects of age, period, and cohort. The issue with this approach is that relatively few datasets with information relevant to workplace considerations are available for analysis. Moreover, as noted above, the modeling is often not without assumptions that could be challenged on both statistical and theoretical grounds (e.g., Bell and Jones, 2018; Luo et al., 2016).
Other Methodological Concerns
As discussed in the previous section, cross-sectional surveys are typically not useful for studying generational differences because they confound age and cohort effects, and one of the most promising approaches for
addressing this limitation is to use APC statistical models on repeated datasets that span many years to provide multiple observations of people of different ages at different points in time. However, issues of measurement invariance and representativeness are also relevant when evaluating the literature on generational differences. These issues are salient to the study of generations because it is important to confirm that the tools (e.g., surveys) used to measure constructs (e.g., work-related values) of interest are able to support comparisons across the targeted generational groups (measurement invariance) and whether population-level inferences are justified given the sampling plan (representativeness).
Researchers interested in workplace characteristics typically focus on such topics as job satisfaction, intrinsic motivation, and organizational commitment, termed “constructs” in the social sciences. Most social scientists acknowledge, either implicitly or explicitly, that survey responses involve some degree of imprecision. The actual responses are reflections of the underlying construct of interest; they are not thought to be perfect indicators of the construct. In fact, questions about reliability and validity typically arise for all measures. Rigorous examinations of generational differences therefore require that survey responses (or any other measures) operate in the same way across time and across members of different generations. This issue is often the domain of measurement specialists (i.e., psychometricians) who are concerned with psychometric properties and the extent to which those properties change across points of comparison. Other social scientists might simply assume that measures have the same psychometric properties. Nonetheless, measurement equivalence/invariance (e.g., Horn and McArdle, 1992) is critical for drawing the appropriate conclusions from quantitative data in the generational literature.
The basic question with respect to measurement invariance is whether psychometric properties are consistent across groups or time points so that the observed scores reflect the same value of the construct (also known as a latent variable in some disciplines) whenever comparisons are made. If this condition of invariance is not met, sound inferences across studies are impossible. For example, consider a study comparing average scores on a multi-item survey of job satisfaction for younger employees (i.e., workers under 30) versus older employees (workers over 55). The measure of job satisfaction would be considered invariant if the observed scores referred to the same underlying level of job satisfaction for both groups of workers (i.e., if a score of 3.5 referred to the same level of satisfaction for a 25-year-old and a 55-year-old). If such a condition held, it would then be reasonable to draw inferences about observed differences in average levels
of job satisfaction between the older and younger workers. If measurement invariance did not hold, however, the same observed scores would refer to different levels of job satisfaction in the two age groups. If invariance is not present, drawing inferences from comparison between groups is akin to the proverbial problem of comparing apples and oranges (Vandenberg and Lance, 2000).
There are different levels of measurement invariance (Schmitt and Kuljanin, 2008; Vandenberg and Lance, 2000), imposing increasingly stringent requirements on the psychometric properties of scores. Invariance is evaluated using structural equation modeling techniques (e.g., Brown, 2015) or item-response theory methods (Tay, Meade, and Cao, 2015). Both of these techniques are predicated on the notion that observed scores on quantitative measures reflect differences in underlying latent (unobserved) variables. Both techniques formally acknowledge measurement imprecision and do not assume that observed scores are perfect reflections of the constructs they are intended to measure.
Many studies in the generational literature fail to test explicitly for measurement invariance, adding further ambiguity to attempts to draw conclusions from the existing literature. Only two studies included in the committee’s review directly examine measurement invariance in the context of generational differences (Meriac, Woehr, and Banister, 2010; Twenge et al., 2010). These studies show work values to be partially invariant across three generational groups and thus support the idea that cross-generational comparisons are meaningful. Inferences from these studies are based on much stronger psychometric ground with respect to making comparisons; however, the approaches taken are still constrained by the potential confounding of age, period, and cohort effects.
In addition to measurement consistency, it is important to consider the nature of the samples used for generational research. Inferences about populations are only as sound as the sampling strategy of a given study. Rigorous approaches to the selection of sample subjects can strengthen the external validity of studies (or the degree to which inferences from a study can be extended to larger populations of interest). The issue of representativeness is critical to external validity. In the case of generational research, the relevant issues are whether the samples are truly representative of the generations of interest and whether the diversity of the population is represented in the samples. Given demographic shifts and changes, comparing samples of workers from the 1980s to the 2020s involves comparing samples that vary in terms of many characteristics, such as ethnicity, race, parental education, and income. It is important for the
existence of demographic differences and diversity to factor into the interpretations of differences among generations. Put simply, researchers need to consider how demographic differences may confound generational comparisons.
Although researchers sometimes use the phrase “representative” to apply to samples drawn from defined populations, such a term is probably best applied to the process used to generate a given sample (see Stuart, 1968, as cited in Pedhazur and Schmelkin, 1991). In classic survey methodology, samples are drawn from defined populations of interest because, while researchers are interested in drawing conclusions about a population, they often lack the time, money, or other resources to collect information from every element or member of that population. In fact, the use of inferential statistics eliminates the need to study all members of a population.
The issue with representativeness becomes how well the samples ultimately generated by researchers are representative of the population of interest. A convenience sampling strategy uses no randomization, simply taking advantage of accessible members of a population (e.g., employees willing to fill out a survey, college students enrolled in introductory psychology courses); therefore, this strategy can lead to sampling a biased subset of the population. Because there is no formal way to estimate sampling errors in convenience samples (Pedhazur and Schmelkin, 1991), it is impossible to estimate how well the characteristics of such samples reflect the attributes of the population of interest. In lieu of drawing on formal statistical principles, then, researchers must make educated guesses. The bottom line is that representativeness is unknown and unknowable when convenience samples are used.
Probability samples are usually more difficult to collect and require that researchers determine the odds that any element of a population will be selected for inclusion in the sample. The simplest case is when all elements have the same nonzero probability of being selected. There are, however, complicated sampling strategies involving stratification and over- and undersampling, which are widely used in polling applications and epidemiology. The virtue of these probability-based methods is that sampling errors can be calculated, placing inferences drawn from extrapolating the sample to the population on much stronger footing. Still, nonresponses can bias a sample if the chances of not participating in a sample (i.e., by refusing to consent or being unable to complete a survey) differ across different subsets of the population. Thus, even in ideal cases in which organizations use scientific sampling, questions about representativeness can remain.
The complexities of probability sampling and survey nonresponse are largely beyond the scope of this report. However, these issues are relevant to the analysis of the strength of the evidence for generational differences. Consumers of generational research thus need to evaluate whether issues
of sampling and representativeness are approached in thoughtful ways. For example, how well do samples represent the generations of interest? Does the literature consider other sample characteristics (e.g., sex, race/ethnicity, education level) or just birth years, given that such characteristics may moderate or be alternative explanations for observed effects? When researchers ignore sampling issues, deficiencies in the rigor of the work are likely.
Although most of the empirical studies reviewed in this report are quantitative, there are also a number of relevant qualitative studies (see Appendix A). The qualitative approaches entail analyzing data in the form of natural language (i.e., words) and expressions of experiences (e.g., social interactions). The various methods3 differ in representing a diversity of philosophical assumptions, intellectual disciplines, procedures, and goals (Gergen, 2014). Nevertheless, these methods all share an iterative process of evolving findings (e.g., as driven by induction) and viewing subjective descriptions of experiences as legitimate data for analysis (Wertz, 2014).
Using an iterative process to draw inferences means that researchers tend to analyze data by identifying patterns tied to instances of a phenomenon and then developing a sense of the whole phenomenon as informed by those patterns. Seeing the patterns can shift the way the whole is understood, just as seeing a pattern in the context of a whole phenomenon can shift the way the pattern is understood (Levitt et al., 2018). These iterations are self-correcting; as new data are analyzed, the analysis corrects and refines the existing findings.
Among the 29 studies reviewed by the committee that use qualitative approaches to assess generational characteristics, 15 explicitly compare different generation groups. The sampling methods include purposive and convenience sampling. With purposive sampling, the sample is chosen for purposes of maximizing variability, generating typical or critical cases, covering extreme/deviant situations, and gauging expert opinions. The sample sizes range from single digits (e.g., in case studies) to more than 100 (e.g., in larger interview studies and discursive analyses). When convenience sampling is used, researchers often explicitly justify the legitimacy of that sampling in their specific research context. When researchers have the goal of identifying generation differences in certain domains, they often sample based on the birth cohort categorizations of various generations. Neverthe-
3 A range of qualitative analytic approaches—such as narrative, grounded theory, phenomenological, critical, discursive, case study, and thematic analysis approaches—have been used in the generational literature (Lichtman, 2014).
less, as discussed above, it is clear that such sampling cannot separate age effects from the intended generation effects.
In the generational literature, researchers typically use qualitative methods to answer two broad research questions: (1) Do generational differences exist in certain attributes, behaviors, attitudes, or values? and (2) Do people perceive generational differences in certain attributes, behaviors, attitudes, or values? The main qualitative data collection method used in the literature to address the first of these questions is interview. Assuming the interviewees represent the intended generational groups (either through interviewees’ self-identification or through arbitrary categorization based on the span of birth years), researchers derive the attributes, behaviors, attitudes, or values of interest from the interview responses and compare them across the intended generational groups. In addition to the potential methodological issues involved in analyzing interview responses (e.g., interpretation bias, coding unreliability), an obvious issue with this approach is that neither self-identification with the targeted generational groups nor arbitrary categorization based on the span of birth years can rule out the confounding effects of age and period discussed earlier. Accordingly, even if systematic differences are seen in interview responses across the intended generational groups, it is unclear whether those differences are due to generation, age, or period effects. This methodological weakness due to grouping applies as well to other qualitative data collection methods (e.g., observation, focus group discussion, document analysis). Therefore, qualitative methods do not offer sufficient internal validity in addressing the first research question above.
The main qualitative data collection methods used to address the second research question include interview, focus group discussion, and document analysis. Given the focus of this research question on the perception of generational differences, participants in interviews and focus group discussions need to be made aware of the concept of generations before they report the differences they perceive. The issue here is that the generation-related information given to participants may influence how they retrieve their memories and experiences or form their impressions and judgments. When document analysis is used, this issue is of less concern because the document content is typically archival in nature and is generated independently from the research purpose. Regardless of the qualitative methods used, however, if the sampling coverage is narrow, any findings about people’s perceptions of generations cannot be generalized to all people and may therefore better be treated as preliminary and used to inform subsequent quantitative investigations.
To facilitate communication of the findings obtained with qualitative methods, it is best practice for researchers to describe the origins or evolution of their data collection protocol so that other researchers can
assess how the concept of generations was introduced to study participants and make judgments about interpretations of the findings. Further, researchers are advised to explicate in detail the process used for analysis, including some discussion of the procedures involved (e.g., coding, thematic analysis), adhering to the principle of transparency (Levitt et al., 2018). This discussion also would include describing coders or analysts and their training, as well as what software was used for the data analysis. It is important to identify clearly whether coding categories emerged from the analysis or were developed a priori. Triangulation across multiple sources of information, findings, or investigators is typically viewed as desirable in terms of generating strong support for the research claims. However, the committee found very little application of these best practices in the qualitative studies in the generational literature.
Qualitative methods can be used to achieve such research goals as developing theory, hypotheses, and attuned understandings; examining the development of a social construct; and illuminating social discursive practices (i.e., the way interpersonal and public communications are enacted) (Levitt et al., 2018). It is the committee’s belief that in research on worker attitudes and behaviors, the continued use of qualitative methods is to be encouraged. Because of the limitations discussed throughout this chapter, qualitative studies cannot verify the existence of generational differences. When appropriately designed and documented, however, they can help advance understanding, for example, of such work-related constructs as job satisfaction, as well as of generational perceptions that affect workplace behaviors. (See the further discussion of alternative perspectives for future research in Chapter 5.)
According to Creswell (2015), the use of mixed methods involves (1) collecting and analyzing both qualitative and quantitative data in response to overarching research aims, questions, and hypotheses; (2) using rigorous methods for both qualitative and quantitative research; (3) integrating or “mixing” the two forms of data intentionally to generate new insights; (4) framing the methodology with distinct forms of research designs or procedures; and (5) using philosophical assumptions or theoretical models to inform the designs. The committee’s review of the generational literature revealed six studies employing both quantitative and qualitative methods. However, these studies appear to have the same weaknesses identified above—insufficient internal and external validity in both the qualitative and quantitative portions to justify inferences about generational differences. Although rarely used appropriately in generational research, however, mixed-methods approaches could lead to additional insights not gleaned
from qualitative or quantitative findings alone (Creswell, 2015). The value of using mixed methods accrues from the integration of qualitative and quantitative findings in a thoughtful way that leads to greater mining of the data and enhanced insights. In principle, the use of mixed methods has the potential to lend credibility and robustness to research designs.
Since the late 1990s, the number of empirical studies on generational differences in work values/attitudes has increased dramatically. These studies generally use birth cohorts to define generations and draw on popular labels to categorize groups in their samples. While popular notions of generations have become broadly familiar, the wide range of birth years used to identify various generational groups indicates a lack of consensus on how generations should be operationalized in research.
Most generational researchers have approached the empirical study of generational differences with the underlying assumption that the overall concept of “generations” is valid. They take at face value that an undefined set of shared experiences—social, political, cultural, and historical influences—have shaped the attitudes/values to be measured. To date, however, little theoretical or empirical justification has been offered to clarify the events and shared experiences assumed to define a generation. At best, the work usually is purely descriptive.
This research has been motivated by a desire to understand generational shifts in the workforce and their impacts on such employment practices as recruitment, retention, and training. While this is a worthwhile research pursuit, the existing generational literature has a number of limitations: (1) untested assumptions and conceptual variations regarding the concept of generation; (2) overreliance on cross-sectional studies and convenience samples, which have relatively weak internal and external validity with regard to the objectives of identifying generational differences and generalizing findings to all members of each generation; and (3) statistical challenges in separating out age, period, and cohort effects, even with the more rigorous research designs. Together, these limitations call into question whether researchers can draw sound inferences from the existing literature.
Conclusion 4-2: The body of research on generations and generational differences in the workforce has grown considerably in the past 20 years. Despite this growth, much of the literature suffers from a mismatch between a study’s objectives and its research design and underlying data, which threatens both the internal and external validity of the work. The research designs and data sources rely too heavily on
cross-sectional surveys and convenience samples, which limits the applicability and generalizability of findings.
While much of the literature reviewed by the committee relies on one-time, cross-sectional surveys whose results confound age and cohort effects, some researchers have used multilevel models, discussed above, to distinguish cohort effects from age and period effects (e.g., Donnelly et al., 2016; Jürges, 2003; Kalleberg and Marsden, 2019; Koning and Raterink, 2013; Kowske, Rasche, and Wiley, 2010; Leuty and Hansen, 2014; Lippmann, 2008). Kalleberg and Marsden (2019) illustrate such a statistical approach to disentangling the effects of age, historical time period, and generation (i.e., cohort differences) on changes in work values in the United States. They use data from the GSS (1973–2016) and the International Social Survey Program (ISSP) (1989, 1998, 2006, 2016). These datasets consist of information collected from multiple cross-sectional samples designed to represent the U.S. population in the various years and thus provide repeated value measurements across ages and time. The authors analyze these data using hierarchical logistic regression analyses in which period and cohort differences are modeled using random effects (i.e., a multilevel model applied to repeated surveys administered across multiple years). Work values are conceptualized in two different ways given the data available in the two surveys. The first (measured in the GSS datasets) involves work as a “central life interest,” with respondents being asked whether they would continue to work or stop working if they were wealthy enough to have that option. The second (measured in the ISSP datasets) entails asking respondents to rate the importance to them of different features of jobs (which are measured as single items) (ratings range from “not at all important” to “very important”). The job features measured are both extrinsic (security, high income, potential for advancement) and intrinsic (interesting work, opportunity to help others, opportunity to help society), as well as flexible hours.
Kalleberg and Marsden (2019) find little evidence for pronounced generational (i.e., cohort) differences in work values. While these differences may be statistically detectable, they are substantively minor. This finding suggests that much speculation about the distinctiveness of values—such as being self-absorbed and narcissistic (Twenge, 2006) or less concerned with career advancement than with achieving greater work–life balance (Jenkins, 2018) for particular generations lacks a strong empirical grounding, at least for the United States. Rather, these authors found that age differences were dominant in explaining differences in whether respondents would continue to work if they were wealthy enough not to have to do so. The idea that work is a central life interest declined by age until age 65, after which it increased somewhat. On the other hand, variations in the time
periods during which people live are most closely related to changes in the importance they place on the various facets of jobs. Thus since the 1990s, people in the United States have tended to place greater importance on jobs that provide security, high income, and more opportunities for advancement. These patterns are consistent with the view that these job features have become more difficult for workers to attain in recent years.
The authors of many studies that claim to support generational differences could not disentangle whether age differences, changes between time periods, or distinctions between generations were the root cause of observed effects. Because of inherent challenges in studying cohort or generation effects, many researchers may have misattributed their own findings or the findings of others to generational differences. In so doing, researchers themselves have helped precipitate the conclusion that younger generations of workers are somehow different from previous generations. For instance, the analysis of differences in work values by Twenge and colleagues (2010) is commonly interpreted as providing evidence for generational differences. Yet their analysis was limited to a comparison of 16-year-olds in three different decades. Consequently, although the study provides evidence for time-related differences in work values, it is possible that had older individuals also been sampled over time, the researchers might have observed the same changes in the older group. This observation would have indicated that changes between time periods, not generational differences, better explain the observed differences. In fact, these authors acknowledge the ambiguity in their results and point to period effects as an alternative explanation for their findings.
As reviewed above, a small subset of studies have used APC methods to examine work-related attitudes and values. The study by Kalleberg and Marsden (2019) provides analyses of some of the very work values reviewed in the aforementioned study by Twenge and colleagues (2010). However, when these authors used APC analyses, they found that observed changes were not a function of generational differences at all; instead, period effects were at the root of the changes. The contrast between these two studies is telling, showing that when more rigorous methods are used, what appears to be attributable to generation effects can actually be attributable to period effects. Unfortunately, very few studies examining worker attitudes and values have used APC methods.
Many more studies have used APC methods to disentangle these effects in other domains (e.g., in the examination of changes in self-esteem over time by Twenge and colleagues (2017). These studies typically find that when time-based changes are analyzed, the period effects are much greater than the generation effects, and when generation effects are present, they tend to be small. Given these findings, the use of APC models in future research examining changes in work-related variables is the best way to offer less unambiguous conclusions. However, use of this approach may
be constrained by issues with data availability. Likewise, it is important for researchers to specify carefully the statistical assumptions behind the multilevel model and to evaluate critically whether they are tenable. With respect to cross-temporal meta-analysis, this approach is imperfect in that it does not allow for the separation of period and cohort effects. However, it is a useful tool for determining whether a given construct has changed over time in general, and research examining psychological variables using cross-temporal meta-analysis continues to be useful for that purpose. Finally, cross-sectional studies with convenience samples have limited utility, and their findings cannot be used appropriately if the goal is to draw inferences about generational differences.
Recommendation 4-1: Researchers interested in examining age-related, period-related, or cohort-related differences in workforce attitudes and behaviors should take steps to improve the rigor of their research designs and the interpretation of their findings. Such steps would include
- decreased use of cross-sectional designs with convenience samples;
- increased recognition of the fundamental challenges of separating age, period, and cohort effects;
- increased use of sophisticated approaches to separate age, period, and cohort effects while recognizing any constraints on the inferences that can be drawn from the results;
- greater attention to the use of samples that are representative of the target populations of interest;
- greater attention to the design of instruments (e.g., surveys) to ensure that the constructs of interest (i.e., measured attitudes and behaviors) have the same psychometric properties across time and age groups; and
- increased use of qualitative approaches with appropriate attention to documenting data collection protocols and analysis processes.
This page intentionally left blank.