Robert Ursano (Uniformed Services University of the Health Sciences) began his presentation by noting that SAMHSA has a requirement to collect data for a specific purpose, and that task is different from the goals of many researchers, which is to understand everything there is about trauma and stressor-related disorders. Nonetheless, he argued, consideration should be given to quick turnaround surveys in response to national threats and disasters, which could assess the impact, including PTSD, on affected communities. A large-scale survey such as the National Survey on Drug Use and Health would not be nimble enough to accommodate these types of needs as they arise, but Ursano said that he believes that federal agencies are in a better position than others to address such needs.
Ursano said that there are a range of outcomes that are relevant to the discussions of trauma and are important to measure, whether the goal is in-depth research or to estimate the number of people in need of services. He approached the discussion of key concepts from several different perspectives. In terms of mental health responses to trauma, disasters, and public health emergencies, the most prevalent distress responses to trauma exposure are a sense of vulnerability, change in sleep, irritability and distraction; belief in exposure; multiple idiopathic physical symptoms and multiple unexplained physical symptoms; and isolation. He pointed out that irritability is important to measure as a separate dimen-
sion because of associated outcomes, such as increased motor vehicle accidents, family violence, and suicide.
Ursano said that in most studies the emphasis is on psychiatric illnesses, such as PTSD, depression, and complex grief. From among these outcomes, grief is the least often included, but it is an important outcome in the context of trauma. In looking at health risk behaviors, such as alcohol and drug use, Ursano emphasized that the question that needs to be asked is not whether someone is addicted to alcohol or drugs, but rather whether they have increased their use in the recent past, such as in the past week.
Ursano said that it is well documented that the greater the exposure to traumatic events in a community, the more psychiatric causalities there are. He also noted, as have other workshop participants, that research has shown that the more potentially traumatic events a person is exposed to, the higher the likelihood of developing a disorder. However, he noted that it is still important to measure levels of exposure and prevalence rates for disorders. He said that he liked the emphasis on potentially traumatic events by Dean Kilpatrick (Medical University of South Carolina) (see Chapter 2): to really understand outcomes, it is important to study the events and what happens in response to those events. He likened trauma to toxic exposure, with the need to understand the toxin. He noted that the DSM-5 revisions were an important step forward for research in this area.
To meet its goals, Ursano said that SAMHSA will need to collect dimensional, as well as categorical, data. In other words, as he had mentioned, it is necessary to understand whether a person has altered her or his drinking pattern, in addition to understanding whether the person meets criteria for alcohol addiction. He also emphasized the need to measure the “cascade of adversities” a person may be facing after exposure to a potentially traumatic event, in addition to measuring exposure to the event. Those adversities could include financial adversities or problems in the areas of housing, employment, or services.
Ursano also emphasized the importance of studying community-level resilience factors and exposures, whether that means a few blocks or a larger neighborhood. He noted that ZIP-code-level data already exist, and they can provide contextual information to understand potentially traumatic events and the associated morbidity and mortality. There are also contextual issues at the family level that could be important to capture. One example is the increased rates of child neglect in U.S. Army families that have been noted by researchers during recent wars.
Discussing the range of psychosocial responses to trauma and disaster, Ursano noted that there are many that warrant consideration for
measurement, including some that are not typically included in studies of this topic. He listed the following for consideration:
- resilience and altruism
- sleep problems
- increased alcohol use and smoking
- anger at government
- social isolation
- loss of faith in social institutions
Ursano pointed out that another way of thinking about what should be measured is from the perspective of health surveillance. If health surveillance is the primary goal, then the key measures may be different. For example, measuring distress and health risk behaviors rather than mental disorders may be more important. He reported that some of his research indicates that the question of whether a person has had difficulty balancing work demands with family concerns is a substantial predictor of the presence or absence of PTSD and depression. This question also provides data that can highlight a set of other potential needs in a family, which are not often assessed. Ursano summarized a potential list of post-disaster community mental health items as follows:
- psychiatric illness or symptoms
- health risk behaviors
- risk perception
- safety perception
- changes in behavior
- preparedness behaviors
Ursano also touched on the topic of resilience and listed the follow-
ing concepts that have been highlighted by Dennis Charney as relevant:1 optimism, recovery skills, self-regulation of emotions, attachment and social support, and altruism active or passive responses (instrumental). For example, knowing how optimistic an individual is or knowing the level of optimism in that person’s neighborhood or ZIP code can provide useful information about the person’s probability of recovering from a large-scale disaster event.
Collective efficacy, or the extent to which members of a community take care of each other, is another predictor of PTSD highlighted, Ursano noted. A study that looked at the probability of PTSD among Florida public health workers found that higher levels of collective efficacy at the community level were associated with lower probabilities of PTSD.2
Focusing specifically on the concept of PTSD, Ursano agreed with previous speakers that exposure to potentially traumatic events is very common. By a certain age, most people have experienced a potentially traumatic event in their lifetimes, and the question is whether that leads to chronic PTSD or not. Acute PTSD is frequent even in people without a psychiatric history, but rapid recovery is the norm. Ursano cautioned about focusing on only those with functional impairment because this approach would be similar to trying to understand cardiac disease by only studying people who have myocardial infarctions.
A research area of interest for the future is capturing the trajectory of PTSD, which Ursano noted due to its implications in terms of the need for interventions, but also as a measurement challenge. Although it is possible to ask three or four questions about how things were last month, the month before, and the month before that, understanding trajectories would ideally require a longitudinal study. For example, a four-wave study would enable researchers to classify people into groups with different trajectories and study predictors, such as the characteristics of the event.
As a final issue that is relevant to measuring exposure to traumatic events and PTSD, Ursano highlighted traumatic brain injury. He said that recent wars have underscored the importance of measuring traumatic brain injury, such as episodes or loss of consciousness or being dazed, as part of any data collection on this topic. In addition, data from the Army Study to Assess Risk and Resilience in Servicemembers show that about
1 Charney, D.S. (2004). Psychobiological mechanisms of resilience and vulnerability: Implications for successful adaptation to extreme stress. American Journal of Psychiatry, 161(2), 195-216.
2 Ursano, R.J., McKibben, J.B.A., Reissman, D.B., Liu, X., Wang, L., Sampson, R.J., and Fullerton, C.S. (2014). Posttraumatic stress disorder and community collective efficacy following the 2004 Florida hurricanes. PLoS ONE, 9(2), e88467. doi:10.1371/journal. pone.0088467.
one-half of all soldiers have had an episode when they lost consciousness due to traumatic brain injury before even joining the Army, often due to concussions. In other words, traumatic brain injury is widespread, and research has found that parts of the brain that are affected are related to the ones that are affected by PTSD. Many postconcussive symptoms also overlap with PTSD. In addition, there are associations of those symptoms not only with PTSD but also with generalized anxiety disorder, event-related panic disorder, and event-related depression, and understanding these connections can enable the better targeting of treatments. Ursano also pointed out that some studies have also found traumatic brain injury to be a predictor of suicide.
Robert Pynoos (University of California, Los Angeles) commented that studies have also examined the impact of traumatic brain injury among children and found that subsequent to episodes that involved loss of consciousness, young children’s IQs dropped by several points and stayed lower for more than a year before they recovered.
Terrence Keane (Boston University School of Medicine and U.S. Department of Veterans Affairs, National Center for Posttraumatic Stress Disorder) noted that in some cases traumatic brain injury and PTSD are associated with the same events, but in other cases they are not. Sometimes there is subsequent alcohol and drug use, and the associations among these outcomes are not always easy to tease out. Ursano responded that this highlights the need to measure health risk behaviors, not just disorders. In other words, if the interest is in morbidity and mortality, then the question is what is a person doing that has increased his or her risk of morbidity and mortality?
Terrence Keane discussed approaches to measuring exposure to trauma, PTSD symptomology, and subclinical PTSD. He noted that subclinical PTSD first became of interest as part of the National Vietnam Veterans’ Readjustment Study (NVVRS) because researchers noticed that there was a large group of people who did not fit the definition of PTSD because of a single criterion: avoidance. With national samples of Vietnam veterans and their peers, the NVVRS found that many of the participants had been involved in the antiwar movement and the veterans’ benefits movement, which meant that they were not avoiding thoughts of the traumatic events. Keane said that although subclinical PTSD was a useful concept at the time and it continues to be used in various ways, it is not clear that the concept is still useful today. It is possible that the use of the term is in fact contributing to confusion.
Keane noted that there are many different measures of trauma exposure and PTSD symptomology, and he pointed out that further information is included in his presentation slides and can also be found online. He then listed several measures that he considers to have acceptable reliability and validity for exposure and symptoms.
For exposure, the options for self-report measures include
- Traumatic Life Events Questionnaire
- Traumatic Events Questionnaire
- Trauma History Questionnaire
- Life Events Checklist
- Stressful Life Events Screening Questionnaire
- Traumatic Stress Schedule
- Trauma Assessment for Adults–Self Report
- The Life Stressor Checklist–Revised
- Trauma History Screen
- Brief Trauma Questionnaire
In terms of self-report measures for symptoms, one of the main considerations highlighted by Keane is whether the measure has been updated for the DSM-5. Keane noted that some of the most common measures have already been updated or are in the process of being updated. He highlighted three measures that are fully updated for the DSM-5: the PTSD Checklist, the Life Events Checklist, and the Primary Care PTSD screen.
Keane noted that he considers the World Health Organization Composite International Diagnostic Interview (CIDI) to be a very comprehensive measure of symptoms. He said that he also likes the approach used by the National Stress Events Survey approach, which was developed with Kilpatrick’s leadership. Keane then listed several additional symptom measures that are available:
- PTSD Checklist, Civilian
- Davidson Trauma Scale
- Posttraumatic Stress Diagnostic Scale
- Trauma Symptom Checklist
- Modified PTSD Symptom Scale
- National Women’s Study PTSD Module
- Purdue PTSD Scale–Revised
- Screen for Posttraumatic Stress Symptoms
- Self-Rating Inventory for PTSD
- CIDI–PTSD Module
- Impact of Event Scale–Revised
- PTSD Symptom Scale–Interview
- Symptom Checklist–90 PTSD Scales
- Penn Inventory for PTSD
- Los Angeles Symptom Checklist
- Trauma Symptom Inventory
- Distressing Events Questionnaire
- Posttraumatic Symptom Scale
- Minnesota Multiphasic Personality Inventory–2 Keane PTSD Scale
- National Stress Events Survey
- Harvard Trauma Questionnaire
- Revised Civilian Mississippi Scale
Keane pointed out that the earlier discussions highlighted the role of comparability among surveys. If researchers could agree on a reasonably standardized approach, the resulting comparability would have some advantages for everyone. However, it is important to note that some of the measures were developed with a focus on specific types of traumatic events, such as sexual assault or interpersonal violence. These measures may not work well in other contexts.
Keane said that a primary consideration when selecting a measure is the amount of time that can be allocated to administering the items and the topics covered by the other questions on the survey. Some of the relatively short screening instruments are the Traumatic Stress Schedule, the Traumatic Events Questionnaire, the Brief Trauma Questionnaire, the Trauma Assessment for Adults, and the Trauma History Screen.
As others have noted, exposure to traumatic events can lead to a range of outcomes. Keane said that researchers need to carefully consider the extent of psychopathology and the comorbidity they intend to measure. The related concepts of functioning, impairment, and quality of life are also important. Another decision that is needed prior to selecting a measure is whether the goal is to understand current symptoms, perhaps by specifying a time frame, such as the past month or past 3 months, or to understand lifetime symptoms.
Other considerations include the sensitivity and specificity of the measure and utility analyses more broadly, to determine whether the questions are measuring the concepts of interest to the researchers. Keane commented that one could debate whether a “gold standard” exists to evaluate the measures: he does not believe that there is one.
The mode of administration is another factor that needs to be considered when selecting a measure. If interviewers are to be used, one con-
sideration needs to be the time and cost involved in training them. For a national survey, this can be a large front-end expense. Keane said that he is becoming increasingly more convinced about the value of web-based self-administered approaches, such as the ones described by Kilpatrick, particularly because they enable increased standardization.
Web administration could also make it more feasible to design a longitudinal study and follow the same sample over time in order to collect data on trajectories, levels of recovery, and resilience. However, Keane noted, in order to implement a longitudinal study successfully, it is important to take into consideration how multiple administrations could affect the measure and whether any drift could be expected.
Benjamin Saunders (Medical University of South Carolina) commented that some of the measures reviewed have been developed for use in clinical settings, while others were developed for research purposes. The clinical measures tend to be the ones that are more concise, and this characteristic needs to be taken into consideration.
Terry Schell (RAND) agreed that some of the measures were developed to assess the severity of symptoms in a clinical setting, and although there may be scoring algorithms for evaluating sensitivity and specificity, these measures were not designed for probable diagnoses. He said that it is not clear how important it is to SAMHSA to collect data on diagnostic prevalence in contrast with obtaining a more in-depth understanding of the role of posttrauma mental health problems or psychopathology on a continuous scale. The use of the Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders in the Mental Health Surveillance Study would indicate that diagnosis was the sole topic of interest.
Larke Huang (SAMHSA) said that SAMHSA is looking at population health broadly and would like to understand the differences between people who are at risk, people who have mild problems, and people who have serious problems. Understanding comorbidities with the conditions that SAMHSA is mandated to look at is another important part of the current effort. Keane responded that it appears that to meet SAMHSA’s needs both dimensional and categorical measures would be needed. He added that it is well documented that trauma exposure among populations with serious mental illness is associated with more severe impairment; consequently, additional measures on serious mental illness are also important to include. Kilpatrick noted that it is also well known that people with serious mental health problems have higher rates of exposure to interpersonal violence.
Huang asked what is known about how two estimates of exposure from different studies fit together: those of exposure to certain types of traumatic events and those of exposure to any traumatic event. Kilpatrick
said that researchers’ understanding of what constitutes a traumatic event and the definitions have changed over the years. The DSM-III referred to a “psychologically distressing event that is outside the range of usual human experience,” but over time it became clear that these experiences are more common than previously assumed and that many people have been exposed to more than one traumatic event, some of which may be more toxic than others. It is also better understood that the effects are cumulative. This evolution explains in part the differences in estimates obtained by different studies, and it contributes to the increasingly more complex task of identifying the relevant events and then determining how multiple events may be related to outcomes of interest.
Keane added that many of the existing measures of exposure are dichotomous, and for each type of event, they simply ask whether it happened or not. Some of the most recent measures follow up each affirmative answer by asking how many times it happened and when was the last time it happened. These are useful additional dimensions to measure, although the number of times only shows a detectable difference in the data up to a certain point. Ursano commented that asking about how old the respondent was when the traumatic event was experienced would be the most useful addition to measures.
Jonaki Bose (SAMHSA) asked the participants to clarify whether there could be potential benefits to asking a very small number of questions on this topic, for example, by adding the items to the National Survey on Drug Use and Health. Some very short scales, such as the Primary Care PTSD Screen, do exist, but many of the comments seem to suggest that these would not provide adequate information for SAMHSA’s purposes. Schell responded that the Primary Care PTSD Screen does not collect any information about the potentially traumatic event, only about symptoms. Keane added that keeping the topic of trauma in the center of attention is valuable in and of itself, but a well-designed study would set the stage for really understanding the prevalence of exposure and responses.
Schell discussed procedures for developing, scoring, and evaluating the performance of a trauma scale. He began his presentation by saying that measures of trauma exposure are very different from scales that one might develop on other topics and that applying psychometric theory and techniques to trauma scales can be counterproductive.
As background, Schell provided an overview of the psychometric theory of reflexive, or effect-indicated, measures, which are measures with items that are theorized to share a common cause. The common cause is the construct to be measured, and the items reflect the influence
of the construct or are the effect of the construct. The items are correlated with each other because they have the same cause, but they may otherwise be very dissimilar. For example, weight loss and suicidal thoughts are sometimes included on the same depression scale because they are both considered to be manifestations of a problem in a person’s brain, but they are otherwise dissimilar. Most standard psychological measures are reflexive, and causal assumptions of this type are the basis of most psychometric analyses in general, including classical test theory, factor analysis, and item response theory.
Items in a reflexive measure are correlated due to their shared cause, and the quality of the measurement can be inferred from the correlation between the items. Schell noted that summing correlated items converges on an error-free measure of the common cause as the number of items goes to infinity, and the correlation between items approaches 1. This occurs because the interest is in the covariance term, not the variance. Schell said that for most scales, adding items leads to a better measure. In other words, one gets a better measure of the common cause if more items are averaged because as more items are added, the covariance of the items has increasingly bigger effect on the variance of the scale. He noted that reviewers of journal articles often ask authors to discuss Cronbach’s alpha, which is a measure of the extent to which the covariance terms dominate in the variance of the sum, which is a function of the number of items and the average correlation between them.
Advanced psychometric methods can enable a measure to converge to being error free more quickly than a simple sum of items: for example, one can give more weight in the sum to items that are more correlated with the other items or by subtracting out of the scale the portion of variance that appears to be unique to an item, in other words, the one that is not caused by the common cause. However, Schell pointed out that an error-free measure of the common cause of the items is not necessarily an error-free measure of the intended construct. For example, the causal model could be wrong, or the measure may be reliable but not valid. There may be multiple shared causes, some of which the researcher did not intend to measure. An example of this is response bias, such as order effects. Schell said that he and his colleagues have noticed strong order effects when examining the data from some of the common PTSD scales, such that each item is correlated more strongly with the next item than would be expected under any of the available models. These serial correlations can affect studies that use factor analysis.
Schell argued that exposure to traumatic events is not a reflexive construct because the goal is not to measure the common cause, but, rather, the opposite, to measure the common outcome. Exposure to a traumatic event can be described as a formative construct, a cause-indicated con-
struct, or a composite construct. The events can be very different, yet they are often combined by researchers because they are seen as a class of events.
Schell said that summing items creates a measure of the common cause, but this approach cannot be used unless the events are uncorrelated and equally predictive of the defined outcome. Because of that, he argued that summing up the items does not work for measuring exposure to traumatic events, even though it is commonly done in the field. A solution for scoring a formative scale is available on the basis of a theory that is applicable to life event scales. Schell noted that estimating the way to score the scale can be done through regression of that criterion on the items. This approach is particularly useful if a study included a measure of the effects of the exposure to a traumatic event, as specified by the theory. However, he added that this approach is rarely used, and he reiterated that key to its use is defining the criterion that the scale is supposed to predict first. Defining the scale is not possible without defining the criterion first, which in this case would be PTSD symptoms.
Schell said that the approach he described will weight the items in a way that helps figure out what combination best predicts the effect. For example, combat trauma could predict whether the person is in a wheelchair or not, and it could also predict whether the person has PTSD symptoms or not. However, the items are not weighted the same way for the prediction. This difference leads to essentially different scales, even with the same set of items. Schell suggested that instead of trying to think about items for traumatic events, it may be possible to think of a series of potentially traumatic events that have some relationship with PTSD symptoms, but that might have a different relationship with drug use, and a different relationship with other problems.
Schell pointed out that the concept of unidimensionality does not apply in this situation. In addition, each item is intended to have unique variance that is not error. In other words, one item may be useless for predicting PTSD but useful for predicting drug use. That situation would not mean that the item is filled with error: instead of thinking about items that are not correlated with the item total as bad items, in the context of trauma it is necessary to have items that are unique.
Another implication, Schell pointed out, is that in the case of formative scales that are theoretically defined by predictive criteria, the focus is generally on validity instead of reliability. Although test-retest reliability can be defined, it is usually not assessed. These types of items should not have high internal consistency reliability; otherwise they may be closing in on an error-free measurement of the same domain, rather than capturing different domains.
Schell summarized the three characteristics of formative scales in comparison with reflexive scales:
- optimally efficient when items/events are uncorrelated (which means that a shorter list could be used);
- less valid with higher values of Cronbach’s alpha when they have been scored as summed scales; and
- less influenced by items that are highly correlated with other items, rather than more influenced by them.
Schell acknowledged that formative scales are very difficult to work with. There is no reason to believe that the various indicators can be treated as homogenous with respect to a risk factor. Rape has one set of risk factors, for example, which are different than the risk factors for auto accidents. These kinds of differences mean that a trauma scale cannot be used as a good outcome in a causal model.
Keane asked whether there are exceptions, such as externality and high risk taking that could be considered latent variables underlying exposure to a variety of different types of traumatic events. Schell agreed that some types of exposure can have common causes, and impulsivity is an example of that. If the items on a trauma scale are summed, the result could be an impulsivity measure, and that is why they should not be summed. However, if some of the items have common causes, they can be clustered into highly correlated dimensions, but this clustering is probably not worth doing unless there is a theoretical reason for it.
Schell said that although a regression equation is a potential solution for scoring these types of scales, this is not always possible or desirable to use. There are several other approaches that may be reasonable. One would be to score according to the strict construct definitions, when they are available. For example, in a study that involved measuring sexual assault based on the Uniform Code of Military Justice definition of sexual assault, the researchers did not need a criterion variable to know how to score that because the definition was very specific.
Respondents could just be presented with a list of events and asked whether they had experienced any of them. However, Schell said, he does not consider this a good approach for measuring trauma because a good enough definition of trauma is not available to enable one to decide what should be on the list and what does not need to be included. Relying on findings from earlier studies is certainly a possibility, and SAMHSA could design a study that looked at a comprehensive list of potentially traumatic events and their characteristics and figure out how to combine them to predict PTSD symptoms, drug use, and other outcomes of interest. Based on those data, Schell said, it may be possible to develop an approach to
scoring the scales. Then that scoring could be used even in data collections that do not measure the criteria.
Schell said that the most common approach is to combine events without summing them, but he reminded participants that this could reduce the variance to the point at which it looks as though everyone has been exposed to trauma. This possible outcome illustrates an unavoidable tradeoff between the completeness of the trauma measure and its usefulness for any possible analysis.
A rarely utilized option is to minimize covariance before summing by dropping, combining, or down-weighting redundant items. For example, if data were collected on six items about sexual assault and they are all highly correlated, then one could review the covariance matrix and keep only the best item. Schell said that, theoretically, it would also be possible to keep the full set of items but weight them in a way that is inversely proportional to their covariances. He added that he does not know of any study that has used this approach.
Schell said that even when a theorized criterion has been measured, it may be desirable to use a unit-weighted summed scale. One approach would be to figure out which subset of items, when summed, is the best predictor of the outcome. This analysis would be similar to doing a regression in which the betas are constrained to either be a 1 or a 0, and it can be done if it does not involve a significant loss of power. The result is a shorter list of traumatic events that can be summed to produce a predictor of PTSD. However, Schell reminded the participants, there is no such thing as a single scale from formative items, and the same set of items might not work as a predictor for a different item.
Schell concluded with an example of a measure of combat trauma that he and his colleagues have used on several occasions in military populations. The measure was derived from an initial set of 30 items by identifying the items that, when unit weighted, were the best predictors of PTSD. This approach worked well in the initial context, but when they tried to use it to predict physical aggression in people’s homes, it no longer worked. The analysis showed that some deployment-related traumatic events have positive effects on violence in the home, while others have negative effects, so the measure that worked for PTSD did not work for this context.