Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Act justment of Observec! Intake Data to Estimate the Distribution of Usual Intakes in a Group An incliviclual's actual intake varies considerably from one clay to the next, but it is usual or long-term average intakes that are of interest in assessing and planning clietary intakes to ensure nutrient acloquacy for inclivicluals or groups. As explained in a previous re- port (IOM, 2000a), serious error in the assessment of nutrient inaci- equacy or excess can occur if the clietary intake ciata examined do not reflect usual intakes. This poses a major obstacle to the assess- ment of an incliviclual's nutrient intake because his or her usual intake is generally poorly estimated from only a few clays of observa- tion, yet more extensive ciata collection is rarely feasible. Assess- ments of nutrient acloquacy among groups are facilitated by the availability of statistical adjustment procedures to estimate the clis- tribution of usual intakes from observed intakes, as long as more than one clay of intake ciata has been collected for at least a repre- sentative subsample of the group. These procedures do not yield estimates of usual intake for particular inclivicluals in the group, but the acljusteci distribution of intakes is appropriate for use in analy- ses of the prevalence of inacloquate or excess intakes in the group. In recent years a number of different statistical procedures have been clevelopeci to estimate the distribution of usual intakes from repeated short-term measurements (Hoffmann et al., 2002~. Two commonly used adjustment procedures are clescribeci here: the Na- tional Research Council (NRC) method and the Iowa State Univer- sity (ISU) method. Both procedures are based on a common con- ceptual foundation, but the ISU method includes a number of statistical enhancements that make it more appropriate for use with 196
APPENDIX E 197 large population surveys. The NRC method is simpler and may be more appropriate than the ISU method for use with small samples (those with less than 40 to 50 inclivicluals). However, neither meth- oci is without limitations. THE NATIONAL RESEARCH COUNCIL METHOD Conceptual Underpinnings In assessing nutrient acloquacy it is necessary to estimate usual intake. However, usual intake cannot be inferred from measures of observed intake without error. For any one incliviclual, Observed intake = usual intake + measurement error The observe ci variance ~ V0bserved) of a clistribu tion of in take s for a group baseci on one or more clays of intake ciata per incliviclual is the sum of the variance in true usual intakes of the inclivicluals who comprise the group (e.g., the between-person or interindividual variance, Vbetween) and the error in the measurement of inclivicluals' true usual intakes. Error arises both because of the normal variation in inclivicluals' intakes from one clay to the next and because of random error in the measurement of intake on any one clay. It is referred to as the within-person, ciay-to-ciay, or intraincliviclual vari- ance ~ Vwithin) (NRC, 1986) ~ Vobserved Vbetween + Vwithin + Vunderreporting The observed distribution of intakes will be wider and flatter than the true distribution of usual intakes as a result of the presence of within-person variance. However, assuming that the within-person variation is random in nature, the estimate of mean intake for the group will not be influenced by this variance. If multiple days of intake data per individual are averaged, and the distribution of intakes in the group is constructed from the means of each incliviclual's multiple intakes, then the error variance (e.g., within-person variance) diminishes as a function of the num- ber of clays of intake ciata per person. Thus, as the number of clays of ciata per person increases, the distribution of observed intakes Expressed as the inclivicluals' observed mean intakes over the clays of ciata collection) becomes a better and better approximation of the true distribution of usual intakes in the group. The NRC method (NRC, 1986) is typically applied to a data set
198 DIETARY REFERENCE INTAKES comprising multiple clays of intake ciata for a sample of inclivicluals, ideally with an equal number of observations per incliviclual. This method of estimating the distribution of usual intakes works by first partitioning the observed variance into its between- and within-per- son components, and then shifting each point in the observed clis- tribution closer to the mean by a function of the ratio of the square roots of the between-person variance (VbeGween) and observed vari- ance (VObs~e~) In this way, the method attempts to remove the ef- fect of within-person variation on the observed distribution. The variance of the acljusteci distribution should represent VbeGween. Application The steps in the NRC method are outlined below. The method is illustrated using ciata on the zinc intakes of 46 women recorcleci over three, nonconsecutive, 24-hour clietary intake recalls (a sub- sample of women drawn from a earlier study by Tarasuk and Beat- on t19991~. Step I. Examine normality of distribution and transform data if necessary. This adjustment procedure clepencis on the properties of a nor- mal distribution, yet the observed distribution of intakes for most nutrients is likely to be positively skewoci. This is because the clistri- bution is naturally truncated at 0 (i.e., reported intakes cannot fall below this value) but has no limit at the upper encl. Thus it is imper- ative that the normality of the 1-clay intake ciata be assessed. (This can be accomplished through the NORMAL option in PROC UNIVARIATE in SAS.) If departures from normality are cletecteci, the ciata should be transformed to approximate a normal clistribu- tion. The most appropriate transformation will clepenci on the shape of the original distributions it may have a logarithm, square root, or cubed root relationship. Note that for this example, the assessment of normality is con- ducted on all 138 days of recall data (e.g., 46 women multiplied by 3 clays). The Shapiro-Wilk statistic, W. provides one measure of the normality of the ciata (Tarasuk and Beaton, 1999~. For the raw ciata, W= 0.85 (versus a value of 1 for normally clistributeci ciata), and the distribution departs significantly from normality (p < 0.0001~. A vi- sual inspection of the plotted ciata reveals that they are right-skewoci. Through a process of trial and error, a more normal distribution is achieved by applying a cubed root transformation to these ciata.
APPENDIX E 199 The W of the transformed ciata is 0.99 (p = 0.1812) . The next two steps in this adjustment procedure are conclucteci using the trans- formeci ciata. Step 2. Estimate the within- and between Person variance. Some statistical packages have procedures for partitioning the vari- ance of the observed ciata into the within- and between-person vari- ance components (e.g., PROC VARCOMP in SAS). This can also be easily accomplished using the analysis of variance procedures avail- able in most statistical packages by conducting a simple one-way AN OVA with subject ID inclucleci as a categorical or class variable. A sample program for SAS is presented at the end of this appendix. When the raw ciata are transformed to better resemble a normal distribution, this step is conclucteci on the transformed ciata. Two values are extracted from the AN OVA output. The mean square error or unexplained variance (e.g., the variance in the ob- serveci ciaily intakes that is not accounted for by between-subject clifferences) represents the within-subject variance in the 1-clay ciata. The mean square model (e.g., the mean square associated with the subject ID variable entered into the AN OVA) represents the ob- serveci variance of the 1-clay ciata. Because the adjustment proce- clure is applied to an incliviclual subject's mean intakes over the period of observation, both the mean square model and mean square error neeci to be clivicleci by the mean number of clays of intake data per subject to obtain the Vobserved and VwiGhin for this ciistri- bution (e.g., Vobserve`] = mean square moclel/n and VwiGhin = mean square error/e). VbeGween can be estimated by subtracting VwiGhin from Unobserved, as follows: VbeGween = (mean square model - mean square error)/n where n is the mean number of clays of intake ciata per subject in the sample. VbeGween represents the "true" variance of the distribution of usual intakes. Each of these variance estimates can be expressed as a stanciarci deviation by simply taking the square root of the vari- ance. Table E-1 presents the output for the AN OVA procedure as ap- plieci to this example. The mean number of clays of intake ciata per subject is three. In this example, VObserve`' = 0~24633584/3, VwiGhin = 0.13375542/3 and VbeGween = (0.246335840.13375542) /3.
200 DIETARY REFERENCE INTAKES TABLE E-1 AN OVA of Zinc Intake of 46 Adult Women, Shown for Data Transformed Using Cubed Roots Source Degrees of Sum of Freedom Squares Mean Square F Value Pr > F Model Error Corrected total 45 92 137 11.08511265 12.30549834 23.39061099 0.24633584 1.84 < 0.0069 0.13375542 Step 3. Adjust individual subjects' mean intakes to estimate the distribution of usual intakes. Each subject's mean intake is now acljusteci by applying the follow- ing formula: Acljusteci intake = Subject's mean - group mean) x (~DbeGween/ ~Dobserved) ~ + group mean where between iS the square root of VbeGween and Unobserved iS the square root of Vobserve~ This equation effectively moves each point in the distribution of observed intakes closer to the group mean, but it floes not shift the group mean. If the distribution of 1-clay ciata was transformed prior to partitioning the variance (Step 2), the equa- tion is applied to the individual subject and group means calculated from the transformed ciata (Step 3), and the resultant distribution neecis to be transformed back prior to use (see Step 4~. If the ciata were not transformed, however, the acljusteci intakes calculated from this equation now represent the estimated distribution of usual intakes. Step 4. If the original data have been transformed, transform the adjusted intake back to the original units. If the original ciata were transformed in order to satisfy the neces- sary assumption of normality, the adjusted data need to be trans- formeci back into the original units prior to their use for nutrient assessment. Back-transforming refers to the application of the inverse function of the original transformation. In this example, the original ciata were transformed using cubed roots; the back transformation raises subject's acljusteci intakes to the power of three. The process of transforming data, adjusting it, and then back-transforming it is
APPENDIX E 201 TABLE E-2 Observed Distribution of 3-clay Mean Zinc Intakes (mg) and Estimated (Acljusteci) Distribution of Usual Intakes for a Sample of 46 Women Standard 25th 50th 75th Zinc Intake Mean Deviation Percentile Percentile Percentile Observed 3-day means 8.84 3.58 6.11 8.49 10.97 Adjusted intake 8.03 2.20 6.58 8.15 9.33 necessary to preserve the shape of the original distribution for anal- ysis purposes while removing the within-person variance. Table E-2 presents a comparison of the distribution of the observed subjects' 3-clay means to the acljusteci intake. The variance of the acljusteci intake distribution is substantially less than the vari- ance of the distribution of the observed 3-clay means, as eviclenceci by the acljusteci intake's lower stanciarci deviation. In aciclition, the distance between the 25th and 75th percentiles of the acljusteci intake distribution is closer to its mean than that of the observed 3-day mean. If the Estimated Average Requirement (EAR) cut-point method is applied to the acljusteci distribution to assess the prevalence of inacloquate zinc intakes among this sample, an estimated 26 per- cent of women (12/46) appear to have inacloquate intakes (12 of the 46 acljusteci means were below the EAR for zinc for women of 6.8 mg/ciay). This is lower than the 28 percent prevalence of inacle- quacy that would be estimated from the unacljusteci ciata. Special Considerations Two features of the NRC method deserve special note because they pose challenges to analysts wanting to use this approach. First is the requirement for normally clistributeci ciata, and the second is the handling of incomplete data. Normality As noted earlier, the NRC method hinges on having normally clistributeci intake ciata or being able to transform the observed ciata into a normal distribution. If nonnormal data are not transformed prior to adjustment, or if the applied transformation fails to correct for the nonnormality of the ciata, then assessments of the preva-
202 DIETARY REFERENCE INTAKES fence of inacloquacy or excess using the acljusteci distribution will be inaccurate. Some inclication of the importance of this step comes from a closer look at the results of the adjustment procedure applied in the example presented above. Both the mean and the meclian of the acljusteci distribution are slightly lower than the mean and meclian of the women's 3-clay means (Table E-2), suggesting that the acljust- ment procedure has shifted the original distribution toward 0. This shift is a function of the transformation. Haci the transformation more completely achieved the properties of a normal distribution, the observed mean and the acljusteci mean would be equivalent. It may be difficult, if not impossible, to normalize some observed nutrient intake distributions with simple power transformations. Observed distributions of vitamin A, in particular, are notorious for this problem (Aickin and Ritenbaugh, 1991; Beaton et al., 1983) . In cases where the ciata fail to satisfy the assumptions of a normal distribution even when transformed, application of the NRC method and use of the resultant acljusteci distribution for nutrient assess- ment is problematic (Beaton et al., 1997~. Depending on the extent of the departure from normality, it may be preferable to not use the ciata for nutrient assessment. If assessments are conclucteci on ciata acljusteci without fully satisfying the normality assumption, at mini- mum, the problem should be noted so that reaclers can interpret prevalence estimates with greater caution. Hand;ting Incomp;tete Data The NRC method was originally developed for application to data sets with more than one clay of intake ciata per subject. In clescrib- ing the NRC method here, it has been assumed that an equal number of replicate observations are available for each member of the sample. If there are subjects missing one or more clays of intake ciata, this can be factored into the calculation of VbeGween, reducing the denominator of that equation. Nonetheless, it is assumed that few subjects fall into this category. In large dietary intake surveys it is increasingly common to collect two or more clays of intake ciata on a subsample of the larger sample and use the unclerstancling of within- and between-person variance cleriveci from this subsample to adjust the intake ciata of the entire sample. (The ISU method Fusser et al., 1996] is well suited to handling such ciata.) In surveys involving smaller samples, however, this practice is much less common. The application of estimates of within- and between-person variance from a subsample to the larger sample obviously presumes that the subsample is representative of
APPENDIX E 203 the larger sample with respect to all characteristics that affect these variance estimates. If starting with a smaller sample, this representa- tiveness may be more difficult to achieve through random sampling. With minor mollifications to the NRC method outlined here it is possible to derive variance estimates from a subsample and apply this information to adjust the 1-clay intake ciata for a larger sample. However, given the issue of representativeness, it is preferable to obtain two or more clays of intake ciata on all subjects in a small sample and use all subjects' ciata in the adjustment procedure. THE IOWA STATE UNIVERSITY METHOD Working in conjunction with the U.S. Department of Agriculture, a group of statisticians at ISU clevelopeci a method to estimate usual intake distributions from large clietary surveys (Nusser et al., 1996~. The method is implemented through a software package called SIDE (Software for Intake Distribution Estimation). It can be used to adjust observed intakes in large clietary surveys as long as two nonconsecutive or three consecutive clays of intake ciata have been collected for a representative subsample of the group. For a full discussion of the ISU method of adjustment, see Guenther and colleagues (1997~. Baseci on the NRC method, the ISU approach includes a number of statistical enhancements (Guenther et al., 1997) . Specifically, the ISU method is clesigneci to transform the intakes for a nutrient to the standard normal distribution, applying procedures that go beyond the simple transformations that analysts can apply in the NRC method. The distribution of usual intakes is then estimated from this distribution of transformed intake values and the esti- mates are mapped back to the original scale through a bias-acljusteci back transformation. The procedures represent a major advance over the NRC method and a number of other more complicated adjustment procedures that have been proposed (Hoffmann et al., 2002~. In addition, the ISU method is clesigneci to take into account other factors such as day of week, time of year, and training or conditioning effects (apparent in patterns of reported intake in relation to the sequence of observations) that may exert systematic effects on the observed distribution of intakes. The ISU method can also account for corre- lation between observations on consecutive days and for heteroge- neous within-person variances (e.g., in cases where the observed level of ciay-to-ciay variability in inclivicluals' intakes is directly associ- ated with their mean intake levels). While these refinements could
204 DIETARY REFERENCE INTAKES be built into the NRC method, in its simplest form the method floes not account for autocorrelation or other systematic effects on within-person variation. Another particularly valuable feature of the ISU method is its ability to apply sample weighting factors, common in large popula- tion surveys, so that the acljusteci distribution of intakes truly esti- mates the distribution of usual intakes in the target population, not just the sample. Thus the ISU method is well suited for use with large survey samples. In a recent evaluation of six different methods, Hoffmann and colleagues (2002) conclucleci that the ISU method haci distinct advantages over the others. Most importantly, the method was applicable across a broaci range of normally and nonnormally clistributeci intakes of food groups and nutrients. Despite its strengths, however, the ISU method may not be as appropriate as the NRC method for use with small samples. The greater complexity of the ISU method requires a larger sample to ensure that the various steps in the adjustment procedure retain acceptable levels of reliability. A smaller sample can be used with the NRC method because the adjustment procedure is more sim- plistic (e.g., applying simpler methods of transformation and back- transformation and not accounting for heterogeneity of within- person variance). OTHER CONSIDERATIONS IN THE APPLICATION OF ADJUSTMENT PROCEDURES Defining Groups for Data Adjustment Because nutrient requirements vary by life stage and gentler group, assessments of nutrient adequacy are usually conducted separately for particular subgroups of the population. The statistical adjust- ment of intake ciata whether clone by the NRC or ISU methoci- shoulci therefore also be conclucteci separately for each group for which the nutrient assessment will be conclucteci. If intake ciata have been collected across more than one life stage and gender group, it is not appropriate to combine subgroups for the purpose of adjust- ment and then later subclivicle the acljusteci ciata for separate analy- ses. Similarly, if the intencleci analysis of nutrient inacloquacy is by stratum within a single life stage or gentler group (e.g., the assess- ment of nutrient inadequacy for particular population subgroups clefineci by income or education levels), then the adjustment of intake ciata should be conclucteci separately for each stratum.
APPENDIX E Adjusting Intake Variables Expressed as Ratios 205 To assess the macronutrient composition of cliets and examine, for example, the proportion of energy cleriveci from saturated fatty acicis, it is necessary to examine the distribution of usual intakes for macronutrients expressed as ratios of total energy intake. The adjustment procedures clescribeci here can be applied to intakes expressed as nutrient:energy ratios or as nutrient:nutrient ratios. However, the ratio of interest should be computed for each clay of intake ciata first; the observed intakes are then acljusteci to estimate the distribution of usual intakes as ratios. For example, it is not appropriate to compute the acljusteci distribution of energy and fat separately and then combine these distributions for analytic purposes. Underlying Assumptions and [imitations of Adjustment Methods One important difference in application of the two methods clescribeci here is that the ISU method of adjustment is typically applied to the distribution of intakes on clay one of ciata collection, whereas the NRC method is applied to multiple-clay means. In the design of large clietary surveys it is becoming increasingly common to collect a second clay of intake ciata on only a subsample of the group. The ISU method is then applied to adjust the entire clistri- bution of intakes on clay one using the information about within- person variation that is gleaned from the subsample. In the application of the NRC method to smaller ciata sets, typically comprising multiple clays of intake ciata for each member of the sample, multiple-clay means are used as the basis for adjustment with the underlying assumption that all clays have equivalent validity. In ciata sets where a sequence effect is observed, with reported energy and nutrient intakes declining systematically across multiple clays of ciata collection (Guenther et al., 1997), the adjustment of intakes to clay-one ciata will result in a higher estimate of usual intake than an adjustment baseci on inclivicluals' multiple-clay means. If it can be assumed that intake on day one has been more accurately reported than on subsequent clays, then clearly the adjustment to day-one data will yield a less biased estimate of the distribution of usual intakes. Because good methods to establish the validity of self- reporteci intakes on particular clays of ciata collection are lacking, it is difficult to determine whether day-one data or multiple-day means are better estimates of true intake. Incleeci, the answer may differ clepencling on the particular group uncler study and the conditions of data collection.
206 DIETARY REFERENCE INTAKES Neither the NRC nor the ISU method of adjustment is capable of aciciressing problems of systematic bias clue to underreporting of intakes. The approaches must assume that inclivicluals have reported their food intake without systematic bias on clay one, at least, for the ISU method, and across all clays of ciata collection for the NRC method. If intakes have been unclerreporteci, the acljusteci clistribu- tion of intakes will be biased by this underreporting. Irrespective of the method of adjustment applied, it must also be assumed that reported food intakes have been correctly linked to a food composition database that accurately reflects the energy and nutrient content of the food. Systematic errors in the estimation of nutrient levels in foocis consumed will bias the estimated clistribu- tion of usual intakes. In the case of nutrients for which food compo- sition ciata are known to be incomplete, analysts must gauge the extent to which reported intakes will be biased. If intake cannot be estimated without substantial error, it is not appropriate to proceed with nutrient assessment. Despite these limitations, the adjustment of observed distributions of intake for within-person variance to better estimate the clistribu- tion of usual intakes in a group represents a critical step in the assessment of nutrient acloquacy or excess. In applying the steps in planning cliets for groups, as clescribeci in this report, the focus is on planning for usual intakes. The assessments of nutrient acloquacy and excess that are required to inform the planning process should be conclucteci on intake ciata that have been acljusteci to provide the best possible estimate of the distribution of usual intakes in the group.
APPENDIX E 207 SAMPLE SAS PROGRAM FOR THE NRC METHOD (Written by G.H. Beaton, University of Toronto, in December 1988 and modified in January 2002j This program runs an ANOVA, estimates the partitioning of variance, and calculates the between-person, within-person, and total standard deviations (e.g., SDINTER, SDINTRA, and SDTOTAL, respectively) for the data set at hand with these estimates. The program then adjusts the observed distribu- tion of mean intakes to remove remaining effects of within-person variation in intakes. The adjusted data can then be used as input data for the EAR cut- point or full probability assessment (IOM, 2000a). If the original data are transformed to better approximate a normal distribution, this program should be run on the transformed data and the final adjusted data back- transformed prior to the assessment of nutrient adequacy or excess. Note that the adjustments should be made independently for each stratification (e.g., males and females) and should be run on ratios after the ratio has been calculated. ** NOTE: THIS PROGRAM, AS WRITTEN, ASSUMES THAT THE ** ** INPUT DATA SET HAS ONE RECORD FOR EACH DAY OF ** ** INTAKE. IF MORE THAN ONE DAY OF INTAKE FOR EACH ** ** SUBJECT APPEARS IN A SINGLE RECORD, THE DATA SET ** ** WILL NEED TO BE REORGANIZED BEFORE THE PROGRAM ** ** IS RUN. ** PROC ANOVA DATA=YOURDATA OUTSTAT=ANOVSTAT; CLASS SUBIID; MODEL NUTFUENT=SUBJID; interest; DATA PARTIT1: SET ANOVSTAT; MS= SS/DF; MSERROR= MS; MSMODEL= MS; DFERROR= DF; DFMODEL= DF; IF_TYPE_= 'ERROR' THEN MSMODEL = .; IF TYPE = 'ANOVA' THEN MSERROR= . _ _ , IF_TYPE_ = 'ERROR' THEN DFMODEL = .; IF TYPE = 'ANOVA' THEN DFERROR= . _ _ , KEEP MSMODEL DFMODEL MSERROR DFERROR; PROC UNIVARIATE NOPRINT; a<< Change variable name to nutrient of continued
208 DIETARY REFERENCE INTAKES VAR MSMODEL DFMODEL MSERROR DFERROR; OUTPUT OUT=PARTIT2 MEAN = MSMO DEL DEMO DEL MSERRO R DFERROR; DATA PARTIT3; SET PARTIT2; MEANREPL = (DFMODEL+DFERROR+1 ) / (DFMODEL+1 ); ERRORDIF = MSMODEL - MSERROR; IF ERRORDIF LT 0 THEN ERRORDIF = 0; SDINTRA= MSERROR*~0.5; SDINTER = (ERRORDIF / MEANREPL) *~0.5; SDTOTAL= (SDINTER*~2 +(SDINTRA*~2/MEANREPL))~0.5; INDEX=1; KEEP SDINTER SDTOTAL INDEX; PROC MEANS NOPRINT DATA=YOURDATA; VAR NUTRIENT; BY SUBJID; OUTPUT OUT= SU BJMEAN MEAN= SMEAN ; DATA SUBJMEAN; SET SUBJMEAN; INDEX=1; PROC UNIVARIATE NOPRINT; VAR SMEAN; OUTPUT OUT=MEANS MEAN = GMEAN; DATA MEANS; SET MEANS; INDEX=1; DATA ADJUST; MERGE SUBJMEAN PARTIT3 MEANS; BY INDEX; NRCADJ = GMEAN + (SMEAN- GMEAN) ~ SDINTER/SDTOTAL; KEEP SUBJID NRCADJ; RUN; ** THIS IS NOW THE ADJUSTED ** ** DATA TO BE USED IN ANALYSIS ** ** NEED TO DO FOR EACH OF THE ** ** INTAKE VARIABLES IF THIS ** ** PROCEDURE IS TO BE EMPLOYED ** DATA FINAL; MERGE YOURDATA ADJUST; BY SUBJID; PROC PRINT; TITLE 'NUTRIENT DATA SHOWING INDIVIDUAL OBS, MEAN, NRC ADJUSTED': RUN;