Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Appendix D Evaluating Multiple End Points Simultaneously in a Mixture of Three Antiandrogens: A Case Study An important step in risk assessment is the selection of end points for analysis. For single-chemical risk assessment, the most sensitive end point has often served as the basis of evaluation, although current guidance suggests a more nuanced approach (EPA 2002, 2005). Some authors have considered mod- els that combine multiple end points in the same model and therefore avoid hav- ing to select the most sensitive end point. For example, Sammel et al. (1997) used a latent-variable model for mixed discrete and continuous correlated out- comes in which the posterior estimate of the latent variable may be interpreted as a measure of severity. Other authors have used pseudolikelihood estimation when combining continuous and ordinal outcomes to simplify the numerical challenges of using a joint density (see, for example, Faes et al. 2004). One ad- vantage of pseudolikelihood approaches over conditional models is that estima- tion of a joint benchmark dose is possible; this lends itself to quantitative risk assessment (Geys et al. 1999, 2001; Regan and Catalano 1999; Faes et al. 2004). The issue is more complex for risk assessment of chemical mixtures. Al- though a general kind of risk, such as reproductive or developmental risk, may be clear, different chemicals in the mixture may be associated with different sensitive end points. Furthermore, when data on studies and chemicals are com- bined, there is no guarantee that the same end points were even measured or that the data are available. Such missing-data concerns may result in numerical diffi- culties in latent-variable and multivariate models. For those reasons and others, a composite score (see, for example, Moser 1991; McDaniel and Moser 1993; Moser et al. 1995, 1997; Shih et al. 2003; Coffey et al. 2007) that combines mul- tiple end points into a single score may be useful. The objective of this appendix is to illustrate the development of a com- posite score in the analysis of the effects of a mixture of three antiandrogens on male differentiation in rats (data from Hass et al. 2007; Metzdorff et al. 2007). 160
Appendix D 161 Five end pointsâanogenital distance (AGD), nipple retention (NR), and three organ weights (weights of the ventral prostate, seminal vesicles, and levator ani/bulbocavernosus muscles [LABC])âare assessed. Owing to the nature of the end points and the timing of their measurement, most pups were evaluated either for AGD and NR or for organ weights. The composite score adjusts for either case. As indicated in this example, the end points combined in the com- posite score may be different (for example, binary or categorical, count, or con- tinuous or interval variables). The approach used here is based on desirability functions. Desirability functions were first proposed by Harrington (1965) for use in optimizing the quality of a manufactured product that is measured by multiple end points. Harringtonâs approach is used to find the levels of the fac- tors that optimize the overall quality of the many end points (Derringer and Suich 1980; Derringer 1994). It has been widely adopted in manufacturing and among engineers involved in product optimization is the most popular method for simultaneously analyzing many outcomes (Wu 2005). The method has also been applied to the titration of multiple-drug regimens in medical research (Shih et al. 2003) and in dose-response modeling in toxicology studies (Coffey et al. 2007). Once a composite scoring method is specified, each animal is represented in the data analysis by a single score regardless of the number of variables measured. Dose-response curves are estimated for each chemical, and an addi- tivity model is estimated. In this study, a fixed-ratio mixture of the three chemi- cals was also experimentally evaluated. It is of interest to determine whether there is evidence of interaction in the region of the mixing ratio and, even if there is evidence of an interaction, how different the dose-response curve of the mixture is from that predicted by dose addition. METHODS Experimental data. Data, generously provided by Ulla Hass, are as described in Hass et al. (2007) and Metzdorff et al. (2007). In short, male sexual differentia- tion was studied after in utero and postnatal exposures to one or a mixture of three antiandrogens (vinclozolin, flutamide, and procymidone). The mixing ratio of the mixture was based on individual potencies for âcausing retention of six nipples in male offspringâ (Hass et al. 2007). Test chemicals and mixtures were administered by gavage to time-mated nulliparous, young adult Wistar rats from gestation day 7 to the day before expected birth and on postnatal days 1-16. Changes in AGD and NR in male offspring rats were evaluated. The ventral prostate, seminal vesicles, and LABC of one male per litter were excised and weighed. Composite score. A composite score was calculated based on the basis of the desirability-function method (see, for example, Harrington 1965; Coffey et al. 2007) for the five end points chosen for analysis. In short, for each variable, a
162 Phthalates and Cumulative Risk Assessment: The Tasks Ahead function is selected that transforms the observed response to a unitless score (0- 1) based on the appropriateness (or desirability) of the response. The individual scores are then combined into a single composite score by using the geometric mean, and a standard statistical analysis can be performed. This flexible ap- proach can handle multiple types of response variables and may include differ- ent desirability functions for each variable. Subjectivity in specifying the func- tions may be minimized by using consensus expert opinion. Each of the five variables of interest was transformed to a continuous de- sirability function, di, with values ranging from 0 to 1, where a value of 0 desig- nates the response as not at all desirable, and a value of 1 is assigned to the most desirable response. Although they are not included here, for categorical end points (such as a mild or moderate or severe histopathology score), a value of 0- 1 is selected for each category. For continuous end points (such as AGD and NR), the basic shape of the function is determined by whether one is trying to maximize or minimize the response or to aim for a range of target values (see, for example, Shih et al. 2003). For example, a larger AGD value is expected for males, so a âlarger is betterâ shape may be specified by using a logistic function: â1 â¡ â Y â a ââ¤ di ( max ) = â¢1 + exp â â i i â â¥ , â£ â bi â â¦ where Yi* + Yi* Y * â Yi* ai = and bi = i , Yi* < Yi*. 2 1 â Î³i 2ln( ) Î³i The parameter ai is an average of the upper (Yi*) and lower (Yi*) bounds of the response being targeted, bi controls the function spread, and Î³i is defined so that the desirability at Yi* equals Î³i and the desirability at Yi* equals 1-Î³i. A minimiz- ing desirability is obtained by reversing the sign of the exponential argument. A target desirability function can then be constructed by multiplying minimizing (di(min)) and maximizing (di(max)) desirability functions so that di = di(max)*di(min). The parameters ai, bi, and Î³i, allow flexibility in defining the desirability function and the degree of conservativeness to incorporate. The shapes of the individual desirability functions are provided in Figure D-1A-E. The asterisks represent observed data points. For AGD, a normalized score for the AGD index (Hass et al. 2007) was formed by using âmean AGD indices from unexposed male and female pupsâ to define the minimum (min) and maximum (max) responses. The normalized score was defined as
Appendix D 163 AGDindex â min AGD norm = . max â min Thus, a normalized value of 0 represents âcomplete feminizationâ and is associated with an undesirable response (di = 0); a normalized value near 1 represents the average unexposed-male AGD index, a desirable response (di = 1). The lower 1-percentile of the unexposed males had a normalized value of 0.56 with an interquartile range (IQR) of 0.24. The desirability function was selected so that a normalized value of 0.56 was assigned a score of 0.9; a nor- malized value of 2IQR below 0.56 (=0.08) was assigned a value of 0.1 (which equals Î³ in the notation above; see Figure D-1A). For NR, following Hass et al. (see Hass et al., Table 3), values of 1, 6, and 10 were considered low, medium, and high effects. A desirability function was selected (Figure D-1B) with assigned scores of 0.95, 0.66, and 0.24, respec- tively. Desirability functions for organ weights (ventral prostate, seminal vesi- cles, and LABC) in terms of percentage of control were also based on the lower 1-percentile of the unexposed group (di = 0.9) and 2IQR below the 1-percentile was assigned a value of 0.1 (Î³ in the notation above). The resulting desirability functions are provided in Figure D-1C-E. Those individual desirability functions were combined by using the geo- metric mean to arrive at a composite measure of overall desirability, D, so that D = (d1 Ã d 2 Ã ... Ã d k )1/ k , where k is the number of end points used in the calculation. Although they are not used here, it is also possible to assign different weights to the individual de- sirability scores: k 1/ â wj D = (d Ã d Ã ... Ã d ) 1 w1 w2 2 wk k j =1 . Construction of an additivity model. The general strategy for the analysis of the data was to use the single-chemical data to fit a nonlinear logistic additivity model for the mean composite score, that is, (1 â Î± ) Âµi = Î± + 1 + exp[ â( Î² 0 + Î² i x )] for the three single chemicals, where x represents dose. Following the âsingle chemical requiredâ method of analysis (see, for ex- ample, Casey et al. 2004), the additivity model was used to estimate the dose-
164 Phthalates and Cumulative Risk Assessment: The Tasks Ahead response relationship along the fixed-ratio ray of interest (in terms of total dose with mixing ratios ai) under the hypothesis of additivity: (1 â Î± ) Âµ add = Î± + c 1 + exp[ â ( Î² 0 + â Î² i xi )] i =1 (1 â Î± ) =Î± + c 1 + exp[ â ( Î² 0 + â Î² i ai t )] i =1 (1 â Î± ) =Î± + , 1 + exp[ â ( Î² 0 + Î¸ add t )] where t is total dose and c is the number of chemicals in the mixture (here, 3). The mixture data were also fitted to a nonlinear model in terms of total dose: (1 â Î± ) Âµ mix = Î± + . 1 + exp[â( Î² 0 + Î¸ mix t )] To control for litter effects, the dose-response data were analyzed with a gener- alized nonlinear mixed-effects model approach with litter as an added random effect. A quasi-Newton iterative algorithm (Proc NLMIXED in SAS; version 9.1) was used for estimation and inference. The test of additivity for the speci- fied mixing ratio is equivalent to testing coincidence between the two models for the mixture. Because the other parameters were assumed to be similar (Î± and Î²0), the hypothesis of coincidence is H 0 : Î¸ add = Î¸ mix , which can be tested by using a t test with the appropriate variance estimated with the multivariate delta method. RESULTS The first step in the analysis is to determine the shapes of desirability curves for each end point under consideration. To illustrate the approach, the summary statistics from the distribution of the unexposed animals (the 1 percen- tile and the 1 percentile minus 2IQR) were used to establish two points on the curve and thereby specify the shape with a logistic function. The resulting curves are shown in Figure D-1. From the curves, the observed data are trans- formed into desirability scores of 0-1, where a value of 1 indicates no toxicity and a value of 0 indicates the most severe toxicity. For example (Table D-1), a pup in the highest-dose group of the mixture had an observed AGD index of 6.6, which was transformed to 0.12 in a normalized form. From Figure D-1A, a nor-
TABLE D-1 Demonstration of Calculation of Toxicity Index for Three Rats in Control Group and Two Mixture Dose Groupsa From Mixture: Total Dose, From Mixture: Total Dose, From Control Group 109.19 mg/kg 106.19 mg/kg (Litter, 1; Block, 1; ID, 5) (Litter, 59; Block, 2; ID, 1) (Litter, 135; Block, 4; ID, 1) Observed Desirability Observed Desirability Observed Desirability End Point Response Score Response Score Response Score AGD_BW 10.5 0.99 NR â 6.6 0.14 (Norm_AGD) (0.88) (0.12) Nipples 0 0.97 NR â 12 0.12 Ventral prostate NR â 0.10 0.43 0.06 0.36 (%control) Seminal vesicles NR â 0.30 0.67 0.20 0.56 (%control) LABC NR â 0.40 0.80 0.22 0.50 (%control) Toxicity index 0.98 0.61 0.27 a âDesirability scoreâ can be read from Figure D-1 for observed response values. Observed responses are transformed to %control values for organ weights. Composite score is geometric mean of desirability scores of five end points, adjusted for cases with fewer scores. NR = not recorded. 165
166 Phthalates and Cumulative Risk Assessment: The Tasks Ahead malized AGD of 0.12 is associated with a desirability score of 0.14, indicating severe toxicity. For that pup, the calculations of the other four desirability scores followed in a similar manner. The end points demonstrating severe toxicity for the pup were AGD and NR, with scores of 0.14 and 0.12, respectively. The geometric mean of the five values resulted in a toxicity index of 0.27. Calcula- tions are also demonstrated for a pup in the control group and for a pup in a moderate-dose group. In those three rats, the toxicity index decreased as the dose of the mixture increased, indicating that toxicity increased with dose. Profile plots of the desirability scores of the five end points for each dose group of the mixture study are provided in Figure D-2. Each connected line segment across the end points represents the transformed data from a single pup. The desirability scores transform different end points (one normalized, one count variable, and three expressed in terms of percent control) into a unitless scale of 0-1 that can be compared across end points. The disconnected line seg- ments in the plots illustrate that most pups were either evaluated with AGD and NR or had organ weights measured. In general, the control group and lowest- mixture dose group (7.87 mg/kg) had little indication of toxicity in any of the end points. However, as the dose increased to about 20 mg/kg, there was an in- dication of worsening NR, AGD was affected at about 40 mg/kg, and organ weights were not highly affected until the dose was about 70 mg/kg. Similar plots are provided in Figure D-4 for each of the single-chemical dose-response studies. The toxicity of the single chemicals was similar to that of the mixture in that NR and AGD were more sensitive than organ weights as specified by the desirability functions. The composite score was calculated for each pup in the single-chemical and mixture studies by using the geometric mean of the individual desirability scores. The average litter responses across dose are displayed in Figure D-3 as asterisks. There is a clear dose-response relationship for each chemical and for the mixture. The nonlinear logistic model was fitted to these data, and the result- ing parameter estimates are provided in Table D-2. In general, the fit of the dose-response curves to each study is adequate (Figure D-3). The maximal- effect parameter (Î± in Table D-2) for the single chemicals and mixture was esti- mated as 0.287. All the slope parameters (Î²s in Table D-2) are negative and sig- nificant, indicating that as the dose increases there is an increase in toxicity (a lower value of the composite score). The additivity model and mixture model were fitted with a common maximal-effect parameter (Î±) and intercept parameter (Î²0), so a test of coinci- dence between the model for the mixture data and that predicted under additivity from the single-chemical data is a test for a difference in the slope parameters, Î¸mix and Î¸add (Table D-2). There is a significant difference in the slopes between
Appendix D 167 TABLE D-2 Parameter Estimates Based on Nonlinear Logistic Modela Chemical Parameter Estimate Standard Error p value Î± 0.287 0.015 <0.001 Î²0 3.05 0.151 <0.001 Vinclozolin Î²1 -0.036 0.003 <0.001 Flutamide Î²2 -0.821 0.068 <0.001 Procymidone Î²3 -0.045 0.003 <0.001 Î¸mix -0.086 0.006 <0.001 Additional Estimates Î¸add -0.055 0.004 <0.001 Î¸mix â Î¸add -0.031 0.004 <0.001 EDadd(2.5) 1.67 0.85 0.052 EDmix(2.5) 1.06 0.54 0.052 EDadd(2.5) â EDmix(2.5) 0.60 0.31 0.057 a Estimate for scale parameter was 0.02, and variance of random effect due to litter was 0.002, with 95% confidence interval of 0.001-0.003. Estimated dose-response curves are in Figure D-3. Fixed mixing ratios for mixture were Î±1 = 0.62, Î±2 = 0.02, and Î±3 = 0.36 for vinclozolin, flutamide, and procymidone, respectively. the two models (Table D-2 and Figure D-3D), with the mixture data demonstrat- ing a greater response (a lower composite score) than that predicted under addi- tivity. Although statistically significant, the difference between the two models is most notable in the higher dose range (Figure D-3D). The doses associated with an effect size of 2.5% for the two models are not significantly different (Table D-2). DISCUSSION A composite score was developed here for male differentiation for five end points by using a so-called desirability-function method. An advantage of using such a score in evaluation of mixtures is that end points may be combined across studies and chemicals by transforming all end points into a common unitless scale of 0-1. The subjectivity of the initial step of specifying the desir- ability shapes may be minimized by specifying values on the curves from sum- mary statistics in the control group or by using consensus from subject-matter experts (Coffey et al. 2007). Shih et al. (2003) reported on a simulation study demonstrating a degree of robustness in inference with moderate changes in the
168 Phthalates and Cumulative Risk Assessment: The Tasks Ahead shapes of the desirability functions. Furthermore, there is research being con- ducted to develop methods to optimize desirability-function shapes and their relative importance on the basis of an external empirical measure of severity (Ellis et al. 2008). However, reaching consensus on such issues is not trivial and would require substantial consultation if this method were to be used in a regula- tory setting. When many end points are of interest in evaluating risk posed by exposure to chemical mixtures, multiple statistical tests that may be performed can greatly inflate rates of type I error (concluding that there is an effect when there is none). Multiple comparison adjustments are often too conservative, for example, the Bonferroni correction, which leads to reduced power to detect effects of in- terest. Thus, use of a composite score focuses the inference to an overall effect and eliminates concern of multiple testing and inflated type I error rates (Coffey et al. 2007). Hass et al. (2007) reported that the effect of the mixture of three antian- drogens on AGD was predicted âfairly accuratelyâ by dose addition but that the effects on NR âwere slightly higher than those expected on the basis of dose addition.â Metzdorff et al. (2007) reported that the joint effect of the three antiandrogens on reproductive organ weights was dose-additive. Use of the composite score was driven largely by NR (Figure D-2) and resulted in evidence of a greater effect (lower composite score) of the mixture than that predicted by additivity. Thus, analysis with the composite score was in agreement with the general conclusions reported for the individual end points. A limitation of the analysis described here is that constant variance among the chemicals and dose groups was assumed. More general assumptions may be more appropriate, as evidenced by plots of the data (Figure D-3). A formal test of equal variance was not conducted. Another limitation of the approach is that the correlation among end points was not accounted for. Wu (2005) describes an extension based on the modified double-exponential desirability function that accounts for correlated multiple characteristics that may be useful in the setting described here. For general use of composite scores, further evaluation, discussion, and acceptance of the shapes of the desirability functions are necessary. The central motivation is to be able to use a composite score to represent the whole set of common adverse outcomes identified to be of interest for a mixture. For the il- lustration described in this appendix, the androgen-insufficiency syndrome was evaluated with five end points (AGD, NR, and three reproductive organ weights). The analysis described here is for illustration only; for general use, subject-matter experts would need to achieve some level of acceptance and vali- dation that the composite score did indeed represent the âwholenessâ of the syn- drome.
Appendix D 169 A B
170 Phthalates and Cumulative Risk Assessment: The Tasks Ahead C D
Appendix D 171 E FIGURE D-1 Desirability curves for AGD, NR, and organ weights (ventral prostate, seminal vesicle, and LABC). Asterisks represent observed data points. A Desirability End point End point
172 Phthalates and Cumulative Risk Assessment: The Tasks Ahead B Desirability End point C Desirability End point End point
Appendix D 173 D Desirability End point End point E Desirability End point
174 Phthalates and Cumulative Risk Assessment: The Tasks Ahead F Desirability End point FIGURE D-2 Profile plots for individual pups (connected line segment) in each dose group of mixture data. A
Appendix D 175 B C
176 Phthalates and Cumulative Risk Assessment: The Tasks Ahead D FIGURE D-3 Average calculated toxicity index (composite desirability score) per litter vs dose of three single chemicals and mixture. A Desirability End point
Appendix D 177 B Desirability End point C Desirability End point
178 Phthalates and Cumulative Risk Assessment: The Tasks Ahead D Desirability End point E Desirability End point
Appendix D 179 F Desirability End point G Desirability End point
180 Phthalates and Cumulative Risk Assessment: The Tasks Ahead H Desirability End point I Desirability End point
Appendix D 181 J Desirability End point K Desirability End point
182 Phthalates and Cumulative Risk Assessment: The Tasks Ahead L Desirability End point M Desirability End point
Appendix D 183 N Desirability End point O Desirability End point
184 Phthalates and Cumulative Risk Assessment: The Tasks Ahead P Desirability End point Q Desirability End point
Appendix D 185 R Desirability End point S Desirability End point
186 Phthalates and Cumulative Risk Assessment: The Tasks Ahead T Desirability End point U Desirability End point FIGURE D-4 Profile plots from the single chemical dose-response data.
Appendix D 187 REFERENCES Casey, M., C. Gennings, W.H. Carter, V.C. Moser, and J.E. Simmons. 2004. Detecting interaction(s) and assessing the impact of component subsets in a chemical mixture using fixed-ratio mixture ray designs. JABES 9(3):339-361. Coffey, J.T., C. Gennings, and V.M. Moser. 2007. The simultaneous analysis of discrete and continuous outcomes in a dose-response study: Using desirability functions. Regul. Toxicol. Pharmacol. 48(1):51-58. Derringer, G. 1994. A balancing act: Optimizing a productâs properties. Quality Progress 27(June):51-58. Derringer, G., and R. Suich. 1980. Simultaneous optimization of several response vari- ables. J. Qual. Technol. 12:214-219. Ellis, R., C, Gennings, J. Benson, and B. Tibbetts. 2008. Validation of a Morbidity Score in a Study of Botulism Toxin A. The Toxicologist 102(S1):240[Abstract 1165]. EPA (U.S. Environmental Protection Agency). 2002. A Review of the Reference Dose and Reference Concentration Processes. Final report. EPA/630/P-02/002F. Risk Assessment Forum, U.S. Environmental Protection Agency, Washington, DC. De- cember 2002 [online]. Available: http://cfpub.epa.gov/ncea/cfm/recordisplay.cfm? deid=55365 [accessed Oct. 9, 2008]. EPA (U.S. Environmental Protection Agency). 2005. Guidelines for Carcinogen Risk Assessment. EPA/630/P-03/001F. Risk Assessment Forum, U.S. Environmental Protection Agency, Washington, DC. March 2005 [online]. Available: http://cf pub.epa.gov/ncea/cfm/recordisplay.cfm?deid=116283 [accessed Oct. 9, 2008]. Faes, C., H. Geys, M. Aerts, G. Molenberghs, and P.J. Catalano. 2004. Modeling com- bined continuous and ordinal outcomes in a clustered setting. JABES 9(4):515- 530. Geys, H., G. Molenberghs, and L.M. Ryan. 1999. Pseudolikelihood modeling of multi- variate outcomes in developmental toxicology. J. Am. Stat. Assoc. 94(447):734- 745. Geys, H., M.M. Regan, P.J. Catalano, and G. Molenberghs. 2001. Two latent variable risk assessment approach for mixed continuous and discrete outcomes from devel- opmental toxicity data. JABES 6(3):340-355. Harrington, E.C., Jr. 1965. The desirability function. Ind. Qual. Control 21(10):494-498. Hass, U., M. Scholze, S. Christiansen, M. Dalgaard, A.M. Vinggaard, M. Axelstad, S.B. Metzdorff, and A. Kortenkamp. 2007. Combined exposure to anti-androgens exac- erbates disruption of sexual differentiation in the rat. Environ. Health Per- spect.115(S1):122-128. McDaniel, K.L., and V.C. Moser. 1993. Utililty of a neurobehavioral screening battery for differentiating the effects of two pyrethroids, permethrin and cypermethrin. Neurotoxicol. Teratol. 15(2):71-83. Metzdorff, S.B., M. Kalgaard, S. Christiansen, M. Axelstad, U. Hass, M.K. Kiersgaard, M Scholze, A. Kortenkamp, and A.M. Vinggaard. 2007. Dysgenesis and histologi- cal changes of genitals and perturbations of gene expression in male rats after in utero exposure to antiandrogen mixtures. Toxicol. Sci. 98(1):87-98. Moser, V.C. 1991. Applications of a neurobehavioral screening battery. Int. J. Toxicol. 10(6):661-669. Moser, V.C., B.M. Cheek, and R.C. MacPhail. 1995. A multidisciplinary approach to toxicological screening. III. Neurobehavioral toxicity. J. Toxicol. Environ. Health 45(2):173-210.
188 Phthalates and Cumulative Risk Assessment: The Tasks Ahead Moser, V.C., G.C. Becking, V. Cuomo, E. FrantÃk, B.M. Kulig, R.C. MacPhail, H.A. Tilson, G. Winneke, W.S. Brightwell, M.A. De Salvia, M.W. Gill, G.C. Haggerty, M. HornychovÃ¡, J. Lammers, J.J. Larsen, K.L. McDaniel, B.K. Nelson, and G. Os- tergaard. 1997. The IPCS collaborative study on neurobehavioral screening meth- ods: V. Results of chemical testing. Neurotoxicology 18(4):969-1056. Regan, M.M., and P.J. Catalano. 1999. Bivariate dose-response modeling and risk esti- mation in developmental toxicology. JABES 4(3):217-237. Sammel, M.D., L.M. Ryan, and J.M. Legler. 1997. Latent variable models for mixed discrete and continuous outcomes. J. Roy. Stat. Soc. B 59(3):667-678. Shih, M., C. Gennings, V.M. Chinchilli, and W.H. Carter Jr. 2003. Titrating and evaluat- ing multi-drug regimens within subjects. Stat. Med. 22(14):2257-2279. Wu, F.C. 2005. Optimization of correlated multiple quality characteristics using desir- ability function. Qual. Eng. 17(1):119-126.