Cover Image

HARDBACK
$42.50



View/Hide Left Panel
Click for next page ( 360


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 359
Computerizecl Performance Testing in Neurotoxicology: Why' What, How/ and Whereto? Francesco Gamberale, Anders Iregren, and Anders KjelIberg WHY MEASURE PERFORMANCE? Behavioral performance tests have been developed primarily to assess the psychophysiological efficiency of the individual. Most fre- quently, the tests have been used for diagnostic purposes in clinical contexts or for the purpose of personnel selection. During the last 15 to 20 years, however, performance tests have been applied with increasing frequency to assess functional changes in the central nervous system (CNS) induced by exposure to unfavorable work environmental con- ditions. Ever since the early 1970s, when the use of psychometric tech- niques made it possible to link together deterioration in human per- formance and the inhalation of solvent vapor (Astrand and Gamberale, 1978), psychometric tests have been widely and successfully used in many countries in the study of solvent toxicity (Anshelm Olson, 1985; Gamberale, 1985; Iregren, 1986b) as well as in the study of the toxic- ity of numerous other chemical compounds including anesthetic gases (Biersner, 1972), agricultural chemicals (Rodnitzky et al., 1975), and metals (Roels et al., 1987~. The growing interest in the measurement of performance is most probably due to the sensitivity shown by these methods in unveiling changes in the human organism that otherwise would not be de- tected. By now, the evidence that these changes are some of the earliest indicators of the occurrence of health effects has become un- 359

OCR for page 359
360 F. GAMBERALE, A. IREGREN, AND A. KIELLBERG equivocal. As a consequence, the measurement of performance has come to be regarded by many as a device of major importance for monitoring hazards to health and safety in the work environment. This development appears to agree with the ideas promulgated by the World Health Organization (WHO) that health does not mean only absence of disease but also optimum physical, mental, and social well-being and, moreover, that health means not only freedom from pain and disease but also freedom to maintain and develop one's functional capabilities. At our institute, the measurement of performance has undoubt- edly constituted the main method of studying the effects on the CNS of low dose exposure to the chemical substances that are frequently found in the work environment. Thus, we have applied performance tests of various kinds in experimental laboratory studies as well as in field studies and in cross-sectional epidemiological investigations. Through the years, we have made use of performance tests in experi- mental inhalation studies of industrial solvents such as toluene (Gamberale and Hultengren, 1972; Iregren, 1986a), methylchloroform (Gamberale and Hultengren, 1973), styrene (Gamberale and Hultengren 1974), white spirit (Gamberale et al., 1975a), methylene chloride (Gamberale et al., 1975b), trichloroethylene (Gamberale et al., 1976), xylene and ethylbenzene (Gamberale et al., 1978), methyl isobutyl ketone (Wigaeus- Hjelm et al., 1990), and toluene in combination with p-xylene (Anshelm Olson et al., 1985) or with ethanol (Iregren et al., 1986~. Psychometric tests have been applied directly at the worksite in two investigations of workers in the plastic boat industry exposed to styrene (Gamberale et al., 1976b; Kjellberg et al., 1979), in studies of steelworkers (Anshelm Olson et al., 1981) and of workers in the paint industry (Anshelm Olson, 1982) exposed to solvent mixtures, and in two investigations of nurse anesthetists exposed to anesthetic gases (Gamberale and Svensson, 1974; Kjellberg and Strandberg, 1979~. Furthermore, we have used comprehensive batteries of behavioral tests to investigate possible long-term effects of chronic exposure to organic solvents among car and industrial spray painters (Elofsson et al., 1980), workers in a jet motor factory (Knave et al., 1978), and rotogravure printers (Iregren, 1982~. Behavioral tests are also used to an ever-increasing extent to study the effects of work environmental conditions other than exposure to neurotoxic substances. Thus, unfavorable effects on performance of environmental factors such as noise, vibration, cold, heat, electric and magnetic fields, and physical work load have been demonstrated in laboratory experiments as well as in field studies. Our experience with the use of psychometric tests to study the

OCR for page 359
COMPUTERIZED TESTING IN NEUROTOXICOLOGY 361 effects of nonchemical agents in the physical work environment is relatively moclest. However, behavioral tests have been successfully applied at our laboratory in experiments on the effect on performance of exposure to noise (KjelIberg and Wide, 1988) and to moderate cold (Enander, 1987~. A series of experimental studies on the effects of different climatic conditions on performance is now in progress (Gamberale et al., 1988b). Finally, we have used psychometric tests to investigate the possible effects on workers of acute (Gamberale et al., 1988a) and chronic exposure (Knave et al., 1979) to electric and magnetic fields. HOW TO MEASURE PERFORMANCE? A working group within the WHO has recommended (WHO, 1987) a battery of tests to use in the search for neurotoxic effects in work- ing populations. The main criterion applied by the WHO in selecting the tests to be included in the battery was that the tests should have proven their sensitivity in empirical investigations. To facilitate a widespread application of the methods, no tests that required complicated technical equipment for their administration were included in the battery. A further requirement was that the tests should be selected among those commonly used by the clinical practitioner for diagnostic purposes. In practice, these requirements limited the choice to the tests in the Wechsler Adult Intelligence Scale (WAIS battery). Thus, no attention was paid to tests developed espe- cially for use in laboratory experiments and quasi-experimental field studies of the effects of exposure to neurotoxic substances. In our opinion the tribute paid to the clinical practitioner by selecting among traditional manual or paper-and-pencil tests; has had a negative effect on the sensitivity of the WHO battery to detect neurotoxic effects. Against this background it is difficult to understand why some groups working with the development of computerized tests fee! the need to refer to this WHO list of tests as a rationale for their test implementations (Cassito, 1985; Letz and Baker, 1986~. It is obvious that such a strategy leads to inadequate utilization of the possibilities offered by computerized testing. Another problem should be considered when implementing exist- ing traditional tests on computers, namely, that the correlations be- tween the results obtained with the two versions of the tests (i.e., the computerized and the paper-ancl-penci] tests) often are quite low. This low correspondence may be due to several inevitable differences between the resulting test protocols with regard to stimulus presen- tation as well as response input. To mention one example, Beaumont

OCR for page 359
362 F. GAMBERALE, A. IREGREN, AND A. KJELLBERG (1985) investigated the effects of various response modes on the re- sults in a computerized Digit Span test. He found substantial differ- ences between responses entered via the ordinary keyboard, an external keypad, and a touch sensitive screen. Several of the testing systems applied within the area of neurotoxicology use fairly simple psychomotor tasks, and reaction time or response latencies are generally used as the outcome variables. It has been argued, however, that performance on more complex cognitive tasks should be more sensitive to disruption by exposure to toxicants. Still, in the experience at our laboratory, a test of Simple Reaction Time (SRT) has proved to be generally the most sensitive test. The greater sensitivity demonstrated by the tests of relatively simple mental functions does not necessarily imply that these tests tap the CNS functions most vulnerable to neurotoxic substances. Instead this greater sensitivity may be due primarily to the higher reliability of these tests compared to tests measuring complex cognitive func- tions. A substantial contribution to the reliability of the tests of simple mental function stems from the fact that performance parameters in these tests are usually based on a large number of items. These circumstances concerning the sensitivity and reliability of different types of tests should be taken into consideration especially when analyzing results in terms of possible differential deficits (e.g., Chapman and Chapman, 1978~. Several groups have recently developed computer-based perfor- mance evaluation systems for use in neurotoxicology (e.g., Baker et al., 1985; Braconnier, 1985; Cassito, 1985; Eckerman et al., 1985; Iregren et al., 1985; Laursen and Jorgensen, 1985), and some laboratories use similar systems in related fields (e.g., Bittner et al., 1985; Irons and Rose, 1985~. Furthermore, there are of course several computer-based systems that have been applied in clinical use, two examples of which are those of Acker (1983) anti Beaumont and French (1987~. WHAT ELSE TO MEASURE? The methods of value in early detection of neurotoxic effects or of the effects on the CNS of exposure to other unfavorable physical environmental agents include, besides performance tests, neurophys- iological and neurological testing, as well as questionnaires for the assessment of subjective experience. In most investigations, a systematic collection of the subjects' experience when exposed to different experimental or occupational conditions may constitute an invalu- able source of information. Most questionnaires for this type of as- sessment can easily be computerized with maintained reliability and validity (see, e.g., Carr et al., 1981; Lucas, 1977~.

OCR for page 359
COMPUTERIZED TESTING IN NEUROTOXICOLOGY 363 With the Swedish Performance Evaluation System (SPES), it is possible to collect three types of self-report data: (1) symptoms of acute as well as long-term exposure, (2) self-rating of mood, and (3) self-rat- ing of performance. The first two types of data aim at the detection and description of possible environmentally induced changes in the subjects' perceptions of their physical and psychological states. The third type of perceptual data are motivational in character and refer to the subjects' motivation, confidence, and effort expended during testing. PROS AND CONS OF AUTOMATED TESTING Automated testing in general implies some advantages over tradi- tional paper-and-pencil testing, The automation of tests gives an excellent opportunity for strict standardization of test pro- cedures (e.g., instructions, test protocols, and evaluation variables), thus increasing the possibilities for comparisons across studies; . the ability to perform detailed measurement and analysis of single responses or response components this type of microanalysis has greatly enhanced the sensitivity of performance tests; and . increased precision in the measurement procedure; by reduc- ing the influence of the investigator, the reliability and the validity of the results are increased. Because an automated test may be administered by a technician or a nurse, there is the possibility of reducing the work load of psy- chologists, who therefore can make better use of their skills. It has also been suggested that automation of the testing procedure would render the testing situation less threatening. Fully computerized testing procedures provide some additional possibilities as well: Computers are flexible, and one system can be used to admin- ister a variety of tests, as well as to perform other routine tasks in the laboratory or clinic. Computers facilitate prompt scoring and evaluation of even very complex tests and questionnaires. Computers make it possible to adapt the choice of items ac- cording to the performance capacity of the individual. Computers offer communication possibilities, making data transference for statistical computations or other purposes very con- venient. Computers are transportable-and, in the extreme case, even portable, thus making field testing feasible.

OCR for page 359
364 F. GAMBERALE, A. IREGREN, AND A. KJELLBERG Some of the most frequently mentioned critical comments regard- ing computerized testing are (1) the supposedly poor rapport estab- lished between the subject and the machine; (2) the difficulties in testing large groups; (3) the static form of computerized approaches; (4) the restricted range of stimuli that can be presented; and (5) the restriction in the choice of response media. With regard to the rapport established, investigations pertaining to this problem generally indicate no difficulties (see, e.g., Carr et al., 1981; French and Beaumont, 1987; Lucas, 1977; or Lukin et al., 1985~. The "user friendliness" of a system is not dependent upon whether it is computerized or not, but on the careful design of the system (Heal et al., 1973~. As pointed out by Beaumont (1982), the single most important requirement for successful design is the predictability of the system. Thus, no action on the part of the subject should result in accidental termination of the test. Due to the still relatively high purchase costs of even a microcom- puter system, it is difficult to test large groups of people simultaneously. On the other hand, computerized tests often provide much more in- formation than traditional tests within a specified time period. Thus, the ability to perform simultaneous testing of large groups is not as important with computerized tests. The objection concerning the static form of computerized tests is by now invalid. There have been successful attempts at making "tai- lored" (i.e., adaptive) tests, where the test items administered are contingent upon the performance of the subject (see, e.g., Weiss and Vale, 1987~. One of the memory tests available with our system, a version of the Digit Span test, functions in an adaptive way. Until today, most efforts at the implementation of performance tests on computers have used visual stimuli, because the administra- tion of auditory or tactile stimuli, for example, requires the use of rather complicated (probably custom-made) external equipment in addition to the computer. Such additions would naturally make it very clifficult to standardize the tests to allow for widespread use. The few response media available present a similar problem with respect to the development of tests. Because input to a computer is normally made via the keyboard, most testing systems presently use this medium. Attempts to use other means of input usually also imply manual performance, as for example, with joysticks or touch- sensitive screens. At present, the choice of response mode is prob- ably the most limiting factor in the development of computerized tests because the response medium may easily affect test results in unintended ways. A fast and reliable system for processing speech input would in many instances provide the only acceptable solution.

OCR for page 359
COMPUTERIZED TESTING IN NEUROTOXICOLOGY 365 The restrictions regarding stimulus presentation and response in- put are, of course, steadily diminishing because continuous technical developments make computers increasingly competent. However, standardized tests using new technical possibilities (e.g., speech in- put) are still several years off because the development of well-stan- dardized tests is a laborious, long-term project. In spite of the restrictions mentioned, efforts at developing com- puterized tests and testing systems are steadily increasing, and this type of assessment is currently in use in a wide variety of settings. Computerization has made complicated psychometric techniques available even to persons lacking training as professional psychologists. Therefore, this trend accentuates ethical demands for control of the construc- tion, distribution, and use of these methods (Matarazzo, 1983~. The American Psychological Association (APA, 1986) and the British Psy- chological Association (Bartram et al., 1987) have published guidelines for this purpose. The experience gained at our laboratory in using computerized tests has generally been positive, although there are of course diffi- culties (Iregren et al., 1985~. One significant problem, which applies equally to traditional tests, relates to the time and effort needed for successful test development. Furthermore, computers are still technically complicated machines, thus requiring special skills of the psychologists and technicians engaged in this development. Several recent reviews have treated various aspects of computer- ized testing, and the reader is recommended those by Bartram and Bayliss (1984), McArthur and Choppin (1984), Space (1981), and Thompson and Wilson (1982~. DEVELOPMENTS IN SWEDEN Since 1970, the Division of Psychophysiology, National Institute of Occupational Health (NIOH), Solna, Sweden, has been concerned with the development of psychometric methods suitable for the study of adverse effects of environmental stressors, primarily neurotoxic sub- stances. The first test to be standardized for use in environmental research was an SRT test (Lisper and Kjellberg, 1972~. At the start, this test was administered with an electronic apparatus, consisting of a timer, a tape recorder, and a stimulus/response panel. The test was then implemented on other types of testing equipment, and it has been used in most of our investigations. Special equipment, de- veloped in 1973, was used solely in laboratory experiments. This apparatus comprised a paper tape controlled solenoid operated stimulus/ response panel and a-teletype printer, and it was used for the first

OCR for page 359
366 F. GAMBERALE, A. IREGREN, AND A. KIELLBERG time in an investigation of the toxicity of white spirit (Gamberale et al., 1975a). Besides the above mentioned SRT test, tests of Choice Reaction Time (CRT), numerical ability, and memory were imple- mented on this equipment. A further step in the development of testing equipment was taken by the introduction of a new stimulus/response panel in 1975. Stimuli were presented on eight rows of 32-LED displays, each capable of showing any alphanumeric character. Responses were entered on a full QWERTY keyboard. This equipment, which made possible the presentation of more complex stimuli as well as the registration of written responses, was used to test different cognitive functions. It was used in several cross-sectional epidemiological investigations on workers exposed to industrial solvents and electromagnetic fields (Elofsson et al., 1980; Iregren, 1982; Knave et al., 1978, 1979~. The major disad- vantages of this type of equipment were the laborious programming procedure, the fragility of the paper tape, and the time-consuming evaluation of the results. For a review of similar attempts at noncomputerized automated testing, see Denner (1977~. Some of these disadvantages could be overcome when computers became more easily available. However, early computers had other shortcomings with respect to automated testing. They were expen- sive and difficult to program or handle, and the access via time- sharing terminals made exact timing of response latencies impossible. With the advent of the microcomputer, new approaches became possible. For the first time, fully automated testing could be per- formed, with administration of instructions and test items, as well as response registration with precise timing of response latencies and data storing. New demands were made on our performance assessment methods bv the acquisition of an exposure chamber, which required a fully automated procedure in the solvent inhalation studies. The equipment used in these experiments consisted of a computer with a black-and-white monitor, a dual disk drive, a modified numerical keyboard, a reaction time panel, and a printer. Three performance tests were used with this equipment. The previously used SRT test and a test of short-term memory were adapted to this computer sys- tem, and a new test of CRT was developed. This system had several shortcomings owing to its technical limi- tations, e.g., poor picture quality due to low graphic resolution, and poor precision in the timing of response latencies. A small working memory and a slow basic language were also severely limiting factors with respect to test development. For a review of system requirements for computerized automated testing, see Beaumont (1982~. When a new generation of the same computer was introduced, the performance assessment system underwent further development. The .. ._ ., .~ ~~ ~ ~

OCR for page 359
COMPUTERIZED TESTING IN NEUROTOXICOLOGY 367 new computer was equipped with high-resolution color graphics, a more flexible working memory, and a basic language that facilitated the construction of long sequences of tests. The requirement of a timing accuracy of at least 1 ms was met by using an external clock with program routines in Assembler language. Several of the previ- ously used tests were implemented on this equipment (i.e., SRT, CRT, a memory test, and a test of numerical ability). Furthermore, new tests were developed for use with this computer, e.g., a Complex Reaction Time task using color words as stimuli. The performance assessment system was further improved in 1984 by using a later version of the computer, which was equipped with greater working memory for graphics and a high-quality color moni- tor. The number of tests currently available on this system is 14. The SPES has now been transferred to IBM computers to facilitate its use by other research groups. DESCRIPTION OF THE SWEDISH PERFORMANCE EVALUATION SYSTEM The SPES consists of a number of semiautomated computerized performance tests and various scales for the subjective evaluation of performance on the tests, of mood, and of different kinds of symp- toms. The system is designed to be dynamic and flexible. Thus, it allows the researcher or the practitioner to choose among the tests and the scales, adapting the battery to the specific purpose of the evaluation at hand. The system is also intended to undergo gradual improvement based on analyses of the results of ongoing empirical studies and on future experience with the use of the system. With few exceptions, the performance tests are nonverbal, i.e., they can be used to assess performance independent of the language of the subjects. Some of the tests, e.g., the Color word vigilance (SPES3:1), the Color Word Stress (SPES3:2), and the Verbal Reasoning test (SPES10) can be easily adapted to other languages and will only require trans- lation of the text files used by the programs. These text files, which are easily edited, contain all the verbal communication with the sub- ject (i.e., instructions on how to perform the test as well as the test items). The only test that requires a completely new construction and standardization if used with non-Swedish speaking subjects is the Vocabulary test (SPESll). Hardware Any IBM or IBM-compatible PC, XT, or AT, equipped with an external clock card (SB11 multifunction card, Emulex Corp., 3545 Harbor

OCR for page 359
368 F. GAMBERALE, A. IREGREN, AND A. KJELLBERG Blvd, P.O. Box 6725, Costa Mesa, CA 92626), an overlay to the key- board, an EGA graphics card, a color monitor (for some of the tests), and an optional printer may be used. A hard disk is not necessary to run the test battery. However, because the full system is too large to be contained within a single diskette, a hard disk is recommended. Software The programs, which are available in compiled form on diskette, are written in TURBO Pascal. The system includes a master program referring to a number of different test programs, which can be com- bined to any preferred sequence. This sequence may in turn be repeated any number of times, according to the design of the investigation at hand. The graphic presentations within this system are developed with the aid of a graphic tools package TURBO PAINT TOOLS 1986 DATABITEN/P.S. DATAKRAFT. At present, the system consists of the 14 tests listed in Table 1, in addition td four scales for the measurement of reported mood, symp- toms (two scales), and self-rated performance. The tests are Simple, Choice, and Complex Reaction Time (four tests); Search and Memory Test; Symbol Digit; Digit Span; Logical Reasoning; Additions; Finger Tapping (two tests); Vocabulary; Digit Classification; and Digit Addition. A short description of each test may be found in the appendix to this chapter. Anyone interested in further information about the SPES system should contact Anders Iregren, who is responsible for the system development and the distribution of SPES software. EMPIRICAL BACKGROUND AND APPLICATIONS Table 2 lists the investigations that have been performed so far by using SPES tests. These include experimental studies in the labora- tory, occupational studies of effects from acute or long-term exposure to various agents, two investigations applying SPES tests in clinics of occupational medicine, and two studies directly aimed at the meth- odological evaluation of the tests. Standardization Stucly A standardization sample of 100 subjects went through SPES1, 2, 3:1, 4, 5, 6, 7, 10, and 12:1 (Kjellberg and Wisung, i987~. A large proportion of the subjects (i.e., 38 persons) were university students, and 62 were employees of NIOH. For 59 of the latter, testing was repeated four to five months later.

OCR for page 359
COMPUTERIZED TESTING IN NE UROTOXICOLOGY TABLE 1 Tests and Scales in the SPES SPES Code Performance Tests 3:1 3:2 4 5 6 7 Simple Reaction Time Choice Reaction Time Color Word Vigiliance Color Word Stress Search and Memory Symbol Digit Digit Span Additions Digit Classification Digit Addition No. of Items (+ Practice) Parameters Extracted 80 + 16 Mean, SD, 112 + 32 192 + 16 192 + 16 3* (10 + 1) 6 + 4 Varies 40 ( + 3) 240 10 Verbal Reasoning 11 Vocabulary 12:1 Finger Tapping Speed 24 (3 + 1) 12:2 Finger Tapping Endurance 369 Approx- imate ~- ~ ~me (min) decrement 6 Mean, no. of errors 9 Mean RT, no. of commissions, 8 no. of omissions Mean RT, no. of . commissions, no. of omissions Mean RT/search level 10 Mean RT, no. of errors Length of memory span Mean RT, no. of erros Mean RT, no. of errors, no. of lags 120 Mean RT, no. of errors, no. of lags 64 Mean RT, no. of errors 45 No. of correct answers Mean no. of taps/hand 1 Changes of movement time and resting time over test time 4 7 8 8 5 Approx- imate SPES Time Code Self-Rating Scales No. of Items Paramters Extracted (min) 30 Performance 1/test Percent of maximum performance 1 31 Mood 12 Activity score and stress score 3 32 Acute symptoms 17 No. of symptoms reported 4 33 Long-term symtpoms 38 No. of symptoms reported 6 NOTE: SD = standard deviation; RT = reaction time.

OCR for page 359
384 F. GAMBERALE, A. IREGREN, AND A. KJELLBERG A Msec 40 - 30 - 20 - 10 B 80 100 0 20 t(71 0 5,0\ 4,0 ~~_~ 3,0- 40 60 Percentile 2,0 - t,O - O O - . ~ 20 40 60 80 100 Percentile . . . ~ , , , FIGURES 4A and 4B. Mean differences between solvent exposed (N = 292) and nonexposed workers (N = 424) in different parts of the reaction time distribution in SPES1; t-values for the differences are shown in the figure on the right. SOURCE: Data from Soderman et al. (1982). CLINICAL VALIDATION STUDY General Results The test battery proved to be simple to use in clinics, and neither psychologists nor patients had any difficulty in utilizing the tests or the equipment. Furthermore, results indicated that computerized tests predicted the diagnosis slightly better than traditional tests.

OCR for page 359
COMPUTERIZED TESTING IN NEUROTOXICOLOGY Descriptive Statistics 385 Mean values for the different diagnostic groups on the various performance measures are presented in Table 9. The p-values for the group differences from a one-way analysis of variance are also given. Discriminatory Power The predictive power of the computerized tests on the diagnosis was tested with a multiple regression analysis. The multiple correla- tion coefficients ranged from 0.54 to 0.81 for the three psychologists involved. Fairly low though they are, these correlations are still slightly higher than those obtained between traditional tests and diagnosis. CLINICAL TRIAL WITH SPESL Table 10 shows performance on the Simple Reaction Time test for the four diagnostic subgroups and the control group. The positive TABLE 9 Mean Values for Various Performance Measures in the Diagnostic Groups and p Values for Group Differences Solvent-Induced Illness Test/Variable No Possibly Yes p Simple RT Mean 333 446 475 0.039 Standard deviation 80 125 139 0.004 Choice RT Mean 958 983 1,092 0.003 Color Word Vigilance Mean 641 705 710 0.003 No. of misses 6.9 12.5 14.4 0.081 No. of alarms 8.1 7.2 8.6 0.836 Symbol Digit Mean 45.7 50.4 55.9 0.142 Estimated RT 37.6 44.1 50.5 0.033 Errors 0.83 0.66 0.79 0.931 Digit Span 50% level 6.1 5.7 5.1 0.002 Reasoning Mean RT 7.9 8.4 7.7 0.723 No. correct 45.7 42.0 41.0 0.510 SOURCE: Data from Iregren et al. (1987).

OCR for page 359
386 F. GAMBERALE, A. IREGREN, AND A. KIELLBERG TABLE 10 Mean and Standard Deviation for Performance on the Simple Reaction Time Test, Group Size and Age for Various Groups in Clinical Try-out of SPES1 Reation Time Age Mean Diagnosed Group Mean (SD) Variability N (years) SD Solvent-~nduced illness 543 122 5 48 12 (206) Possibly solvent-~nduced 388 104 17 51 8 illness (135) Psychiatric illness 315 73 10 44 12 (105) Other diagnoses 268 59 19 46 14 (e.g., low back pain) (49) Control group 242 46 27 39 7 (25) SOURCE: Data from Hagberg and Iregren (1984). predicted value for the SRT on this diagnosis is 56 and the negative predicted value is 79, given a cut-off limit from a 99 confidence inter- val derived from the control group data. POSSIBLE FUTURE DEVELOPMENT OF SPES The increasing technical competence of computers will certainly broaden the range of mental abilities that can be tested. However, due to the laborious procedure of test development, well-standardized and validated new tests are still a few years off. The current tests, which already have provided much useful infor- mation about the neurotoxic effects of many substances, will be ap- plied in closer collaboration with representatives from other disciplines. Thus, we will be able to relate performance data to increasingly pre- cise measurements of the exposure to toxic substances, as well as to more sophisticated physiological and neurochemical effect measures. In the long run, this development will increase our understanding of the biological mechanisms behind the functional changes that we ob- serve and will provide still better validation for the performance measures. However, the immense variety of performance measures in use is probably the single factor that at present has the greatest effect on the rate of growth of knowledge. The possibilities of making com- parisons across studies performed at separate laboratories and in dif- ferent countries are of major importance, and initiatives to facilitate the standardization of computerized tests have been taken within the

OCR for page 359
COMPUTERIZED TESTING IN NEUROTOXICOLOGY 387 European Economic Community. One significant problem in this process is the slightly different primary uses of various tests and test systems, because the intended use of a test naturally affects the way in which it is implemented. However, the development of standard- ized test protocols is now in progress, and efforts to accomplish this task have been made at our laboratory as well as elsewhere. This volume is a good example of the present strivings. APPENDIX Description of Performance Tasks Simple Reaction Time SPES1 is a sustained attention task measur- ing response speed to an easily discriminated but temporally uncer- tain visual signal. The task is to press a key on the keyboard as quickly as possible when a red square is presented on the display. A total of 96 stimuli are administered during 6 min at intervals varying between 2.5 and 5.0 s. The first minute serves as practice, after which performance capacity is assessed for 5 min. Choice Reaction Time SPES2 is a four-choice RT task similar to SPESl with the addition of response selection requirements, The stimuli consist of crosses displayed one at a time on the screen. One arm of the cross is always shorter, and the task is to indicate on one of four keys, placed in analogy to the arms of the cross, which arm is the shorter. A total of 144 stimuli are presented at the same intervals as in SRT SPESl, and the first two minutes are excluded as practice trials. Color Word Vigilance SPES3:1 is a Choice Reaction Time task in which response selection is based on a more complex signal charac- teristic than in SPES2. It is a task of vigilance type since a response is required only to a minority of the signals. The Swedish word for "red," "yellow," "white," or "blue" (all three-letter words) is presented on the screen. The text can be written in any one of the colors. The task is to press a key as rapidly as possible when there is congruency between the meaning of the word and the color of the text. The interval between consecutive stimuli is 2.2 s, and the 16 possible combinations of words and color are randomly distributed within each sequence of 16 stimuli. Thus, the proportion of critical stimuli is 25 percent. A total of 256 items are presented, and the first 16 are regarded as practice trials. Color Word Stress SPES3:2 is a version of SPES3 which is con- structed to provoke false alarms, and thus primarily measures the ability to inhibit such responses. The stimuli are the same, but the

OCR for page 359
388 F. GAMBERALE, A. IREGREN, AND A. KIELLBERG interval between subsequent stimuli is decreased to 1.5 s, and the proportion of critical stimuli has been increased from 25 to 75 per- cent. Search and Memory SPES4 measures the speed of comparing stimuli shown on the screen with a set of stimuli retained in memory. One, two, or three letters are presented on the screen for 1, 2, or 3 s, respectively. The task is to reproduce the letters on the keyboard after their disappearance. Following a successful reproduction, a row of 30 letters is presented. The task is to search this row as fast as possible for the occurrence of any of the critical letters, and each appearance is indicated by pressing a key. There may be anything from 0 to 3 critical letters in each row. Altogether there are 33 items, 11 for each number of search letters. The first trial at each level is regarded as practice. Symbol Digit SPES5 is a revised version of a traditional test of perceptual speed. In one row, a key to this coding task is given by the pairing of symbols with the randomly arranged digits 1 to 9. The task is to key in as fast as possible the digits corresponding to the symbols presented in random order in a second row. Each item consists of nine pairs of randomly arranged symbols and digits, and a total of ten items are presented in all. Performance is evaluated for the last six items of the test. Digit Span SPES6 is a traditional test of short-term memory capac- ity. Series of digits are presented on the screen. The digits are presented one at a time with a 1-s presentation time, and the task is to repro- duce the series on the keyboard. Depending on the correctness of the answer, the length of the following series is either increased or de- creased. The test starts with a series of three digits and is terminated after six changes from a correct to an incorrect answer. Additions SPES7 measures speed of simple mental arithmetic op- erations. An addition task comprising three horizontally placed dig- its is presented on the screen for 1 s. The task is to add the digits as quickly as possible and to indicate the sum on the keyboard. The test includes a total of 43 items. Digit Classification SPES8 is a continuous CRT task. Digits ranging from 1 to 8 are presented one at a time on the screen. The task is to determine whether the digit presented is odd or even and to respond by pressing one of two appropriately marked keys. As soon as a response is given a new digit appears, and 240 digits are presented in all. Digit Addition SPES9 is a version of SPES8 requiring more com- plex processing of the signals. The digits are presented one at a time on the screen for 1.5 s at intervals of 1.8 s. The task is to add the digit

OCR for page 359
COMPUTERIZED TESTING IN NEUROTOXICOLOGY 389 currently presented to the previous digit and determine whether this sum is odd or even. The response is given by pressing one of two appropriately marked keys. Verbal Reasoning SPES10 measures the speed and accuracy of ver- bal reasoning. Sentences of varying syntactic complexity are presented on the screen. Each sentence describes a relation between the letters A and B. and it is followed by a combination of these letters. The task is to indicate with one of two keys whether the sentence gives a correct description of the relation between the letters A and B. There are 32 different items in a random series which is repeated twice. Vocabulary SPESll is a traditional test of verbal understanding. The task is to indicate which of five alternatives is the synonym of a key word. A total of 45 items 15 nouns, 15 verbs, and 15 adjec- tives are presented. The words have been selected from a 102-item vocabulary test which was distributed as a paper-and-pencil test to 164 subjects with varying educational background. The selection of words was made with the primary aim of achieving discriminatory power in a low-education group. The words are presented in as- cending order of difficulty. Finger Tapping Speed SPES12:1 measures the maximum rate of repetitive movement. The task is to tap as rapidly as possible on a key at the keyboard with the index finger. The forearm is kept in a fixed position at the table. Eight 10-s trials, with a forced interval of 15 s, are performed while alternating between the preferred and nonpreferred hand. Four trials are given with each hand, and the first trial with each hand is regarded as a practice trial. Finger Tapping Endurance SPES12:2 is a version of SPES12 in which the change in tapping rate over time is assessed. The task is to tap as rapidly as possible with the index finger on a key. A 1-min trial is performed with the dominant hand, and for each single tap the movement time and the resting time are registered separately. Performance is evaluated with respect to level and to changes over time. Description of the Self-Rating Scales Self-Rating of Performance SPES30. Within the system, it is pos- sible to let the subject rate his performance directly after each test. In the standard version, the subject is asked to rate his actual perfor- mance in percent of his maximum performance. The question could, however, easily be rephrased. Self-Rating of Mood SPES31. The scale consists of 12 mood-de- scriptive adjectives coupled to a six-category response scale. The

OCR for page 359
390 F. GAMBERALE, A. IREGREN, AND A. KIELLBERG response categories have verbal labels ranging from "not at all" to "very much." Ratings are given by typing the number of the appro- priate response alternative. The questionnaire is based on two more comprehensive Swedish mood adjective check lists (Kjellberg and Bohlin, 1974; Sjoberg et al., 1979) each containing six subscales. Several authors have argued in favor of reducing these six dimensions of mood to two basic dimensions, an Activity or Energy dimension and a Stress or Tension dimension (Kjellborg and Bohlin, 1974; Sjoberg et al., 1979; Thayer, 1978; Watson and Tellegren, 1985~. On the basis of previously reported factor analyses, six words were selected for each of the two dimensions. Words in the original questionnaires which have been found to be unfamiliar to, or at least unnatural to use by, nonstudent groups were excluded. A score in each subscale is computed as a mean of the ratings of the six adjectives in the scale. Acute Symptoms SPES32. This questionnaire contains 17 items regarding symptoms of local irritation as well as symptoms from the CNS. The subject is asked to rate the present intensity of each symp- tom on a four-point scale. Long-Term Symptoms SPES33. The questionnaire contains 38 items regarding a wide variety of symptoms, such as vegetative symptoms, concentration deficits, fatigue, tiredness, dizziness, and symptoms of peripheral neuropathy. The subject is asked to rate the frequency of occurrence of each symptom during the last six months on a four- point scale. REFERENCES Acker, W. 1983. A computerized approach to psychological screening The Bexley- Man-Audsley Automated Psychological Screening and the Bexley-Man-Audsley Category Sorting Test. Int. J. Machine Stud. 18:361-369. American Psychological Association. 1986. Guidelines for computer tests and interpretations. Washington, D.C. Anshelm Olson, B. 1982. Effects of organic solvents on behavioral performance of workers in the paint industry. Neurobehav. Toxicol. Teratol. 4:703-708. Anshelm Olson, B. 1985. Early detection of industrial solvent toxicity. The role of human performance assessment. Arbete Halsa National Board Occupational Safety Health 21:1-59. Anshelm Olson, B., F. Gamberale, and B. Gronqvist. 1981. Reaction time changes among steel workers exposed to solvent vapor. A longitudinal study. Int. Arch. Occup. Environ. Health 48:211-218. Anshelm Olson, B., F. Gamberale, and A. Iregren. 1985. Coexposure to toluene and p- xylene in man: Central nervous functions. Br. J. Ind. Med. 42:117-122. Astrand, I., and F. Gamberale. 1978. Effects on humans of solvents in the inspiratory air: A method for estimation of uptake. Environ. Res. 15:1-4. Baker, E. L., R. Letz, and A. Fidler. 1985. A computer-administered neurobehavioral

OCR for page 359
COMPUTERIZED TESTING IN NEUROTOXICOLOGY 391 evaluation system for occupational and environmental epidemiology. J. Occup. Med. 27:206-212. Bartram, D., and R. Bayliss. 1984. Automated testing: Past, present and future. Occup. Psychol. 57:221-237. Bartram, D., J. G. Beaumont, P. Cornford, P. L. Dann, and S. Wilson. 1987. Recommendations for the design of software for computer based assessment Summary statement. Bulletin for the British Psychological Society 40:86-87. Beaumont, J. G. 1982. System requirements for interactive testing. Int. J. Man-Machine Stud. 17:311-320. Beaumont, J. G. 1985. The effects of microcomputer presentation and response medium on digit span performance. Int. J. Man-Machine Stud. 22:11-18. Beaumont, J. G., and C. C. French. 1987. A clinical field study of eight automated psychometric procedures: The Leicester/DHSS project. Int. J. Man-Machine Stud. 26:661-682. Biersner, R. J. 1972. Selective performance effects of nitrous oxide. Human Factors 43:187-194. Bittner, A. C., M. G. Smith, R. S. Kennedy, C. F. Staley, and M. M. Harbeson. 1985. Automated portable test (APT system). Overview and prospects. Behav. Res. Methods Instrum. 17:217-221. Braconnier, R. J. 1985. Dementia in human populations exposed to neuro-toxic agents: A portable microcomputerized dementia screening battery. Neurobehav. Toxicol. Teratol. 7:379-386. Carr, A. C., R. J. Ancill, A. Ghosh, and A. Margo. 1981. Direct assessment of depres- sion by microcomputer. A feasibility study. Acta Psychiatr. Scand. 64:415-422. Cassito, M. G. 1985. Review on recent developments and improvements of neuropsychological criteria for human neurotoxicity studies. Pp. 20-24 in Neurobehavioral Methods in Occupational and Environmental Health. Copenhagen: WHO. Chapman, L. J., and J. P. Chapman. 1978. The measurement of differential deficit. J. Psychiatr. Res. 14: 301-311. Denner, S. 1977. Automated psychological testing: A review. Br. J. Soc. Clin. Psychol. 16:175-179. Eckerman, D. A., J. B. Carrol, D. Foree, C. M. Guillon, M. Lansman, E. R. Long, M. B. Waller, and T. S. Wallsten. 1985. An approach to brief field testing for neurotoxicity. Neurotoxicity Toxicol. Teratol. 7:387-393. Elofsson, S. A., F. Gamberale, T. Hindmarsh, A. Iregren, A. Isaksson, I. Johnsson, B. Knave, E. Lydahl, P. Mindus, H. E. Persson, B. Philipson, M. Steby, G. Struwe, E. B. Soderman, A. Wennberg, and L. Widen. 1980. Exposure to organic solvents: A cross-sectional epidemiologic investigation on occupationally exposed ear and industrial spray painters with special reference to the nervous system. Scand. J. Work Environ. Health 6:239-273. Enander, A. 1987. Effects of moderate cold on performance of psychomotor and cognitive tasks. Ergonomics 30:1431-1445. French, C. C., and J. G. Beaumont. 1987. The reaction of psychiatric patients to computerized assessment. Br. J. Clin. Psych. 26:267-278. Gamberale, F. 1985. The use of behavioral performance tests in the assessment of solvent toxicity. Scand. J. Work Environ. Health (Suppl. 1):65-74. Gamberale, F., and M. Hultengren. 1972. Toluene exposure. II. Psychophysiological functions. Work Environment and Health 9:131-139. Gamberale, F., and M. Hultengren. 1973. Methyl-chloroform exposure. II. Psycho- physiological functions. Work Environment and Health 10:82-92.

OCR for page 359
392 F. GAMBERALE, A. IREGREN, AND A. KIELLBERG Gamberale, F., and M. Hultengren. 1974. Exposure to styrene. II. Psychological functions. Work Environment and Health 11:86-93. Gamberale, F., and Kjellberg, A. 1983a. Behavioral performance assessment as a biological control of occupational exposure to neurotoxic substances. Pp. 111-121 in R. Gilioli, M. G. Cassitto, and V. Foa, eds. Neurobehavioral Methods in Occupational Health. Oxford: Pergamon Press. Gamberale, F., and A. Kjellberg. 1983b. Field studies of the acute effects of exposure to solvents. Pp. 117-129 in The Neuropsychological Effects of Solvent Exposure, N. Cherry and A. Waldron, eds. Hampshire, England: The Colt Foundation. Gamberale, F., and G. Svensson. 1974. The effect of anaesthetic gases on the psychomotor and perceptual functions of anaesthetic nurses. Work Environ Health 11:108-111. Gamberale, F., G. Annwall, and M. Hultengren. 1975a. Exposure to white spirit. II. Psychological functions. Scand. J. Work Environ. Health 1:31-39. Gamberale, F., G. Annwall, and M. Hultengren. 1975b. Exposure to methylene chloride. II. Psychological functions. Scand. J. Work Environ. Health 2:95-103. Gamberale, F., G. Annwall, and B. Anshelm Olson. 1976a. Exposure to trichloroethylene. III. Psychological functions. Scand. J. Work Environ. Health 4:220-224. Gamberale, F., G. Annwall, and M. Hultengren. 1978. Exposure to xylene and ethyl- benzene. III. Effects on central nervous functions. Scand. J. Work Environ. Health 4:204-211. Gamberale, F., B. Anshelm Olson, P. Eneroth, T. Lind, and A. Wennberg,. 1988a. Acute effects of ELF electromagnetic fields. A field study on linemen working at 400 kV. Solna, Sweden: National Institute of Occupational Health. Gamberale, F., H. O. Lisper, and B. Anshelm Olson. 1976b. The effect of styrene vapour on the reaction time of workers in the plastic boat industry. Pp. 135-148 in Adverse Effects of Environmental Chemicals and Psychotropic Drugs, M. Horvath, ed. Amsterdam: Elsevier. Gamberale, F., A. Kjellberg, and S. Razmjou. 1988b. The Effects of Unfavorable Thermal Conditions on Performance. Solna, Sweden: National Institute of Occupational Health. Greenhouse, S.W., and S. Geisser. 1959. On methods in the analysis of profile data. Psychometrika 24:95-112. Hagberg, M., and A. Iregren. 1984. Simple reaction time as a diagnostic aid in psycho- organic syndrome induced by organic solvents. Proceedings from the International Conference on Organic Solvent Toxicity, Stockholm, October. Hedl, J. J., H. F. O'Neil, and D. N. Hansen. 1973. Affective reactions toward computer based intelligence testing. J. Consult. Clin. Psychol. 40:217-222. Iregren, A. 1982. Effects on psychological test performance of workers exposed to a single solvent (toluene) A comparison with effects of exposure to a mixture of organic solvents. Neurobehav. Toxicol. Teratol. 4:695-701. Iregren, A. 1986a. Subjective and objective signs of organic solvent toxicity among occupationally exposed workers. An experimental evaluation. Scand. J. Work Environ. Health 12:469~75. Iregren, A. 1986b. Effects of industrial solvent interactions. Studies of behavioral effects in man. Arbete Halsa National Board Occupational Safety Health 11:1-60. Iregren, A., F. Gamberale, and A. Kjellberg. 1985. A microcomputer based behavioral testing system. Pp. 75-80 in Neurobehavioral Methods in Occupational and Envi- ronmental Health. Copenhagen: WHO. Iregren, A., T. Akerstedt, B. Anshelm Olson, and F. Gamberale. 1986. Experimental exposure to toluene in combination with ethanol intake. Psychophysiological func- tions. Scand. J. Work Environ. Health 12:128-136. Iregren, A., O. Almkvist, M. Klevegard, and U. Aslund. 1987. A clinical validation of

OCR for page 359
COMPUTERIZED TESTING IN NEUROTOXICOLOGY 393 six computerized tests for diagnosing solvent caused occupational illness (in Swed- ish). Arbete Halsa National Board Occupational Safety Health 13:1-37. Irons, R., and P. Rose. 1985. Naval biodynamics laboratory computerized cognitive testing. Neurotoxicity Toxicol. Teratol. 7:395-397. Kjellberg, A., and O. Bohlin. 1974. Self-reported arousal: Further development of a multifactorial inventory. Scand. J. Psychol. 15:285-292. Kjellberg, A., and M. Strandberg. 1979. The effects of anaesthetic gases on reaction time of anaesthetic nurses. Report No. 11. Solna, Sweden: National Board of Occupational Safety and Health. Kjellberg, A., and P. Wide. 1988. Effects of simulated ventilation noise on performance of a grammatical reasoning task. Proceedings of the 5th International Congress on Noise as a Public Health Problem, Stockholm. Kjellberg, A., and H. Wisung. 1987. Some metrical properties in a computer administered test battery for use in behavioral toxicology. Report No. 1. Solna, Sweden: National Board of Occupational Safety and Health. Kjellberg, A., B. Wigaeus, J. Engstrom, I. Astrand, and B. Ljungquist. 1979. Long-term effects of exposure to styrene in a polyester plant. Arbete Halsa National Board Occupational Safety Health 18:1-25. Knave, B., B. Anshelm Olson, S. Elofsson, F. Gamberale, A. Isaksson, P. Mindus, H. E. Persson, G. Struwe, A. Wennberg, and P. Westerholm. 1978. Long term exposure to jet fuel. A cross sectional epidemiologic investigation on occupationally exposed industrial workers with special reference to the nervous system. Scand. J. Work Environ. Health 4:19-45. Knave, B., F. Gamberale, S. Bergstrom, E. Birke, A. Iregren, B. Kolmodin Hedman, and A. Wennberg. 1979. Long-term exposure to electric fields. A cross-sectional epidemiologic investigation of occupationally exposed workers in high-voltage substations. Scand. J. Work Environ. Health 2:115-125. Laursen, P., and T. Jorgensen. 1985. Computerized neuropsychological test system. In Neurobehavioral Methods in Occupational and Environmental Health. Copenhagen: WHO. Letz, R., and E. Baker. 1986. Computer-administered neurobehavioral testing in occupational health. Sem. Occup. Med. 1:197-203. Lisper, H.O., and A. Kjellberg. 1972. Effects of 24-hour sleep deprivation on rate of decrement in a 10-minute auditory reaction time task. J. Exp. Psychol. 96:287-290. Lucas, R.W. 1977. A study of patient attitudes to computer interrogation. Int. J. Man- Machine Stud. 9:69-96. Lukin, M. E., E. Dowd, B. S. Plake, and R. Kraft. 1985. Comparing computerized versus traditional psychological assessment. Computers in Human Behavior 1:49- 58. Mahoney, E. C., P. A. Moore, E. L. Baker, and R. Letz. 1988. Experimental nitrous oxide exposure as a model system for evaluating neurobehavioral tests. Toxicology 49:449-457. Matarazzo, J. D. 1983. Computerized psychological testing. Science 221:323. McArthur, D. L., and B. H. Choppin. 1984. Computerized diagnostic testing. J. Educational Measurement 31:391-397. Rodnitzky, R. L., H. S. Levin, and D. L. Mick. 1975. Occupational exposure to organophosphate pesticides. A neurobehavioral study. Archives of Environmen- tal Health 30:98-103. Roels, H., R. Lauwreys, J. P. Buchet, P. Genet, M. J. Sarhan, I. Hanotiau, M. deFays, and D. Stanescu. 1987. Epidemiological survey among workers exposed to manganese: Effects on lung, central nervous system and some biological indices. Am. J. Ind. Med. 11:307-327.

OCR for page 359
394 F. GAMBERALE, A. IREGREN, AND A. KIELLBERG Sjoberg, L., E. Svensson, and L. O. Persson. 1979. The measurement of mood. Scand. J. Psychol. 20:1-18. Soderman, E., A. Kjellberg, B. Anshelm Olsen, and A. Iregren. 1982. Standardization of a simple reaction time test for use in behavioral toxicology. Report No. 27. Solna, Sweden: National Board of Occupational Safety and Health. Space, L.G. 1981. The computer as psychometrician. Behav. Res. Methods Instrum. 13:595~06. Thayer, R.E. 1978. Toward a psychological theory of multidimensional activation (arousal). Motivation and Emotion 2:1-34. Thompson, J. A., and S. L. Wilson. 1982. Automated psychological testing. Int. J. Man-Machine Stud. 17:279-289. Watson, D., and A. Tellegen. 1985. Toward a consensual structure of mood. Psychol. Bull. 98:219-235. Weiss, D. J., and C. D. Vale. 1987. Adaptive testing. Appl. Psych. 36:249-262. Wigaeus-Hjelm, E., M. Hagberg, A. Iregren, and A. Lof. 1990. Exposure to methyl isobutyl ketone (MIBK). Toxicokinetics and occurrence of irritative and CNS symptoms in man. International Archives of Occupational and Environmental Health. In press. World Health Orgaruzation. 1987. Prevention of Neurotoxic Illness in Working Populations, B. L. Johnson, ed. New York: John Wiley & Sons.