Cover Image

Not for Sale



View/Hide Left Panel
Click for next page ( 140


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 139
139 APPENDIX E A Literature Review of Field Studies and Spatial Analyses for Hotspot Identification of WildlifeVehicle Collisions The literature description for this appendix contains stand- Nielsen, S.E., Herrero, S., Boyce, M.S., Mace, R.D., Benn, B., Gibeau, alone references. The literature review of all other appendices M.L., and S. Jevons. 2004. Modelling the spatial distribution of human-caused grizzly bear mortalities in the Central Rockies is contained in the References section. ecosystem of Canada. Biological Conservation 120:101113 Premo, D.B.P., and E.I. Rogers. 2001. Town of Amherst deer-vehicle ac- I. Wildlife-Vehicle Collision Analysis cident management plan. White Water Associates, Inc., Amasa, Michigan (www.white-water-associates.com) Allen, R.E., and McCullough, D.R. 1976. Deer-car accidents in south- Rogers, E. 2004. An ecological landscape study of deer-vehicle collisions ern Michigan. Journal of Wildlife Management 40(2):317321. in Kent County, Michigan. Report for the Michigan State Police, Bashore, T.L., Tzilkowski, W.M., and E.D. Bellis. 1985. Analysis of deer- Office of Highway Safety and Planning. White Water Associates, vehicle collision sites in Pennsylvania. Journal of Wildlife Manage- Inc., Amasa, MI 49903. 56 pp. ment 49(3): 769774. Romin, L.A. and J.A. Bissonette. 1996. Temporal and spatial distribution Bellis, E.D., and H.B. Graves. 1971. Deer mortality on a Pennsylvania of highway mortality of mule deer on newly constructed roads at interstate highway. Journal of Wildlife Management 35(2):232237. Jordanelle Reservoir, Utah. The Great Basin Naturalist 56(1): 111. Biggs, J., Sherwood, S., Michalak, S., Hansen, L., and C. Bare. 2004. Seiler, A. 2005. Predicting locations of moose-vehicle collisions in Swe- Animal-related vehicle accidents at the Los Alamos National Lab- den. Journal of Applied Ecology 42: 371382. oratory, New Mexico. The Southwestern Naturalist 49(3):384394. Simek, S.L., Jonker, S.A., and Mark J. Endries. 2005. Evaluation of prin- Caryl, F.M. 2003. Ungulate mortality on a forested highway. University cipal roadkill areas for Florida black bear. ICOET 2005. of East Anglia, Norwich. M.Sc. dissertation. 42 pp. Singleton, P.H., and J.F. Lehmkuhl. 1999. Assessing wildlife habitat Finder, R.A., Roseberry, J.L., and A. Woolf. 1999. Site and landscape connectivity in the Interstate 90 Snoqualmie Pass Corridor, Wash- conditions at white-tailed deer/vehicle collision locations in Illi- ington. ICOWET III. nois. Landscape and Urban Planning 44: 7785. Gundersen, H., and H.P. Andreassen. 1998. The risk of moose-collision: a logistic model for moose-train accidents. Wildlife Biology 4(2): II. Spatial Analysis Techniques 103110. Gundersen, H., Andreassen, H.P., and T. Storaas. 1998. Spatial and Boots., B.N. and A. Getis. 1988. Point Pattern Analysis. Sage Publica- temporal correlates to Norwegian train-moose collisions. Alces 34: tions, Inc. Newbury Park, California. 85 pp. 385394. Burka, J., Nulph, D., and A. Mudd. 1997. Technical approach to devel- Hubbard, M.W., Danielson, B.J., and R.A. Schmitz. 2000. Factors in- oping a spatial crime analysis system with ArcView GIS. INDUS fluencing the location of deer-vehicle accidents in Iowa. Journal of Corporation and U.S. Department of Justice. Wildlife Management 64(3):707713. Lee, J., and D.W.S. Wong. 2001. Statistical Analysis with ArcView GIS. Joyce, T.L., and S.P. Mahoney. 2001. Spatial and temporal distributions John Wiley and Sons, Inc., New York, New York. 192 pp. of moose-vehicle collisions in Newfoundland. Wildlife Society Bul- Levine, N., Kim, K.E., and L.H. Nitz. 1995. Spatial analysis of Honolulu letin 29(1): 281291. motor vehicle crashes: I. Spatial Patterns. Accident Analysis and Pre- Kassar, C., and J.A. Bissonette. 2005. Deer-vehicle crash hotspots in vention 27(5):663674. Utah: data for effective mitigation. UTCFWRU Project Report No. Levine, N. 1996. Spatial statistics and GIS: software guides to quantify 2005(1):1-28. Utah Cooperative Fish and Wildlife Research Unit, spatial patterns. Journal of the American Planning Association 62(3): Utah State University, Logan Utah. 381391. Malo, J.E., Suarez, F., and A. Diez. 2004. Can we mitigate wildlife Levine, N. 1999. Quickguide to CrimeStat. Ned Levine and Associates, vehicle accidents using predictive models? Journal of Applied Ecol- Annandale, VA. ogy 41: 701710. Levine, N. 2004. CrimeStat III: Distance analysis. Chapter 5 in: A spa- Nielsen, C.K., Anderson, R.G., and M.D. Grund. 2003. Landscape in- tial statistics program for the analysis of crime incident locations. Ned fluences on deer-vehicle accident areas in an urban environment. Levine & Associates: Houston, Texas, and the National Institute of Journal of Wildlife Management 67(1): 4651. Justice, Washington, D.C., USA.

OCR for page 139
140 Spooner, P.G., Lunt, I.D., Okabe, A. and S. Shiode. 2004. Spatial analy- from each side of the road; at sites shorter than 100 m, two points sis of roadside Acacia populations on a road network using the net- were randomly selected work K-function. Landscape Ecology 19:491499. Analysis: stepwise logistic regression used to test the importance of the variables used in the model; 5 pairs of sites randomly selected for a test of the model's predictive ability I. WildlifeVehicle Collision Analysis Results: 9 of 19 variables selected for inclusion in model (residences, com- Allen, R.E., and D.R. McCullough. 1976. Deer-car accidents in south- mercial buildings, other buildings, shortest visibility, in-line visibil- ern Michigan. Journal of Wildlife Management 40(2): 317321. ity, speed limit, distance to woodland, fencing, non-wooded area); 85% of kill locations had a prob. of 0.70 or greater of being clas- Objective: to identify the time, place, and characteristics of traffic and sified as kill site; 89% of control had a prob. of 0.30 or less of deer that contribute to collisions. It was hoped that such an under- being classified as kill site standing would suggest measures to reduce collisions high correlation between speed limit and in-line visibility; be- Data layers: 10 counties in S. Michigan; data on DVC from accident re- tween residences and other buildings; removal of correlated ports, 1966-1967 variables did not significant change model Variables analyzed for all accidents: date, day of week, time, 9 and 7 variable models performed equally well in predicting kill speed of car, sex of deer, road type and non-kill sites; 5 kill locations correctly classified, one con- Added to 1967 data: location within 0.16 km from a landmark; trol location misclassified by both models number deer seen at time of accident; fate of deer involved; Discussion: DVCs not random in time or space; kills aggregated whether car driven or towed away; extent of injuries Traffic volume data from MI Dept of State Highways: average Bellis, E.D., and H.B. Graves. 1971. Deer mortality on a Pennsylvania in- traffic volume for various time intervals (hourly, daily, monthly) terstate highway. Journal of Wildlife Management 35(2): 232237. Analyses: 3 areas from highest accident roads chosen for habitat analy- Objective: to present the results of an analysis of data on highway mor- sis: all accidents plotted on aerial photos and roadside habitat clas- tality collected from November 1968 through December 1969 sified as cropland, forest, or unimproved field Data layers: data collected from an 8.03-mile section of I-80; divided Results: most accidents occurred between 16000200; 2 peaks, sunrise into 212 contiguous sectors of 200 ft length Kill data obtained from game protector who filled out re- and 1-2 hours after sunset; traffic volume and DVC correlated for searcher-supplied data sheets (date, location by sector number, evening and nighttime hours (85% of variation in DVC accounted highway lane, sex, age class); it was understood that many kills for by traffic volume) were probably not reported Number of accidents and traffic volume highest on weekends; 5 portions of each of the 212 sectors analyzed for physical and largest number of DVC in fall and early winter vegetation factors that might affect deer mortality: planted In 3 sections where habitat determined, accidents and habitat ROW on each side of highway; area adjacent to ROW on each types occurred in similar proportions side of highway; median strip % of accidents increased up to a speed of 80-95 kph, then de- Factors used in the analyses: quality and amount of vegetation; clined at higher speeds topography; area of ROW; presence of fences or guardrails Deer counts obtained from May 68May 69 by spotlighting Bashore, T.L., Tzilkowski, W.M., and E.D. Bellis. 1985. Analysis of from vehicle deer-vehicle collision sites in Pennsylvania. Journal of Wildlife Results: 286 reported DVC; 67.9% of sectors had at least one DVC (max Management 49(3): 769774. of 9 in one sector); roadkills often concentrated in groups of con- Objectives: examined roadkill locations plotted on highway maps by PA tiguous sectors Game Protectors since 1968. A cursory exam revealed that deer kills 70% of deer seen through spotlighting were grazing (conserva- tend to be aggregated at specific sites where accidents occur year tive estimate); suggests presence and type of vegetation within after year. Analyzed aerial photos and highway and topo maps and sectors accounted for much of variation in numbers killed Low correlation between DVC and all measured variables-- conducted field studies to determine which factors characterize demonstrated that with our technique we could not account for concentrations of collisions at particular sites. A model was devel- the variation in numbers of deer killed oped to predict probabilities that a section of highway would be a Examined data in a less analytical manner by considering com- high kill site and then tested for reliability. bos of sectors in relation to overall topography 4 PA counties studied, used 2-lane hard-top roads, 51 paired sites (kill High mortality: (1) where sections of road lay in troughs and control), data collected from 1 July30 Oct 1979 and 27 June formed by elevated median strips with steep banks and steep 1 Oct 1980 inclines on ROW; (2) where troughs terminated by reduc- Data layers: residences (number/ha); commercial buildings (num- tions in elevation of the median strips allowing deer to easily ber/ha); other buildings (number/ha); banks (prop. of terrain ele- cross road; (3) both sides of highway and median strip had vated more than 1 m above road surface); gullies (more than 1 m good grazing and relief relatively flat below road surface); level (not bank or gully); wooded; non- Low mortality: (1) area with low relief, abundant food on wooded; barren; distance to woodland; increasing slope; decreas- ROW and chain link fence; (2) ROW declines sharply to a ing slope; no slope; angular visibility; in-line visibility; shortest vis- stream or other lowland area, guardrails present ibility; speed limit; fencing; guardrails High correlation between number killed per month and num- Data obtained by selecting a random point from within each 100m in- ber seen per month terval of site length and running a 100 m transect perpendicularly

OCR for page 139
141 Biggs, J., Sherwood, S., Michalak, S., Hansen, L., and C. Bare. 2004. Discussion: ambiguous relationship between accidents and snowfall Animal-related vehicle accidents at the Los Alamos National Labo- might derive from our pooling of snowfall and accident data by ratory, New Mexico. The Southwestern Naturalist 49(3): 384394. month instead of using daily snowfall measurements and accident counts Objectives: to (1) analyze wildlifevehicle accident data with respect to Poor results with utility of different variables could be result of time, season, location, and species for accidents occurring on small sample size LANL internal and perimeter roads and (2) perform an analysis of Because of small sample size, placed a higher priority on finding site characteristics at accident locations identified as hotspots. a well-fitting model that made sense rather than on finding one Data layers: ~68 km of primary rd included in study; majority of traffic that was statistically significant volume in early morning and late afternoon; WVC data from 19901999 Accident data: from NMDGF, LAPD and LANL security force Caryl, F.M. 2003. Ungulate mortality on a forested highway. University reports (date, time, location, species, cost of damage, injuries to of East Anglia, Norwich. M.Sc. Dissertation (copyrighted) 42 pp. humans, injuries to animals); accident locations recorded into GIS (sometimes based on approximate description of site) Objective: to produce a multi-species empirical model of ungulate road Hotspot characterization data: vegetation characteristics (dom- mortality using highly accurate spatial data from field and GIS inant tree, shrub, forbs and grass sp.), posted speed limit, road based sources in Kootenay National Park, British Columbia, type (straight, curve, hill), presence of lighting, amount of avail- Canada. It would then be determined if the model could be repro- able light (high, mod, low, none), presence and length of duced using GIS-based variables only, to provide a quick and ef- guardrails, height of fencing, slope characteristics, motorist vis- fective management guide to focus mitigation efforts at high risk ibility distance locations Data layers: species included moose, mule deer, white-tailed deer, elk Analyses: Cluster analysis using GIS nearest neighbor index approach and bighorn sheep used to determine whether accidents were distributed randomly; Ungulate mortality data: date, number, species, sex, age, loca- deer and elk examined separately tion from GPS Density analysis using the "simple" type calculation of the GIS Control site data: randomly selected non-kill sites along high- program and a search radius of 100 m applied to identify acci- way; ratio of control sites to kill sites larger than one desirable dent hotspots due to greater expected variation in control environmental Accident site characterization analysis: 15 hotspots selected, attributes with 15 paired control sites; 100 m transect centered on site Field-based environmental variables: distance to cover (> 1 m placed parallel to road on either side, six 15 m transects (at 25, tall and continuous), % cover forest; % cover shrub; % cover 50, 75 m marks) placed perpendicular to 100 m transects; herb; % cover bare ground; roadside slope; verge slope; adjacent hotspot characterization data recorded along 15 m transects land slope; inline visibility; angular visibility Statistical analyses: 2 used to test for differences in accident GIS-based variables (using ArcView): elevation; distance to hy- counts between seasons drology; distance to human use; road sinuosity ratio; change in Exact binomial tests: to determine if differences occurred be- elevation; habitat importance for deer, moose and elk; barrier tween the numbers of accidents in different pairs of seasons; also if significant difference occurred among hourly counts Analysis: Spearman's rho correlations used to screen for multi- of accidents; deer and elk analyzed separately collinearity, removed one highly correlated biophysical variable be- Poisson regression: if accident count in given year significant fore model development; differences between seasons compared difference from other years; also to test if association between using 2 tests monthly accident counts and monthly snowfall amounts sig- Model development: logistic regression, stepwise selection nificant; deer and elk analyzed separately process using log likelihood ratio tests and a prob. value of 0.05 Logistic regression: to model status of an area as a hotspot or for entry and removal of variables to the model; selection control as a function of measured variables process then repeated using only GIS-based variables; 2 used as Fisher's exact test: if diff in recoded variables between a goodness-of-fit test of model appropriateness; Wald stats to hotspots and controls were statistically significant test the significance of independent variables; direction of pre- Univariate logistic regression: as first step to identify poten- dictor influence verified using Mann-Whitney U tests; odds ra- tial predictors for a larger model; potential predictor vari- tios examined to assess contribution that a unit increase in pre- ables chosen if (1) absolute value of the Wald statistic > 1, (2) dictor variable made to outcome probability lit search revealed potential importance, or (3) authors Model validation: 5 control and 5 kill sites randomly chosen to thought important validate model's predictive ability; 0.29 chosen as classification cut-off for predicted group memberships based number of Results: seasonal peaks in DVC (fall) and E(lk)VC (winter, fall); most kill sites vs control sites; predicted probabilities classified into accidents in late afternoon and evening hours 3 groups: low, moderate and high risk of kill Cluster analysis: EVC and DVC did not occur randomly Density analysis: identified several areas with higher concentra- Results: Kill sites highly aggregated; highly significant seasonal differ- tions of accidents ences (high in summer); roadkills positively associated with daily Accident site characteristics: when considered 1 at a time, no traffic volumes variable measured was a statistically significant predictor of 3 of 17 environmental variables shown to be reliable predictors: hotspot or control status; variables chosen as predictors in final distance to humans, elevation, distance to cover; all had nega- model were ln(average number woody stems > 2 m in height), tive coefficients; 67.3% kill sites, 64.3% control sites predicted and maximum slope correctly, giving overall success of 65.2%

OCR for page 139
142 2 of 9 GIS variables shown to be reliable predictors: distance to moose-train collisions per km in Norway (Gundersen et al. 1998). humans, elevation; both had negative coefficients; 61.5% kill In the model we have included speed of train, type of train, time of sites, 65.1% control sites predicted correctly, giving overall day and lunar phase, besides climatic covariables known to be cor- success of 64% related with moose-train collisions. Model testing: 3 of 5 control sites, 4 of 5 kill sites (70% of sites) Data layers: success (1) or failure (0) of train hitting moose; correctly classified using GIS model train departures; train route; train predictor variables (average Probability surface with low moderate and high collision prob- speed, train type, time of day); ability created using 2 GIS variables daily average temperature; snow depth; lunar phase; Discussion: kill sites not located randomly in time or space; kill sites data recorded from Dec-Mar 1990-1997 found at lower elevations than control sites and closer to human Analysis: a logistic model was applied incorporating the above variables; use areas; odds of a kill decreased by 96% with each additional 1000 the most parsimonious model chosen using AIC; m elevation above sea level if all other variables controlled; odds of another model made containing only passenger trains running the a kill decreased by 40% with each additional 1 km distance from whole distance between 2 towns, snow depth, daily average human disturbance sites; probability surface showed a close agree- temp, lunar phase, train speed, time of day left station; ment with observed roadkill locations, would be useful guide for used data from 19901996 to predict the number of train-killed quick assessment of possible high risk locations moose for each train for the winter of 1996/1997 Finder, R.A., Roseberry, J.L., and A. Woolf. 1999. Site and landscape Results: most parsimonious model included route, time of day, lunar conditions at white-tailed deer/vehicle collision locations in Illi- phase, snow depth, temp; nois. Landscape and Urban Planning 44: 7785. according to AIC, this model is indistinguishable from one includ- ing average train speed; Objective: to determine if high deer/vehicle accident locations could be best model for the second analysis included all predictor variables. predicted from remotely sensed land use/land cover patterns Problems with morning train results in second analysis due to in- Data layers: 98 counties in IL; 19891993; rd segments with high con- troduction of logging activity in Storholmen in 1996. centrations of DVAs on state marked routes; n=86 locations with Second model gave good results for morning train after removing 15 reported DVAs analyzed 6 collisions at Storholmen Plotted hotspot road segments using TIGER data files and Map and Image Processing System GIS software; road segments ad- Discussion: authors introduced a new approach to study game-vehicle justed to 1.3 km; random locations on same route of same accidents by focusing on factors that cause vehicles to collide with length for control purposes game rather than focusing on the factors that cause game to be close Landcover classification (Landsat TM): crops, forest, grass, to traffic arteries. water, developed, orchards; topographic physical features from aerial photographs and topographic maps Gundersen, H., Andreassen, H.P., and T. Storaas. 1998. Spatial and Analyses: 0.8 km radius buffer zone around each road segment to quan- temporal correlates to Norwegian train-moose collisions. Alces tify and compare landscape composition and pattern using 34: 385394. FRAGSTATS Objectives: In this study we reveal how temporal variation, i.e. climatic Simple correlation used to investigate relations among variables; factors and moose population density, and spatial variation, i.e. highly correlated eliminated landscape pattern and changes in food availability, correlate with t-tests to determine if variable means differed between hotspots moosetrain collisions along the railway in Norway which is most and controls; those indices with |t| values 3.0 selected as pre- burdened by wildlife collisions. dictor variables for logistic regression Data layers: train kills (time, location to nearest 100 m) Stepwise logistic regression model selection process to obtain a daily average temperature, snow depth preliminary equation size of moose pop estimate by population model Cersim (based on AIC to compare models observations by hunters in previous season) Stepwise selection process repeated using only landscape in- dices, satellite imagery data only Analysis: 2 categories of analysis--temporal factors (climatic and pop den- 5 paired sites used to test models' predictive abilities sity) and spatial factors (landscape patterns and food availability) Temporal factors: compared the freq. distribution of days w/ Results: variables included in model 1: % distant woody cover, % adja- cent gully; natural log of area of recreational land within buffer, certain weather conditions (expected) w/ the freq. distribution natural log of width of corridors crossing road; of 10 samples to test of collisions at the various weather conditions (observed) by a model validity, 5 control and 4 hotspots correctly predicted goodness-of-fit test. GLMs used to correlate moose pop size and Variables included in model 2: Simpson's diversity index; natu- number of collisions ral log of woods mean proximity index; of 10 samples to test Spatial factors (regional): analyzed correlation between land- model validity, 4 control and 2 hotspots correctly predicted scape patterns and number of collisions per 1 km segment Landscape patterns: (1) topography--measured as the dif- Discussion: study demonstrated that DVA site statistics and RS habitat ference in height from the bottom of the valley to the highest and highway data can be used to predict DVA locations point within 2.5 km to East and West of line and averaged; (2) distance to the nearest side valley--because assumed to Gundersen, H., and H.P. Andreassen. 1998. The risk of moose- channel moose migratory behavior; collision: a logistic model for moose-train accidents. Wildlife Bi- Tested and corrected for autocorrelation in 1 km segments ology 4(2): 103110. Spatial factors (local): to explain spatial variation of collisions, Objective: to use a logistic model to establish the most risky train de- compared number of collisions before and after changes in food partures for Rrosbanen railway which has the highest risk of availability due to logging activity in two areas

OCR for page 139
143 One area had increased food availability while the second had Evaluated performance of logistic regression model by applying decreased food availability to a randomly selected set of DVA sites not included in model Linear model including factors that significantly correlated to development (n = 245) yearly variation in collisions (climatic factors and population Results: 67% of 9,575 mileposts associated with DVA; > 25% DVAs at density) used to obtain estimate of expected number of col- 3.4% of mileposts (325) lisions before and after change in food availability; expected Significant 6-variable model produced; 4 variables landscape vs observed compared with goodness-of-fit test features, 2 highway characteristics Results: Temporal effects: number of collisions associated with both FCT final classification produced a tree with 57 nodes and a mis- temp and snow depth; combined temp and snow depth into vari- classification rate of 0.153%; bridge frequency was best predic- able (accidental period) which started when snow depth exceeded tor of high DVA sites 30 cm and lasted until temp stabilized above 0 degrees C. Number Model validation: correctly classified 63.3% of sites of days in new variable explained 83% of yearly variation in num- ber of moose collisions; GLM including both accidental period and Discussion: edges not found to be important, however it is possible that pop density explained 88% of yearly variation the resolution of our data was too coarse to identify all edges used Spatial effects: significant negatively correlated between number of col- by deer as travel corridors. Bridges always indicate points where lisions and distance to nearest side valley; no association between major edge-creating landscape features intersect roadways, and number of collisions and topography; changes in food availability therefore may be better predictors of concentrations of DVAs than strongly associated to number of collisions more broadly defined edge indices Discussion: moose usually killed in winter on days with lots of snow and Joyce, T.L., and S.P. Mahoney. 2001. Spatial and temporal distribu- low temps; influenced by migratory routes to lower elevations and tions of moose-vehicle collisions in Newfoundland. Wildlife availability of food; temporal variation due to climatic factors, spa- Society Bulletin 29(1): 281291. tial variation due to migratory routes and food availability. Objective: "we . . . relate rate & severity of human injury to time of accident, road conditions, road alignment, vehicle speed (via Hubbard, M.W., Danielson, B.J., and R.A. Schmitz. 2000. Factors in- posted speed limits), number of vehicle occupants, and sex/age fluencing the location of deer-vehicle accidents in Iowa. Journal of moose struck . . . to develop measures to reduce MVCs and of Wildlife Management 64(3):707713. severity of injuries" Objective: to examine the influence of highway and landscape variables MVC reports from conservation officers and RCMP from on the number of DVAs in Iowa 19881994; accidents generally reported if damage > $1000 CD Data layers: number of DVAs, traffic and landcover data obtained for ($500 CD before 1991). all milepost markers within the state (n = 9,575) Spatial analyses: N = 1690 MVCs on Trans-Canada Highway, GIS maps of habitat (Landsat imagery, 19901992): collapsed mapped/digitized, divided 900 km of TCH into ninety 10 km habitat types into cropland, woody cover, grass, artificial, water sections. Category each section by and miscellaneous Annual average MVCs (< 1.75 = low, 1.75-3 = medium, and White-tailed deer harvest numbers for each county from IDNR; > 3 MVCs = high) DVAs from 199097 for state's highways from IDOT; traffic vol- Moose density ( 2 = high) ume estimates linked to all accident sites Traffic volume (low vs. high) For each DVA included: distance to nearest town or city, dis- Results: tance to nearest city with pop > 2,000, number of bridges, num- Areas of low or high moose densities experienced greater ber of lanes of traffic probabilities of MVCs than areas of moderate moose Accident location often recorded to nearest 0.10 mile, but more densities. than 33% recorded at milepost, therefore all locations collapsed Higher probability of MVCs in areas with high traffic to nearest milepost volumes, regardless of moose densities Analysis: dependent variable-number of accidents in each 1.61 km seg- MVCs and human injuries analyses: log linear modeling to eval- ment from 199097 uate effects of the following variables on severity of human Randomly selected sample sites (n = 1,284); clipped 2.59 km2 injuries (low vs. fatalities) landscape section with sample site Darkness (day vs. dusk/dawn/dark) FRAGSTATS used to characterize landscape sections; linked to Road condition (wet-slick vs. dry) number of DVAs for segments Road alignment (straight or curved) DVAs separated into 2 categories (0-13, > 14 hits/segment) Vehicle occupants (driver only vs. driver + passengers) based on natural break and sample size Posted speed limits ( 80 km/h) Logistic regression: to examine relationship of DVAs to traffic, Passenger vehicles only (made up 89.5 of all reported colli- highway characteristics, human pop centers, and landscape vari- sions and had most serious injuries/fatalities) ables; stepwise selection procedure for variable inclusion during Determined influence of each variable on injury severity model development through forward model selection from main effects to sat- Factor classification tree (FCT) constructed to refine ability to urated model; used log-likelihood value (G-sq) of main ef- select high DVC areas fects model (included all variables) as baseline against Robust relative to a more standard cluster and discrimination which all other parameters were judged; each 2-way inter- analysis that might be used (Emmons et al. 1999) action was then added to main effects model and tested. De- Followed method described by Venables and Ripley (1994) viance between baseline value and derived G2 stat measured to find the rooted subtree with a minimum AIC importance of that parameter to model. Excluded parame-

OCR for page 139
144 ters with small deviation, determined at 95% confidence core area, the road segment where collisions per mile are most con- level. centrated; and (2) a mitigation zone, buffering segments on each Results: side of the core where appropriate mitigation actions can account No significant relationship between road alignment and ac- for animal movement and behavior and help avoid the "end of the cident severity (though 79% of accidents occurred on straight fence" problem. vs. curved roads) Summary of results: 24,299 WVC over 11 years; 99.6% had dates and Model 1 (injury, darkness, speed): Light condition and years associated with them; average of 2,202 (2,0252,577) colli- posted speed limit related to severity, and mutually inde- sions per year; 12 routes had high DVC rate over entire length pendent--risk 2.1x greater at night and 2x higher at high- ( 10/mile); 16 with moderate (59.99/mile); 148 with low rates way (high) vs. non-highway (low) speeds. (> 04.99); 65 routes with no reported DVC; 7 with data unavailable; Model 2 (injury, road condition, occupants): more acci- 54.6% of all collisions occurred on 10 routes dents occur than expected when passengers were in vehi- Collision frequency: 021.27 per mile; 1/3 occurred OctDec; 55.7% cles on dry roads, but not when there were no passengers occurred 18002400 hr or wet roads. Risk of severe injury or death 2x higher w/at Hotspots: found 183 hotspots in Utah; core hotspots average 5.3 miles least 1 passenger present compared to driver only. in length; isolated hotspots were 1 mile (1.6 km) in length; hotspot Also looked at temporal and age/sex influence . . . collisions were concentrated; 57.74% of all collisions occurred Moose calves more likely to be involved AugOct, yearlings within a cumulative (~ 1001 km) range, or 10.5% of total analyzed in June/July, and adults in JulyAug. highway miles (9,500 total km) More bull moose involved than exp. No significant relationship between diurnal patterns and sex Malo, J.E., Suarez, F., and A. Diez. 2004. Can we mitigate wildlife or age. vehicle accidents using predictive models? Journal of Applied Injury 6x more likely in collision w/ adult than calf Ecology 41: 701710. Discussion points of interest beyond results: Objective: the present study analyzed a European case and developed "Bashore et al. 1985 found positive relationship b/w speed models of the environmental variables associated with the occur- limit, driver in-line site visibility along road, and number of rence of collisions with animals at two spatial scales (1.0 and 0.1 collisions (see also Poll 1989)." km). Provided that a few variables underlie the location of animal "Damas & Smith 1982 estimate night speeds have to be re- crossings, it should be possible to predict where accidents may duced to 60km/hr or less under low-beam light to sufficiently occur and use this information to optimize mitigation efforts. With expand stopping distances and prevent accidents...enforce- this aim, we (1) defined road sections with high collision rates using ment key and difficult/enormous . . . most effective measure a clustering detection procedure; (2) analyzed the landscape vari- may be with drivers" ables of sections with high collision rates in contrast to low colli- Lavsund and Sandegren 1991. Moose vehicle collisions in sion sections; and (3) used a 0.1 km scale to analyze the points Sweden, a review. Alces 27:118126. where collisions occur in contrast to those where they do not. "Lavsund and Sandegren (1991) found 3x increase in Data layers: official traffic database on WVC for Jan 88Feb 01; n = severity of injury for vehicles moving 7090km/hr com- 2,067; includes date, location (0.1 km); 63% of WVCs occurred be- pared to lesser rates" tween 19982001; 98% involved roe deer, wild boar or red deer Discussed PR programs, mentioned Terra Nova National Definition of high accident rd sections: determined by detecting Park in Canada--long running program (12 yrs as of 2001) clusters of WVC locations; contiguity analysis conducted by involving using moose silhouettes and posting of number of comparing the spatial pattern of collisions with that expected in MVCs/year; have found Newfoundland drivers perceive a random situation; each km of rd with 3 or more collisions, es- Terra Nova National Park as area with greatest number and pecially over consecutive km, could be defined as a high colli- risk of MVCs. See Hardy R.A. 1984. Resource management sion section plan for MVCs, internal Terra Nova NP Parks Canada report. 1:50,000 digital forest cover map (cover types used: riparian for- est, other forest, scrub, grassland, crops, rivers and dams, ur- banized and unproductive); processed in ArcView 3.2 Kassar, C., and J.A. Bissonette. 2005. Deer-vehicle crash hotspots in Habitat features in high collision sections: analyzed 84 loca- Utah: data for effective mitigation. UTCFWRU Project Report tions--41 high collision, 43 low collision; sampling unit = cir- No. 2005(1):128. Utah Cooperative Fish and Wildlife Research cular area (radius 1000 m) around reference point; calculated Unit, Utah State University, Logan Utah. proportion of each habitat type; ecotone length (meters of The data originate from collision reports prepared by law enforcement contact lines between habitat polygons); habitat diversity officers and provided to UDOT by the Utah Department of Public (Shannon index) Safety. A wildlifevehicle collision is included in the database only Variables associated with collision point: analyzed at 0.1 km if an animal was actually hit, if the estimated vehicle damage ex- scale: sampling points from 18 high collision sections in which ceeded $1,000 and/or if a person was injured. Collisions included WVCs had been recorded in at least 12 hectometer posts; 6 hec- in the database do not account for crashes that occurred as a result tometer posts with highest number WVCs chosen from each of swerving to miss an animal. section; a further 6 taken at random from amongst those with We focus on collisions involving almost exclusively mule deer. We used recorded collisions; 12 control samples w/o WVCs taken at the UDOT vehicle crash database to study DVC patterns and trends random from each section from 19922002 on 248 state routes. We evaluated all routes for Evaluated 13 quantitative and 15 qualitative variables covering frequency of deer kills and identified hotspots (at least 1 collision/ aspects linked to driving, general features of the road environs, mile/year). We considered hotspots to consist of two parts: (1) a features associated with animal movements; measurements

OCR for page 139
145 taken for 100 m rd stretch and evaluated 100 m on each side of lanes; distance from road to nearest forest cover patch; ROW road topography based on presence or absence of ditches Analysis: analyzed at both regional and local scales; predictive models for Analysis: univariate procedure used to reduce 66 variable set to smaller the location of sections/points with and without collisions were gen- group; removed variables correlated at r 0.70; left with number erated by binary logistic regression; validated with independent data of buildings, number of forest cover patches, proportion of forest 2 models fitted for each analysis: 1 complete with all measured cover, Shannon's diversity index for further analysis variables, 1 reduced version with only most significant explana- Logistic regression analysis to determine which variables best tory variables explained difference between DVA areas and control areas; built Variable selection for reduced models using G2 statistic; ensured one global model and 10 a priori models; used AIC and Akaike's new model was not significantly more informative than previ- weights to rank and select best model; used relative weight of ous one, avoided correlated variables and those w/o predictive evidence to compare parameter importance; model averaging to capacity incorporate model-selection uncertainty into final uncondi- Significant threshold in variable comparison: P = 0.05; proba- tional parameter estimates and standard errors bility threshold for model: P = 0.1 40 sites retained to validate best-fit model Results landscape scale: 41 high collision rd section identified; 7.7% of Results: global model was significant; areas with DVA contained fewer rd network accounting for 70.5% of all records; distributed among buildings, more patches and higher proportion of forest cover, secondary and tertiary roads; none along A-2 fenced motorway more public land patches and higher Shannon's diversity index of High collision areas had higher cover of non-riparian forest, landscape; Akaike's weights indicated number of buildings and lower crop cover, lower urbanized areas, and higher habitat di- number of public land patches most important variables versity than low collision areas 7 models necessary to compile a 95% confidence set; best-fit Simplified model included forest cover, urbanized area and model correctly classified 77.5% of test sites habitat diversity; had same predictive capacity as full model: Discussion: study unique because assessed landscape factors influenc- 87.0% correct classification for all cases, 88.5% for high and ing DVA in an urban environment; pooled data over 7-year period 85.7% for low collisions sections; successfully predicted 70% of so pop growth or land-use change may have affected data 30 cases used as test data Nielsen, S.E., Herrero, S., Boyce, M.S., Mace, R.D., Benn, B., Gibeau, Results for local scale: low collision areas associated with crossroads, M.L., and S. Jevons. 2004. Modelling the spatial distribution of underpasses, guard rails, embankments at least 2 m high with mod- human-caused grizzly bear mortalities in the Central Rockies erate or steep slopes, greater distances from roads to hedgerows and ecosystem of Canada. Biological Conservation 120:101113 forest stands, and shorter distances from roads to buildings Reduced model included presence of crossroads, presence and Objective: "We develop predictive models and maps that describe the continuity of guardrails, presence and continuity of embank- distribution of human-caused grizzly bear mortalities . . . Our goal ments and distance to nearest forest stand; correctly predicted was to understand, through modeling, the relationships among bear 61.2% of cases, 72.7% of collision points, 48.4% of non-collision mortality locations and landscape-level physiographic and human points; full model results were 74.0%, 79.2% and 68.1% respec- variables. More specifically interested in (1) examining the spatial tively; correctly classified 64.2% of test cases density of grizzly bear mortalities; (2) evaluating possible differences in the physiographic attributes of mortality locations . . .; and Discussion: results show it is possible to predict the location of WVCs (3) developing predictive models that estimate the relative proba- at 2 scales; results should be considered cautiously; validity could bilities of bear mortality (risk) given multivariable combinations of be hindered by assumption of a binomial distribution of errors-- physiographic variables." bigger issue for local rather than landscape model Data layers: mortality info from 19712002; included dead bears and translocated bears; location (UTM when possible), accuracy of Nielsen, C.K., Anderson, R.G., and M.D. Grund. 2003. Landscape in- location (accurate, reasonable, unknown), month, year, sex, age, fluences on deer-vehicle accident areas in an urban environment. and cause of mortality; n = 279 accurate and reasonable locations Journal of Wildlife Management 67(1): 4651. GIS (spatial) predictor variables: land cover (Landsat TM 95-98, 5 classes); distance to edge of nearest land cover; greenness Objective: quantified the effect of landscape factors on DVA in 2 Min- index; distance to nearest water feature; distance to nearest lin- neapolis suburbs to provide public officials and wildlife managers ear human use feature; terrain ruggedness index with recommendations for managing the landscape to reduce DVA. Analyses: 3 separately scaled moving windows to calculate total density Data layers: digitized DVA locations from 19932000 using ArcView of mortality locations: 520 km2; 900 km2; 1405 km2; secure sites = DVA clustering to differentiate DVA areas ( 2 DVA) and pixels with 0 mortalities; high mortality zones = pixels with > 31 control areas (0 or 1 DVA); overlaid 0.5 km road segments at mortalities ( 1 mortality/year) midpoint of DVA clusters; buffered road segments for variable Logistic regression to assess relationship between landscape at- selection purpose with a 0.1 km perpendicular distance from tributes of mortality locations and categories of demographic edge of each side of road; repeated for control areas (n = 160 status, season, and mortality type total) Random sample of locations generated to contrast with human- Landscape variables: land cover (grassland/residential, wood- caused mortality locations land, open water); land use (commercial/industrial, residential, Data divided into model training (80%) and model testing public land); ArcView Patch Analyst used to calculate 60 class (20%) data sets and landscape level variables; road curvature (straight or Logistic regression used to contrast the location of grizzly bear curved); number of buildings in buffer, speed limit; number of mortalities with sites used by bears (through telemetry)

OCR for page 139
146 Results: mortalities concentrated within 3 regions regardless of scale assumed low precision in DVC location data; grid split into two examined equally sized groups of cells, 1 group for model development, 1 for 900 and 1405 km2 scales: mortality densities within moving win- validation dows exceeded 31 mortalities for the three sites; at 520 km2 scale: Density function in SpatialAnalyst used for visual inspection of DVC only one site as high mortality zone patterns; density calculated for each cell by summing number of Total area occupied in high mortality zones: 520 km2 =1.4%, 900 DVCs found within search radius (1/2 mile) and dividing by the area km2 = 3.8%, 1405 km2 = 13.2% of the circle Total area occupied in secure zone: 520 km2 = 23.9%, 900 km2 = Stepwise logistic regression to identify a subset of parameters to build 13.9%, 1405 km2 = 23.9%; 2232% secure habitat in areas of predictive logit model; final model had 3 parameters: linear feet of non-habitat highways and roads, linear feet of roadway within 1000 ft of water- Mortality locations positively associated with access, water, and course, number of mapped land use polygons edge features; negatively associated with terrain ruggedness and Analysis: mapping of DVC densities summed across all years; mapping greenness indices in 3-year blocks; resulted in very little change in hotspots across Non-harvest mortalities more likely to occur in shrub and grass- years, only minor shift in location and density land habitats and closer to edge features and access than random points Romin, L.A., and J.A. Bissonette. 1996. Temporal and spatial distri- Mortalities more likely to occur in deciduous forest and shrub bution of highway mortality of mule deer on newly constructed habitats, nearer to edge, access, and water than radiotelemetry roads at Jordanelle Reservoir, Utah. The Great Basin Naturalist locations; also sig related to areas of low greenness and minimal 56(1): 111. terrain ruggedness Objectives: (1) to determine whether mule deer roadkills on newly relo- cated highways would increase, (2) to evaluate the influence of topo- Premo, D.B.P., and E.I. Rogers. 2001. Town of Amherst deer-vehicle graphic features and vegetation characteristics on the kill pattern accident management plan. White Water Associates, Inc., Amasa, Data collected from 15 Oct 199114 October 1993; 47.3 km total on Michigan (www.white-water-associates.com) 3 highway segments; road construction completed in 1989 Objective: This plan's focus is reducing DVAs. The primary measures Data layers: deer roadkill data collected at least once per week (date, of concern are the numbers of DVAs and the patterns of their dis- highway identification, location to nearest 0.10 mile, age class); tribution in the Amherst landscape. The DVA Management Plan 4 randomly selected pairs of kill (5 or more kills/mile) and non- establishes its initial goal at two spatial scales, whole town and kill zones of 0.10-mile road length each; for each pair, estab- hotspots. lished 3 transects perpendicular to road, 100 m apart, extended Data layers: DVAs reported to police (n = 3300) and counts of carcasses 100 m beyond ROW fence to evaluate respective road alignment removed from road (n = 3320); Jan 1991Dec 2000; entered into and associated habitat features GIS; time of day, time of year, location, speed limit, landcover; deer Distribution of kills (nearest 0.01 mile); avg traffic volume and population; management zones speed for each highway; % vegetative cover; topography proxi- mal to area roads; twice monthly spotlight counts of deer (sex, Analyses: density analysis in ArcView used to examine landscape pat- age class, activity, location to nearest 0.10 mile); deer snow track terns of DVAs. This allowed mapping of DVAs as density contours counts (number of trails, orientation relative to road--parallel and identification of DVA hotspots; density calculated by circles of vs. perpendicular); observable area from highway every 0.10 half-mile radius; DVA density = DVA/sq. mi.; when displayed in mile; ROW width and slope; ROW vegetation; vegetation com- conjunction with other mapped features, contours could be used position; road type to determine the causes of the hotspots as well as examine tempo- ral changes Analysis: stereoscopic aerial photography used to describe habitat fea- tures; transparent grid placed over photos to determine percent Results: temporal changes in hotspots before, during and after the con- cover and topographic features at deer-highway mortality locations centrated lethal control period beginning at the road and extending 1.2 km distant; identified roadkill and live deer locations, as well as descriptive roadside fea- Rogers, E. 2004. An ecological landscape study of deer-vehicle colli- tures to 0.10 mile sions in Kent County, Michigan. Report for the Michigan State Police, Office of Highway Safety and Planning. White Water As- Results: 397 deer roadkills during 2 years of study; deer kills averaged < sociates, Inc., Amasa, MI 49903. 56 pp. 20 before roads relocated; 19 deer kill zones identified; deer spot- light counts not significantly correlated with kill sites; kill zones had Objective: an analysis of landscape patterns of DVCs in 4 townships of higher mean % cover Kent County, Michigan Data layers: GIS database available; included spatial layers drawn from Discussion: traffic volume significantly influenced deer mortality; MiRIS Base Maps and Land Cover Maps; political boundaries, land higher kill levels occurred along drainages; ROW topography may survey section lines, transportation, watercourses and lakes, major funnel deer to the ROW and encourage movement along highway veg cover types, development corridor DVC locations from Michigan Accident Location Index (MALI) main- Seiler, A. 2005. Predicting locations of moose-vehicle collisions in tained by Michigan State Police, 19922000; locations based on po- Sweden. Journal of Applied Ecology 42: 371382. lice reports; uses system of unique physical reference numbers to spatially record accidents Objective: to develop MVC prediction models based on data that are N = 3127 DVC records coded by township, year, month, time of day readily available for road planning at strategic and project levels Half-mile grid created in ArcView and superimposed on study area for (Seiler and Eriksson 1997). This study used accident statistics from summarization of landscape data; 1/2 mile chosen because of before 1999, remotely sensed landscape information, digital topo-

OCR for page 139
147 graphic data and official road and traffic data to identify the county; 500 meter radius around the center point of each road strongest set of environmental and road traffic parameters that can section; univariate logistic regression analyses to determine be used to foresee the risk of MVC. model performance in distinguishing accident from non-acci- Data layers: Landscape, road and traffic, collisions, moose abundance dent sites and harvest Counteractive measures: to illustrate and evaluate the predicted Landscape data: Swedish Terrain Type Classification maps effect of different counteractive measures on accident risks, (TTC) (based on SPOT and Landsat TM satellite images) com- changes in MVC probabilities relative to varying traffic volume bined with digital topographic maps at a scale of 1:100,000; and moose abundance modeled with respect to increased forest 1994-1998, updated with aerial photographs from 1999 proximity, reduced vehicle speed and road fencing. 25x25 meter pixel size; 6 major land cover types; densities of Results: Dominant factors determining MVC risks included traffic vol- landscape features measured as km per km2, number of in- ume, vehicle speed and the occurrence of fences tersections per km rd; distances between rd and landscape el- Model results: model ranking according to AIC weights: (1) traf- ements measured in meters and log(e) transformed fic (classified correctly 81.2% of all observations), (2) combined Road and traffic data: from digital road databases provided by (83.6%, but lower ranking because of greater number of vari- the SNRA ables), (3) landscape (67.5% MVC sites and 62.2% control sites) Averaged rd density: model area--1.92 km/km2; test area-- Validation results: combined model gave best results predicting 1.76 km/km2; 75% is privately owned 72.4% of all MVC sites and 79.8% of all control sites; traffic National trunk roads: 2,50020,000 vehicles/day; > 90 kph; model concordance = 77.9%; landscape model concordance = Tertiary public roads: 80% of rd network, < 1,000 vehi- 62.0%; all results are significant cles/day, 90 kph) in Identified 72.7% of all accident sites model area 71% fenced, in test areas 35% fenced Other parameters were important in distinguishing between ac- Average number of vehicles/day used jointly with its square cident and control sites within a given road category including to adjust for the humpbacked relationship between traffic amount of and distance to forest cover, density of intersections volume and MVC frequencies observed in the data between forest edges, private roads and the main accident road, Moosevehicle collisions: obtained from the SNRA rd acc stats moose abundance indexed by harvest statistics containing all police-reported accidents on public rds between Together, road traffic and landscape parameters produced an 19721999 (type of accident, place, time) overall concordance in 83.6% of the predicted sites and identi- Accuracy not evaluated, error estimated at 500 m (L. fied 76.1% of all test road sections correctly Savberger, pers com). Speed reduction appeared to be most effective measure to re- N = 2185 for model area; N = 1655 for test area (for duce MVC risk at any given traffic volume; modified by fencing, 19901999) moose abundance and forest proximity Moose abundance and harvest: indices of moose abundance were determined from the average annual game bag per hunt- Discussion: spatial distribution of MVC not random; collisions a prod- ing district during the 1990s uct of environmental factors quantified from RS landscape info, Model area: 21 hunting districts, avg 3.45 shot/1000 ha road traffic data and estimates of animal abund.; parameters used (1.05.1); Test area: 14 hunting districts, avg 4.25 shot/1000 to identify high risk roads (traffic data) different from parameters ha (1.66.4) used to identify high risk road segments (landscape data) Moose harvest and MVC correlated strongly at county and national levels over the past 30 years (Seiler 2004) Simek, S.L., Jonker, S.A., and Mark J. Endries. 2005. Evaluation of No migration between winter and summer ranges principal roadkill areas for Florida black bear. ICOET 2005. Analysis: 3 logistic regression models were developed to identify pa- Principal roadkill areas (PRA) defined as 3 or more roadkill bear within rameters that significantly distinguished between observed MVC a distance of 1 mile (1.6 km) sites and non-accident control sites Data from 2001-2003 analyzed using density analysis with Spatial Model composition: N = 2000 MVC records, N = 2000 ran- Analyst in ArcGIS domly distributed non-accident control sites located at least 6 core and 2 remnant black bear populations evaluated 1 km away from MVC site Objectives: to establish whether previously identified "chronic" areas 500 m buffer created around each point (to account for esti- were still apparent or had shifted, and whether different criteria mated error) and timeframes would impact results and subsequent conservation Unpaired t-tests and univariate logistic regression models used recommendations using current and previously evaluated roadkill to identify among 25 variables those that sig (P < 0.1) differed data between accident and control sites (all other analyses used Data layers: FWC bear roadkill data and the major roads shapefile (in- P < 0.05); intercorrelated variables removed, 19 variables left terstates, state highways, county highways, highway access ramps, 3 a priori models: (1) road-traffic (only basic road and traf- and major local and forest roads) fic parameters); (2) landscape (parameters obtained from RS Density analysis: raster format with 30 m x 30 m pixel size; creates a 2D landscape data and digital maps); (3) combined model raster grid of pixels calculating the total number of points that oc- Stepwise (backward) regression to identify sig parameter curred within the search radius divided by the search area size; pix- combos; sets compared using AIC and Akaike weights; model els within areas meeting principal roadkill definition reclassified to structure considered adequate if variance inflation factors 1 (referred to as CRDA), all others classified as no value; 1-mile were close to 1.0 buffer created around CRDA dataset (referred to as PRBA); analy- Model validation: N = 1300 accident sites (1km road sections) sis repeated using criteria outlined by Gilbert and Wooding (1996) and 1300 non-accident sites (1km road sections) from new of 8 roadkill bear/7 miles (they used dataset from 19761995)

OCR for page 139
148 Results: With a few exceptions, most of the PRA identified by both Points selected do not influence the selection of other loca- methodologies overlapped; Gilbert methodology encompassed a tions for points (independence). much larger area which included more roads whereas the current These conditions imply the study area is homogeneous w/no methodology identified more specific principal roadkill road seg- interaction b/w points, and the resulting pattern from that ments; using similar timeframe (19761995), two methods again point generation process could be considered to occur by identified very similar PRA but new method identified additional chance in an undifferentiated environment, referred to as areas; using complete timeframe (19762004) PRA identified in all "complete spatial randomness" or CSR (cites Diggle 1983). 6 populations, including 2 which had not been previously identi- CSR is idealized standard which other patterns can be com- fied as containing PRA pared to-- Clustered patterns occur when points are significantly Discussion: illustrated that changes in locations of PRA can occur when using different methods and timeframes; different results with re- more grouped in the study area than they are in CSR. Regular patterns occur when points in the study areas are spect to scale--Gilbert's method gives PRA on a broader scale, new method provides increased specificity on actual locations of more spread out than they would be in CSR hotspots; PRA will change with changes in habitat and land use; Opposite of uniformity condition/homogeneous model: het- preferred method (Gilbert or new) will depend on goals and erogeneous models, which imply some locations in study objectives area are more prone to receive a point than other locations, or may be less likely to receive a point. Singleton, P.H., and J.F. Lehmkuhl. 1999. Assessing wildlife habitat If independence assumption is relaxed, then there may be in- connectivity in the Interstate 90 Snoqualmie Pass Corridor, teraction among points--i.e., they may attract or repulse each Washington. ICOWET III. other. To analyze dispersion or arrangement characteristics, use hy- Objective: an assessment of wildlife habitat connectivity and barrier ef- pothesis testing procedures, with the null hypothesis always fects of I-90 from Snoqualmie Pass to Cle Elum was initiated in Jan- that the pattern is CSR, with the simplest alternative hypoth- uary 1998. The assessment consists of 5 components including a esis being that the pattern is not CSR. GIS analysis of ungulate roadkill distribution If null not rejected, no further analysis needed. Data on ungulate roadkill locations was collected by WSDOT mainte- Null (CSR) provides division between clustered and reg- nance personnel from 1990 to 1998. We imported these records on ular patterns. species and location of roadkills into the GIS and used a moving win- If null is rejected, can develop further/formulate new null dow analysis to determine the number of kills per mile along I-90 hypotheses to test other theories. Results: 4 roadkill concentration areas were identified based on the Spatial autocorrelation is a measure of the correlation among analysis of 490 deer and 194 elk kills. Quantitative analysis of land- neighboring points in a pattern. scape characteristics of collision locations has not yet been con- No spatial autocorrelation means no correlation between ducted. However, roadkill distribution appears to be affected by neighboring values and would expect CSR landforms that channel animal movement and by human develop- Measures of dispersion/distance methods analyze patterns using ment and disturbance patterns. stats calculated using characteristics of distances separating in- dividual points in the pattern. Nearest neighbor analysis (NNA): II. Spatial Analysis Techniques as originally developed, several limitations--inaccuracy in interpretation in some situations and edge effects Boots., B.N. and A. Getis. 1988. Point Pattern Analysis. Sage Publica- 2-D study areas (not roads): defined as distance between tions, Inc. Newbury Park, California. 85 pp. point a and the nearest other point in the pattern Development of statistical analysis of point patterns originated Distances other than those between a point and its closest in plant ecology over 50 yrs ago. neighbor are referred to as second, third, or "higher order Point pattern map has 2 components: neighbor distances" Point pattern: has size (# points, n) NNA in 1-D study areas (roads): same concepts, but the Study area: may be 1 or multidimensional. Roads would be rep- line is bounded by its ends, so two ways to deal with these resented as a one-dimensional study area. Two-dimensional ends (edges) study areas are enclosed by a boundary, which determines the If points at ends of line shape of the study area. Road study areas do not have a shape If no points at either end of the line necessarily. NN dist for any point not located at an end point is dis- If studying the location of points relative to the study area, then tance to either the preceding or succeeding point en- examining dispersion of points; if studying locations of points countered on the line; thus nearest neighbor distances relative to other points, then examining the arrangement of are part of the set of all interpoint distances on the line. points. In many cases dispersion and arrangement may be To test, interpoint distances converted to proportions highly correlated. of the sum of the interpoint distances, resulting in When analyzing pt patterns, usually use method that involves scaled values ranked from smallest to largest, within n establishing a theoretical pattern that is compared to other pat- as the number of interpoint distances. Observed and terns that are identified. That theoretical pattern chosen is for- expected values compared to normally distributed sta- mally called a homogeneous planar Poisson point process, and tistic z; if calculated value of z is positive and larger these points are generated under two conditions: than value of z = 1.96 (alpha 0.05) obtained from tables Each location has equal chance of receiving a point (uni- of normal dist, the null is rejected in favor of hypothe- formity). sis that indicates regularity in the point pattern.

OCR for page 139
149 Refined NNA (cites Diggle 1979 pg 79) involves comparing Measures of arrangement are insensitive to some differ- the complete distribution function of the observed nearest ences in some pattern characteristics so that identical val- neighbor distances F(di), with the distribution function of ex- ues may be expected for patterns that are different in some pected nearest neighbor distances for CSR P(di). way. Observed nearest neighbor distances obtained by taking Stats theory less well developed (in 1988) so greater ele- nearest neighbor distances and ranking smallest to largest, ment of subjectivity enters when interpreting results of then determine what proportion F(di< = r) of nearest analyses of measures of arrangement. neighbor distances are less than or equal to some chosen Reflexive nearest neighbor analysis: distance r (usually selected to correspond with nearest When two points are the nearest neighbor of each other, neighbor distance values). said to be reflexive (reciprocal) nearest neighbors. Cited Pielou (1969:111112) with equation that shows Test number of reflexive nearest neighbors in the pattern that the corresponding proportion of expected nearest observed compared to expected number of reflexive near- neighbor distances r for unbounded CSR pattern. P(r) est neighbors in CSR. Diggle 1981 suggests P(r) and F(r) can be compared Lack of a test of significance and unanimity in interpret- using dr = max | F(r)-P(r) | ing results . . . common to extend analysis to analysis of Because nearest neighbor distances are not mutually in- reflexive nearest neighbors to higher orders; in interpret- dependent Diggle (1981:26) suggests, to evaluate the sig- ing number of observed pairs in relation to CSR values, nificance of dr, use Monte Carlo test procedure to gener- most researchers suggest that higher order values in excess ate set (usually 99) of CSR patterns each with the same of the CSR expectations indicate a measure of regularity in number of points as the empirical pattern in the study the arrangement of points whereas lower empirical values area, then calculate dr for each of the calculated simulated imply grouping. patterns, then examine where the value of dr for the em- Dacey 1969 gives tables of probabilities that a point along pirical/observed pattern falls within the entire set of 100 a line in a random pattern is the jth neighbor of its own jth values (99 simulated and 1 observed patterns). If dr for ob- nearest neighbor for j 6. 1st order prob: 0.6667; 2nd order served pattern were among 5 largest values of dr, the null prob: 0.3704; 3rd order prob: 0.2716; 4th order prob: of CSR can be rejected (at alpha 0.05). Diggle 1979 sug- 0.2241; 5th order prob.: 0.1952; 6th order prob.: 0.1753; to gests that if for dr, F(r) > P(r), then clustered, whereas if get "expected," multiply total number of points that are by F(r) < P(r) than indicates regular pattern of points. the corresponding probability, and if observed number of Second order procedures requires distance measurements jth pairs is less than expected, then suggests grouping between all combinations of pairs of points. Study of in- May be that the reflexive nearest neighbor observed = terevent distances where events are mapped points. Focus is CSR, but when look at higher order reflexive pairs (2nd, on the variance, or second moment, of interevent distances. 3rd, etc.) may see tendencies toward grouping. Advantages over other techniques: more info about pat- Summary: No one single optimal method. tern is potentially available; CSR model available for in- Power of most point pattern techniques (i.e., ability to elim- terevent distances can be used as basis for statistically sig- inate false hypothesis) varies depending on the type of pat- nificance (second order analysis); statistically defensible tern so some techniques are better than others in detecting boundary correction technique developed for second clustering whereas others are better at detecting regularity. order studies. Convenient to use to study various distance Measures of dispersion better than measures of arrangement subdivisions or distance zones. since the latter methods require more subjectivity in the in- Analysis based on circle with radius d centered on each terpretation of their results. point, each of the points w/in the circle is paired with the Measures of dispersion used in combo with arrangement center point of that circle and it is this number of pairs that techniques can provide confirmation of results and further form our data. As d increased, see increased number of insights into the patterns. pairs of points in each circle. Analysis of that data depends on expected pairs of points derived similarly to points in a Poisson process (CSR model). Ripley (1981:15960) Burka, J., Nulph, D., and A. Mudd. 1997. Technical approach to de- Cites Haining (1982), Getis (1983, 1984), Ripley 1981 and veloping a spatial crime analysis system with ArcView GIS. Diggle 1983 additional background. INDUS Corporation and U.S. Department of Justice. Measures of arrangement examine locations of points relative to other points in the pattern. Two advantages over measures of Discusses methods used to develop and implement an ArcView- dispersion: based spatial crime analysis system for geographic analysis. Advantages Sample application functions include "Density free": to compare arrangement properties of Pin maps and summaries CSR pattern against observed pattern, don't need to esti- Geocoding mate any values from the observed data. Change maps that look at trends over time based on two Arrangement measures are concerned with the locations maps of same area representing incidents at different times, of points relative to each other and not relative to the which produces a third map that shows increase or decrease study area (as is the case with dispersion methods) in incidents per polygon b/w the two time periods. Disadvantages: Surface-derived hotspots--many ways to do this, but they use Not as rigorous than measures of dispersion, sort of like ArcView spatial analyst to build a surface of incident density how non-parametric statistics are usually less powerful for a selected set of incident pts, using the kernel function in than their parametric equivalents. SpatialAnalyst, then reselect out the "peaks" depicting hotspots

OCR for page 139
150 Standard deviation Ellipses est neighbor index). Index of 1.0 is indistinguishable from Temporal and spatial trend charts chance, lower than 1.0 indicates clustering and > 1.0 indi- Layout generation (maps) cates dispersion. These measures allow description of spatial variation and de- Lee, J., and D.W.S. Wong. 2001. Statistical Analysis with ArcView GIS. gree of concentration (spatial autocorrelation). John Wiley and Sons, Inc., New York, New York. 192 pp. Compared SD ellipses for types of crashes (fatal, serious injury, alcohol-related, single-vehicle, head-on, two-vehicle, etc.) to Chapters: e/o as well as to other ellipses for residential population and Attribute Descriptors employment. Point Descriptors Used to provide insights into how certain relationships have a Pattern Detectors spatial dimension (e.g., between alcohol and severe injuries; Line Descriptors types of impact and injury level), can be used to compare diff Pattern Descriptors types of accidents, the same type of accident for 2 diff time pe- riods, or same type for two different areas. These do not provide Levine, N., Kim, K.E., and L.H. Nitz. 1995. Spatial analysis of Hon- behavioral insights. olulu motor vehicle crashes: I. Spatial Patterns. Accident Analy- These methods go beyond "blackspot" analysis--blackspot sis and Prevention 27(5):663674. analysis assumes that observation locations are spatially inde- Examines method for geo-ref crash locations and guides for de- pendent; that each observed location has its own random process, scribing spatial dist of crash locations, and how types of crashes whether Poisson distributed or not. Cites Loveday and Jarrett 1992 can be spatially differentiated. Study area was assumed homo- re: spatial autocorrelation and that you can't treat each observa- geneous planar, not a network (system of roads). tion as independent. 4 general categories of analyzing spatial variations in auto Limitations to these guides: assume monocentric spatial plane crashes: but in cities often have multiple centers and these distort the re- Diff types of environments--rural vs. urban, large cities vs. lationships by assuming a center, but they say that there are no small cities, state comparisons, national comparisons; tend to accepted methods for identifying multiple nodes in a spatial use highly aggregated data and large geographical units. plane; most cluster analyses produce biased results since they Examines crashes as function of volume, speed, other vari- don't take spatial autocorrelation (see Anselin 1995 for devel- ables on roads, road types, intersections, emphasis on func- opments in this area). tions of the road system, how different road segments or elements create different crash likelihoods. Classic "black- Levine, N. 1996. Spatial statistics and GIS: software guides to quantify spot" analysis included in this category (cites: Boyle and spatial patterns. Journal of the American Planning Association Wright 1984, Persaud 1987, Maher and Mountain 1988). 62(3): 381391. Crashes in particular areas, corridors, neighborhoods, em- phasis on analysis units, which are socially and ecologically Reviews the following software guides: integrated. STAC (Spatial and Temporal Analysis of Crime) System-wide spatial variations in crashes (few studies on this) Hawaii Pointstat to look at variations across region, examine how crashes in a S-Plus particular zone or sub-area are part of larger spatial pattern. Venables and Ripley Spatial Statistics Functions Developed own software to derive different indices of spatial SASP: A 2-D Spectral Analysis Package for Analyzing Spatial point pattern (Hawaii Pointstat; cites Levine et al. 1994). Takes Data list of lat/long for each crash location and produces 4 measures SpaceStat: A Program for the Statistical Analysis of Spatial of concentration Data Mean center (mean lat and mean long on list, "center of Variables may be described spatially as either gravity") Occurring at unique point locations (incidents, buildings, Standard distance deviation, based on "Great Circle" dis- people) tance of each point from mean center (cites McDonnell 1979 Aggregated to areas (census tracts, traffic analysis zones, city chap 1; Snyder 1987 pp. 2933). boundaries) Standard deviational ellipse, which calculates the SD along a Stats describing points or areas fall into 3 general categories transformed axis of maximum concentration and another Measures of spatial distribution, which describes center, dis- SD along an axis which is orthogonal to this (cites Ebdon persion, direction, and shape of the distribution of a variable 1985 pp. 135141). More concise than standard distance de- (cites Hammond and McCullogh 1978; Ebdon 1988), e.g., get viation circle (above). latitude/longitude locations geo-coded, then can calculate Nearest neighbor index, which measures average distance center of the distribution ("center of gravity" or mean cen- from each point to the nearest point and then compares this ter), dispersion (standard distance variation), direction of the to a distribution that would be expected based on chance dispersion (standard deviation ellipse)--then can compare (cites Ebdon 1985 pp. 143150; Cressie 1991 pp. 602615). to other distributions. Developed by plant ecologists for describing clustering of Measures of spatial autocorrelation describe relationship point patterns (cites Clark and Evans 1954). For each point, among different locations for a single variable, indicating distance to every other point calculated and shortest distance degree of concentration or dispersion (cites Cliff and Ord selected, then shortest distances are averaged and compared 1981; Haining 1990; Cressie 1991). Indicates whether clus- to a NNDist which would be expected based on chance (near- tering is greater than can be expected on basis of chance.

OCR for page 139
151 Measures of spatial association between two or more variables, could either be number of discrete points that fall within the describes the correlation or association between variables dis- cell or a value attributed to the entire cell. tributed over space (Anselin 1992b spatial dependence article) "Distribution of grid cell structure can be decomposed into STAC, DOS-based program designed by Statistical Analysis trigonometric ("cyclic") components, called a Fourier de- Center of the Illinois Criminal Justice Information Authority to composition," resulting in discrete frequencies (p & q) that help police depts. identify small concentrations (called "hot spot are independent of e/o and that indicate the contribution of areas") of crime. each frequency to the overall pattern. Essentially an ANOVA Two modules--TIME, SPACE. SPACE module does two splitting up the variance into sine/cosine components. things: radial search for incidents from a selected point and Central output is periodogram which is a plot of the sine/co- identification of highest concentrations of incidents within a sine components and is expressed as the number of waves study area. SPACE needs identification number and x, y lo- down the rows, p, and the number of waves across columns, cation of each point in Euclidean coordinates (plane coordi- q, with an origin at p = 0 and q = 0. Two summary indices: nates, UTMs). It must specify limits of study area (min/max R-spectrum is average of periodogram values for semicircu- x, y coordinates) as well as search radius which is a circular lar "distance" bands emanating from the origin (p = 0, q = 0) area that the program uses to search for points that cluster to- and a width of 1. The spectrum is an average of the peri- gether. No theoretical basis for choosing particular radii, and odogram values for an angular band (i.e., pie slices) from the different search radii will produce slightly different clusters. origin; that is, it is a polar coordinate band that is 10 degrees Produces ellipses to identify areas of clustering. Doesn't have wide, starting at -5 deg -+5 deg along the x-axis and turning statistic to objectively group points into unique clusters (i.e., clockwise until 165175 degrees. with fixed number of clusters and each pt assigned to one and Also 3-D figure showing a smoothed rearranged peri- only one cluster). odogram. Hawaii Pointstat provides summary measures of the spatial dis- 2-D spectral analysis seen as exploratory guide for examining tribution of points. Available in DOS and Sun Unix versions, repeating spatial patterns. can be obtained from the Internet. SpaceStat program designed to spatially analyze areal distribu- Takes list of x,y location points, can use weights/intensities tion (Anselin 1992a), written in Gauss (matrix language). Can for points (i.e., if multiple WVCs occurred at same location). be applied to data collected on individual zones or areas within Distances between points calculated with 2 different metrics a larger geographical area. Spherical geometry using "great circle" distances; Ability to create a spatial weights file, which is a series of Spherical grid distances, which assume that travel occurs weights, assigned to individual observations, indicating their only in horizontal or vertical direction (not diagonally)-- location in relationship to e/o. Two forms of weights: used in cases of grid street systems. Binary (contiguity matrix that indicates which zones are Program produces following outputs: mean center; standard adjacent to each other) deviation of distance of each point from mean center; stan- General (distance based matrix that indicates the relative dard deviation of ellipse (which is 2 standard deviations, one distance of each zone from the others. Typically defined in along a transformed axis of maximum concentrations and terms of inverse distance raised to an integer power (e.g., one along an axis 90 degrees to that other axis, defining an el- 1/d, 1/d2, 1/d3); the higher the power of the distance fac- lipse); nearest neighbor index; Moran's I (Moran 1948, 1950; tor, the more "local" the effect. Ebdon 1988, Haining 1990) 4 modules: Provides summary stats of point spatial distributions and can First allows data to be input and transformed output distance files for use in other programs. Useful to de- 2nd involves guides for creation of spatial weights input scribe distribution of points and can be used to compare dif- 3rd involves exploratory analysis including descript stats ferent types of distributions. correlations, and principal components. Includes a Join- Venables & Ripley's Spatial Statistics Functions in S-Plus: mod- Count statistic for binary variables and several measures ules written in S-Plus (distributed by StatSci), available in both of spatial autocorrelation and descriptive model provides Unix and Windows systems. Has Ripley's K function utilities. a local indicator of spatial association (LISA) by applying Ripley's K function uses distances between all points and com- Moran's "I" to individual observations (Anselin 1995). pares the observed number of neighbors within a certain dis- 4th module has number of regression routines, with ordi- tance to a theoretical number based on a Poisson random nary least squares (OLS) and robust method for estimat- process; k-fx generally considered most comprehensive of the ing OLS using a "jackknife" procedure, and provides di- distance measures and can be used for determining the distance agnostics to examine residuals. Includes tests for spatial scale at which randomness occurs. autocorrelation, gauging whether spatial dist is affecting SASP--two-dimensional spectral analysis package for analyzing either the distribution of the dependent variable or the spatial data--set of utility modules for conducting 2-D spectral residual error terms. If no apparent spatial autocorrela- analysis using a grid cell organization (Renshaw and Ford 1983; tion, then OLS is valid procedure. If there is spatial auto- Ford and Renshaw 1984; Renshaw and Ford 1984). 2-D spectral correlation, then model that incorporates spatial location analysis is technique for detecting patterns in a spatial distribu- needs to be developed. tion and is direct extension of 1-D spectral analysis used in time Most regression packages don't incorporate spatial lo- series analysis. cation and implicitly treat space as if it were random Data consist of series of rectangular grid cells imposed over (i.e., part of the residual error term). SpaceStat only spatial plan with m rows and n columns. The value within package that Levine is aware of that explicitly builds lo- each cell represents an estimate of a third variable, which cation into regression procedure. While one can apply

OCR for page 139
152 non-spatial statistics to spatial data, the error associ- Chapter 6 hotspot analysis: ated in not considering spatial location is enormous. In Pg 164 overview of types of cluster analyses methods effect one is assuming that each observation is inde- 1. Hierarchical techniques: like inverted tree diagram in pendent of all others, which is clearly wrong for spa- which two or more incidents are first grouped on the basis tially affected phenomena. of some criteria (e.g., nearest neighbor). Then these are Author provides info on accessing all software de- grouped into second order clusters, which are then scribed in article. grouped into third order clusters and this process is re- peated until either all incidents fall into a single cluster or else the grouping criteria fails. Levine, N. 1999. Quickguide to CrimeStat. Ned Levine and Associates, Literature cited: Sneath 1957; McQuitty 1960; Sokal Annandale, VA. and Sneath 1963; King 1967; Sokal and Michener 1958; Guide to parallel online help menus in the program. Ward 1963; Hartigan 1975 Eight program tabs, each with lists of routines, options and 2. Partitioning techniques, or K-means technique, partition parameters the incidents into a specified number of groupings, usu- 1. Primary file: point file w/x-y coordinates. ally defined by the user. All points are assigned to one 2. Secondary file: optional; also point file w/ x-y cords used in (only one) group. Displayed as ellipses. comparison with primary file. Literature cited: Thorndike 1953; MacQueen 1967; Ball 3. Reference file: "used for single and dual variable kernel den- and Hall 1970; Beale 1969 sity estimation." Usually though not always a grid overlaying 3. Density techniques identify clusters by searching for dense the study area. concentrations of incidents (next chapter of book dis- 4. Measurement parameters cusses one type of density search algorithm that uses the a. Area: define area sing units (square miles, square meters, kernel density method. etc.) Literature cited: Carmichael et al. 1968; Gitman and b. Length of street network: total length Levine 1970; Cattell and Coulter 1966; Wishart 1969 c. Type of measurement--direct (shortest distance between 4. Clumping techniques involve partitioning incidents into two points) or indirect (distance constrained by grid, groups or clusters but allow overlapping membership called "Manhattan" metric). Literature cited: Jones and Jackson 1967; Needham 5. Spatial distribution: provides statistics describing overall dis- 1967; Jardine and Sibson 1968; Cole and Wishart 1970 tance (first order spatial stats). 3 routines for describing spatial 5. Miscellaneous techniques: other methods less commonly distance, and 2 routines for describing spatial autocorrelation used including techniques applied to zones, not incidents. (intensity variable needed for the latter two routines, weight- Local Moran (cites Anselin 1995) ing variable can also be used)--details on these routines with 6. Also hybrids of these methods, Block and Green 1994 use descriptions are included. a partitioning method with elements of hierarchical 6. Distance analysis: provides stats about distances between grouping point locations, useful for identifying degree of clustering of Optimization criteria: distinguish techniques applied to points (second order analysis). Three routines for describing space. properties of the distances and two routines that output dis- 1. Definition of cluster: discrete grouping or continuous tance matrices. variable; whether points must belong to a cluster or can be a. m-sub:Nearest neighbor analysis isolated; whether points can belong to multiple clusters. b. Number of nearest neighbors 2. Choice of variables: whether weighting or intensity values c. **Linear nearest neighbor analysis are used to define similarities. d. **Number of linear nearest neighbors 3. Measurement of similarity and distance: type of geometry e. **Ripley's K statistic used; whether clusters are defined by closeness or not; f. Distance matrices types of similarity measures used. g. Within file point-to-point: routine outputs distance be- 4. Number of clusters: whether there are a fixed or variable tween each point in primary file to each point in second- number of clusters; whether users can define the number ary file (can relate to guardrails, intersections, fencing, etc.) or not. 7. Hotspot analysis: identifies groups of incidents clustered to- 5. Scale: whether clusters are defined by small or larger areas; gether. Second order analysis. 3 stats: for hierarchical techniques what level of abstraction is a. Nearest neighbor hierarchical spatial clustering: groups considered optimal. points together on basis of spatial proximity--user defines 6. Initial selection of cluster locations ("seeds"): whether significance level associated with a threshold, minimum they are mathematically or user defined; specific rules to number of points that are required for each cluster and define initial seeds. output size for displaying clusters with ellipses 7. Optimization routines used to adjust initial seeds into final b. K-means clustering routine for partitioning all points into locations whether distance is being minimized or maxi- k-groups in which K is a number assigned by the user mized; specific algorithms used to readjust seed locations. c. Local Moran statistics: applies to the Moran's I statistic to 8. Visual display of clusters once extracted: whether drawn individual points or zones to asses whether particular by hand or by geometrical object (ellipse); proportion of pts/zones are spatially related to nearby points or zones cases represented in visualization. 8. Interpolation tab: allows estimates of point density using the No single solution--different techniques will reveal different kernel density smoothing method. groupings and patterns among the groups.

OCR for page 139
153 Chapter goes on to specifically explain Crimestat routines identified as similar or different to their nearby pattern. and criteria for 3 techniques--hierarchical clustering based Basic concept: LISA local indicator of spatial associa- on nearest neighbor analysis; partitioning technique based tion, indicator of the extent to which the value of an on K-means algorithm, and zonal technique that identifies observation is similar or different from its neighboring zones which are different from their nearby environment, observations. Requires two conditions: (1) each obser- whether they are "peaks" or "troughs" vation has a variable value that can be assigned to it in Discusses some advantages/limitations for some techniques: addition to its x/y coordinates; (2) the neighborhood Nearest neighbor hierarchical clustering: identify needs to be defined--could be adjacent zones or all groups of incidents where groups of incidents are spa- other zones negatively weighted by the distance from tially closer than would be expected on basis of chance. the observation zone 4 advantages Some thoughts on hotspots 1. Can identify small geographical environments 3 advantages to the 3 techniques discussed above where there are concentrated incidents, useful for Identifies areas of high or low concentrations of specific targeting of microclimates where incidents events; are occurring. Sizes of clusters can be adjusted to fit Systematically implements algorithms (though particular groupings of points human decisions affect how the algorithms run); and 2. Can be applied to any entire dataset and need not be Lastly, these techniques are visual. applied to smaller geographic areas, easing compar- Disadvantages: isons between different areas Choice of parameters in algorithms is subjective; 3. Linkages between several small clusters can be seen makes this as much an art as science. Greater effect, through second and higher order clusters--i.e., the smaller the sample size. there are different scales (geographical levels) to the Applies to volume of incidents, not underlying clustering of points and hierarchical clustering can "risk." It is an implicit density measure, but higher identify these levels density may be a function of a higher population or 4. Each level may imply different management strategies risk or both. Hierarchical clustering limitations One thing to identify a concentration of incidents, 1. Size of grouping area dependent on sample size since but these hotspot methods don't explain why there lower limit of mean random distance is used as cri- is a concentration of events there. It could be ran- teria--for distributions with many incidents thresh- dom, not relate to anything inherent about the lo- old will be smaller than distribution with fewer inci- cation. dents, so not consistent definition of hotspot area Hotspot identification is merely an indication of an 2. Arbitrariness due to minimum points rule requiring underlying problem, but further analyses are re- user to define a meaningful cluster size so two differ- quired to identify what is contributing to the occur- ent users may interpret the size of a hotspot differ- rences in that area. ently, also selection of p-value in the students t-dis- tance can allow variability between users. Almost all Levine, N. 2004. CrimeStat III: Distance Analysis. Chapter 5 in: A Spa- other clustering techniques have this property too. tial Statistics Program for the Analysis of Crime Incident Loca- 3. No theory or rationale behind clusters. Same goes tions. Ned Levine & Associates: Houston, Texas, and the National for many other clustering techniques that are em- Institute of Justice, Washington, D.C., USA. pirical groupings with no theory behind them; how- ever, if one is looking for a hotspot defined by land First order properties are global and represent dominant pattern use, activities, and targets, the technique provides of distribution. no insight into why clusters are occurring or why Second order (or local) properties refer to subregional patterns or they could be related. neighborhood patterns within overall distribution, and tell about K-means partitioning clustering: data are grouped into particular environments that may concentrate crime incidents. k groups defined by user, after specified number of seed NNI (nearest neighbor index): locations are defined by user. Routine tries to find best Simple to understand, calculate ... for areas, not linear fea- positioning of K centers and assigns each point to the tures. center that is nearest. Assigns points to one and only Basis of many distance statistics, some of which are imple- one cluster, but all points are assigned to cluster, thus mented in CrimeStat. no hierarchy (second, higher order clusters) in routine. Compares distances between nearest points and distances Basically, k-means procedure will divide the data into that would be expected on basis of chance and is an index that the number of groups specified by the user. is the ratio of two summary measures. Advantages and disadvantages: Choosing too many For each point distance to closest other point (nearest clusters will lead to defining patterns that don't really neighbor) is calculated and averaged over all points. exist whereas choosing too few will lead to poor dif- Expected nearest neighbor distance if CSR = the mean ferentiation among areas that are distinctly different. random distance. Given the numbers of clusters one chooses, the re- Mean random distance = d(ran) = 0.5 SQRT[A/N] where sults may or may not relate to actual "hotspots" A is area of region and n is number of points. Local Moran statistics: aggregate data by zones, applies NNI = d(NN)/d(ran) = ratio of observed nearest neighbor Moran's I stat to individual zones allowing them to be distance to mean random distance

OCR for page 139
154 If observed distance is same as mean random distance, Since empirical standard deviation of Ld(NN) used in- then ratio will be ~1; if observed average distance is stead of theoretical value, t-test used rather than Z-test. smaller than the mean random distance, then the index CrimeStat output with Lnna routine: will be < 1 indicating clustering; if observed average 1) Sample size distance is greater than the mean random distance, 2) Mean linear nearest neighbor distance in meters, feet, then index > 1 indicating dispersion and that points are miles more widely distributed than would be expected based 3) Minimum linear dist b/w nearest neighbors on chance. 4) Maximum linear dist b/w nearest neighbors Testing significance of NNI: Z-test to determine if signif- 5) Mean linear random distance icant difference between observed and expected. Z = 6) Linear nearest neighbor index [d(NN)-d(ran)]/[SE of d(ran)] 7) SD of linear nearest neighbor distance SE of d(ran)~ = SQRT[(4-pi)A/4pi(N-sq)] where A is 8) SE of linear nearest neighbor distance area of region and n is number of points. 9) Significance test of NNI (t-test) Note: significance test for NNI is not a test for CSR, K-order linear nearest neighbors: beyond nearest neighbor dis- only a test of if average nearest neighbor distance is sig- tances, 2nd nearest linear neighbor, 3rd nearest, etc. In Crime- nificantly different than chance, i.e., test of first order Stat can specify number of nearest linear neighbor indices to be nearest neighbor randomness. There are also second, calculated. third, and so forth order distributions that may or may Output includes order, starting with 1; mean linear nearest not be significantly different from CSR. All these are K- neighbor distance for each order (m); expected linear near- order effects. est neighbor distance for each order (m); and linear NNI for Edge effects can bias NNI--a point near border of study each order. area may actually have its nearest neighbor on the other Kth linear NNI is ratio of observed Kth linear nearest neigh- side of the border, but program selects another point bor distance to the Kth linear mean random distance. within the study area as nearest neighbor of border point, Expected linear nearest neighbor distance is Ld(ran) = which may exaggerate the nearest neighbor distance. No 0.5(L/n-1) where L is total length of road and n is sample size, consensus on how to deal with this (cites Cressie 1991 for only adjusting for nk which occurs as degrees of freedom are options) and "this version" of CrimeStat has no correc- lost for each successive order. Index is really the k-order lin- tion for edge effects. However, bias will be significantly ear nearest neighbor distance relative to the expected linear smaller given datasets with clustering. neighbor distance for the first order--it is not a strict NNI for K-order nearest neighbors: beyond nearest neighbor distances, orders above 1. 2nd nearest neighbor, 3rd nearest, etc. In CrimeStat can specify (*These are notes from non-linear NNI; not sure if applica- number of nearest neighbor indices to be calculated. ble here, too, but there are no other notes on these issues in Output includes order, starting with 1; mean nearest neigh- linear NNI section.) CrimeStat has no test for significance bor distance for each order (m); expected nearest neighbor (none has been developed) for Kth NNI since orders aren't distance for each order (m); and NNI for each order. independent. Kth NNI is ratio of observed Kth nearest neighbor distance (*These are notes from non-linear NNI; not sure if applica- to the Kth mean random distance. ble here, too, but there are no other notes on these issues in CrimeStat has no test for significance (none has been devel- linear NNI section.) No restrictions on number of nearest oped) for Kth NNI since orders aren't independent. neighbors that can be calculated, but since average distance No restrictions on number of nearest neighbors that can be increases with higher order nearest neighbors, bias from edge calculated, but since average distance increases with higher effects will increase. Orders no greater than 2.5% of points order nearest neighbors, bias from edge effects will increase. should be calculated (cites Cressie 1991 pg 613 for example). Orders no greater than 2.5% of pts should be calculated (cites Example of interpreting results from higher order analyses-- Cressie 1991 pg 613 for example). if one parameter shows clustering through fourth order, then Linear NNI (Lnna): applied to roads, with assumptions that in- tending toward more dispersed than random, then may in- direct distances are used following network or grid. dicate that there are small clusters of points, but that the clus- Theory: cites Hammond and McCullagh (1978). ters themselves are relatively dispersed; the more orders an- CrimeStat calculates average of indirect distances between alyzed showing clustering, the more overall clustering across each point and its nearest neighbor = Ld(NN). the entire study area. Expected linear nearest neighbor distance is Ld(ran) = Linear k-order nearest neighbor distance different than non- 0.5(L/n-1) where L is total length of road and n is sample size. linear (areal). Index slightly biased as denominator (k-order Linear NNI = LNNI = [Ld(NN)]/[Ld(ran)] expected linear neighbor distance) is only approximated. Theoretical standard error for random linear nearest neigh- Also, index measures distance as if the streets follow a true bor distance not known grid, oriented E/W and N/S, hence may not be realistic for Author of CrimeStat developed approx SD for observed places where streets traverse in diagonal patterns--in these Ld(NN) = SLd(NN) = SQRT[(min(di,j) - Ld(NN))2/N-1] cases, use of indirect distance measurement will produce where min(di,j) is nnd for point I and Ld(NN) is average greater distances than what actually may occur on the street linear nearest neighbor distance. network. SE Ld(NN) = [SLd(NN)]/SQRT[N] Ripley's K Statistic (not for linear features--only areas) Approx significance test = t = [Ld(NN)-Ld(ran)]/SE of Index of non-randomness for different scale values (cites Ld(NN) Ripley 1976, 1981; Bailey and Gattrell 1995; Venables and

OCR for page 139
155 Ripley 1997). "Super-order" NN statistic providing test of Ripley proposed edge adjustments randomness for every distance from the smallest up to the "Guard rail" within study area so points outside the size of the study area. Sometimes called reduced second mo- guardrail but inside the study area can be counted for ment measure implying that it is meant to measure second center points (an enumerator) inside the guardrail, but order trends (i.e., local clustering vs. general pattern over re- cannot have own circle placed upon them (i.e., only a re- gion); however, also subject to first order effects so is not a cipient, can only be j's and not I's). Must be done manu- strictly second order measure. ally, must identify each point as either an enumerator and Consider spatially random dist of n points. Circles of radius, recipient or recipient only. Can be problematic if study ds, are drawn around each point, where s is the order of radii area boundary not "regular" shape. from smallest to largest and the number of other points that Venebles and Ripley 1997--weighting to account for pro- are found within the circle are counted and then summed portion of circle placed over each point within the study over all the points (allowing for duplication), then the ex- area. Thus K(ds) = [A/N2] i j I(di,j) becomes K(ds) = pected number of points within that radius are E(number of [A/N2] i j Wij-1 I(di,j) where Wij-1 is inverse of proportion points within distance di) = [N/A]K(ds), where N is sample of circle of radius ds placed over each point which is within size, A is total study area, and K(ds) is area of a circle defined the total study area; thus if point is near border, it will get by ds. For example, if area defined by particular radius is 1/4 greater weight because smaller proportion of circle placed the total study area, and if there is spatially random distribu- over it will be within the study area. Again, has to be done tion, on average approximately 1/4 of the cases will fall within manually and can be problematic if study area boundary any one circle (+/- sampling error). More formally, with not "regular" shape. CSR, expected points within distance ds is CrimeStat only calculates the unadjusted L and tells users to anticipate the bias by only examining L stat for small E(number under CSR) = [N/A] ds2 distances where bias is smallest (even though one could And if average number of points found within a circle of a calculate 100 distance intervals). particular radius placed over each point is greater than found Comparison to spatially random distribution--because sam- in above equation (expected), then clustering occurring or if pling distribution of L statistic not known, do 100 random average number of points found within circle of particular distance simulations, then for each simulation the L statistic radius placed over each point is less than found in above is calculated for each distance interval, after all simulations equation (expected), then dispersion. have been conducted, highest/lowest L-values are taken for K statistic similar to NND because it provides info about av- each interval and is called an "envelope." By comparing dis- erage distance b/w points, but more comprehensive than tribution of L to random envelope, one can assess if observed nearest neighbor distance stats for two reasons: is different from chance. Applies to all orders cumulatively, not just a single order Note: since no formal test of significance, comparison Applies to all distances up to the limit of the study area be- with envelope only approximate confidence about cause the count is conducted of successively increasing whether distribution differs from chance or not, i.e., one radii. can't say likelihood of obtaining this result by chance is Under unconstrained condition, K is defined as K(ds) = less than e.g., 5%. [A/N2] i j I(di,j) where I(di,j) is the number of other points, j, found within distance ds summed over all points, i. So, cir- Spooner, P.G., Lunt, I.D., Okabe, A. and S. Shiode. 2004. Spatial cle of radius ds placed over each pt I, then number of other analysis of roadside Acacia populations on a road network using pts ij are counted. Circle is moved to next pt i and process re- the network K-function. Landscape Ecology 19:491499. peated, thus double summation points to the count of all j's for each I, over all I's. When done, radius of circle is increased Ripley's K-function (Ripley 1976, 1991) not appropriate for point and process is completed. Typically radii of circle are in- patterns on road networks since k-function assumes infinite ho- creased in small increments so there are 50-100 intervals by mogeneous environment for calculating Euclidean distances. which the statistic can be counted. In CrimeStat, 100 inter- Network k-function for univariate analyses and network cross vals (radii) are used based on ds = R/100 where R is the radius k-function for bivariate analyses more appropriate. of a circle for whose area is equal to the study area. Used these methods to confirm significant clustering of Acacia Can graph K(ds) against ds to see if there is clustering at cer- populations at various scales and spatial patterns. tain distances or dispersion at others, but since this plot is K-function been used to study spatial patterns of mapped point non linear (increasing exponentially), then transform into data in plant ecology (cites a list). sq-root function L(ds) = SQRT[K(ds)/] ds. In practice only K-function uses all point-to-point distances not just nearest the L statistic is used even though the name of the statistic is neighbor distances based on the K derivation. When k-function used for point patterns constrained by linear L statistic prone to edge effects, i.e., for points located near road networks, can overdetect clustering patterns possibly lead- the boundary of the study area, the number enumerated by ing to Type 1 errors. any circle for those points will (all other things =) be less than Cites Forman 1999 ICOET article says lack of spatial guides to points in the center of the study area because points outside analyze point patterns on road networks. the boundary aren't counted. The > distance between points Credits Okabe and Yamada 2001 (The k-function method on a tested (i.e., the greater the radius of the circle placed over network and its computational implementation. Geographical each point), the greater the bias, thus a plot of L vs. distance Analysis 33:271290) with developing k-function analysis of will show decline as distance increases. point patterns on a network.

OCR for page 139
156 Refers to k-function to "reduced second moment measure" to cally obtained from the binomial distribution approximated by measure two-dimensional distribution pattern on infinite ho- normal distribution for large number of points. To check for mogeneous plane where circle of radius t centered on each point statistically significance of observed from CSR, approx of 95% and number of neighboring points within circle are counted. CI constructed using standard deviation of normal distance, and Can vary radius t scale, deviation of observed from expected max/min values of +/-1.65*SD using one-sided tests. If observed number of points plotted against t. Null hypothesis for k-func- > expected and outside CI, then points A&B are significantly tion is complete spatial randomness (CSR) and if observed func- "attracted"; if observed < expected and outside CI, then points tion deviates from a randomly generated (Poisson) point A&B are significantly repelled. process, the null is rejected. "Spatial point patterns were analyzed on a road network shape- Univariate network k-function similar process but calculates the file using SANET Version 1.0 021125 (Okabe et al. 2002, shortest path distance from each point to all other points on a okabe.t.u-tokyo.ac.jp/okabelab/atsu/sanet/sanet-index.html), finite connected planar network, assumption of binomial point an ESRI Arcmap extension." First preprocessed all polylines to process based on hypothesis that points p (the set of points as- make sure properly connected to e/o. SANET used first to cal- sumed on network) are uniformly and independently distrib- culate distances between all notes on road network then used to uted over finite road network, thus if hypothesis rejected, points assign points to the nearest point on the road network. Network are spatially interacting and may form non-uniform patterns. k- and cross k-function analyses were performed by SANET and 100 Monte Carlo simulations used to construct confidence output data were exported to Excel to aggregate data, calculate "envelope" based on max and min values from an equivalent confidence intervals (for x-k-function analyses) and produce number of random coordinates for k(t) compared to k-hat (t) graphs. or observed. Any values of k-hat (t) that lie outside confidence Univariate addresses clustered vs. regular distributions; bivari- envelope were considered significant deviation from CSR. If ate addresses if two types of sets of points are attracted or k-hat(t) > k(t), then points p are clustered; if k-hat(t) < k(t), repulsed from e/o. then points p are tending toward regularity. Edge effects are Combo of using graphical Kernel (for visual) and network taken into account with distance computations so no need for k-function was helpful, but must be realized that kernel estima- edge adjustment factor (Okabe & Yamada 2001) tions do not compensate for spatial differences I road networks Bivariate network k-function, two different kinds of points A&B and their effect on point patterns observed. are analyzed on network, with hypothesis of spatial interaction between different types of points. Statistical test for bivariate Final paragraph: possible applications of network k-function include analysis similar to univariate network k-function but present animal movement patterns from survey and traffic mortality, en- version of SANET used for network cross k-function analyses vision network k-function becoming standard GIS application on does not construct a confidence envelope, but can be theoreti- networks.