Comparison of Crash Severity Risk Factors at Signalized and Stop-Controlled Intersections in Urban and Rural Areas in Alabama

Understanding the factors that affect crash severity at intersections is essential to develop strategies to alleviate safety deficiencies. This paper identifies and compares the significant factors affecting crash severity at signalized and stop-controlled intersections in urban and rural areas in Alabama using recent five-year crashes. A random forest model was used to rank variable significance and a binary logit model was applied to identify the significant factors at both intersection types in urban and rural areas. Four separate models (urban signalized, urban stop-controlled, rural signalized, and rural stop-controlled) were developed. New variables that were not previously explored were used in this study, such as the roadway type (one-way vs. two-way) and traffic control functioning (yes or no). It was found that one-way roadways were associated with a reduction in crash severity at urban signalized intersections. In all four models, rear-end crashes showed lesser severity than side impacts. Head-on crashes, higher speed limits, and curved sections showed higher severity in urban signalized and stop-controlled intersections. In rural stop-controlled intersections, right-turning maneuvers had a severity reduction. Female drivers showed 15% and 45% higher severity likelihood (compared to males) at urban and rural signalized intersections, respectively. Strategies to alleviate crash severity are proposed.


Intersection Crashes in Alabama
Intersections are designed to organize vehicles' movements at approaches and to smoothen vehicles' turning maneuvers.Since at-grade intersections have several conflict points, intersections are critical locations on the roadway network and are prone to increased crash risk.Crashes at intersections are often severe due to vehicles crossing the paths of each other at a relatively high speed.Recent statistics in the state of Alabama in 2014 showed that there were 133,175 crashes recorded in the Critical Analysis Reporting Environment (CARE) database maintained by the Center for Advanced Public Safety (CAPS) at the University of Alabama.Of these crashes, 74,714 crashes (about 56%) were recorded as intersection-related.Of these intersection-related crashes, 12,785 crashes (or 9.6%) occurred at signalized intersections, whereas 14,916 crashes (or 11.2%) occurred at stop-controlled intersections.The driver behavior and the inherent geometric and traffic characteristics at signalized and stop-controlled intersections are significantly different; thus, separately analyzing crash severity at each intersection type is needed.
Two main locations where intersections exist are urban and rural areas.Since both area types possess different geometric and traffic characteristics (for example, urban areas usually carry higher traffic volumes and the speed limit on rural areas is relatively higher than urban areas), intersections located on each area type need to be analyzed separately.Furthermore, the factors that affect crash severity on intersections in urban and rural areas can be different.Understanding those factors that affect crash severity at signalized and stop-controlled intersections in urban and rural areas is essential to develop more-focused strategies to alleviate any safety deficiency identified.To the author's knowledge, no study has been conducted to-date to compare the significant factors affecting crash severity at signalized and stop-controlled intersections in urban and rural areas.However, few studies analyzed the crash frequencies and travel delays at both signalized and unsignalized intersections (see for example, Liu 1979).
The main objective of this study is to identify and compare the significant factors that affect crash severity at signalized and stop-controlled intersections in both rural and urban areas.The study uses 5-year historical crashes in the state of Alabama from 2010 to 2014.The factors explored in the study include roadway, geometric, traffic, vehicle, environmental, and driver characteristics which are deemed essential to investigate and compare crash severity at both intersection types in urban and rural areas.The crash severity includes two levels; severe and non-severe.Severe crashes include fatalities and incapacitating (or serious) injuries.On the other hand, non-severe crashes include property damage only (PDO), possible injuries, and non-incapacitating injuries.To identify and compare the significant risk factors of crash severity at intersections types, the binary logit (or logistic regression) model is applied.In addition, one of the common variable screening methods, the random forest, is used to rank the important of the various independent variables explored.

Relevant Studies
Research has been performed to identify the factors that contribute to crash severity at signalized and unsignalized intersections separately.Researchers have adopted many statistical techniques to identify factors that affect crash severity.Among the various models, the binary logit model was widely used for identifying the factors that affect crash severity.For example, Wang and Abdel-Aty (2008) used this technique while analyzing left turn crash severity and concluded that injury severity was higher when left turning vehicles collide with opposing through traffic.Driver attributes, vehicular characteristics, geometric design, and environmental factors were the significant variables that affect injury severity for left turn crashes (Wang and Abdel-Aty 2008).Al-Ghamdi (2002) used the binary logit model to identify the factors that affect crash severity and concluded that location and crash type were the two key significant variables impacting the crash severity.Examples of other studies that used the binary logit model are Lee and Abdel-Aty (2009) and most recently, Li et al. (2017).
Different independent variables were considered when analyzing crash severity.According to Dissanayake and Lu (2002), the driver's use of alcohol or drugs, ejection from the vehicle in the crash, point of impact, existence of curve or grade at the crash location, and vehicle speed were determined as factors that increased the severity of young drivers.Savolainen and Mannering (2007) concluded that crashes were less severe under wet pavement conditions and near intersections, because of lower speeds maintained by riders in these situations.Kockelman and Kweon (2002) concluded that passenger cars are safer than pickups in single-vehicle crashes, while concluding the opposite for two-vehicle crashes.Duncan et al. (2002) identified that high speed, driving at night, female drivers, and driving under the influence increased the risk of severity.Nassar et al. (1994) developed severity models for bad weather conditions and concluded that drivers may be more attentive in this inclement weather conditions.
While focusing on signalized intersections, Huang et al. (2008) identified the factors that affected the driver injury severity and vehicle damage.They concluded that severity was less for crashes that occurred in peak period and in good street lighting condition compared to crashes that occurred at night, T-intersections, vehicles traveling in the right-most lane, and for intersections equipped with red light running cameras.They also found that heavy vehicle drivers were less likely injured when compared with two-wheeler riders because of the better resistance of heavy vehicles.
Studies that analyzed severity at unsignalized intersections have included Wang and Qin (2002), Devlin et al. (2011), andHaleem andAbdel-Aty (2010).Wang and Qin (2002) studied driver mistakes that occurred at uncontrolled, stop-controlled, and signalized intersections.The authors also identified potential countermeasures and concluded that running stop signs, high speed, driving under the influence of alcohol or drugs, and poor visibility can cause more crash severity.As concluded from Devlin et al. (2011), running stop signs was the common error made by driver which increased injury severity.Haleem and Abdel-Aty 2(010) studied the factors that relate to crash severity at unsignalized intersections using traffic volume, number of through lanes, geometric factors, shoulder width, number of left turn movements, number of left and right turn lanes on major approach, and driver age.
As shown in the abovementioned literature review, no study has been conducted to compare the significant predictors of crash severity at signalized and stop-controlled intersections in both urban and rural areas.This study attempts to fill this gap while considering various independent predictors and relatively new variables not previously considered, such as the roadway type (one-way vs. two-way) and traffic control functioning (yes or no).The detailed description of the data and variables considered is shown in the next section.

Data and Variable Settings
Five years (2010 to 2014) of intersection-related crashes in the state of Alabama were extracted from the CARE database, which is maintained by CAPS.Afterwards, crashes occurring at signalized and stop-controlled intersections were only considered and other intersection types were excluded, such as yield-controlled intersections and unknown intersection types.Afterwards, crashes at urban and rural areas were only considered and other observations with unknown area type was excluded.The overall crash observations were further categorized into four separate datasets, representing crashes at urban signalized, urban stop-controlled, rural signalized, and rural stop-controlled intersections separately.Close to twenty independent variables were considered.These variables include driver characteristics (driver age, driver gender, and driver condition), roadway characteristics (roadway type, speed limit, roadway condition, and terrain), and environmental characteristics (e.g., weather and lighting conditions).This study makes use of relatively new variables not previously considered in severity studies, such as the roadway type (one-way vs. two-way) and whether the traffic control was functioning or not.
As previously indicated, four separate models (urban signalized, urban stop-controlled, rural signalized, and rural stop-controlled) were developed for comparing the significant factors contributing to severe crashes.Twenty independent variables were considered for inclusion in the two models for signalized and stop-controlled intersections in urban areas.For rural signalized intersections, 18 variables were considered.Finally, for rural stop-controlled intersection, 17 variables were considered for the model.Descriptive statistics of the variables considered at signalized and stop-controlled intersections in urban and rural areas are shown in Tables 1 and 2, respectively.It should be noted that an additional traffic variable considered in the study is the annual average daily traffic (AADT), which is a continuous variable type.The AADT at the intersections was calculated by summing up the traffic volumes from both directions on the main road.As shown in Tables 1 and 2, the response (dependent) variable in the analysis was crash severity.In the data setting for modeling, each observation represents a unique crash.The crash severity includes two levels; severe and non-severe.Severe crashes include fatalities and incapacitating (or serious) injuries.On the other hand, non-severe crashes include PDO, possible injuries, and non-incapacitating injuries.

Binary Logit (Logistic Regression) Model
The logistic regression (or logit) modeling approach was adopted for use in this paper.The logistic regression determines the relationship between categorical dependent variables and one or more independent (categorical, continuous, or both categorical and continuous) variables.The response variable in logistic regression can be binary or dichotomous (Wang and Abdel Aty 2008;Al-Ghamdi 2002).Therefore, the response usually takes the values "zero" and "one", representing "non-occurring" and "occurring" events, respectively (Chang and Yeh 2006).If the dependent variable is categorized into more than two levels, then the multinomial logistic regression model can be suggested (Shankar and Mannering 1996;Islam and Mannering 2006).In this study, since fatalities were relatively few at the four different intersection types, fatalities were aggregated with incapacitating injuries to represent severe injuries.The other response level was non-severe injuries.For this the binary logit (logistic regression) model was applied in this study.The binary logistic regression equation takes the following form: Π(x) = exp (g(x)) / (1+ exp (g(x)) ) (1) where, g(x) = crash severity formula as a function of independent variables; x = vector of the independent variables; β = vector of coefficients to be estimated.
The likelihood function (()) for the pair (x i , y i ) is given by (Yasmin et al. 2013): where, n = number of observations (each observation denotes the i th observed outcome), x i = observed value of the explanatory variables for observation i, and y i = response variable (crash severity) for observation i.
It should be noted here that the best estimate of β was calculated by maximizing the deviance function.In this study, the R package (R Package 2016) has been used to estimate the final maximum likelihood estimate of each variable in the four models.
As shown in Tables 1 and 2, several categorical variables were considered in this study.Each categorical variable includes two or more levels.For this one level has been used as the reference (or base case) and other levels were estimated in relative to the reference level when fitting the four binary logit models.The effect of each level of the independent variable on crash severity was measured by comparing the estimated (β) value with the base case (Yasmin et al. 2013).If the estimated value is greater than zero, then this specific level is associated with higher severe injury likelihood when compared to the base case.In addition to the estimated values, the odds ratio (OR) was used to interpret the implication of each estimate on crash severity.OR ranges from zero to positive infinity.If OR is less than one (< 1), then the variable has less likelihood of influencing the crash severity.On the other hand, if OR is greater than one (> 1), and then the variable has more likelihood of influencing the crash severity.To assess the goodness-of-fit of the fitted binary logit models, the Akaike information criterion (AIC) and the deviance estimates were used.The lower the AIC and residual deviance values, the better the model.The stepwise regression technique was used to keep the variables in the models.Stepwise regression is a mix of both the backward deletion and forward addition methods when fitting a regression model.A 90% confidence level (or 10% significance level) was used to judge the inclusion of the variables in the final models.

Random Forest
Random forest model was used in this study to rank the importance of variables.In this study, the random forest technique was run separately for each intersection type.This was performed prior to binary logit model to find out the crucial variables, and then compare the final list of important variables with the final variables in each of the four binary logit models.Ho (1995) created the first algorithm of random decision forest, then it was developed by Brieman ( 2001) and is considered one of the promising machine learning techniques for screening important variables (see for example Abdel-Aty and Haleem 2011; Haleem and Gan 2006).In this technique, a number of trees are grown by selecting some observations randomly from the original data set with substitution, then searching over a randomly selected subset of variables at each split till the variable significant is ranked (Haleem and Gan 2006).Several studies have used the random forest to analyze binary response variables (see for example, Harb et al. 2009;Sparks 2009).To rank the variables, the Gini index was used in this study, which tests the homogeneity of the nodes and leaves (or the purity of the variables) in the resulting random forest model.A higher Gini index indicates a relatively important (or pure) variable, and vice-versa.

Results and Discussion
The results of the binary logit model for analyzing crash severity at both signalized and stop-controlled intersections in urban and rural areas are shown in Tables 3 and 4, respectively.Table 3 shows the two models developed for urban signalized and stop-controlled intersections, while Table 4 shows the two models developed for rural signalized and stop-controlled intersections.Only the significant variables are shown in the final models in both tables.The goodness-of-fit statistics (i.e., the AIC and the residual and null deviance estimates) are shown in all the four models.

Interpretation of the Binary Logit Models at Urban Intersections
As shown in Table 3 for the urban signalized intersections model, severity is less prevalent during weekends compared to weekdays (β=-0.142,OR=0.867).This could be interpreted as drivers are usually more attentive and they feel relatively relaxed on weekends (compared to weekdays), especially given that drivers usually go out for enjoyment with their families on weekends.Having other members of the family in the same vehicle can explain the reduced severity on weekends.The OR is 0.867, implying that the severe injury likelihood is 13.3% lesser on weekends compared to weekdays.For the urban stop-controlled intersections model, running a stop sign (β=0.756,OR=2.129) is associated with a higher severity likelihood, compared to failing to yield at the intersection.These results are consistent with Devlin et al. (2011).Running a stop sign can cause an angular collision with other vehicles, which could possibly increase the likelihood of the injury severity for the drivers and accompanying passengers.Rear-end and head-on crash types were significant at both the signalized and stop-controlled intersections models.At signalized intersections, rear-end crashes (β=-1.507,OR=0.221) showed lesser injury severity than side impact crash types.This might be since rear-end crashes usually occur in congested situations, where the vehicle speed is relatively low.
On the other hand, head-on crashes (β=0.511,OR=1.668) showed higher likelihood of severe injuries than side impact crashes.These findings are consistent with the studies by Yasmin et al. (2013) and Al-Ghamdi (2002).Head-on crashes are usually more severe compared to other crash types due to the higher force of impact.Rear-end and head-on crashes showed similar severity patterns at stop-controlled intersections; however, head-on crashes were associated with much higher severity likelihood compared to head-on crashes at signalized intersections.
At signalized intersections, cloudy weather was associated with lesser severity likelihood (β =-0.17,OR=0.844) compared to clear weather.This is intuitive since drivers tend to drive more attentively in relatively inclement weather compared to clear weather conditions.Concurring with previous studies, female drivers were associated with higher severity (β=0.143,OR=1.154) compared to their male counterparts.This is because female drivers have weaker physical conditions compared to males, which make them more vulnerable of higher severity when crashing.This result is consistent with the finding of Yasmin et al. (2013).
At signalized intersections, all driver condition levels (driving under influence "DUI", depressed/illness, and physical impairment) were associated with an increase in severity likelihood compared to normal driver conditions.These results are consistent with the finding of Devlin et al. (2011).The highest severity increase was for DUI (OR=2.924),followed by depressed/ill (OR=2.784),and finally physical impairment (OR=2.489).This shows the negative impact on severity when driving under the influence, being depressed, or ill.
Regarding the vehicle maneuver, turning left at signalized intersections (β=0.546,OR=1.726) showed higher severity likelihood than stopping or slowing (base case).On the other hand, turning right was associated with crash severity reduction at both signalized and stop-controlled intersections.This is anticipated since lesser conflict points exist when turning right compared to turning left at an intersection.However, the severity reduction is more noticeable at stop-controlled intersections.These findings are consistent with Wang and Abdel-Aty (2008) and Al-Ghamdi (2002).In both signalized and stop-controlled intersections, speed limits above 45 mph and less than 55 mph (which inherently implies higher vehicle speeds) were associated with higher severity likelihood compared to speed limits less than 45 mph.Crashes at relatively higher speeds are most often more severe (Devlin et al. 2011).Assessing this finding, at stop-controlled intersections, the increase in traffic volume was associated with a reduction in injury severity likelihood due to the relatively lower speeds at higher traffic volumes.This result is consistent with the finding of Al-Ghamdi (2002).
At signalized intersections, concrete roadways were associated with 77.4% lesser severity compared to asphalt roadways (OR=0.226).This is due to the increased friction and relatively lower speeds when driving on concrete roads.This was also found in the study by Haleem and Abdel-Aty 2010.Interestingly, lower crash severity likelihood was found when the traffic signal was not functioning (OR=0.242)compared to functioning traffic signals.This can be explained as drivers might be more cautious and attentive when the traffic control stops functioning.In some situations, when the traffic signal is not functioning, police officers organize the traffic at those affected signalized intersections, which can also explain the reduction of severity in these situations.
Compared to intersections located on straight and level terrains (base case), intersections located on straight at hillcrest terrains experienced higher severity likelihood at both signalized and stop-controlled intersections.However, the odds of severity were higher at stop-controlled intersections (OR=1.472)compared to the odds at signalized intersections (OR=1.209).Curved terrains experienced higher severity likelihood than straight and level terrains at stop-controlled intersections.Stop-controlled intersections in urban areas experienced higher odds of severity on curved terrains (OR=2.071)as opposed to straight at hillcrest terrains (OR=1.472).This highlights the adverse safety impact of curved terrains at stop-controlled intersection approaches.
Signalized intersections located on two-lane (per direction) roadways experienced 25.8% less severity (OR=0.742)compared to intersections located on three-lane or more (per direction) roadways, which is consistent with the finding of Al-Ghamdi (2002).This can be interpreted as intersections located on relatively wider roadways can increase the driver confusion, especially when undertaking turning maneuvers.Assessing this finding, signalized intersections located on one-way roadways experienced 43.1% less severity likelihood (OR= 0.569) than signalized intersections located on two-way roadways.This can be due to the reduction in conflict points in one-way roadways as opposed to two-way roadways.
Crashes involving motorcycles at signalized intersections were associated with 378% higher severity likelihood (OR=4.78)compared to crashes involving passenger cars.This finding is consistent with that from Savolainen and Mannering (2007).This can be interpreted since motorcyclists have no protection when crashing as opposed to drivers inside the vehicle.When comparing the goodness-of-fit of both models, it can be observed that the urban stop-controlled intersections model outperformed the urban signalized intersections one (lower AIC and residual deviance estimates).

Random Forest Analysis for Variable Screening
Another modeling approach used in this study was the random forest technique for variable screening and ranking.Using the R package, a random forest model was fitted at each of the four intersection types.It was found that the top variables in the ranking list were matching with the final list of variables in the four models.This assesses and shows the significance of the resulted variables in each model in affecting the severity at the analyzed intersections.A sample random forest output for stop-controlled intersection in rural areas is shown in Figure 1.This figure shows the seven top variables ranked by the Gini index criterion.Among these top seven variables, five variables were found in the binary logit model.These variables were the natural logarithm of AADT, vehicle type, vehicle maneuver, weather condition, and turn lanes.

Comparison of Crash Severity Risk Factors at Intersections in Urban and Rural Areas
This section highlights and conducts a comparison of the common significant variables resulted from the binary logit models at intersections in urban and rural areas.Table 5 shows the common significant variables.As shown in Table 5, there were seven common variables for at least two intersection types.These common variables were manner of crash, vehicle maneuver, posted speed limit, roadway terrain, driver gender, weather condition, and traffic volume.The only variable that was found significant in the four models was rear-end crashes, which were associated with severity reduction at all four intersections.The highest reduction was at rural stop-controlled intersections, followed by urban stop-controlled intersections, then urban signalized intersections, and finally rural signalized intersections.In general, rear-end crashes had higher reduction at stop-controlled intersections than at signalized intersections.This can be attributed to the fact that signalized intersections are characterized by "stop-and-go" traffic, which might cause more rear-end crashes than at stop-controlled intersections.On the other hand, head-on crashes were associated with higher severity, especially at stop-controlled intersections.
As expected, the right-turning maneuver showed lesser severity likelihood, especially at stop-controlled intersections.Higher speed limit roadways (> 45 mph & ≤ 55 mph) and straight at hillcrest terrains nearby urban stop-controlled intersections experienced higher severity compared to urban signalized intersections.This might be attributed to sight distance problems nearby stop-controlled intersections.
Female drivers showed 15% (OR=1.154)and 45% (OR=1.453)higher severity likelihood (compared to their male counterparts) at urban and rural signalized intersections, respectively.This implies that rural intersections are more hazardous locations for females than urban intersections, possibly due to the relatively higher speeds in rural areas.
Cloudy weather conditions were associated with lesser severity likelihood at both urban signalized and rural stop-controlled intersections.A higher reduction was found at rural stop controlled intersections (23.7% reduction) compared to urban signalized intersections (15.6% reduction).The increase in traffic volume at urban and rural stop-controlled intersections was associated with a reduction in injury severity likelihood.
To show the impact of female drivers and rear-end crash types on crash severity at both urban and rural signalized intersections, Figure 2 is shown.This figure shows the percentage of severity (including both fatalities and severe injuries) at each intersection type.As shown, female drivers and rear-end crashes were associated with higher severity in rural areas compared to urban areas.These results concur with the OR interpretation in Table 5, where the OR for females and rear-end crashes were higher in rural areas compared to urban areas.

Conclusions
This study identified and compared the factors that affect crash severity at signalized and stop-controlled intersections in both urban and rural areas.The study used relatively new variables not previously considered in severity studies, such as the roadway type (one-way vs. two-way) and whether the traffic control was functioning or not, in addition to a comprehensive list of variables related to roadway geometric characteristics, traffic characteristics, vehicle characteristics, driver characteristics, and environmental conditions.Four separate binary logits were fitted at urban signalized, urban stop-controlled, rural signalized and rural stop-controlled intersections.Several significant predictors were found in the final models.In addition to the binary logit model, the random forest technique was used to rank the importance of the variables.It was concluded that the binary logit model and random forest technique had several common significant variables.
In all the four models, rear-end crashes experienced lesser severity than side impact crashes.Head-on crashes, higher speed limits, and curved sections showed higher severity in both urban signalized and stop-controlled intersections.In rural stop-controlled intersections, cloudy weather and right-turning maneuver were associated with a severity reduction.Female drivers showed 15% and 45% higher severity likelihood (compared to their male counterparts) at urban and rural signalized intersections, respectively.There were seven common variables for at least two intersection types.These common variables were manner of crash, vehicle maneuver, posted speed limit, roadway terrain, driver gender, weather condition, and traffic volume.It was concluded that cloudy weather was associated with lesser severity likelihood at both urban signalized and rural stop-controlled intersections.However, a higher reduction was found at rural stop controlled intersections compared to urban signalized intersections.Additionally, the increase in traffic volume at urban and rural stop-controlled intersections was associated with a reduction in injury severity likelihood.
Based on the abovementioned findings, some recommendations can be proposed to alleviate injury severity at intersections.Since the left-turning maneuver showed higher injury severity likelihood at urban signalized intersections, providing enough sight distance and protected left turn phase (with no permitted phase) in busy areas can be suggested.Since urban stop-controlled intersections located on curved terrains showed higher severity, providing some warning signs upstream of the intersection is suggested to warn drivers approaching the intersection.In addition, enforcement and education countermeasures can be suggested.For example, enforcing non-speeding by law enforcement officers in urban areas is essential since higher speed limits were associated with increased injury severity likelihood.Education programs can be performed to target female drivers to educate them of the increased severity likelihood at rural intersections.Education programs can be also proposed to educate drivers on how to react when approaching curved terrains and hillcrests.
Future research can consider the minor-approach traffic volume at urban and rural intersections when investigating the crash severity.In this study, the minor-approach AADT was unavailable since the minor approaches were mostly located on local (non-state) roads.Considering both major and minor traffic volumes for intersections might better explain the significance of some of the variables.

Figure 1 .
Figure 1.Sample Variable Ranking Using Random Forest Technique at Rural Stop-Controlled Intersections

Figure 2 .
Figure 2. Severity of Female Drivers and Rear-end Crash Type at Urban and Rural Signalized Intersections

Table 1 .
Descriptive Statistics of Variables at Intersections in Urban Areas

Table 3 .
Binary Logit Model at Urban Signalized and Stop-Controlled Intersections

Table 5 :
Common Significant Variables at Intersections in Urban and Rural Areas