top of page
Writer's pictureArpit Shah

Determining Accident-prone Road Sections across Time & Space

Updated: Aug 29

I have fond memories of playing the Hot & Cold game as a kid. Perhaps you do, too. Hiding a tiny object somewhere in the house and asking your siblings and friends to find it, within a stipulated time. Upon being asked, one has to give a verbal clue - 'Hot', 'Very Hot', 'Very Cold', 'Not Hot-Not Cold' and other iterations to inform the seeker the status of his/her 'proximity' to the hidden object - hot implying near and cold implying distant. The joy of hiding the object securely or spotting it in quick time was immense. Funnily enough, there were frequent disagreements if the seeker found the (subjective) call of proximity to be inaccurate or misleading!


Spatial proximity in Cricket
Figure 1: The degree of spatial proximity plays an important role in determining the outcome of a sporting incident or in the effectiveness of an application involving magnetism.
Maps link attribute information to positional information
Figure 2: Maps 'link' attribute information to a position giving us a very direct, visual representation of spatial phenomenon.
 

We will use a similar Hot and Cold technique, albeit statistical in nature, to understand Vehicle Accident trends in a particular US county spatially and spatiotemporally on a modern mapping platform - Geographic Information System, commonly known as GIS.


Spatiotemporal method of analysis is a very powerful way of making sense of geodata in that it adds a whole new dimension (time & space together) to data visualization. While we'll get to that in due course of this article, you may want to refer to examples of unidimensional workflows from the following video links - spatial & temporal.


For our vehicle accidents case purposes, we will use the Hot Spot Analysis tool & the Space-Time Cube tool. I would recommend you to watch the two videos below to get a clear understanding of the methodology involved.


Video 1: Hot Spot Analysis in GIS


Video 2: Space-Time Cube in GIS

Interesting, isn't it?

 

Applying Location Analytics to Vehicle Accidents Records


I have attempted to explain this topic in much detail. You may choose to see the video compilation below if you are more keen to see clips of the technology at work rather than read more about the concept involved. (I'd recommend you to read the article first, followed by a video viewing if you'd like to explore further).


Video 3: Walkthrough on deploying Location Analytics on Vehicle Accidents Records

Let's begin.


At first, we will load > 1 lakh vehicle accident records from 2010 to 2015 onto the mapping platform. Aside from the positional information i.e. the exact coordinates of the accidents, there are several attribute information about the accidents available, such as - date and time of the accident, number of fatalities and injuries if any, whether the driver was under the influence of alcohol or was distracted at the time of the accident and the weather condition during the time of the accident.


These attribute information appear very standard in nature, one expects that such records are captured at every major urban centre by the local law enforcement agencies.


Vehicle Crashes attribute table
Figure 3: Vehicle Crashes attribute table

Vehicle Crashes attribute table
Figure 4: Vehicle Crashes attribute table contd.
 

Because the data we possess contains positional information i.e. the coordinates, we can plot it on a mapping platform - Esri's ArcGIS (Geographic Information System).

Plotting vehicle crashes on GIS
Figure 5: Plotting vehicle crashes on GIS

Alongside this accident data, we have another crucial piece of information stored as a separate information layer - the digitized road network of the Area of Interest (AoI).


Again, this is a standard piece of information expected to be available with law enforcement agencies worldwide.

Digitized Road Network
Figure 6: Road Network Layer
 

The next step involves restructuring the data into a Space-Time Cube. In simple words, just as we format plain data into a pivot table in Microsoft Excel to lend a more meaningful structure to information, the GIS software arranges the multiple, complex data points into individual spatiotemporal buckets or Bins using the Space-Time Cube.


Space-Time Cube in Excel
Figure 7: Space-Time Cube in Excel

Space Time Cube in Geoprocessing Tools
Figure 8: Space Time Cube in Geoprocessing Tools (ArcGIS Pro)

Each individual bucket of information (Bin) in the space-time cube will aggregate information pertaining to 2 miles of the territory (spatial) across 16 weeks i.e. 4 months of data (temporal).




















Unlike Pivot Table in MS Excel, the space-time cube is not visualized in the software - rather, the output summary is available for us to review while the output file is stored in the system and used as an input in the subsequent workflow as we shall observe later on.

Space Time Cube Output Summary
Figure 8: Space Time Cube Output Summary

The single paragraph above gives us a good summary about our geodata and how it has been arranged spatiotemporally in the mapping platform.

 

The space-time cube forms an integral part of our next workflow - the Emerging Hot Spot analysis.

Emerging Hot Spot Analysis Geoprocessing tool in ArcGIS Pro
Figure 9: Emerging Hot Spot Analysis Geoprocessing tool in ArcGIS Pro

Emerging Hot Spots do not represent the density of the accidents, rather it captures the 'trend' of the accidents in that spatial area over a period of time and categorizes it as per its statistical significance i.e. when X causes Y


At first, we will deploy Emerging Hot Spot Analysis on just the 'Count' of Vehicle Accidents over a single neighborhood time-step (technical note captured in the image below).










Neighborhood-based Statistics in ArcGIS Pro
Figure 10: Neighborhood-based Statistics. Source: ArcGIS Pro Community
 

The output of the Emerging Hot Spot Analysis is depicted below. Read the map legend on the left to understand the hexagon symbology-

Emerging Hot-Spot Analysis Output ArcGIS Pro
Figure 11: Emerging Hot-Spot Analysis Output

The Emerging Hot Spot output table below indicates to us that there are 2 new hot spots, 17 consecutive hot spots, 59 sporadic hot spots and 13 oscillating hot spots in our study area.

Emerging Hot-Spot Output Table ArcGIS Pro
Figure 12: Emerging Hot-Spot Output Table

To know more about what each hexagon pattern means, you may read the infographic below -

Emerging Hot-Spot Output Patterns explained
Figure 13: Emerging Hot-Spot Output Patterns explained; Source: ArcGIS Pro Community

The pattern most commonly found in our Emerging Hot Spot output is the Sporadic Hot Spot - means that the spatial bin under observation continually (& statistically) switches from being a hot spot to not being a hot spot to being a hot spot again.


One can presume that, given the context of our topic, the New, Persistent & Intensifying hot spots are the ones which would capture the immediate attention of law enforcement agencies.

 

Some of you would have noticed that as part of our Emerging Hot Spot Analysis, we did not factor in the Road Network layer available with us. Yes, that is true: the 2 mile spatial distance within each Bin is Euclidean i.e. based on straight line computation and does not factor the distance in existing Road Network terms. Factoring in the Road Network would lead to more accurate analysis and improve our interpretation of it.


After all, if I were to ask you how much distance can you travel by car in 45-50 minutes, which representation would be more accurate - The Euclidean one on your left or the Road Network factored one on your right in the depiction below?


Euclidean Distance output in ArcGIS online
Figure 14: Euclidean Distance output in ArcGIS Online

Drive Time output in ArcGIS online
Figure 15: Drive Time output in ArcGIS Online

The depiction to your right is more accurate as it mimics the real-world scenario more closely. One can only travel as much in 45 minutes as the existing Road Network allows us to.

 

So how can we analyze Vehicle Accident spots factoring in the Road Network?


Before we proceed to do so, we need to pre-process our geodata first as there are certain anomalies present. Observe from the image below that the location of some of the accident locations (red dots) do not fall directly on roads - rather, they are located outside the road boundaries. It could be so that the location recorded is where the vehicle landed after the accident and not where the accident occurred in reality. Or it could be a case of mistaken record-taking, faulty GPS calibration etc.

Anomalies in Vehicle Crashes depiction on a Map
Figure 16: Anomalies in Vehicle Crashes depiction on a Map

Corrective measure for anomaly in Figure 16 - Snap Geoprocessing Tool in ArcGIS Pro
Figure 17: Corrective measure for anomaly in Figure 16 - Snap Geoprocessing Tool

To correct this, we use the Snap tool in GIS wherein we command the mapping platform to link any accident data points within 0.25 miles of the road network to the nearest road.


This leads to a shift of the outlier accident spots to within the road boundary.







The revised output (below) corrects the anomaly - now virtually all the accident spots are located within the Road Network...


Corrected Map output after running the Snap tool
Figure 18: Corrected Map output after running the Snap tool

Spatial Join Geoprocessing Tool Parameters in ArcGIS Pro
Figure 19: Spatial Join Geoprocessing Tool Parameters

… which therefore allows us to integrate the two layers - Accident Spots and Road Network seamlessly, by using the Spatial Join tool.










Vehicle Crash Geodataset linked to Road Network as depicted in the popup over a road intersection
Figure 20: Vehicle Crash Geodataset linked to Road Network as depicted in the popup over a road intersection

Now, the Accident geo-data appears to be properly structured and directly linked to the Road Network.


We are ready to do another Hot Spot Analysis now...

 

Or are we?...


'Calculating' Crash Rate per mile per year in ArcGIS Pro
Figure 21: 'Calculating' Crash Rate per mile per year - Parameters in ArcGIS Pro Calculate Field Geoprocessing Tool

Unfortunately no, Longer roads will have more accidents assigned to them and the hot spot output will be biased towards longer roads as a result. This isn't correct and will hamper the quality of our interpretation.


To standardize this implicit defect in the geo-data, we will compute the 'Crash Rate per mile, per year', first.














The Crash Rate is now decoupled from the length of the road. The newly computed data column Crash_Rate is added to the extreme right in the attribute table below.

Crash rate depicted per year basis
Figure 22: Crash rate depicted on a per mile, per year basis
 

Now we are ready to perform the Hot Spot Analysis. This time we'll not use the Emerging Hot Spot Analysis Tool, rather we will use the Hot Spot Analysis (Getis-Ord Gi*) Tool as we want our analysis to capture the spatial relationships within the road network as well.


To explain it simply, we want to assign weights not just based on the recorded location of the vehicle after the accident but also to the entire section of the road where the accident sequence would have played out (driver spotting a person / vehicle on the road ---> hitting the brakes ----> hitting the person / another vehicle ----> vehicle eventually halting).


Getis-Ord Gi* Hot Spot Analysis Geoprocessing tool
Figure 23: Getis-Ord Gi* Hot Spot Analysis Geoprocessing tool

Getis-Ord Gi* Hot Spot Analysis Geoprocessing tool parameters
Figure 24: Getis-Ord Gi* Hot Spot Analysis Geoprocessing tool parameters

The technical note for this tool reads - "To keep the crash hot spots local, the Impedance Distance Cutoff parameter was set to 360 feet (about the length of a football field), which is the minimum stopping sight distance for a vehicle traveling 45 mph."


In case you are interested, you may read detailed concept note here.















Now that we've run the Hot Spot Analysis (Getis-Ord Gi*) Tool, you may see a cross-section of its output below-

Output of Getis-Ord Gi* Hot Spot Analysis Geoprocessing tool
Figure 25: Output of Getis-Ord Gi* Hot Spot Analysis Geoprocessing tool

Now, the hot spots are aligned with the road network (do not appear as hexagons as they did previously), allowing for more meaningful interpretation.

 

Next, we'll deep dive further and analyze hot spots for specific variables beginning with only those vehicle accidents which led to fatalities. We will use the same workflow as above, just the geodata is filtered to capture only those accidents which led to fatalities.


The Fatality hot spot output is naturally different from the All Accidents hot spot output.


The GIS platform allows us to compare both the outputs visually. See the All Accidents Hot Spot output (Left) v/s Accidents involving Fatalities Hot Spot output (Right) comparison from the depiction below-

All Accidents Hot Spot output (Left) vs Accidents involving Fatalities Hot Spot output (Right)
Figure 26: All Accidents Hot Spot output (Left) vs Accidents involving Fatalities Hot Spot output (Right)

This comparison is very illuminating. Some hot spots have emerged at new locations in the image on the right which law enforcement has to play close attention to. You would appreciate that running the hot spot analysis on a specific variable (fatality) brought to the fore certain areas of trouble which were diluted in the All Accidents hot spot and hence weren't visible there. Even within the hot spot output of the All Accidents analysis, we are able to narrow down on the sections which are more Fatality-prone. Obviously, several sections are not hot spots at all in the image on the right as there was no statistically significant relation to fatality there - perhaps these are less troublesome roads and can be given second priority by law enforcement agencies and policy makers.

 

Similarly, we will use the same workflow to compare All Accident Hot Spots (left) to Accident Hot Spots where the driver was under the influence of Alcohol (right) from the depiction below -


All Accident Hot Spots (left) to Accident Hot Spots where the driver was under the influence of Alcohol (right)
Figure 27: All Accident Hot Spots (left) to Accident Hot Spots where the driver was under the influence of Alcohol (right)

A clear indication of river-side partying?


I hope you can appreciate how powerful the spatiotemporal hot spot analysis can be to develop a deep understanding of the accident trends. The analytical output can be useful for a wide variety of stakeholders from law enforcement agencies and policy makers to vehicle manufacturers and general public.


Do note that the quality of the output is dependent on the quality of the geodata captured.


I cannot emphasize it more, organizations especially in India should lay stress on capturing and improving the quantity and quality of the geodata they capture.

 

Our next sequence of analysis is to demonstrate the power of modern map-based analytics where we can micro-analyze the accident geo-datasets at even greater depth and at much faster speeds.


So after computing specific variable-based Hot Spot Analysis, the next question you may ask is - during which hours of the day are the vehicle accidents peaking in and how do their hot spots look like / compare to the original All Accident hot spots?


Luckily, aside from doing map based analytics, modern GIS platforms are adept at doing chart and table based analytics just as we do on spreadsheet based platforms such as Microsoft Excel.


The GIS has created a line chart for us below. What trends can you observe ?

Vehicle Crashes Line Chart - hourly and daily bifurcation in Esri ArcGIS Pro
Figure 28: Vehicle Crashes Line Chart - hourly and daily bifurcation

Do the trends become more evident / visible in the differently color-coded line chart below?

New Color-coded Vehicle Crashes Line Chart - hourly and daily bifurcation
Figure 29: New Color-coded Vehicle Crashes Line Chart - hourly and daily bifurcation

Hours 1500 - 1700 (3 pm - 5 pm) on Weekdays (Monday-Friday) are the peak times for vehicle accidents.


Using this discovery, we can fine-tune our hot spot analysis for this time filter and re-apply the same workflow. However, this time we will not proceed to do it one-step-at-a-time. Instead, we can replicate it by creating and executing a Geo-Processing Model as depicted below-

Pre-built Geo-Analytics model to compute the Hot Spot (Getis-Ord Gi*) on the accident spots within the peak hours we've identified and only on weekdays.
Figure 30: Pre-built Geo-Analytics model to compute the Hot Spot (Getis-Ord Gi*) on the accident spots within the peak hours we've identified and only on weekdays.

Geo Processing Tool - Create Day/Time Hot Spot Map
Figure 31: Geo Processing Tool - Create Day/Time Hot Spot Map

Such models are easy to configure and requires minimal coding. As you would gauge, having such geo-analysis models allow us to have faster and error free re-runs of validated geo-workflows. It saves enormous time and effort - location intelligence at the click of a button!




















The depiction below compares All Accidents Hot Spot (left) to the output of the Accidents Hot Spot during Peak Hours of Weekdays (right) geo-model. The new output gives us fresh insights about vehicle accident patterns and when their probability of occurrence is at its maximum.

All Accidents Hot Spot (left) to the output of the Accidents Hot Spot during Peak Hours of Weekdays (right)
Figure 32: All Accidents Hot Spot (left) to the output of the Accidents Hot Spot during Peak Hours of Weekdays (right)
 

Those who aren't yet genuinely impressed by the capability of modern map-based location intelligence platforms are sure to be blown away by the final analysis step I'll demonstrate up next. After all, what good is Location Analytics in 2D when the option to do Spatiotemporal Analysis in 3D is available to us!


That being said, we can only use Spatiotemporal 3D Hot Spot analysis tool, provided we have appropriate questions to ask the Geographic Information System (GIS). In the current context, we can run this tool if we wish to know the Trends of Vehicle Accidents during Peak Hours on Weekdays on a Year-on-Year basis. To visualize the output, there are two geo-processing models which we would need to configure-


a) Yearly Hot Spot Maps for each year from 2010 - 2015 (6 years)

Graph to generate 6 different yearly hot spot maps in Esri ArcGIS Pro
Figure 33: Graph to generate 6 different yearly hot spot maps

b) This output (6 yearly hot spot maps) flows as one of the inputs in our next model where we execute the Peak Hours during Week Days geo-processing model -

Peak Hours during Week Days geo-processing model
Figure 34: Depiction of Peak Hours during Week Days geo-processing model

These models may appear complex, however, we have just 'codified' our existing workflows which I took you through early on in this article. The only complex part actually is involving the rendering of the output in 3D and luckily, it is the GIS platform which has to do this bit of geoprocessing and not us!


3D Visualization of the Year-on-Year Vehicle Accidents during Peak Hours on Weekdays Model Output

3D Visualization of the Year-on-Year Vehicle Accidents during Peak Hours on Weekdays Model Output
Figure 35: 3D Visualization of the Year-on-Year Vehicle Accidents during Peak Hours on Weekdays Model Output

Struggling to make sense of the red bars in the image above?


3D spatiotemporal visualization of a road where the yearly vehicle crashes are plotted as a space-time cube
Figure 36: 3D spatiotemporal visualization of a road where the yearly vehicle crashes are plotted as a space-time cube

See the image above. At the intersection on Prospect Avenue lies the depiction of a sporadic hot spot - the type most common as per our hot spot analysis. The first year is right at the bottom and is statistically very significant (dark red) i.e. it represents a strong hot spot for vehicle accidents. In the second year, the hot spot vanishes. In the third year, the hot spot is statistically significant, yet weaker (light red) than the first year. In the fourth year, the hot spot vanishes again only to return in full force (dark red) in the fifth year and vanish in the sixth year.


Hope you can interpret the 3D visualization now and also understand the importance of seeing a trend in a spatiotemporal dimension on a 3D platform.

 

Using the same information that I've mentioned, can you interpret the following? -

3D spatiotemporal visualization of a road where the yearly vehicle crashes are plotted as a space-time cube
Figure 37: Very Highly Statistically Significant Hot Spots at the intersection for all six years (2010-2015)

And this?

3D spatiotemporal visualization of a road where the yearly vehicle crashes are plotted as a space-time cube
Figure 38: Hot Spot with high statistical significance only for the fourth year (2013)

And this?

3D spatiotemporal visualization of a road where the yearly vehicle crashes are plotted as a space-time cube
Figure 39: Begins with statistically very highly significant hot spot (2010) followed by statistically significant hot spots in the next two years. The hot spot vanishes for the next 3 years - 2013 - 2015

Thank you for reading! Hope you found this content to be appealing.

3D spatiotemporal visualization of a road where the yearly vehicle crashes are plotted as a space-time cube
Figure 40: Final Output of 3D based spatiotemporal visualization of a road where the yearly vehicle crashes are plotted

(Much thanks to Esri authors Lauren Scott Griffith & Lixin Huang for developing this concept in their tutorial).

 

ABOUT US


Intelloc Mapping Services | Mapmyops.com is based in Kolkata, India and engages in providing Mapping solutions that can be integrated with Operations Planning, Design and Audit workflows. These include but are not limited to - Drone ServicesSubsurface Mapping ServicesLocation Analytics & App DevelopmentSupply Chain Services & Remote Sensing Services. The services can be rendered pan-India, some even globally, and will aid an organization to meet its stated objectives especially pertaining to Operational Excellence, Cost Reduction, Sustainability and Growth.


Broadly, our area of expertise can be split into two categories - Geographic Mapping and Operations Mapping. The Infographic below highlights our capabilities.

Mapmyops (Intelloc Mapping Services) - Range of Capabilities and Problem Statements that we can help address
Mapmyops (Intelloc Mapping Services) - Range of Capabilities and Problem Statements that we can help address

Our 'Mapping for Operations'-themed workflow demonstrations can be accessed from the firm's Website / YouTube Channel and an overview can be obtained from this flyer. Happy to address queries and respond to documented requirements. Custom Demonstration, Training & Trials are facilitated only on a paid-basis. Looking forward to being of service.


Regards,

311 views
bottom of page