Buffalo-Niagara Transportation Data-Warehouse Prototype and Real-time Incident Detection

Data Warehouse for Buffalo-Niagara region

Research is being conducted on capturing and utilizing large amounts of data specific to the Buffalo-Niagara region.

The overall goal of this study is to not only design a transportation data-warehouse prototype for the Buffalo-Niagara region, but also to demonstrate its usefulness through a specific application.

To achieve this, three objectives were designed: (1) outline the structure of a data warehouse for the Buffalo-Niagara region, (2) use the combined data in the prototype warehouse to examine its usefulness in the construction of a real-time incident detection system which not only detects incidents but also tries to predict incident characteristics, and (3) show the importance of the data warehouse by comparing the results of incident detection strategies which require different combinations of data. To meet these objectives a prototype data warehouse was first created, and then used in the creation and validation of three incident detection strategies: a speed threshold detection system, a binary probit model which uses only speed data, and a binary probit model which uses a combination of speed and volume data.

The prototype data warehouse showed it was possible to construct a fully fleshed-out version for transportation data in the Buffalo-Niagara region with useful results. The speed threshold model which used a 10 minute speed drop of 10 mph to detect incidents had a 62.5% detection rate, as well as favorable false alarm and classification rates. The more complex binary outcome model which used only speed data detected incidents with a success rate of 70.4%, an improvement over the speed threshold model despite worse false alarm and classification rates. It was also able to predict incident type, number of blocked lanes, and incident severity with 75.9%, 70.4%, and 75.9% accuracy, respectively. The binary outcome model which used both speed and volume data had a more impressive detection rate of 75.5% with similar false alarm and classification rates and was slightly better at predicting incident type and severity (both with 77.6% accuracy) but slightly worse at predicting the number of blocked lanes (with 69.4% accuracy).

Overall, the combined data model is the best strategy for both detecting incidents and predicting their characteristics, which emphasizes the importance of a transportation data warehouse.