ABSTRACT:
Incident reporting and investigation are components of safety management systems. Timely and accurate identification of risk factors is crucial to effective prevention strategies. However, risk factor identification is often hampered by size, complexity, and the need for human involvement in categorizing incident data. We present a data-mining approach to incident risk factor identification and analysis using data from the Aviation Safety Reporting System, which is part of the Federal Aviation Administration. Our approach is an attempt to overcome obstacles related to labor intensive manual identification of risk factors as well as incomplete data. First, topical mining techniques convert underused textual data (incident narratives) to serve as model input. Second, data-streaming algorithms are used to incrementally build and test classification models for risk factor identification. Three different classification algorithms were tested providing overall accuracy rates ranging from 76 percent to 88 percent, demonstrating the potential for effective use of large and unstructured incident data in safety management. Our research presents and demonstrates an approach to automated incident type identification and contributes to our understanding of the use of text-mining and data-streaming technologies in improving safety management systems.
Key words and phrases: automated incident identification, data stream, clustering, incident cause evaluation, incident cause identification, safety management systems, text mining, topic mining