Journal of Management Information Systems

Volume 28 Number 4 2012 pp. 11-38

Two Rule-Based Natural Language Strategies for Requirements Discovery and Classification in Open Source Software Development Projects

Vlas, Radu E and Robinson, William N

ABSTRACT: Open source projects do have requirements; they are, however, mostly informal text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects, error prone. Automated analysis of natural language requirements, even partial, will be of great benefit. Toward that end, we describe the design and validation of an automated natural language requirements classifier for open source projects. We compare two strategies for recognizing requirements in open forums of software features. Our results suggest that classifying text at the forum postaggregation and sentence aggregation levels may be effective. Our results suggest that it can reduce the effort required to analyze requirements of open source projects.

Key words and phrases: natural language processing, open source, requirements classification, requirements discovery, software requirements