ABSTRACT: Open source projects do have requirements; they are, however, mostly informal text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects, error prone. Automated analysis of natural language requirements, even partial, will be of great benefit. Toward that end, we describe the design and validation of an automated natural language requirements classifier for open source projects. We compare two strategies for recognizing requirements in open forums of software features. Our results suggest that classifying text at the forum postaggregation and sentence aggregation levels may be effective. Our results suggest that it can reduce the effort required to analyze requirements of open source projects.
Key words and phrases: natural language processing, open source, requirements classification, requirements discovery, software requirements