ABSTRACT: Data mining techniques provide a popular and powerful tool set to generate various data-driven classification systems. In this paper, we investigate the combined use of self-organizing maps (SOM) and nonsmooth nonconvex optimization techniques in order to produce a working case of a data-driven risk classification system. The optimization approach strengthens the validity of SOM results, and the improved classification system increases both the quality of prediction and the homogeneity within the risk groups. Accurate classification of prostate cancer patients into risk groups is important to assist in the identification of appropriate treatment paths. We start with the existing rules and aim to improve classification accuracy by identifying inconsistencies utilizing self-organizing maps as a data visualization tool. Then, we progress to the study of assigning prostate cancer patients into homogenous groups with the aim to support future clinical treatment decisions. Using the case of prostate cancer patients grouping, we demonstrate strong potential of data-driven risk classification schemes for addressing the risk grouping issues in more general organizational settings.
Key words and phrases: classification, data mining, optimization, prostate cancer, risk grouping