Journal of Management Information Systems

Volume 31 Number 4 2015 pp. 109-157

Enhancing Predictive Analytics for Anti-Phishing by Exploiting Website Genre Information

Abbasi, Ahmed, Zahedi, Fatemeh “Mariam”, Zeng, Daniel, Chen, Yan, Chen, Hsinchun, and Nunamaker, Jay F


Phishing websites continue to successfully exploit user vulnerabilities in household and enterprise settings. Existing anti-phishing tools lack the accuracy and generalizability needed to protect Internet users and organizations from the myriad of attacks encountered daily. Consequently, users often disregard these tools’ warnings. In this study, using a design science approach, we propose a novel method for detecting phishing websites. By adopting a genre theoretic perspective, the proposed genre tree kernel method utilizes fraud cues that are associated with differences in purpose between legitimate and phishing websites, manifested through genre composition and design structure, resulting in enhanced anti-phishing capabilities. To evaluate the genre tree kernel method, a series of experiments were conducted on a testbed encompassing thousands of legitimate and phishing websites. The results revealed that the proposed method provided significantly better detection capabilities than state-of-the-art anti-phishing methods. An additional experiment demonstrated the effectiveness of the genre tree kernel technique in user settings; users utilizing the method were able to better identify and avoid phishing websites, and were consequently less likely to transact with them. Given the extensive monetary and social ramifications associated with phishing, the results have important implications for future anti-phishing strategies. More broadly, the results underscore the importance of considering intention/purpose as a critical dimension for automated credibility assessment: focusing not only on the “what” but rather on operationalizing the “why” into salient detection cues.

Key words and phrases: design science, data mining, phishing websites, genre theory, Internet fraud, website genres, credibility assessment, phishing