ABSTRACT:
Opinion spammers exploit consumer trust by posting false or deceptive reviews that may have a negative impact on both consumers and businesses. These dishonest posts are difficult to detect because of complex interactions between several user characteristics, such as review velocity, volume, and variety. We propose a novel hierarchical supervised-learning approach to increase the likelihood of detecting anomalies by analyzing several user features and then characterizing their collective behavior in a unified manner. Specifically, we model user characteristics and interactions among them as univariate and multivariate distributions. We then stack these distributions using several supervised-learning techniques, such as logistic regression, support vector machine, and k-nearest neighbors yielding robust meta-classifiers. We perform a detailed evaluation of methods and then develop empirical insights. This approach is of interest to online business platforms because it can help reduce false reviews and increase consumer confidence in the credibility of their online information. Our study contributes to the literature by incorporating distributional aspects of features in machine-learning techniques, which can improve the performance of fake reviewer detection on digital platforms.
Key words and phrases: deceptive online reviews, digital platforms, fake reviews, hierarchical supervised-learning, information credibility, machine learning, online reviews, review manipulation