Opinion Spam Detection based on Annotation Extension and Neural Networks

Full Text: <a href="https://ccsenet.org/journal/index.php/cis/article/download/0/0/39043/40879">PDF &nbsp;
DOI: 10.5539/cis.v12n2p87

Yuanchao Liu; Bo Pang

doi:10.5539/cis.v12n2p87

Opinion Spam Detection based on Annotation Extension and Neural Networks

Yuanchao Liu
Bo Pang

Abstract

Online reviews play an increasingly important role in the purchase decisions of potential customers. Incidentally, driven by the desire to gain profit or publicity, spammers may be hired to write fake reviews and promote or demote the reputation of products or services. Correspondingly, opinion spam detection has attracted attention from both business and research communities in recent years. However, unlike other tasks such as news classification or blog classification, the existing review spam datasets are typically limited due to the expensiveness of human annotation, which may further affect detection performance even if excellent classifiers have been developed. We propose a novel approach in this paper to boost opinion spam detection performance by fully utilizing the existing labelled small-size dataset. We first design an annotation extension scheme that uses extra tree classifiers to train multiple estimators and then iteratively generate reliable labelled samples from unlabeled ones. Subsequently, we examine neural network scenarios on a newly extended dataset to learn the distributed representation. Experimental results suggest that the proposed approach has better generalization capability and improved performance than state-of-the-art methods.