Using J48 Tree Partitioning for scalable SVM in Spam Detection


  •  Mohammad-Hossein Nadimi-Shahraki    
  •  Zahra S. Torabi    
  •  Akbar Nabiollahi    

Abstract

Support Vector Machines (SVM) is a state-of-the-art, powerful algorithm in machine learning which has strong regularization attributes. Regularization points to the model generalization to the new data. Therefore, SVM can be very efficient for spam detection. Although the experimental results represent that the performance of SVM is usually more than other algorithms, but its efficiency is decreased when the number of feature of spam is increased. In this paper, a scalable SVM is proposed by using J48 tree for spam detection. In the proposed method, dataset is firstly partitioned by using J48 tree, then, features selection are applied in each partition in parallel. Consistently, selected features are used in the training phase of SVM. The propose method is evaluated conducted some benchmark datasets and the results are compared with other algorithms such as SVM and GA-SVM. The experimental results show that the proposed method is scalable when the number of features are increased and has higher accuracy compared to SVM and GA-SVM.



This work is licensed under a Creative Commons Attribution 4.0 License.
  • ISSN(Print): 1913-8989
  • ISSN(Online): 1913-8997
  • Started: 2008
  • Frequency: quarterly

Journal Metrics

WJCI (2020): 0.439

Impact Factor 2020 (by WJCI): 0.247

Google Scholar Citations (March 2022): 6907

Google-based Impact Factor (2021): 0.68

h-index (December 2021): 37

i10-index (December 2021): 172

(Click Here to Learn More)

Contact