An Exploration of Impact Factors Influencing Students’ Reading Literacy in Singapore with Machine Learning Approaches

  •  Xin Dong    
  •  Jie Hu    


This study identified the contextual factors which differentiated 15-year-old students with high- and low-achieving reading literacy in Singapore based on Program for International Student Assessment (PISA) 2015. 4,015 students from Singapore were collected from the public dataset of PISA 2015, with 2,646 high-achieving students and 1,369 low-achieving students in PISA reading literacy test. The impact of the overall 49 contextual factors on reading literacy was analyzed in three levels: student level, family level and school level. Support vector machine (SVM), a machine learning approach, was applied to analyze these contextual features. It indicated that SVM could effectively distinguish these two cohorts of readers with an accuracy score of 0.78. SVM-based recursive feature elimination (SVM-RFE), another machine learning approach, was then applied to rank these selected features. These features were outputted in descending order with regard to the degree of their significance to the differentiation. At last, an optimal set with 15 contextual factors was selected by RFE-CV (cross validation), which collectively affected the differentiation of students with high- and low-level of reading literacy. Based on the analysis, implications to further improving students’ reading literacy can be achieved.

This work is licensed under a Creative Commons Attribution 4.0 License.
  • ISSN(Print): 1923-869X
  • ISSN(Online): 1923-8703
  • Started: 2011
  • Frequency: bimonthly

Journal Metrics

Google-based Impact Factor (2021): 1.43

h-index (July 2022): 45

i10-index (July 2022): 283

h5-index (2017-2021): 25

h5-median (2017-2021): 37

Learn more