Scaled Pearson’s Correlation Coefficient for Evaluating Text Similarity Measures
- Issa Atoum
Abstract
Despite the ever-increasing interest in the field of text similarity methods, the development of adequate text similarity methods is lagging. Some methods are decent in entailment while others are reasonable to the degree to which two texts are similar. Very often, these methods are compared using Pearson’s correlation; however, Pearson’s correlation is bound to outliers that could affect the final correlation coefficient figure. As a result, the Pearson correlation is inadequate to find which text similarity method is better in situations where data items are very similar or are unrelated. This paper borrows the scaled Pearson correlation from the finance domain and builds a metric that can evaluate the performance of similarity methods over cross-sectional datasets. Results showed that the new metric is fine-grained with the benchmark dataset scores range as a promising alternative to Pearson’s correlation. Moreover, extrinsic results from the application of the System Usability Scale (SUS) questionnaire on the scaled Pearson correlation revealed that the proposed metric is attaining attention from scholars which implicate its usage in the academia.
- Full Text: PDF
- DOI:10.5539/mas.v13n10p26
Journal Metrics
(The data was calculated based on Google Scholar Citations)
h5-index (July 2022): N/A
h5-median(July 2022): N/A
Index
- Aerospace Database
- American International Standards Institute (AISI)
- BASE (Bielefeld Academic Search Engine)
- CAB Abstracts
- CiteFactor
- CNKI Scholar
- Elektronische Zeitschriftenbibliothek (EZB)
- Excellence in Research for Australia (ERA)
- JournalGuide
- JournalSeek
- LOCKSS
- MIAR
- NewJour
- Norwegian Centre for Research Data (NSD)
- Open J-Gate
- Polska Bibliografia Naukowa
- ResearchGate
- SHERPA/RoMEO
- Standard Periodical Directory
- Ulrich's
- Universe Digital Library
- WorldCat
- ZbMATH
Contact
- Sunny LeeEditorial Assistant
- mas@ccsenet.org