An Introduction to Quantitative Text Analysis for Linguistics: Reproducible Research Using R


  •  Hanaa Alqahtani    

Abstract

Jerid Francom’s book An Introduction to Quantitative Text Analysis for Linguistics: Reproducible Research Using R is an essential textbook for researchers and students alike, who are exploring quantitative text analysis. This book is designed with beginners in mind, it emphasizes reproducible research, offering a structured approach to text analysis through the programming language R. Spanning five interconnected parts, beginning with foundational concepts like the Data-Information-Knowledge-Insight (DIKI) hierarchy, corpus creation, and data curation, advancing to topics like tokenization, dimensionality reduction, vector space modeling, and hypothesis testing with the {infer} package. This book contains practical exercises alongside detailed explanations that guide readers through the entire process of text analysis, starting from data acquisition to predictive modeling and statistical designs. Computational methods including readability measures, sentiment analysis, semantic modeling, and topic modeling are highlighted within this book, ensuring that readers are equipped to extract meaningful insights from linguistic data. Through the incorporation of Tidyverse tools and additional resources like GitHub repositories, Francom successfully bridges theoretical understanding with hands-on application. Transparency and reproducibility have been prioritized within the text, and meticulous data documentation and open-source methodologies have been meticulously advocated by the author. Although the book is an accessible resource for English-language data, readers might be challenged due to its focus on breadth over depth when their focus might be on seeking advanced exploration or on the other hand for those without basic programming experience. Regardless of this, Francom’s pedagogical approach combines clarity with practical guidance, making this book a valuable resource for students, researchers, and professionals who aim to integrate quantitative methods into their linguistic research.



This work is licensed under a Creative Commons Attribution 4.0 License.
  • ISSN(Print): 1923-869X
  • ISSN(Online): 1923-8703
  • Started: 2011
  • Frequency: bimonthly

Journal Metrics

Google-based Impact Factor (2021): 1.43

h-index (July 2022): 45

i10-index (July 2022): 283

h5-index (2017-2021): 25

h5-median (2017-2021): 37

Learn more

Contact