The Application of Text Mining Algorithms In Summarizing Trends in Anti-Epileptic Drug Research


  •  Shatrunjai Singh    
  •  Swagata Karkare    
  •  Sudhir Baswan    
  •  Vijendra Singh    

Abstract

Content summarization is an important area of research in traditional data mining. The volume of studies published on anti-epileptic drugs (AED) has increased exponentially over the last two decades, making it an important area for the application of text mining based summarization algorithms. In the current study, we use text analytics algorithms to mine and summarize 10,000 PubMed abstracts related to anti-epileptic drugs published within the last 10 years. A Text Frequency – Inverse Document Frequency based filtering was applied to identify drugs with highest frequency of mentions within these abstracts. The US Food and Drug database was scrapped and linked to the results to quantify the most frequently mentioned modes of action and elucidate the pharmaceutical entities marketing these drugs. A sentiment analysis model was created to score the abstracts for sentiment positivity or negativity. Finally, a modified Latent Dirichlet Allocation topic model was generated to extract key topics associated with the most frequently mentioned AEDs. We found the top five most common drugs that appeared from the analysis were Gabapentin, Levetiracetam, Topiramate, Lamotrigine and Acetazolamide. We further listed the key topics associated with these drugs and the overall positive or negative sentiment associated with them. Results of this study provide accurate and data intensive insights on the progress of anti-epileptic drug research.



This work is licensed under a Creative Commons Attribution 4.0 License.