Analyzing the Effect of Social Media Sentiment on Stock Prices

As consumers increasingly use social media to find out about a company and its products, the sentiment of that social media become even more important. This study is exploratory in nature, the aim is to determine if social media sentiment influences the stock price of large global companies. Stock prices and annual revenue was measured for Dow 30 stocks at two time periods six months apart. Social Media sentiment is more difficult to measure. In this study the sentiment ratings from Social Searcher.com were used to indicate whether social media messages by the public are positive, negative or neutral. While past studies have focused on either Facebook or Twitter messages, this tool aggregates messages from eleven social media platforms globally. In conducting this study numerous limitations and recommendations for future studies became apparent. The value of this exploratory study is in the flaws exposed by conducting this study. This provides a rich template for future studies on social media sentiment and its effect on a company.


Introduction
The focus of this study is to examine how consumers' social media sentiment, measured using social media posts, effects the price of stock for large global companies. The annual revenue of these companies was also measured and analyzed against social media sentiment. There are two general ways that could be used to analyze the effect of social media on stock prices, one is a top-down approach that measures the social media messages put out by the company itself. Amin et al. (2020) examined corporate social media messages and its relationship with stock prices. These messages are controlled by the company both in quantity and quality. The other approach, and the one taken in this study, is a bottom-up approach, examining consumers' sentiment on social media and how those sentiments effect stock prices and annual revenue. Babu et al. (2015) used the Facebook Gross National Happiness Index to measure national sentiment with the stock market with mixed results depending on whether the market was bearish or bullish at the time. Bollen et al. (2017) examined tweets posted on Twitter and found that changes in public sentiment correlated with changes in the Dow Jones Industrial Average 3-4 days after the postings. Bartov et al. (2017) used Twitter tweets to predict a company's quarterly earnings, and Sul et al. (2017) analyzed 2.5 million Twitter tweets to show a relationship between sentiment in the tweets and the stock's returns in the next trading day.
This study looks at 11 social media platforms in multiple languages including Tumbler, Reddit, Flickr, DailyMotion as well as VKontakte, a Russian social media site. This study seeks to explore whether an aggregate of social media platforms can provide better predictive analysis than using just one platform such as Facebook or Twitter as most studies do. Younger Gen Z consumers are gravitating towards other platforms, more recently Tik Tok, and away from older platforms such as Facebook. Thirty stocks, namely the Dow 30 were used to gather data. Due to the small sample size (n=30, Dow 30 stocks) this study must be considered exploratory. If the results prove promising, then further study would use the S&P 500 for a larger sample size.

Literature Review
The relationship between economic factors and stock prices has long been studied by researchers and is the traditional approach to understanding market value. In the 1990's, before online investing became more accessible, research was conducted to determine the relationship between investor sentiment and stock prices regardless of quantitative financial data (Delong et al., 1990). More recently, with the ability of small investors to impact stock prices, non-economic factors such as emotion or sentiment have an increasingly larger role in effecting stock prices. DeLong et al. (1990) identified two types of investors; rational investors that use economic data and irrational investors that use emotion and sentiment. Social media is the thread that connects consumers, and thereby their sentiments about companies regardless of whether they are investors. Investors can proactively use social media to manipulate stock prices as well. GameStop is one example of social media effecting a stock price as a result of postings by investors (Krantz, 2021). As a result of these social media posts, primarily on Reddit, the stock rose 1600% in January 2021. The stock price had nothing to do with the company's financials but rather the passionate postings on social media. Hiremath et al. (2019) studied the sentiment for sports teams based in India and the stock prices for those teams. They found a direct correlation between rising sentiment and rising stock prices, and the reverse held true as well. But not all stocks are found to be subject to sentiment. Baker and Wurgler (2007) found that the stocks most likely to be affected by sentiment were those that were more difficult to arbitrage. A more recent study (Padungsaksawasdi, 2020) finds that stocks in different industries react differently to investor sentiment.
Many studies have found that investor sentiment does effect stock market valuation. (Brown & Clif, 2005;Solt & Statman, 1988;Fisher & Statman, 2003;Schmeling 2009). However most of these studies are 10-15 years old or older, and in Internet years these studies may be considered outdated. More recently, Yousra et al. (2020) examined Google Trends to predict stock prices in MENA countries (Middle East North Africa). Hiremath, Venkatesh, and Choudhury (2019) studied sentiment in India. Abdelhedi-Zouch, M., & Ghorbel, A. (2016) studied sentiment and its relationship with bank share prices in Islamic countries. Other studies look at Egypt and Tunisia (Kammoun & Zaier, 2017) or the Iraqi stock exchange (AL-Hisnawy & Al-Morshed, 2018). A search of the literature does not find any recent studies that analyze USA exchange listed stocks and consumer sentiment.

Hypotheses
This exploratory study seeks to understand if there is a relationship between the changes in stock prices and the social media sentiment of consumers by testing four hypotheses. Two of the hypothesis will focus on changes in stock prices, and two will examine annual revenue as it relates to consumers' sentiment. The term consumer sentiment rather than investor sentiment is used since not all consumers posting on social media are investors. Studies show that organizational tweets were a significant predictor of stock prices (Dhar & Bose, 2020), but what about tweets generated by the public. Further research should look at all popular social media posts, not just Twitter or Facebook. Each hypothesis related to stock prices is subdivided into a. social media mentions and b. sentiment ratio. Social media mentions tracks how often a company is mentioned on social media regardless of whether those mentions are positive, negative, or neutral. The sentiment ratio, on the other hand, measures the ratio of positive to negative posts without the neutral posts.
The first hypothesis relates to stock prices, that companies with significant increases in stock prices will have more social media mentions and a higher positive to negative sentiment ratio across the eleven most commonly used social media networks compared with companies that have decreasing or unchanging stock prices: H1a: There is a statistically significant positive relationship between social media mentions and stock price H1b: There is a statistically significant positive relationship between sentiment ratio and stock price Hypothesis 2 examine the same two independent variables, a. mentions and b. sentiment ratios, with increases in annual revenue.

H2a: There is a statistically significant positive relationship between social media mentions and revenue
H2b: There is a statistically significant positive relationship between sentiment ratio and revenue.
Additionally information about the primary platforms used to post by users was gathered for qualitative analysis and to help point a direction for future research.

Data Collection
The DJIA consists of the top 30 companies listed on exchanges in the United States. These 30 stocks were measured for price and social media sentiment on Jan. 15, 2021 and then again, six months later on July 15, 2021. Changes in the price of the stock as well as changes in consumer sentiment were measured across 11 social media platforms. The stocks were compared to see if there was any relationship between changes in stock price and changes in consumer sentiment. Many studies refer to "investor sentiment", however in this study the sentiment that is measured is from consumers who are not necessarily investors. ijbm.ccsenet.org International Journal of Business and Management Vol. 17, No. 9;2022 Social sentiment for these 30 stocks was measured using Social Searcher.com. The website tracks posts made about a company and rates these posts as either positive, negative, or neutral. The sentiment ratio is a positive to negative ratio without the neutral posts, whereas the percent positive rating is out of the total number of posts including neutral.
The number of mentions for each of the 11 social media networks was also measured. According to the data on Social Searcher.com, Facebook is not the primary method used globally to post about Dow 30 companies, Twitter was number 1, followed by Reddit, and then VKontakte.

Measurements
The independent variables used to measure sentiments for the Dow 30 were: mentions, sentiment ratio, percent of positive to total posts, and platforms used. Mentions are the number of times the company name appeared in a post across the eleven networks on a given day. These posts were then categorized by Social Searcher.com into three categories, positive sentiment, negative sentiment, and neutral. To analyze the data collected, the number of mentions for two random days six months apart was used in its raw form, and also as an average of the two days. Sentiment was also measured on the same two days six months apart. Sentiment is comprised of two measurements, a positive to negative ratio, and a percent positive out of the total number of postings. The positive to negative ratio does not include neutral postings, the percent positive rating is the percentage of positive postings out of both negative and neutral postings.
The dependent variables for this study are stock price and annual revenue. Stock prices and annual revenue were noted for Jan 15, 2021 and July 15, 2021. The average stock price over the six month period and the change in price (+-%) was calculated as well. Revenue change from 2020 and 2021 was determined from the annual report filed by each company.

Results
The annual revenue for the Dow 30 is a right skewed distribution with 70% of the Dow 30 companies having an annual revenue of 100 billion or less, 20% with an annual revenue of between 100 and 200 billion, 6.67% having annual revenue of between 200 and 300 billion, and then an outlier company (Walmart) with $573 billion in annual revenue for the year 2021. Changes in stock prices during the six months period between January 2021 and July 2021 show most of the Dow 30 companies (40%) had a positive increase in stock price of between 3% and 13%. The same number of companies showed a greater than 13% increase in the stock price during the six-month time frame. This is summarized in Table 2. mentions, and changes in sentiment ratios and positive sentiment during the six-month period is shown in Table  3. Using Pearson correlation coefficients, we can see that the change in positive sentiment is negatively related to the company's annual revenue, this is significant at .003. The percent change in sentiment ratio (positive:negative, does not include neutral sentiment) was also significant at .046.

Table 3. Bivariate correlation
Stepwise regression was performed with annual revenue as the dependent variable. The results shown in Table 4 find one independent variable significant at .003, the change in positive sentiment with a negative slope showing a negative relationship between a change in positive sentiment and annual revenue of the company Table 4. Stepwise regression for annual revenue Stepwise regression was performed with the percent change in stock price as the dependent variable. No significant results were found.
Sentiment and mentions were gathered using Social Searcher.com across eleven social media platforms for the Dow 30. Figure 1 shows a bar graph of the top 4 social media platforms that contributed to the most discussion on social media. All 30 companies were mentioned frequently on Twitter, and at a close second on Reddit.
VKontakte was the third most used social media platform followed by Daily Motion.

Discussion
The sample used in this study was only 30 which may have been acceptable had the annual revenue data not been so highly skewed. Given such an extreme skew in annual revenue, the results in this exploratory study were underwhelming. These are also unusual times, especially when these data were collected. The Covid-19 pandemic was till affecting consumers in 2021 and probably affecting consumer sentiment in a negative way. With lockdowns and shutdowns its understandable that social media would become a place to vent negative feelings. It will be interesting to repeat this study after the pandemic is considered over and the public regains confidence that the future will be brighter. Increased rates of depression have been noted in the media and this purported increase in depression has probably introduced negativity into social media that might not otherwise have existed. It is hard to know if social media sentiment was negatively impacted until consumer confidence is restored and new data are examined.
The most interesting finding was a significant negative relationship between a change in the percent of positive sentiments and annual revenue. H2 stated a positive relationship between the two variables, however the outcome showed just the opposite. The implication is that as sentiment went down, the annual revenue increased, which is difficult to fathom. This may be due to the way sentiment was measured, or may be due to general negativity in social media during the pandemic. The percent of positive sentiment is a measure based on total sentiment including negative and neutral sentiment. Another measure of sentiment in this study was a sentiment ratio of positive to negative, excluding neutral sentiments. This ratio did not show a significant relationship with annual revenue. When looking at the raw data for a company, most of the "mentions" on social media were graded as neutral, this was true for all the companies, Figure 2 shows the raw ratings for Apple as an example. The positive to negative ratio excluding the neutral was not a helpful metric, neither was the "mentions" metric. It was expected that larger companies would have more mentions on social media, but this was not found to be the case.  The social media platforms that garnered the most discussion were Twitter, then Reddit, VKontakte and Daily Motion. Twitter and Reddit were by far the most popular places for discussion about these companies. Further study should break down the social media platform at a more granular to see if the sentiment leans more positive or negative depending on which platform is used. Previous studies in the literature have focused mainly on Facebook and Twitter, but this study shows that other platforms could be more enlightening.

Conclusions and Recommendations
In conclusion, none of the hypothesis were found to be statistically significant in a meaningful way. The one finding that showed a negative relationship between social sentiment and annual revenue disappeared when sentiment was measured without the neutral category. Two obvious drawbacks became immediately apparent in conducting this study. First, how to value the financial health of a company. Annual revenue is a good measurement, but what about stock price, does a higher stock price imply a healthier company? On the surface it might look like it does, but in real terms stock price doesn't mean anything unless it is measured against how many shares are outstanding. Apple stock is currently priced at $145, but it has split several times in the past making the current price hard to compare with other companies. Price earnings ratio might be a better measurement. In this study the change as a percentage of stock price was measured, however no significant differences were found.
The second drawback in this study is the sentiment measurement. This metric is measured using the number of positive or negative words used in a post. AI is used to sniff out these words and assign a sentiment rating. As AI becomes more sophisticated this rating may become more accurate, but it is questionable whether the sentiment rating is a valid measure in its current form. There are several companies that provide sentiment ratings, each using their own proprietary methods to assign a rating. Without a consensus rating it is difficult to determine how meaningful each rating is. Fidelity has recently added something they call a "sentiment score" for investors to use in their stock research. Their sentiment score is different from Social Searcher.com, and other tools such as Sprout Social or Hootsuite. There needs to be more research in the academic realm about how to measure social media sentiment. Further study is also warranted on the veracity of Artificial Intelligence to determine the nuances in written posts. Sarcasm is not easily identified by AI.
The sample size in this study was only 30, which is too small to draw any conclusions. It was used due to the Dow 30 being a restricted set of thirty stocks. On an exploratory level, this study has opened up a vast number of research opportunities. Aside from measurement issues, the literature on social media sentiment is both outdated and underwhelming. Future research should focus on social media sentiment and its effects on brands and companies as well as strategies that companies can use to combat negative messaging. As this research is conducted, the tools for measuring social media sentiment will no doubt improve, making it harder to compare social media sentiment measured with more sophisticated tools in the future and the tools we currently have, soon to be outdated. Currently, humans are better than machines at identifying negative vs. positive sentiment. This should not deter the researcher, social media sentiment is important to monitor. It helps in identifying problems of consumer perception, and if identified early enough, allows for a company to address the issues that cause any negative sentifment.
Facebook is no longer the primary source of messaging in social media on a global scale. According to this study Twitter is far more influential in determining social media sentiment, followed by Reddit and then the Russian social media platform VKontakte. Combining eleven platforms together, as Social Searcher.com does, sounded like a good idea for this study. However, adding more platforms might obfuscate true sentiment, and in this case Simpson's Paradox might be at play. More research into the individual platforms would be useful, in tandem with looking at all the platforms in the aggregate. Additionally, the social etiquette and culture on each medium is different. This may result in different types of messages being given more weight than they deserve. Social media etiquette is another area of research that should be explored.
In conclusion, there is much research to be done in the social media realm in general, and on measuring social media sentiment in particular. We cannot rely on studies that are ten or more years old, and even studies that are five years old are becoming questionable. Yet these older studies are consistently cited in the social media literature without questioning their currency. Furthermore, the last two years have seen a world in turmoil due to a global pandemic as well as major disruptions in the global supply chain. The pandemic disrupted both the social order and the economic wellbeing of many countries. Research from the Boston University School of Public Health shows increasing rates of depression in the last two years (https://www.bu.edu). The pandemic may have also contributed to a greater reliance on social media, making analysis based on past trends useless.
The key here is to monitor, and continuously update research in the social media realm. This is done at the proprietary level so that companies can stay on the cutting edge. It should also be done in the academic realm so that scholars can also remain current in social media research.