Modeling - Conclusions

Introduction

This section serves as a fusion of the supervised modeling across the previous sections:

Specifically, this will take the best binary classification models from Naive Bayes, Decision Trees, and Support Vector Machines. These binary classification models were trained and tested on the strict two label political biases (Left and Right). To iterate across the other sections, they ignore Lean Left, Center, and Lean Right political biases. Given that sentiment towards the topic of student loan forgiveness is generally split along political lines, the idea is that correlation can be projected onto Reddit authors, predicted by a collection of their posts.

In all three families of the supervised learning models, the strict two label political bias binary classifiers performed the best. By performing probabilistic weighting with the three models, an overall political bias of a Reddit author can be obtained, complete with a probability. Therefore, a general consensus of their sentiment towards this topic can be quantitatively calculated.

Probabilistic Weighting

Using the binary classifiers for Right and Left political bias from the three models, a conclusive weighting metric was made.

For each Reddit Author:

\(C_L\): count Left classifications
\(C_R\): count Right classifications
\(P_L\): product of Left probabilities, given a Left classification
\(P_R\): product of Right probabilities, given a Right classification
\(W_L = C_L \cdot P_L\): weight of Left classifications
\(W_R = C_R \cdot P_R\): weight of Left classifications

Let Right political bias be a negative value (as to represent negative sentiment), then the following scores and labels can be interpreted:

Sentiment Score (\(S_s\)):

\[S_s = W_L + (-W_R)\]

Sentiment Label (\(S_l\)):

\[S_l = \begin{cases} S_s \leq -1 : \text{Negative} \\ -1 < S_s < 1 : \text{Neutral} \\ S_s \geq 1 : \text{Positive} \end{cases}\]

Using this probabilistic weighting method, the following results were derived:

Author	Left	Right	Sentiment Score	Sentiment Label
Loading ITables v2.3.0 from the internet... (need help?)

Sentiment Gauge Overall

Comparative Gauges

Positive Sentiment

Neutral Sentiment

Negative Sentiment

Reddit Author Content

Author	Content	Sentiment Score	Sentiment Label
Loading ITables v2.3.0 from the internet... (need help?)

Conclusion

By fusing together several different well performing models built to analyze political bias within text, especially text surrounding the topic of Student Loan Forgiveness, overall weighted political bias scores for Reddit Authors were calculated. Given that this topic is generally split upon political party lines, with the Left (or liberal) showing more positive sentiment towards the topic and the Right (or conservative) showing more negative sentiment towards the topic, political bias was used in an attempt to project overall sentiment on the topic for Reddit Authors. Using this weighted political bias score, authors were grouped into the sentiment classes of Positive, Neutral, and Negative.

Given this a complex topic, with arguments from financial equity to blame shifting, there are general undertones in many of the authors’ contents that do appear to be positive or negative. Some of those with positive sentiment share the tones of the benefits of student loan forgiveness and are spiteful towards the Right side of the government for blocking this. Some of the neutral sentiment appears to be more targeted to sharing facts and helping others understand the topic better. The negative sentiment appears to contain authors who don’t want these costs passed onto to tax payers as a whole, have “worked off their loans, so they should too” themes, along with negative sentiment towards student loan forgiveness ultimately being a failed promise, and some that spiteful towards the Left side of the government.

Overall, having several metrics strung together does seem to produce scores and capture sentiment decently, even given such a complex topic where there is rarely a definitive yes or no.