Most Twitter users are very consistent: they either tell the truth almost all the time
or share false information almost all the time.
The users who spread false information
are fewer in number, but they post many more tweets, sometimes over 16,000.
People also tend to retweet messages without checking whether they are true or false. Tweets that reference articles usually agree with those articles, even if the articles themselves are misleading.
Features that were much less useful included average word length, the shortest word, hashtags, and various account statistics.
For news articles, we used a Kaggle fake vs real news dataset. During data exploration we saw that fake news and true news behave very differently:
To compare the two systems, we ran approximately 200 randomly selected tweets from the NYC mayoral election through both our fine-tuned BERT model and a standardized ChatGPT prompt, collecting the predicted label and confidence score for each model. On average, ChatGPT found the tweets to be about 20% less true. BERT continued to be the stronger strict classifier. Its predictions were consistent and usually high in confidence, reflecting how it was trained—directly on labeled truth data using only the text. As we saw in the previous plots, BERT produced very confident, mostly binary predictions, whereas ChatGPT’s scores were more moderate and varied.
When testing our XBG boost model vs ChatGPT, we subset a random sample of 200 data points from the original data and compared the results with ChatGPT we have to admit that this accuracy does tend to drop once we start leaving data involved in the 2016 election sphere. When inputting more current data, the model is skewed, labeling it as Fake news much more often than Real.