CTA MoodMeter (beta)
What is this?
Since our goal is to make your ride better, it only makes sense that we come up with ways to measure our success. What we have here is an initial attempt to gauge the overall “mood” of the CTA, on a daily basis, and in real-time. It’s informative, kinda fun, and something we hope that others will find useful as well.
How it Works
We use a form of computational linguistics called “sentiment analysis” to automatically classify all user messages and related tweets into three categories: Negative, Neutral, or Positive. Each message is in turn given a numerical score based on it’s category, i.e. -1 for negative, 0 for neutral, and 1 for positive. The “mood” score of a given day is calculated by averaging the scores for all the messages of that day. These averages are what is plotted above; with 1 being the most positive and -1 being the most negative.
There are many aspects that affect the collective mood of the CTA: weather, day of the week, the media, delays, etc. It’s a complicated and somewhat subjective endeavor, that will take us some time to perfect and fully understand. Further, there are two primary limitations and things to think about when interpreting our data...
The first consideration is sample size and quality, which can introduce bias to our efforts. Not everyone posts messages or tweets, nor do they post every day. If they did, we would have a perfectly representative sample set. Further some folks might be more inclined to post when they are mad (or happy). Also, you must consider that the mood of folks that contribute to social media may not be reflective of folks that don’t know what Twitter is, etc.
The second major limitation of our system is the actual classification of messages themselves. We are using a machine to do this, and it’s not always good at it. Currently we use a “Bayes classifier” which looks at all of the words of a message somewhat independently. As a result it can sometimes miss the true intent of a word or phase as it exists in the context of the rest of the sentence. Sarcasm is also difficult for our system to pick up on.
We are working on solutions to account for all of these and more.