How Text Analytics is Changing Marketing Research

By: Grey A. Geppert
What do kids these days look for in a college?
Ask five people and you’ll get six answers, at least. There are a lot of polls, studies and expert opinions out there, all pointing to very different conclusions, making the question a vexing one that has plagued higher ed administrators for years.  UMSL’s digital marketing team recently took a fresh approach to this marketing mystery. Instead of relying on the traditional route of surveys and interviews with students, they utilized an innovative method to secure tens of thousands of unsolicited opinions and complaints from all around the country, without asking a single question.
At Swizzle, we are specialists in text analytics. When many hear the term text analytics, they think social listening tools or sentiment analysis, but it is a broad umbrella term, like biology or neuroscience. Text analytics is actually an ever-growing group of techniques and algorithms which measure and count text strings objectively. This field of study isn’t anything new, and has been used to do things like, analyzing what famous literary works have in common and predicting psychosis in at-risk youth. What Swizzle’s own text analysis system JETS does, is apply these techniques to marketing research, giving marketing research the distinct advantage of analyzing qualitative data like Yelp reviews or transcripts from the customer service department in an objective manner, void of unconscious bias.

Network Visualization of a Topic Model

For the UMSL project, we got “raw” opinions and complaints of students, aggregating thousands of lines of content on “choosing the right college.” In book form, all this data would have been the size of a large J.K. Rowling novel, and a lot less fun to read. Luckily, with text analytics, you don’t have to read it all to know what it says. Using a technique called topic modeling, we organized the data into topics and ranked them by importance. The resulting data was a clear-cut list of student priorities, from most important to least.
In our system word or topic’s “importance,” is summarization of a bunch of different statistical scores. These scores evaluate things like a word’s weight, it’s prevalence in the data, or it’s connection strength, how often the word appears connected to certain words. For example, words Taylor and Swift always have a very high connection strength. In the data for UMSL, we saw that words related to major programs were most important, followed by keywords related to budget, with terms on social experience, like dating and party trailing by a wide margin. With this in mind we dug into the data accordingly, reading what our system indicated were the most representative comments in the data set.
Interestingly, what we found is that students don’t really choose a college. They choose a job or a very general career path, then judge all colleges based on their ability to get them there, while staying within the students budget. In other words, without really knowing it, today’s high school students are extremely pragmatically, and choose their college based on ROI. Things like private school versus public school didn’t matter, only the reputation of the school’s program.
To validate this data further, we compared results on real students against blog posts aimed at helping them make their decision. In the blogger data, keywords and topics related to social experience and nice campuses were much more prevalent. Based on this we theorized that the continued notion that students are going to college to party, is actually an outdated idea that continues to be spread by a generation that has already gone to and graduated from college. Whereas real student data indicates modern American students probably would describe going to college specifically to party as “lame.”
Today, “The UMSL Project,” as we affectionally refer to it, is the first case we talk about when providing real life examples of what our technology can do. It’s a perfect case of how gut feelings and how we perceive the world can be wrongly influenced by our individual viewpoints. Industry experts, through no fault of their own, were spreading incorrect information colored by what the situation had been when they were younger and the studies they read on millennials, which were also authored by a past generation observing from the outside in. Meanwhile, the answers were all on the internet, cropping up naturally, written by their target market in the hundreds of thousands. Today, with text analytics, we can finally leverage all of that information and get truth straight from the mouths of our target audience. All we have to do is shut up and listen.