In the first two posts of our series on sentiment analysis, we discussed, how to use sentiment analysis to get value from service and how to get a competitive advantage. Let’s spend some time talking about how sentiment analysis works on a functional level. For simplicity sake, we will assume that a model has been constructed that is sensitive to banking terminology.
Sentiment Analysis is composed of three high level activities [i].
- Construct the Corpus
- Pre-process the data
- Extract Knowledge
The corpus is simply the body of information you are looking to analyze. This can be both internal (e.g. emails, call center notes) and external sources (e.g. investor conference calls, social media, banking journals). The first step involves identifying the sources of the information you want to analyze and feeding it into the model.
Pre-processing data is organizing the information in a way that it can be analyzed. This involves creating a term document matrix (TDM). The tabular format facilitates the analysis of the words and phrases in the corpus. It enables the connection of terms of interest such as “home equity loan” or “mobile application” with sentiment. Pre-processing also determines which areas have the most activity through a scoring mechanism that evaluates the frequency of terms.
The last step in the process is knowledge extraction. A number of analytical functions are performed in this phase depending on what you are trying to determine. The main function used in sentiment analysis is association. What we are after is the connection of feelings (e.g. good or bad) with the operational aspects of the bank. This is where a well-constructed model comes in. The model should be able to pick up context. For example, a loan closing is different from closing a checking account. The terms need to be linked together to assure that we are getting the right results. Closing a checking account may signal loss of a customer, whereas closing a loan could mean that a new customer is added. Making the right associations can yield insight into potential changes at your bank and how you stack up against the competition.
[i] Practical text mining and statistical analysis for non-structured text data applications [electronic resource] / Gary Miner … [et al.]., page 79