Text analytics: Telling Stories that Drive Insurers’ Progress
To find romance, millions of people globally use online dating sites, offering real-time, geo-location based matching services. Using algorithms, the sites learn about their users much the same way as Amazon, Netflix and Pandora do to recommend products, movies or songs based on a user’s preference. The challenge with dating sites is understanding what self-reported data is “real” and finding that truly perfect match, given that many people tend to exaggerate their personal statistics (age, body type, height and education).
Rather than letting a match-making site’s own algorithm work its magic, one PhD candidate decided to take another approach. Chris McKinlay’s OkCupid profile wasn’t generating the results he hoped. After approaching it several different ways without much success, he turned to algorithms called K-Modes using a tool called Python. His K-Modes approach helped him find clusters of women that fell into seven statistically identifiable groups. He wrote a new program to visit the pages of his top-rated matches.
After using text mining to learn what interested his prospective matches the most, he re-wrote his bio around those interests in two different profiles geared at two different ‘clusters’ of women. Long story short, after refining his written profile based on the data he amassed from the female profiles, he tested the effectiveness of his “matching” skills via 88 first dates which eventually resulted in a successful match. (Read his full story in Wired.com.)
Structured vs Unstructured Data
Analytics has tremendous potential for achieving many successful outcomes, especially in the insurance industry. In recent years, we’ve heard a lot about big data, the overabundance of it, and the importance of getting the right information to be useful.
Data comes in all shapes and sizes. There is structured data which refers to information with a high degree of organization. Quite simply, it’s data that fits in columns and rows including sales figures, loss ratios, income, age – the hard numbers.
There is also a tremendous amount of information that does not fit nicely into columns and cells. Referred to as unstructured data, it essentially equates to everything else – website content, images, satellite images, customer service notes, social network exchanges, sensor data, emails, news releases, voice recordings, to name a few examples. And that’s only scratching the surface.
While unstructured data information can’t exactly fit into a spread sheet, it has lots to tell us."
While such information can’t exactly fit into a spread sheet, it has lots to tell us. Tapping into more unstructured data, especially text, mining it and using the resulting insights for decision-making is becoming more commonplace. It offers insurers the ability to improve efficiencies, steer product development, understand customers’ needs and improve the customer experience.
In practical terms, text analytics – like that used in the OkCupid story – converts text data into insights and patterns. It first surfaced in the mid-1980s and was a highly labor-intensive activity. As most information (common estimates say over 80%) is currently stored as text, text mining is believed to have a high commercial potential value. What’s different today is that tools such as Python are available to help build a variety of models – without an army of data scientists – or if you have them on staff, put them to work.
A Good Read for Insurers
Insurers collect an overabundance of text-based data, oftentimes in many different languages, including applications, email, social media, claims adjuster notes, property risk engineering field notes, medical records, police statements, surveys, and more. It certainly piles up. Now, insurers like XL Catlin are looking at ways to use this overlooked data to spot trends, identify potential problems, or identify new business opportunities.
We see tremendous potential on many business fronts:
Claims: Claims is one area where text mining has already proven helpful. By systematically searching thousands of claims notes, text mining can pick up patterns. A search for the phrase “water damage” might show up 20% of the time in the notes, but after further analysis, we might find that “water damage” accounts for only 10% of the claims. We can delve further to pinpoint what caused the loss – a burst pipe, faulty washer, overflowing bathtub. With this insight, we can determine whether we need to revise our pricing guidelines, develop a set of mitigation efforts to prevent water damage or do something else – yet to be discovered.
Underwriting: Text analytics provides valuable insights into our clients and offers our underwriters a way to look deeper into an account than just the application. It not only provides the hard facts about clients but helps underwriters gain a “feeling” for the company’s future. Consider the value for a D&O underwriter evaluating a company’s financial health. Is the company’s management positive or wary? Is there a new management team? What is it concentrating on? Today, we can text mine or analyze years of historical records such as the company’s annual reports going back 10 years or more, investor calls and even media mentions.
Product Development: Text mining can also help find patterns offering insight into what insurance products or risk management services clients want or need.
Only the BeginningAnd this is only the beginning. As computer scientist Clifford Stoll has said, “Data is not information. Information is not knowledge. Knowledge does not understand. Understanding is not wisdom.” The ultimate goal is to get meaningful insights into the hands and minds of the people who will use that insight to improve their decision-making. Text analytics will help insurers like ours better serve their clients with new products and memorable experiences, preserve profitability with wiser claims processes and tools, and give underwriters the best possible information to make the best possible decisions. In other words, achieve many mutually beneficial outcomes.