Unstructured Data Analysis – A treasure not fully exploited

Intense competition and ease of access to information have made it imperative for financial institutions like banks to listen and understand the specific needs of their customers. Traditionally, banks have leveraged structured data sources to calibrate brand or product performance. However, with technology enabling new media vehicles, 80% of the customer insights reside in unstructured data like consumer complaints, emails, presentations, Internet blogs and more recently activity on social media.

Due to such constant changes in technology and the increasing need for focus on social network analysis, organizations will start to take unstructured and external data sources more seriously when considering their overall analytics platform to ensure a broader perspective. Consequently this means that once companies mature in their overall analytics use, adding unstructured data sources becomes a logical step towards broader business visibility.

Why don’t we witness companies providing solutions for analyzing unstructured data creating buzz like the facebooks and twitters of the world. Is it because of inefficient leadership like Autonomy was alleged to have had or is the world not ready for such a product or do companies fail to unleash the complete potential of such products due to inefficient data analysis strategies or any other incomprehensible reason? Though I would leave most of it for the Corporate Strategy gurus to comprehend, I would like to throw some light on issues pertaining to the tool selection, adoption and integration.

Unstructured Data Analysis tools source content from a wide variety of sources. Most of the times, these include:

  • Consumer Complaints, emails
  • Presentations on sources like slideshare
  • Regular web sites
  • Blogs and public blog hosts such as Blogger, WordPress, and Tumblr
  • Video and audio sharing sites like YouTube and DailyMotion
  • Public micro-blogging services, mainly Twitter
  • Public Social Networking venues such as Facebook, Linkedin, and Google+

In fact, most tools vendors will claim they monitor millions of sources.
But monitoring millions of sources may not be enough for you. Consider the following:

  • More sources monitored does not always mean a better reach. Mass hosting providers like WordPress.com host millions of blogs. So does Blogger and others. Many vendors count each of these blogs as a source, and that bumps up their number.
  • Also, if the tool does not monitor the one key source that’s really critical to your business, all other millions of data streams become of little use to you. This is especially important when you want to monitor country-specific or language-specific sources.
  • Sometimes it is also important to monitor some protected sources (such as your Intranet discussion forum) or a subscription-only Journal besides public sources. In such cases, it is important to find out if the tool vendor can accommodate the same.
  • Even for a specific source, vendors will exhibit varying degrees of support as they might have specific agreements or restrictions in terms of how often they can crawl a source. This can meaningfully affect your result, especially if you are want to monitor in real time.

What should you look for?

Adoption of unstructured data by a company within an existing BI framework is small at best. Businesses still struggle with how to get the most out of analytics, and adding more information that is complicated can lead to a higher probability of project failure. In addition, many organizations forget about context when they are looking at information. For example, many companies identify the metrics they are looking for but are unable to quantify the benefits of monitoring those specific metrics (and how they fit within the broader picture). The same issue exists for companies that want to tackle unstructured data analysis. The first step is to identify the context e.g. looking at the questions that need to be answered.

Once context is defined businesses should consider the traditional technical requirements as well as interactivity.  Questions that can work to identify proper starting points:

  • What business pain is being faced? This also includes the goal of using unstructured data. In addition, by looking at the types of information and the whys behind it, companies can see whether simple solutions exist off the shelf or whether a lot of customization will be required.
  • Who will be interacting with the solution, and what is their level of expertise with analytics?
  • What solutions already exist in-house?
  • What types of information will be integrated?

Enterprise data is very messy, inconsistent, and spread out across multiple internal systems and applications. However, data analysis with right infrastructure and tools in place can yield insights and informed business decisions. Companies can monetize such insights drawn from data analysis. As data-driven benefits grow, so do our demands about what more data can tell us and what other types we can mine. Hoping to see more adoption of such tools for making every customer interaction a “wow” experience!

By Madhavi Natukula

CustomerXPs offers real-time, intelligent products that empower banks with instant insights enabling influenced outcomes of deeper customer engagement and fraud-free transactions

Learn more about CustomerXPs Clari5

Read more:
Increasing Importance of Big Data in Banking
Close