Unified data repository & analytics for enhanced risk assessment
By partnering with Accion for developing a unified data repository, one client enhanced its risk analysis and claims settlement process with a market-ready analytics platform.
The customer in question is a leading CPA-directed program, providing liability insurance for the accounting profession in 40+ US cities. The client signs over 1,000 policyholders each month. The company used a RDBMS based environment to store and manage data pertaining to their insurance business - including customers, policies, claims and payments. They also had archives of unstructured data in the form of PDF documents, customer service notes, claim notes, emails, customer service calls and claimant interviews.
Data between the two sources was generated at a rate of 20GB/week. This accumulated to an archive of over 8 TB of historical data over a 10 year period. Since the client did not have any capability to process such a huge volume of data, it simply archived this data and conducted analysis only on recently generated data.
The business problem was to address a complex database schema, with analytic algorithms that were spread across different stored procedures. This problem made it difficult for the company to maintain its data model. The volume of structured and unstructured data related to claims kept getting added on a regular basis, without any capability for processing or analytics to drive business outcomes.
To tackle this challenge, the client found Accion as a partner to implement a unified data repository and improve business outcomes.
Hadoop for big data analytics
Accion analyzed the client’s computing and business environment. Based on the analysis and an objective technology selection process, Hadoop was chosen as the base framework. The framework served as a unified repository for storing structured and unstructured data from diverse sources, including SQL services, documents, etc. Hbase was selected for structuring meta data from these sources in key value form, while Hive was used for querying and processing large datasets residing in the distributed storage. A customized Java application was developed to extract data from existing RDBMS sources and load it to the unified data repository.
For unstructured data, Accion initiated a Text Mining process to quantify information from unstructured text fields such as claim notes, emails and claimant interviews. Based on the Text Mining process, the team established an algorithm to find patterns representing keyword occurrences or checkboxes in these claim documents and predict future claims or losses. The project used an open source library for pattern recognition - OpenCV - to capture, analyze and manipulate visual data from these sources.
Before, analysts only had access to limited datasets, as the client did not have any capability in processing archived or historical data. Now, with the Hadoop framework and distributed processing engine in place, the client can analyze massive amounts of contemporary as well as historical data. They can also extract patterns, trends and predict high value claims that were not readily identifiable in the past.
Accion proposed a unified big data repository combining structured and unstructured data over Hadoop. We also proposed a custom-designed Java ETL tool for data extraction with advanced text search and analysis features. Our solution was designed to leverage the OpenCV library toward pattern recognition and analysis in scanned documents, and ad-hoc queries through a distributed processing engine using Hive and ImpalaRes. Implementing this simplified NoSQL based data model avoided a complex database schema, making it easy to customize reports and analytic algorithms. The project made the entire dataset available for analysis through batch jobs.
To ensure all the important insights were readily available for the users as actionable data, Accion designed an interactive analytical visualization for the BFSI company.
The partnership with Accion provided the client with an analytics platform to derive actionable insights from their data. This was accomplished by tapping into structured and unstructured data which was simply archived and underutilized beforehand. This led to significant improvements in their operational efficiency as well as risk and claims management processes.
Interested in reducing risk with a unified data repository solution? Contact Us