"PrimerInsurance a MediLife brand faces challenges with data inaccuracies, schema inconsistencies, and a lack of trust in data systems from stakeholders. What measures are necessary to address and resolve these issues?"
MediLife, a global leader in insurance, financial services, and employee benefits, stands as one of the world's largest and most respected insurance companies. With operations in over 40 countries and serving approximately 100 million customers, MetLife provides a wide range of services including life, accident, health insurance, annuities, and retirement and savings products.
PrimerInsurance is a subsidy of MediLife. It has been acquired and controlled by the Insurance giant and has embraced data-driven decision-making to enhance its operations, from underwriting and risk assessment to customer service and product development. To report accurate and refreshed data to stakeholders the data from both the systems of PrimerInsurance and Metlife should represent data as one voice.
Unfortunately, this has become a great challenge and bottleneck for the Insurance giant and its subsidized company. The journey that started as a way to enhance operational efficiency and decision-making is now leading lot of friction between stakeholders.
- Columns missing in the same set of files. That is if one customer file has a particular column it might or might not be missing in another customer file
- Misrepresentation of column headers
- Data inconsistency in columns of different customer files. For instance, the education column of one data file has tertiary but another file has a value called “terto”
- Data duplication and much more
These issues led to a lack of trust in data systems rendering them useless.
The head of Data practices has decided to solve this problem once and for all by designing a single idempotent batch-processing pipeline (as mentioned in Architecture below) to harmonize the data, ensure data quality, and report the business data as needed by the stakeholders.
