By FintechOS · September 28, 2021
8 minute read

Stop dredging the data lake: empower your data without cloud API


Key Takeaways:

  • Migrating your data to the cloud is a priority for incumbent insurers
  • Yet, it’s a complex, expensive and time-consuming process
  • Our Evolutive Data Core offers an alternative to costly cloud migration

According to a Trianz study, 70% of insurers “realize the urgency and importance of cloud in their analytics, business apps, and infrastructure”. Yet, moving everything to the cloud is far from straightforward.

Legend has it that the concept of insurance was first enacted as a response to the damage inflicted in the Great Fire of London in the year 1666. In 1710, The Sun Fire Office insurance company was founded and still does business today as Royal & Sun Alliance.

With the insurance industry having been around for 300 years, it’s no surprise that incumbent insurers have accumulated a huge amount of customer and risk-related data. Not to mention, a history of legacy systems where that information is archived.

Getting these legacy systems to work together in order to use all of that data efficiently is becoming a major priority for incumbents. Modern tools, such as artificial intelligence (AI), chatbots, robo-advisors, and marketing automation all require access to this information to function. This is a particular problem, given that 41% of customers would switch provider due to a lack of digital capability, according to PWC, making digital tools vital to maintain your customer base.

Yet, it’s a hugely complex endeavor, and challenger insurers without legacy systems and data are at a major competitive advantage over their larger contemporaries. How can insurance incumbents migrate their data to cloud repositories where an API layer can distribute it to the tools they need to utilize to compete?

Dredging the data lake

Current wisdom suggests that the solution to the problem is a data lake. This is a single repository where all your data can be stored. Transfer all the content from your various legacy systems into one place and it can be utilized by the tools you need.

This can be an on-premises server or a public or private cloud-based solution. However, the majority of insurtech tools on the market will only really be useful when working with a cloud-based data repository.

The global data lake market size was valued at USD 7.6 billion in 2019, and is expected to grow at a compound annual growth rate (CAGR) of 20.6% from 2020 to 2027. Meanwhile, an Aberdeen survey saw organizations who implemented a data lake outperforming similar companies by 9% in organic revenue growth. Is this really the only way forward, though?

The effective utilization of the information in a data lake is reliant on the format. Few insurtech tools can take full advantage of data while it’s unsorted and siloed.

Say you need to record all your customer’s names in order to properly cross-reference the different insurance products they are subscribed to. Yet, one system lists the customer’s first, middle and last names as one field called ‘name’; another system has separate ‘first name’ ‘middle name(s)’ and ‘family name’ fields; and a third has the customer’s name down as ‘identity’. Using those three data sources requires some coding to tell your tools which field means what.

Not to mention, one system might have an Excel spreadsheet, another could have an SQL database, and the last might have a format made by a company that doesn’t exist any more. To truly activate your stockpile of data, you need to turn your data lake into a ‘data warehouse’.

Your data scientists will need to take the raw data that’s stored in your data lake and convert it all into the correct format. Creating order out of the chaos is somewhat like creating a hydro-electric dam. If you can get all the water to flow in the right direction, the power that’s generated is tremendous; as powerful as AI tools can be.

Yet, it’s a huge undertaking.

Going through your existing data and filing it into a carefully organized database may be satisfying, but it could be many years’ work and a great deal of resource used before you can fully activate your data.

Splashing in data puddles

When you begin your gradual data migration to your data lake, things will be relatively straightforward, at first. Alex Gorelik described this as a “data puddle”, your starting place where you store only the essential data your insurtech tools need to operate, while the rest of your data remains in your legacy systems.

Once you’re up-and-running with your new tools, you can gradually migrate your other legacy data into the cloud. One way to spare your data from becoming too ‘swampy’ and messy for your tools to read is to migrate each legacy system’s data into a separate data puddle. This will allow your insurtech tools to work on your data in the cloud without you having to organize it.

This way, however, you will still have to teach your tools to work with the siloed data, which will take time and resources, as well as slowing down the responsiveness of the tools. It’s a temporary solution.

Not to mention, siloed data can’t be cross-referenced the way a data lake’s information can. With a fully-operational data lake, your life insurance customer who decides to take out motor insurance with you will find all their information already filled out in your onboarding process from their existing account.

As such, you’ll eventually need to connect and format the data across your data puddles. After a great deal of effort, your data ponds will merge into a data lake, then you can organize it into a data warehouse. With a team of data scientists, you can invest in maintaining this huge pool of data.

Even once your data is organized, however, how can you know that the way you’ve done it will be future proof? What if the next insurtech development requires all your information to be re-organized again?

Rapidly your data lake could become a mess that your insurtech tools can’t cope with. Keeping your data organized could become the work of an entire team of data librarians, constantly formatting and reformatting data as technology requires. Still, that won’t be the end of their work.

To really make your data work for you, you’ll need to expand your data lake even further, and open it up to the most chaotic data of all: the internet.

Cloud data lakes are a huge benefit to your firm, but to really make your data work for you, you need the right tools to harness it. Particularly, when you need to expand your data lake even further, and open it up to the most chaotic data of all: the internet.

How a data lake becomes a data ocean

To really empower your insurtech with great data, you need to open them up to the biggest data source there is. As they used to say, the internet is the ‘information superhighway’, and there is an enormous amount of unsorted data there to be utilized.

If you insure a building, you will want to know how far it is from the nearest river, how often that river floods, and what preventative measures are being used to prevent flooding in the area. All of that information is online.

Likewise, your health and life insurance product underwriting can consider air pollution and sun intensity in your customers’ local areas. All this data can be used by AI algorithms to inform your decisions.

All this is relatively simple to do with the right tools, but the right tools need the right data. The “right” data is “as much as possible”. Even a data lake containing all your data isn’t as effective as the vast ocean of data online.

The internet’s data, however, can never be formatted, so how do you deal with so much unsorted information?

Is it time to let go of data lakes?

You’re only rushing to move your data across to the cloud so that an AI tool can make use of it.

If you could find a tool that could process data from your legacy systems, the cloud and the entire internet at once, without slowing your customers’ experience, then you can migrate your data at your own pace.

If you could use a tool that would aggregate your data from all your sources into one data core, then your insurtech tools could work from there, instead of relying on an expensive and time-consuming data migration. Our Evolutive Data Core can do just that.

Where your systems have API layers, the Evolutive Data Core can pull them together. For older systems, we use extract, transform and load (ETL) systems for bulk uploads. For legacy software, we can even use robotic process automation (RPA) to scrape data that can’t be accessed. The Evolutive Data Core will even mirror any changes to your legacy systems, so there’s no need to retire them.

On the other hand, if you do choose to migrate your data to a data lake in the future, our Evolutive Data Core can follow your information wherever you send it, so there’s no interruption in service when you migrate.

To find out more about how our Evolutive Data Core could save you millions on a cloud migration, book a demo.

Share this article with your connections