For enterprises looking to wrest the most value from their data, especially in real-time, the “data lakehouse” concept is starting to catch on.
The idea behind the data lakehouse is to merge together the best of what data lakes and data warehouses have to offer, says Gartner analyst Adam Ronthal.
Data warehouses, for their part, enable companies to store large amounts of structured data with well-defined schemas. They are designed to support a large number of simultaneous queries and to deliver the results quickly to many simultaneous users.
Data lakes, on the other hand, enable companies to collect raw, unstructured data in many formats for data analysts to hunt through. These vast pools of data have grown in prominence of late thanks to the flexibility they provide enterprises to store vast streams of data without first having to define the purpose of doing so.
The market for these two types of big data repositories is “converging in the middle, at the lakehouse concept,” Ronthal says, with established data warehouse vendors adding the ability to manage unstructured data, and data lake vendors adding structure to their offerings.