Mind Your Data - Achieve Business Value

Data Lake +
Data Warehouse + Analytics

It’s an open, scalable, framework platform for: Data Wrangling, Machine Learning, Data Science, and Business Analytics – That works For everyone!

The Problems: Solved by the DataLakeHouse

Where's Your Data - Data Lake Storage

What's the correct configuration for your data lake storage (whether S3, AWS, Wasabi)? How many folders and what's the security protocol for all of your analytics. DataLakeHouse provides the framework for your implementation.

What Design: Star Schema vs. Data Vault

The decision can be a challenge. DataLakeHouse takes out the guess work for you.

How to Integrate the Data?

A clear path on using scalable toolsets to move and transform data is provided.

Data Visualization & Business Value

A cross industry ready-to-go solution set to jumpstart your journey with immediate business value. Get usable solutions from real-world subject matter experts.

Support: Enterprise and Community

Gain confidence in the platform and framework due to continued support efforts from the community and enterprise partners around the world.

Supported by a Community of Developers

Join Us!

Some DataLakeHouse Deployments?

"DataLakeHouse has given us direction, and made the choice for us to deliver enterprise grade analytics with a scalable pipeline."
Fitness and Lifestyle Company, NC
"Going to the cloud gave us too many options and not enough direction. The linked components and open source nature is what our executives needed."
Health Clinic Group, TX
"We still use our legacy systems and DataLakeHouse framework fits perfectly with our Essbase implementation and our existing Data Warehouse."
Transportation Group, TX

What is a DataLakeHouse?

  • Problems It Solves
  • Value it Brings
  • Architecture
  • Roadmap
  • Support

DataLakeHouse takes the guess-work out of your end-to-end data flow to business value solution. Focused on providing a platform to enable achieving business analytics, quickly and with confidence, DataLakeHouse is stack of tools built to work together or separately,  with a best practices data integration framework.

Most organizations spin their wheels by vendor selection, architecture trial and error, and lack of best practices applied from the start. Ultimately an attempt to bring analytics, a data lake storage solution, and/or a data warehouse, to the organization often result in budget overruns and a footprint that looks nothing like the original intent.

Combine these misgivings with a first time move to the cloud or attempting a hybrid solution, while understanding security ramifications, etc. companies are often paralyzed with indecision or forced to move forward slowly or  haphazardly.

DataLakeHouse solves these problems and more by providing a framework, not just for guiding IT but also for the business users and data scientist by guiding them on a path to achieving repeatable business value. 

Consider a solution that fits into any architecture on any cloud vendor or on premise footprint. Now invision that the tooling, pre-built data integration, pre-built organization, and pre-built analytics is available to use immediately or ready to be tailored for the business of your organization. That's the DataLakeHouse elevator pitch.

No matter what the size of your organization, and no matter the size of your data, DataLakeHouse is best practices applied at scale, with a community, enterprise, and partner support to enable your data-driven stability and success.

DataLakeHouse takes a best practices architecture for the Big Data value-chain and applies it as an end-to-end solution for any organization that collects and consumes data for data-driven initiatives. It works by providing a straightforward implementation path proves itself by delivering the key components pre-built solutions for ingest, transformation, and analytics as part of the platform stack.

The mainly cloud-centric solution enables a cloud enabled deployment through Terraform, delivers newest standards in data transformation, and provides a pre-structured data repository for real-time OLAP analytics for all major cloud based Data Warehouses, and more.

The DataLakeHouse project has a growing number of contributors in its open source community continue to better the solution.

The roadmap outlook is always a work-in-progress. 

Currently DataLakeHouse supports on-premise architectures with K8 and Docker deployments, Google Cloud Platform (GCP) and Amazon Web Services (AWS).

The initial support for architecture is broken into the two key areas of the DataLakeHouse concept, Front Lake and Back Lake:

Front Lake:

  • Looker Integration

Back Lake:

  • Snowflake Integration
  • Alibaba Integration
  • Wasabi Integration

We know that no system can be fully considered or implemented without support and training.

Online training continues to be built based on education on the DataLakeHouse framework architecture and its individual components. This covers summary training lessons from introduction, and implementation, to executive understanding.

Online self-guided training is scheduled for open access in December 2020.

On-Site and Scheduled Virtual Training with one of our instructors can be scheduled with a standard two-weeks notice. Please contact us to schedule training.

Data Flows Like Water Analogies

Front Lake?

  • Business Focused
  • Data Analysts
  • Machine Learning
  • Subject Matter Expertise (SMEs)
  • Data Visualization & Reporting

Back Lake?

  • Data Lake Storage
  • Data Virtualization
  • Data Warehousing / Marts
  • Networking and Infrastructure
  • Cloud Projects Management
  • Spark / MR - Parallel Processing

Data River/Stream?

  • Data Integration (ETL / E-LT)
  • Kafka and Streaming Ingestions
  • Data Movement Ingress/Egress
  • Data Lifecycle
  • Shared Data / Data Marketplace
Scroll to Top