Building High-Performance Data Lake Using Apache Hudi and Alluxio

Written by bin-fan | Published 2021/08/27
Tech Story Tags: alluxio | apache-hudi | data-lake | analytics | hdfs | presto | alibaba | data-ingestion

TLDR T3Go is China’s first platform for smart travel based on the Internet of Vehicles. Trevor Zhang and Vino Yang describe the evolution of their data lake architecture, built on cloud-native or open-source technologies, including Alibaba OSS, Apache Hudi, and Alluxio. Their data lake stores petabytes of data, supporting hundreds of pipelines and tens of thousands of tasks daily. The architecture allows us to store the data as-is without having to first structure the data and run different types of analytics to guide better decisions.via the TL;DR App

no story

Written by bin-fan | VP of Open Source and Founding Member @Alluxio
Published by HackerNoon on 2021/08/27