paint-brush
Performance Benchmark: Apache Spark on DataProc Vs. Google BigQueryby@Raghavendra_Singh
3,252 reads
3,252 reads

Performance Benchmark: Apache Spark on DataProc Vs. Google BigQuery

by Raghavendra Pratap Singh5mJune 30th, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Research undertaken to provide interactive business intelligence reports and visualisations for thousands of end users. We need to design a system that can analyse billions of data points in real time. The solution took into consideration following 3 main characteristics of the desired system of desired system: Analysing and classifying expected user queries and their frequency. Developing various pre-aggregations and projections to reduce data churn while serving various classes of user queries. Serving up to 60 concurrent queries to the platform users with a combination of aggregated datasets. Developing state of the art ‘Query Rewrite Algorithm’ to serve the user queries using a combination.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail

Coin Mentioned

Mention Thumbnail
featured image - Performance Benchmark: Apache Spark on DataProc Vs. Google BigQuery
Raghavendra Pratap Singh HackerNoon profile picture
Raghavendra Pratap Singh

Raghavendra Pratap Singh

@Raghavendra_Singh

Raghavendra works for Sigmoid.

Learn More
LEARN MORE ABOUT @RAGHAVENDRA_SINGH'S
EXPERTISE AND PLACE ON THE INTERNET.
L O A D I N G
. . . comments & more!

About Author

Raghavendra Pratap Singh HackerNoon profile picture
Raghavendra Pratap Singh@Raghavendra_Singh
Raghavendra works for Sigmoid.

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite