2,034 reads

The Essential Architectures For Every Data Scientist and Big Data Engineer

by Sharmistha ChatterjeeAugust 11th, 2020

Too Long; Didn't Read

Feature Store has become an important unit of organizations developing predictive services across any industry domain. The essential Architectures For Every Data Scientist and Big Data Engineer are available in this blog. We highlight on the features supported by different Feature Store frameworks, that are primarily developed by different leading industry giants. Feature Store provides a platform for feature-sharing for ML-sharing models across different datasets. It provides a horizontally scalable multi-tenant architecture for multiple models with suitable scaling and monitoring. It provides options to define hierarchical partitioning schemas to train models per partition, that can be deployed as a single logical model.

Companies Mentioned

Coin Mentioned

featured image - The Essential Architectures For Every Data Scientist and Big Data Engineer

Comprehensive List of Feature Store Architectures for Data Scientists and Big Data Professionals

Introduction & Motivation - Why Feature Store

Feature store has become an important unit of organizations developing predictive services across any industry domain. Some of the earlier challenges in deploying ML solutions at scale involves :

Developing and maintaining customized systems by individual teams with little or no coordination.
No collaborative system for sharing features for similar type ML models (models from a similar domain or models addressing. same business use-cases or customer domains).
Increased cognitive burden without the proper scope of scalability.
Limited integration with big-data ecosystems.
Limited scope for model retraining, comparison, model governance, and traceability, limiting agile development life-cycle.
Difficult to track and retrain model which exhibits seasonality.

To overcome the above limitations, Architects. Data scientists, Big Data, and Analytics professionals have felt the necessity to walk under one roof with one unified framework to facilitate easier collaboration, sharing of data, results, reports.

Departments, teams and organizations shared some of the similar notions of Feature Engineering:

Feature Engineering is expensive and amortization happens over time and across models.
The increase in cost is non-linear/exponential with the increase in the number of features.
Triggers/Alerts due to addition/removal of feature is high.
Most often dependencies are not documented/tracked which results in an increase of implicit and explicit dependencies getting added over time.

While sharing a similar opinion, it became easier to come together and create a Unified Framework called Feature Store. This would enhance the speed of ML model deployment life-cycle along with the creation of proper documents, required version analysis, and model performance in order to save time and effort.

In this blog, we highlight on the features supported by different Feature Store frameworks, that are primarily developed by different leading industry giants.

Advantages of Feature Store

Ability to re-use and discover features between teams across the organization.
Features should be governed by adding features like access control and versioning.
Ability to precompute and automatically backfill features --- including online computation and offline aggregation.
Helping to create a collaborative environment between data scientists and big data engineers.
Save effort and cost by sharing not only features but also related artifacts, documents, marketing insights of models developed from these features.
Enable consistency between training and serving.

The Essential Architectures For Every Data Scientist and Big Data Engineer

Too Long; Didn't Read

Companies Mentioned

Coin Mentioned

Introduction & Motivation - Why Feature Store

Advantages of Feature Store

Michaelengelo From Uber

Feast Feature Store

Wix Feature Store

FeatureStore from Comcast

Netflix Feature Store

FBLearner from Facebook

Pinterest Feature Store

Zipline from Airbnb

TFX

Apache Airflow

Zomato Feature Store

Overton from Apple

StreamSQL Feature Store

Feature Store from Tecton

Hybrid Feature Store

Feature Store from Scribble data

Conclusion

References

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

RELATED STORIES