See the engineering behind real-time personalization at Tripadvisor’s massive (and rapidly growing) scale I-Tripadvisor ihamba ukucacisa oku xa uqhagamshelane ne-site, emva kokuba ukunika ulwazi olufanelekileyo ngexesha lokufaka - kwinkqubo ye-milliseconds. Le ukucaciswa kubasetyenziswa kwiimodeli ezihlangeneyo ze-ML ezisebenza kwiidatha ezihlawulwe kwi-ScyllaDB esebenza kwi-AWS. Kule nqaku, uDean Poulin (i-Tripadvisor Data Engineering Lead kwi-AI Service and Products team) inikeza indlela yokusebenza le personalization. Dean inqakraza iingxaki zobuchwepheshe ezinxulumene yokusebenza kwe-real-time kwi-Tripadvisor ye-massive (and fast-growing) scale. Ukusekelwa kwi-AWS re:Invent talks: Pre-Trip Ukuqeqesha Kwiimeko zeDean ... Ukuqala nge-snapshot elula yayo ye-Tripadvisor, kunye ne-scale esebenzayo. I-Tripadvisor yasekwa ngonyaka ka-2000, iye yabanjiswe kwinxalenye yehlabathi kwi-travel kunye ne-hospitality, enikezela amaxabiso zeemigodi ezininzi zololiwe. I-Tripadvisor inikeza i-$1.8 billion yentengiso kunye ne-NASDAQ. Namhlanje, sinazo iqela lomnqweno we-2800 abasebenzi abahlala i-innovation, kwaye i-platform yethu ibonelela kwi-400 iivenkile ezizodwa ngenyanga - inombolo ebandayo ebandayo. Kwiintsuku eyodwa, inkqubo yethu usebenzise iiyunithi ezininzi ze-2 billion ezivela kwi-25 ukuya kwi-50 million abasebenzisi. Yonke i-click yakho kwi-Tripadvisor ilawulwa ngexesha elifanelekileyo. Ngaphandle kokuba, sinikezela iimodeli ze-machine learning ukufumana iingcebiso ezihlangeneyo - ukufumana uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze uze usebenzise i-ScyllaDB kwi-AWS. Oku kuthetha ukuba sinikezele i-millisecond-latency kwi-scale ezininzi iinkonzo ezininzi. Kwi-peak traffic, . 425K operations per second on ScyllaDB with P99 latencies for reads and writes around 1-3 milliseconds Ndiyathanda njani i-Tripadvisor isetyenzise amandla kwe-ScyllaDB, i-AWS, kunye ne-machine learning ngexesha elifanelekileyo ukunika imibuzo eyenziwe ngalinye kubasebenzisi. Sifundisa indlela yokusiza abavela abavela kubafani yonke into ebonakalayo yokufanisa uhambo wayo elungileyo: ngoko kuxhomekeka iimpawu ezihambelana, iindawo ezininzi ezidlulileyo, iimvakalelo ezininzi, okanye iindawo ezilungileyo zokufumana kunye nokutya. Le [i-artikel] ibekwe malunga ne-engineering ekubeni - indlela yokuthumela i-content elifanelekileyo kunye ne-relevant kubasebenzisi kwimeko se-real-time, ukunceda kubo ukufumana ngokukhawul Ukulungiselela Trip Personal Ukukhangisa ukuhambisa uhambo. Emva kokufika kwiphepha lokuqala yeTripadvisor, iTripadvisor uyazi ukuba unomdla, unomdla, okanye umnqweno we-strand - kwaye unokufumana iingcebiso ze-spot-on ezibonakalayo kwiingcebiso zakho zayo. Yintoni oku kuthathwa kwimisekundu ezili-milliseconds? Xa utshintshe kwi-Tripadvisor, sinyanzelisa ukucacisa into efanayo ngokusebenzisa iimodeli ze-Machine Learning ezibonisa iingcali ngokusekelwe kwimveliso yakho yokufaka kwimeko. Sincoma iihotele kunye neengcali ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi. Sincoma iihotele ngokutsho iimfuno zakho ezininzi. Sincoma iindawo ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi ezininzi. Iimveliso ze-Tripadvisor yokusebenza kwe-Architecture I-Tripadvisor isebenza kwizigidi ze-microservices ezihlangeneyo ezihlangeneyo kwi-Kubernetes-on-prem kunye ne-Amazon EKS. I-ML Model Serving Platform yethu ifumaneka nge-microservices ezininzi. Le nkonzo ye-gateway inikeza iimodeli ezingaphezu kwama-100 ye-ML kwi-Customer Services – leyo inokukwazi ukuqhuba iimvavanyo ze-A/B ukufumana iimodeli ezifanelekileyo ngokusebenzisa i-platform yethu ye-experimentation. Iimodeli ze-ML zihlanganiswa ikakhulukazi ngu-Data Scientists zethu kunye ne-Machine Learning Engineers usebenzisa i-Jupyter Notebooks kwi-Kubeflow. Zihlanganiswa kunye ne-trained usebenzisa i-ML Flow, kwaye sinikezela kwi-Seldon Core kwi-Kubernetes. I-Custom Feature Store yethu inikeza iimodeli zethu ze-ML, okuvumela ukuvelisa iimvavanyo ezifanelekileyo. I-Custom Feature I-Store I-Feature Store ikakhulukazi inikeza Iimpawu ze-User kunye ne-Static Features. Iimpawu ze-Static zithunyelwe kwi-Redis ngenxa yokuguqulwa kwexesha elininzi. Thina usebenza i-data pipelines ngosuku yokuhamba idatha evela kwi-data warehouse yethu ye-offline kwi-Feature Store yethu njengezimpawu ze-Static. Iimpawu ze-user zithunyelwe ngexesha elifanelekileyo nge-platform ebizwa ngokuba yi-Visitor Platform. Sinikezela iingxaki ze-CQL kunye ne-ScyllaDB, kwaye . we do not need a caching layer because ScyllaDB is so fast I-Feature Store yethu inikeza ukuya kwi-5 million Iimpawu ze-Static ngeveki kunye ne-half million Iimpawu ze-User ngeveki. Yintoni i-ML Feature? Iimpawu zithunyelwe kwiimodeli ze-ML ezisetyenziselwa ukuvelisa. Kukho iimpawu ze-Static kunye ne-User Features. Enye iimeko zeFunction Static zihlanganisa iingcebiso ezibonakalayo kwi-restaurant okanye izixhobo ezinikezwayo kwi-hotel (njenge-Wi-Fi ye-free, i- pet friendly okanye i-fitness center). Iimpawu ze-user zithunyelwe ngexesha elifanelekileyo xa abasebenzisi abavela kwiwebhusayithi. Sithunyelwe kwi-ScyllaDB ukuze sinokufumana iingxaki ze-lightning fast. Ezinye izibonelo zeimpawu ze-user zihlanganisa iihotele ezidlulileyo kwiiyure ze-30 ezidlulileyo, iihotele ezidlulileyo kwiiyure ze-24 ezidlulileyo, okanye amava ezidlulileyo ezidlulileyo kwiiyure ze-30 ezidlulileyo. I-Technologies Powering I-Visitor Platform I-ScyllaDB yintloko yePlatform yeVisitors. Thina usebenzisa i-Java-based Spring Boot microservices ukuvumela i-platform kubathengi bethu. Oku isetyenziselwa kwi-AWS ECS Fargate. Thina usebenza i-Apache Spark kwi-Kubernetes ngenxa yobugcisa yethu yedatha, i-offline yethu kwi-jobs ye-online. Emva koko sinokusetyenziswa kwabasetyhini zethu ze-offline data warehouse kwi-ScyllaDB ukuze ziyafumaneka kwi-live site. Thina usebenzisa i-Amazon Kinesis ukutya iziganeko ze-streaming ye-user tracking. I-Visitor Platform Data Flow I-graphic elandelayo ibonisa indlela idatha ivela kwi-platform yethu kwiiyure ezine: ukuvelisa, ukuchithwa, ukuhanjiswa, kunye nokuqinisekisa. Iinkcukacha zithunyelwe yi-website yethu kunye ne-mobile apps zethu. Ezinye iinkcukacha zihlanganisa i-Cross-Device User Identity Graph, i-Behavior Tracking events (njenge-page views and clicks) kunye ne-streaming events ezivela kwi-Kinesis. Kwakhona, i-publishing ye-publishment ifumaneka kwi-platform yethu. I-Microservices ye-Visitor Platform isetyenziselwa ukufumana kunye nokuhlanganisa le data. Izixhobo kwi-ScyllaDB zithunyelwe kumasipala ezimbini: I-Visitor Core keyspace, ebandakanya i-Visitor Identity Graph I-Visitor Metric keyspace, ebandakanya I-Facts kunye ne-Metrics (iimeko ezininzi ezidlulileyo ezivela kwi-site) Thina usebenzise iintlobo ETL ezamahala ukugcina kunye nokucacisa idatha kwi-platform. Thina utshintshe i-Data Products, i-Stamped Daily, kwi-offline data warehouse yethu – apho ziyafumaneka iintlobo ezininzi kunye nezinye i-data pipelines ezisetyenziselwa ukusetyenziswa kwabo. Nazi ukubonisa i-Visitor Platform ngamaxabiso: Yintoni iinkcukacha ezimbini? I-database yethu ye-online ibekwe kwi-real-time, i-live traffic yewebhu. I-ScyllaDB ibekwe le nqakraza ngokunika i-latentations ezincinane kunye ne-high-throughput. Thina usebenzisa i-TTL ye-short-term ukucacisa idatha kwi-database ye-online ukuvela ngexesha elide, kwaye iinkqubo zethu ze-data retention zibonise ukuba sinikezela kuphela idatha yeengxaki ze-user for real visitors. I-Data Warehouse yethu ye-offline inikeza idatha ezijwayelekile ezisetyenziselwa ukulungiselela, ukuvelisa iimveliso ze-data kunye nokufunda i-ML Models zethu. Thina akufuneka iinkqubo ze-data ze-offline ezininzi zihlanganisa ukusebenza kwewebsite yethu ye-live, ngoko thina iindidi ezimbini ezahlukileyo ezisetyenziselwa izicelo ezimbini ezahlukileyo. I-Platform ye-Microservices ye-visitor Ukusetyenziswa kwe-5 microservices kwi-Visitor Platform: I-Visitor Core ikakhulukazi i-cross-device user identity graph ngokusekelwe kwi-cookies kunye ne-device IDs. I-Visitor Metric yinto yethu ye-query engine, kwaye inikezela ukuba sinikezela iimeko kunye ne-metric kubasebenzisi ezithile. Siza kusetyenziswa kwilwimi ebonakalayo ebizwa ngokuba yi-visitor query language, okanye i-VQL. Le ngxelo le-VQL inokukwazi ukubonisa iimeko ezintsha ze-trade click kwiiyure ezidlulileyo ze-3. I-Visitor Publisher kunye ne-Visitor Saver zisebenza kunye ne-Writing Path, ukubhala idatha kwi-platform. Ukongeza ukugcina idatha kwi-ScyllaDB, sinokuthumela idatha kwi-offline data warehouse. Oku kwenziwa nge-Amazon Kinesis. I-Visitor Composite ivula ukuhlaziywa kwimveliso ye-batch processing. I-Visitor Saver kunye ne-Visitor Core ifumaneka ukuyifumana abasebenzisi kunye nokuthumela iimveliso kunye ne-metric kwi-API epheleleyo. I-Roundtrip Microservice ye-Latency I-diagram leyo ibonisa ukuba i-latency ye-microservice yethu ibekwe ngexesha elide. I-latency ephakeme yi-2.5 milliseconds kuphela, kwaye i-P999 yethu iye phantsi kwe-12.5 milliseconds. Oku kubaluleke kakhulu, ngokukhawuleza ukuba sincoma i-1 billion requests ngosuku. Iingcali zethu ze-microservice ziquka iimfuno ezininzi ze-latency. I-95% yeengxaki kufuneka ifumaneke kwi-12 milliseconds okanye engaphansi. Ukuba bafumane le nto, ke siyafumana i-paged kwaye kufuneka ukufumana into efanelekileyo ze-latency. I-ScyllaDB I-Latency Nazi isithombe se-scyllaDB yentlungu kwiintsuku ezintathu. Kwi-peak, i-ScyllaDB isebenza i-340,000 iintshukumo ngeveki (kuquka i-script kunye ne-reads kunye ne-deletes) kwaye i-CPU ibandakanya kwi-21% kuphela. I-ScyllaDB inikeza i-microsecond i-writes kunye ne-millisecond ye-read for us. Le ngamanzi yokusebenza okuphumelela ngokukhawuleza ngexesha elifanelekileyo ngoko ke siyenza i-ScyllaDB. Ukuphepha idatha ku ScyllaDB Ukubonisa indlela yokufaka idatha ku ScyllaDB. I-Visitor Metric Keyspace inesiseko ezimbini: Fact kunye Raw Metrics. I-key yokuqala kwi-Fact table yi-Visitor GUID, i-Fact Type, kunye ne-Created At Date. I-composite partition key yi-Visitor GUID kunye ne-Fact Type. I-clustering key yi-Created At Date, ebonakalisa ukuhanjiswa kwedatha kwi-partitions ngexesha elandelayo. I-attributes column ibandakanya i-object ye-JSON ebonakalayo kwindawo yaye. Ezinye iimeko ezisetyenziswa yi-Search Terms, Page Views, kunye ne-Bookings. Thina usebenzisa ScyllaDB's Leveled Compaction Strategy ngenxa yokuba: I-Optimization ye-Range Query Ukuphendula i-high cardinality kakhulu Yinto engcono kwi-read-heavy workloads, kwaye sinayo malunga 2-3X ngaphezulu kokufunda kunceda Yintoni ScyllaDB? Kwakhona, isisombululo yethu yenzelwe usebenzisa i-Cassandra on-prem. Kodwa njengoko i-scale yandisa, i-operational burden yaye. Inyanike ukhuseleko yama-operations ezihlangeneyo ukuze siqhagamshelane i-database upgrades, i-backups, njl. Kwakhona, isisombululo yethu inokufuneka i-latency ezincinane kakhulu yeengxaki ezisemgangathweni. I-User Identity Management system yethu kufuneka idlulisele abasebenzisi ngaphakathi kwe-30 milliseconds – kwaye ngenxa yokusebenza okulungileyo, sincoma i-event tracking yethu yokusabela kwi-40 milliseconds. Kubaluleke ukuba isisombululo yethu ayifuneka ukubonisa le sayithi ngoko i-SLA Zifumaneka i-Proof of Concept kunye ne-ScyllaDB kunye nokufumana ukuba i-performance yinto engcono kunokuba yi-Cassandra kunye nokunciphisa i-operational load. I-ScyllaDB ibonelela i-database yokusebenza ngokukhawuleza kakhulu kunye ne-latency engaphansi kakhulu. Ukuthetha i-optional epheleleyo, ngoko siye sikhule kwi-Cassandra kwi-ScyllaDB Cloud, ngokusebenzisa i-double-writing strategy. Oku ivumela ukuba sikhule kunye ne-zero-downtime xa usebenzise i-40,000 iintlawulo okanye iimfuno ngeveki. Emva koko, siye sikhule kwi-ScyllaDB Cloud kwi-ScyllaDB's "Bring Your Own Account" model, apho unako ukuba iqela le-ScyllaDB uqhagamshelane i-ScyllaDB database kwi-akhawunti yakho ye-AWS. Oku kunikwazi ukuvelisa ukusebenza kunye ne-privacy yeedatha ezilungileyo. Ukubonisa indlela yokusebenza kwe-BYOA ye-ScyllaDB. Kwi-centre ye-diagram, unako ukufumana i-ScyllaDB ye-6 node eyenza kwi-EC2. Kwaye kunezinye iiyunithi ezimbini ze-EC2. I-ScyllaDB Monitor inikeza i-Grafana dashboards kunye ne-Prometheus metrics. I-ScyllaDB Manager ibonelela kwi-infrastructure automation njenge-shintsha i-backups kunye ne-repairs. Ukusetyenziswa oku, i-ScyllaDB ingatholakala ngokugqithisileyo kwi-microservices zethu ukuze inikezele i-latency engaphezulu kakhulu kunye ne-perput kunye ne-performance. Ukugqithisa, ndingathanda ukuba ufumane ngakumbi i-architecture yethu, i-technologies ezisetyenziswa ne-platform, kunye ne-ScyllaDB enomdla wokufanisa inxaxheba yokusebenza kwe-Tripadvisor emangalisayo. I-Cynthia Dunlop I-Cynthia yi-Senior Director of Content Strategy kwi-ScyllaDB. I-Synthia ibhalwe kwi-development ye-software kunye ne-quality engineering iminyaka engaphezu kwama-20.