CloudSearch is a managed service provided by AWS. We used AWS Lambdas, Step Functions to get the data from Firebase and move it to CloudSearch\n----------------------------------------------------------------------------------------------------------------------------------------------\n\n!(https://hackernoon.com/hn-images/1*7DlpJVoAAeYYu-DPunMGGw.png)\n\nComponents that were used to achieve search on firebase data\n\nFirebase does not provide an inherent way to search the data that you store. You either use ElasticSearch with their open source project Flashlight or depend on Full Text comparison. In this post we talk about how to simplify that using [AWS](https://hackernoon.com/tagged/aws) Cloud Search\n\nFor those who are not familiar with Firebase; it is a “realtime” NoSQL database, currently part of [Google](https://hackernoon.com/tagged/google) and is pretty good when prototyping apps. It does have good SDKs for Android, iOS and Web. The entire DB is treated like a JSON document can be easily manipulated from their web console. It provides automated backups in their highest tier and has built in auth with security rules.\n\nAmazon CloudSearch is a managed service provided by AWS that makes it simple and cost-effective to set up, manage, and scale a search solution for all applications, supporting 34 languages along with features like autocomplete , highlighting, calculated fields and scoring based sort. For those dealing with geo locations, CloudSearch provides support for that as well.\n\nListen to the full story at odiocast.com\n\n[**Making your Firebase data searchable with little help from AWSby tech\\_in\\_startups** \n_In this episode of TechNStartups we talk about how we implemented search for our app using AWS Cloud Search, AWS…_www.odiocast.com](https://www.odiocast.com/tech_in_startups/-Kl3ifGJv_D8Z_L2AveD "https://www.odiocast.com/tech_in_startups/-Kl3ifGJv_D8Z_L2AveD")(https://www.odiocast.com/tech_in_startups/-Kl3ifGJv_D8Z_L2AveD)\n\nWe at [@odiocast](https://twitter.com/@odiocast) needed to make our content searchable within the app. We found [AWS Cloudsearch](https://aws.amazon.com/cloudsearch/) to be a perfect fit. I did realize I would be charged for network egress from Google Cloud/Firebase (whatever you want to call it), but that charge is too negligible when compared to the cost of running a ES instance, clubbed with monitoring , auto scaling and up keep. For a fraction of cost and some small scripts I was able to get our content searchable.\n\n#### Setting up CloudSearch\n\nHead over to the AWS Console, and look for CloudSearch. To Get Started you will need to create a CloudSearch domain. The setup wizard can analyze a data file to find the fields which needs to be indexed, you can also configure it manually. I prefer to do it manually. After you have completed the setup wizard, it would take some time to prepare the index.\n\n!(https://hackernoon.com/hn-images/1*sB6TGVU1PIDH-bzcEucOzQ.png)\n\nGrab a coffee or come back to the console after about 5–15mins. You can use Amazon CloudSearch to index and search both structured data and plain text. Amazon CloudSearch features:\n\n* Full text search with language-specific text processing\n* Boolean search\n* Prefix searches\n* Range searches\n* Term boosting\n* Faceting\n* Highlighting\n* Autocomplete Suggestions\n\nYou can construct both simple and compound queries. I would strongly suggest you go through the [resources](http://docs.aws.amazon.com/cloudsearch/latest/developerguide/searching-compound-queries.html).\n\n#### Take it for a spin with ZERO code\n\nWith CloudSearch you can upload the documents which needs to be searched from the dashboard and start searching it from the test search tab available in the left navigation bar.\n\n!(https://hackernoon.com/hn-images/1*YmbonfZR-H4msohzg11MlQ.png)\n\nSetting up CloudSearch domain and configuring indexes was the easy part. Now lets look at how to get the data from Google’s Firebase to AWS CloudSearch.\n\n#### Sending data to CloudSearch\n\nIn our case the content is dynamically generated so we need to keep updating the service with any new data that is generated. To keep things simple, I broke the task into the following steps:\n\n* Calculate the duration for which data needs to be fetched\n* Fetch user data and stories\n* Merge and format the data if required\n* Check if we need to add data to cloud search\n* Add data to cloud search\n* Cleanup\n\nThis can be done in several ways but I decided to go with **AWS Step Function.**\n\n> AWS Step Functions makes it easy to coordinate the components of distributed applications and microservices using visual workflows. Building applications from individual components that each perform a discrete function lets you scale and change applications quickly. Step Functions is a reliable way to coordinate components and step through the functions of your application.\n\nIf we imagine the above subtasks to be states and treat it as a finite state machine then we can imagine something like this\n\n!(https://hackernoon.com/hn-images/1*4cwrRfKkqtK8ISNSYT4ELg.png)\n\nI created 5 lambda functions and deployed them using [Apex](http://apex.run). I wired these lambdas into a state machine up using the state machine language. In my case I need to fetch both user data and the stories data, which I execute in **parallel** and then pass it to the formatting function. AWS provides a graphical interface to view the state machine in and after execution and also highlight every step.\n\n!(https://hackernoon.com/hn-images/1*ilDyeSGvd8zH9wpGq0lYog.png)\n\nThe image is kind of self explanatory, all the green boxes have been successfully executes. The one in blue is currently in execution. “Check if new data found” is a **choice state** which decides whether to proceed forward or not.\n\nIn reality the lambdas execute a lot faster and it takes a second or two to update UI. The spec file of a state machine, is a simple JSON file, its pretty much readable and self explanatory.\n\n!(https://avatars3.githubusercontent.com/u/2898596?v=3&s=400)\n\nOur State Machine spec which we currently use\n\nThe reason I added _calculateInterval_ was to be able to keep some flexibility in syncing old data. Now to keep this running at a fixed interval I use AWS Cloud Watch, and configure a rule at a fixed interval and target the State Machine that we just created, passing the required input interval.\n\n!(https://hackernoon.com/hn-images/1*arlxzrY2FituQAf9SJBpQg.png)\n\nAWS Cloud Watch Console\n\nFinally with some android code, we added search and auto complete to our Android and iOS apps.\n\n!(https://hackernoon.com/hn-images/1*2hislFTLVirCRG5SV83Qvg.png)\n\nIf you have any doubts with respect to AWS CloudSearch, Firebase, Step Functions/AWS Lambda you can reach me at [email@example.com_](mailto:firstname.lastname@example.org)_._\n\n#### TL;DR\n\nIn case you are wondering; lambdas are stateless then how does step functions work? Well behind the scene AWS stores the output of the lambda function into some storage/cache and passes it to the next lambda function based on the state machine definition. If you ever run a parallel task like I am doing here, you’ll find the result is a JSON Array which has input and output in each JSON Object of the array. So if you were to run 10 lambdas parallely, you would end up getting a JSON Array with size 10.\n\n#### Pricing\n\nDo note you will be charged a nominal fee of **$0.025 per 1,000 state transitions** thereafter (**$0.000025 per state transition**) with the **first 4,000 state transitions free each month**. Apart from this, you will be charged the cost incurred by your lambda functions. As far as Cloud Search goes, it is an instance based price based on the number of hours consumed which starts as low as **$0.059 per hour** for a _search.m1.small_ instance. In addition to this there is an additional cost for network egress from firebase which is **$1 per GB.**\n\nBefore building this solution I did realise I could have gone the other way and written Firebase Cloud Functions to send data from Firebase to AWS Cloud Search.\n\nThere are many ways in which you can provide a search functionality in your apps. You might even be tempted to use something like an Algolia, Apache Solr, or ElasticSearch or even the oxygen library. The decision of building versus using managed services should be properly weighed. If search is just a functionality among the several features in your app, you should not spend days and hours building, maintaining and monitoring it. If it is very fundamental to your service and none of the existing tools provide support for your need or maybe the estimated cost for managed services is too high only then you should build your own.