With interest in neural networks and deep , individuals andcompanies are claiming ever-increasing adoption rates of artificial intelligence into their daily workflows and product offerings. growing learning Coupled with breakneck speeds in AI-research, the new wave of popularity shows a lot of promise for solving some of the harder problems out there. That said, I feel that this field suffers from a gulf between these developments and subsequently them to solve “real-world” tasks. appreciating deploying A number of frameworks, tutorials and guides have popped up to democratize machine learning, but the steps that they prescribe often don’t align with the fuzzier problems that need to be solved. This post is a collection of questions (with some (maybe even incorrect)answers) that are worth thinking about when applying machine learning inproduction. Garbage in, garbage out Do I have a reliable source of data? Where do I obtain my dataset? While starting out, most tutorials usually include well-defined datasets.Whether it be , the or any of the great options fromthe , these datasets are often not representative of the problem that you wish to solve. MNIST Wikipedia corpus UCI Machine Learning Repository For your specific use case, an appropriate dataset might not even exist andbuilding a dataset could take much longer than you expect. For example, at Semantics3, we tackle a number of ecommerce-specific problems ranging from to to . For each of these problems, we had to look within and spendconsiderable effort to generate high-fidelity product datasets. product categorization product matching searchrelevance In many cases, even if you possess the required data, significant (and ) manual labor might be required to categorize, annotate and label your data for training. expensive Transforming data to input What pre-processing steps are required? How do I normalize my data before using with my algorithms? This is another step, often independent of the actual models, that is glossed over in most tutorials. Such omissions appear even more glaring when exploring deep neural networks, where transforming the data into usable “input” is crucial. While there exist some standard techniques for images, like cropping, scaling, zero-centering and whitening — the final decision is still up to individuals on the level of normalization required for each task. The field gets even messier when working with text. Is capitalizationimportant? Should I use a tokenizer? What about word embeddings? How big should my vocabulary and dimensionality be? Should I use pre-trained vectors or start from scratch or layer them? There is no right answer applicable across all situations, but keeping abreast of available options is often half the battle. A recent from the creator of details an interesting strategy to standardize deep learning for text. post spaCy Now, let’s begin? Which language/framework do I use? Python, R, Java, C++? Caffe, Torch, Theano, Tensorflow, DL4J? This might be the question with the most opinionated answers. I amincluding this section here only for completeness and would gladly point you to available for making this decision. the various other resources While each person might have different criteria for evaluation, mine hassimply been ease of customization, prototyping and testing. In that aspect, I prefer to start with where possible and use for my deep learning projects. scikit-learn Keras Further questions like Again, there are a number of resources to help make decisions and this is perhaps the mostdiscussed aspect when people talk about “using” machine learning. Which technique should I use? Should I use deepor shallow models, what about CNNs/RNNs/LSTMs? Training models How do I train my models? Should I buy GPUs, custom hardware, or ec2 (spot?) instances? Can I parallelize them for speed? With ever-rising model complexity, and increasing demands on processingpower, this is an unavoidable question when moving to production. A billion-parameter network might promise great performance with itsterabyte-sized dataset, but most people cannot afford to wait for weeks while the training is still in progress. Even with simpler models, the infrastructure and tooling required for the build-up, training, collation and tear-down of tasks across instances can be quite daunting. Spending some time on planning your , standardizing and defining workflows early-on can save valuable time with each additional model that you build. infrastructure setup No system is an island Do I need to make batched or real-time predictions? Embedded models or interfaces? RPC or REST? Your model is not of much use unless it interfaces with the rest of your production system The decision hereis at least partially driven by your use-case. 99%-validation-accuracy . A model performing a simple task might perform satisfactorily with its weights packaged directly into your application, while more complicated models might require communication with centralized heavy-lifting servers. In our case, most of our production systems perform tasks offline in batches, while a minority serve real-time predictions via JSON-RPC over HTTP. Knowing the answer to these questions might also restrict the types ofarchitectures that you should consider when building your models. Building a complex model, only to later learn that it cannot be deployed within yourmobile app is a disaster that can be easily avoided. Monitoring performance How do I keep track of my predictions? Do I log my results to a database? What about online learning? After building, training and deploying your models to production, the task isstill not complete unless you have monitoring systems in place. A crucial component to ensuring the success of your models is being able to measure and quantify their performance. A number of questions are worth answering in this area. How does my model affect the overall system performance? Which numbers do Imeasure? Does the model correctly handle all possible inputs and scenarios? Having used Postgres in the , I favor using it for monitoring my models. Periodically saving production statistics (data samples, predicted results, outlier specifics) has proven invaluable in performing analytics (and error postmortems) over deployments. past Another important aspect to consider is the online-learning requirement ofyour model. When hoverboards become a reality should the product-categorizer put it in , or leave it ? Again, these are important questions worth debating when building your system. Should your model learn new features on the fly? , Vehicles Toys Uncategorized Wrapping it up there is more to it than just the secret sauce This post poses more questions than it answers, but that was sort of the point really. With many advances in new techniques cells layers network architectures, it is easier than ever to miss the forest for the trees. and and and Greater discussion about end-to-end deployments is required among practitioners to take this field forward and truly democratize machine learning for the masses. Originally published at engineering.semantics3.com .