Creating a smart algorithm is not yet a given for many entrepreneurs and small businesses, who might lack the resources to launch a successful Artificial Intelligence program. . Much like the rest of the world, Artificial Intelligence ( A.I. ) has a 1% problem While (very) large corporations benefit from the vast amount of data available to them, as well as a combination of business, technological and regulatory expertise, most SMEs have no such luck. This should however not discourage the brave and bold who seek to embark on a data-science journey. And though there are no shortcuts, there are clear steps to undertake in order to start an A.I project, which I’ve laid out below in the hope that it’ll clear up some of the mysticism surrounding the creation of an “intelligent” algorithm. Please note that this is not meant to be a technical guide, and that definitions will be kept to a minimum for brevity's sake. 1. Formulate an Executive Strategy Before even thinking about A.I, some key questions must be answered by a variety of key company executives : do they want to disrupt their market by creating a different type of value proposition? Do they seek to be “ ”? Maybe their aim is to stay level on a competitive market? Or even catch up to the current leader? best in class These questions have to be answered BEFORE any sort of A.I project is set in motion. Operational teams will otherwise be left to aimlessly dig through data, looking for a story to tell. Given the nature of most industries, though, they would be chasing a moving target, rewriting history as the data comes in. Yes, data is fun. Yes, it’s interesting. But it serves no purpose on its own. Starting with it instead of clear goals creates only solutions in search of a problem. Hint : A good strategy starts with disgruntled customers in mind, not technologies. 2. Identify and Prioritise Ideas The decisions, of course, do not stop at the formulation of a strategy; strategy only answers the question “Who am I?”. “What am I doing?” is an altogether different question, and needs to be answered more operationally through the identification of . use cases Let’s say a company’s long-term strategy is to be the most “trusted” player in its industry. It could enact this strategy by ensuring that all customer queries are answered without errors, that they are answered quickly, that all callers are greeted by a human instead of a robot… hundreds of such ideas could be (should be) systematically identified, evaluated, clustered, prioritised, and discussed in workshops or through small consulting engagements ( ). woo! Some will go straight to Sales, Marketing, Accounting, HR… to potentially be dealt with using a traditional operational/statistical approach; and some will be deemed fit for a solution involving A.I, in one of its many forms. I advise looking at three specific issues to start with : : capabilities exist within the company, but are not optimally distributed. Bottlenecks : capabilities exist within the company, but the process for using them takes too long, too expensive, or cannot be scaled. Scaling : the company collects more data than can be currently analysed and applied by existing resources ( ). Resources roughly 55% of data collected by companies goes unused Hint : It is more productive for companies to look at A.I through the lens of business capabilities rather than technologies. 3. Ask the right questions about Data The data question emerges immediately after an Machine Learning project ( ) is deemed operationally sound and technologically feasible. Or, should I say, the data questionS… the term “Machine Learning” is much more accurate than “A.I” at this point of the conversation What type of data is needed ? There technically isn’t a lot of different types of data : it can be numerical (historically the easiest to use due to its tabular nature), text, images, videos, sound... Just about anything that can be recorded could be deemed to be data. Ultimately, computers are agnostic about the type of input you give them; it’s all just numbers to them. How much data is needed ? There is no specific amount of data-points that can be prescribed as it varies wildly from project to project; but a start-up which has just launched and has no more than 300 clients probably does not naturally have the resources to launch a ML project. Furthermore, note that . Both is best. data can be formed of either a lot of data-point (“Big N”), a lot of details for each data point (“Big D”), or both Do we have that data? Data is either “available” or “to collect” ( ). Collection can either be done internally, which can be incredibly time-consuming (we’re talking months and major restructurations), or gathered through external sources (predicting umbrellas demand, for example, would use weather data freely available to all). yes, I’m over-simplifying, sue me Hint : Unique data, rather than cutting-edge modeling, is what creates a valuable A.I solution. 4. Perform the necessary Risk Assessments Any self-respecting project requires continuous risk-assessment in order to address issues before they arise ( ). This should come early in the project, and be done thoroughly and continuously throughout its entire life-time. Below are 10 questions to ask to get started: PMO 101 : “Everyone else is doing it” is a terrible reason to get into the A.I game. Do I have a SMART goal ? : It’s simply not possible for an algorithm to understand the present and the future without being keenly aware of the past. Do I have enough data ? : Garbage in, garbage out. Are there errors in my dataset ? : Data must be representative of reality, and avoid reflecting reality’s existing prejudices. Is my dataset a d*ck ? : There are only 22,000 PhD-level experts worldwide capable of developing cutting edge algorithms. Do I have the people to make this happen ? currently ? : If all the project’s employees answer to different managers within different company branches, it is likely that different goals will emerge. Will I need to change my hierarchical structure ? : “Will it replace jobs ?” and “Will I have to undergo training ?” are valid questions which need answers. Will my employees become Luddites : Algorithms resides within ecosystems which rely on , , … itself part of a wider ecosystem made of , Data storage, Cybersecurity… Do I have the right architecture ? data collection workflow management IaaS APIs ? : Checking not only current regulations, but being aware of the ones being discussed has always been key in the corporate world, and shall remain so. Are there any regulatory hurdles : If a company is in a time-sensitive crunch, A.I is probably not the answer. Do I have time ? Hint : Risk assessment are best performed by outside players. 5. Choose the relevant Method & Model Once all the matters above have been settled ( ), it’s time to pick the most adapted solution, technologically speaking, to address the previously identified issue. if not, go back to step 1 Doing it this late in the process might seem odd, yet makes perfect sense when one realises that the technologies below are highly adaptive, and that starting with just one in mind would only shrink the horizon of possibilities. In any case, anyone with some experience in the matter will have thought of the right tool to use during the previous steps of the project. Below is a shallow look at the 7 main categories of machine learning. Concrete examples will be given in a later article, but in the meantime can be found all around the web : Supervised Machine Learning : LAs model predictions based on independent variables. It is mostly used for finding out the relationship between variables and forecasting, and works well for predictions in a . → More details : , , , … Linear algorithms stable environment Linear regression Logistic Regression SVM Ridge/Lasso : Ensemble learning is a system that makes predictions based on a number of different models. By combining individual models, the ensemble model tends to be more flexible (less bias) and less data-sensitive (less variance). → More details : … Ensemble methods Random Forest, Gradient boosting, AdaBoost : a probabilistic classifier is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation should belong to. → More details : , , … Probabilistic Classification Naive Bayes Bayesian Network MLE : For supervised learning tasks, deep learning methods make time-consuming tasks irrelevant by translating the data into compact intermediate representations of the data to increase precision ( ). It is great for classifying, processing and making predictions based on non-numerical data. → More details : , … Deep Learning feature engineering think categories CNN, RNN, MLP LSTM Unsupervised Machine Learning : “Clustering” is the process of grouping similar data-points together. The goal of this unsupervised machine learning technique is to find similarities in the data point and automatically group similar data points together. → More details : , … Clustering K-means, DB Scan Hierarchical Clustering : dimensionality reduction, aka dimension reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables. . → More details : , … Dimensionality reduction It also makes very pretty graphs PCA t-SNE : For unsupervised learning tasks, deep learning methods allow the categorisation of unlabeled data. Any more details would substantially lengthen this article ( ). → More details : , ... Deep learning Soz Reinforcement Learning, Autoencoders GANs Hint : all models are wrong, but some are useful. 6. Make a BBP decision As previously mentioned, few companies can afford to fully develop their own algorithms and deploy them at a large scale unaided. The unlucky 99% have to make complicated choices regarding their ability to invest a large amount with low chances of success, or invest a smaller amount, with a high chance of success but a very sticky relationship with a provider/partner. There are no easy answers. Just hard choices. Below are a few options : Build Further existing analytics capabilities within the company through a variety of technological investments and recruitment. Further an existing application by enhancing it with cognitive applications. Use one of the many available and a handful of entrepreneurial employees. open source algorithms Buy Use an existing, off-the-shelf, software from a known provider (such as Oracle, Microsoft, AWS…). Hire an existing provider to fully develop an algorithm (specific developments however mean being completely tied to the provider. ). No bueno Partner A win-win relationship based on a company’s data and a start-up’s technology. Yet, this type of relationship can be complicated and rarely sustainable as the various actors rarely have the same short-term/long-term goals. Hint : BBP choices depend highly on both the executive strategy and the industry to which a company belongs. 7. Run performance checks A recently built algorithm cannot be released into the wild untested. Indeed, most machine learning systems operate as “black boxes” ( ), and something that may have worked on one set of training data may not work as well (if at all) on another set for a variety of reasons. Thankfully, have been devised to ensure that models works as intended : again, a useful over-simplification a handful of tools : number of correct predictions vs number of predictions made Classification Accuracy : a formula which penalizes the wrong classification of data Logarithmic Loss : similar to Accuracy, but more visual Confusion Matrix : the probability that a randomly chosen positive example will be ranked higher than a randomly chosen negative example Area under Curve : dabbles in both precision and robustness F1 Score : the average of the difference between the Original Values and the Predicted Values Mean Absolute Error : Same as above, but makes it easier to compute gradients Mean Squared Error All the above matter insofar as they allow a team to inform a scorecard. External teams are also useful here as they will not be biased by the costs already sunk on the project. pre-prepared Hint : beware of the “ rage-to-conclude” . 8. Deploy the algorithm The deployment phase of an A.I project is probably the most crucial of all, given the resources invested. Indeed, the transition from prototypes to production systems can be expected to be expensive and time-consuming, but should at least not run into major issues if the risk analysis was done properly ( ). that is sadly rarely the case There aren’t many ways to ensure that a deployment goes smoothly beyond an efficient project management team. The two main techniques are as follows : : not to be confused with a “Silent Install”, Silent deployment means running the algorithm in parallel to the historical solution to ensure that it either matches or improves its findings. Silent deployment : also known as “Split testing” or “Bucket testing”, it entails testing the new solution on only part of the population, while the other group continues on with the historical solution. A/B testing Furthermore, and though I would encourage all first-time projects to move slowly and avoid breaking things, keeping an eye out for scaling potential can often be key to future successes. Hint : Do not be afraid of redesigning workflows around the new solution. 9. Communicate both successes and failures Time to “Render to Caesar the things that are Caesar’s”, and share the success of the Machine Learning algorithm built with the world. If the endeavor wasn’t a success, it’s worth discussing too, if only as a learning opportunity for the entire organisation. There are a few key groups which ought to be made aware of either successes and failures, and so for a wide variety of reasons : : In an age when buzzwords can make or kill stocks, sharing an A.I success story will please investors, and is likely to increase the value of the company, especially if the project made it possible to develop new, useful capabilities that can have a long-term impact. Investors/shareholders : This of course depends on the industry and the nature of the tool used, but given the current discussions surrounding ethics and data management, gaining the trust of the public sector through transparency is probably a wise move. Government agencies : Customers need to a) be educated as to what the solution can do for them, and b) be used as ambassadors for the improved service or product, hence creating a . Customers halo effect : Candidates need to be made aware that a company might be changing, and that their skills may need to adapt to these changes in the future. Many candidates also see working in an A.I-capable company as a positive thing, potentially increasing the applicant pool. Candidates : Most important of all, employees need to know what is being done, how this may impact them, and whether their day-ti-day life will change. People are naturally adverse to change, and letting rumors of automation go around is the surest way to ruin a good project. Employees Hint : Under-hyping tends to be more productive than over-hyping. 10. Tracking I have bad news. And good news. The bad news is that the work never ends. The good news is that the work never ends. Indeed, letting an algorithm run unchecked for a long period of time once its put into production would be foolish. The world change, as do people and their habits, leading to their data to change. Within months, it is likely that the initial data used to train an algorithm is no longer relevant. The challenge here is that it might be very hard to notice at first that the algorithm is no longer representative of the world and needs to be reworked using the expertise gained during the construction of version 1.0 of the algorithm. . Talk about a Sisyphean task Hint : The world changes a lot faster than you think. And an algorithm cannot see that. CONCLUSION The very nature of a data-driven project is that it cannot be generic. This means spending both time and money to create something unique to cross a moat created by an early adopter. But because A.I capabilities tend to grow at an exponential rate within large companies, the moat just keeps getting wider. This is the challenge that many companies now face, and why they need a solid process to get started : their margin for error is quickly shrinking. As such, I have a few final recommendations : Start small. Begin with a pilot project. Set realistic expectations. Focus on quick wins. Work incrementally. Good luck out there. This article was originally made for The Pourquoi Pas , a blog providing in-depth analyses of today’s technological challenges. Click here to check it out.

Bueno

Microsoft

Oracle

Target

Technology Should Enable Craftsmen To Go Back To Doing What They Do Best

How the Use of Machine Learning is Challenging the Retail Apocalypse

Tech, made simple

Nominated for 2022 - HackerNoon Contributor of the Year - Science

Nominated for 2022 - HackerNoon Contributor of the Year - Entrepreneurship

Nominated for 2022 - HackerNoon Contributor of the Year - Future

Too Long; Didn't Read

Artificial Intelligence Business Opportunities: 10 Steps to Implement

Artificial Intelligence Business Opportunities: 10 Steps to Implement

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

10 Logical Fallacies and Cognitive Biases Blocking Your Creativity

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

10 Logical Fallacies and Cognitive Biases Blocking Your Creativity

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps