If you think that the field of Big Data, Machine Learning, Artificial Intelligence and Data Science is driven only by faster computers, think again. A recent post on LinkedIn made the following three assertions (paraphrased): The increased and renewed attention on , and is a rebranding of mathematics and advanced statistics. Big Data Machine Learning Artificial Intelligence Related technology now consists of useful and valuable software libraries that are, however, not new, and have simply been repurposed for business use-cases. The main big change in the field has been the increased affordability of massive computational power for processing data. Well, these assertions are true; and, even if they were, they’re also not equally important. I would argue that, in this wide and interconnected field of Big Data, Machine , Artificial Intelligence, and Data Science (for which we lack an all-encompassing name), increasing adoption is driven not so much by the technological trends behind assertion (3), but by ongoing changes related to assertions (1) and (2). partially Learning Despite that, arguments for BD/ML/AI/DS and how it’s changing the world are mostly based on a fascination with the technological trends behind (3), such as cloud computing, GPGPU, and affordable multi-core CPUs. However, the door-openers and drivers of the field are: the increasing (related to, but not accurately captured by assertion (1)) to understand the basics well-enough to create value, accessibility of related know-how the increasing (related to, but not accurately captured by assertion (2)) to craft prototypes that craft a direction to solve problems, availability of related software tools the increasing (not accurately captured by assertion (3), allowing pretty much everyone to get started. affordability of more-than-good-computational capacity Here’s why. Assertion (1): it’s just a rebranding, anyway “The increased and renewed attention on Big Data , Machine Learning and Artificial Intelligence is a rebranding of mathematics and advanced statistics.” Assertion (1) is actually partially correct, and mostly so for the field of Machine Learning — but is what is going on really a cynical “rebranding” of math and statistics? Or is it an expanded accessibility to both old and new ideas and knowledge? It’s the latter. A cursory look at the of Christopher Bishop’s classic textbook showcases that much (not all) of what we call Machine Learning is “fancy interpolation”, i.e. “fancy regression” (and for those living life on the edge, also ), and regression has typically been a topic of statistics. On the other hand, assertion (1) is also irrelevant. Even if it were true (it isn’t entirely true), it would matter little that it’s a rebranding. The real question is: does the trend make it easier for people to grasp the basics? The answer is: yes — and that’s what really matters, rebranding or not. table of contents “Pattern Recognition and Machine Learning” extrapolation What I find most interesting about this assertion is how much of the advanced concepts, mathematics, and statistics has been made easier to grok, even as it has become less relevant for the average Machine Learning practitioner in business, as it has also been abstracted away (see the discussion of assertion (2) below). This is thanks to the fact that there are now many more sources of learning than used to be available in the past. Moreover, these sources are not limited to books as dense as or — learning about BD/ML/AI/DS is now more granular than ever before. There are numerous online courses ranging from the math behind the algorithms to their simple application using software to . And if you still like books, there are so, so many of them and with a much larger variety of style and content, e.g. on the topic of . PRML ESLII tools attractive explanations of concepts on video Machine Learning Regardless of your learning preferences, you are bound to find sources to help kickstart your learning journey. And you definitely don’t need to attend meat-space university lectures, let alone attend courses towards a Masters-level degree, to enter the field. In other words: knowledge about these topics is more accessible than ever before, for cheaper, and in many more as well as more easily digestible formats. And this is also helped by the fact that this alleged “rebranding”, to the extent that it’s true, is mixing things together, instead of keeping them in academically-separated silos of knowledge. Meaning: you might start reading a book on Big Data in business (here’s ) and realize that it has a high-level overview of relevant Machine Learning algorithms. Or, that you might read and be exposed to subtopics very relevant for Artificial Intelligence, such as . The field is increasingly networked, and jumping from idea to idea has never been easier. Narrowly-focused specialists will always be necessary to move separate subfields of ideas forward. Yet, jacks-of-all-trades become increasingly needed to combine ideas from all the fiels, consult the specialists, and create value together. one of my favorites a popular science book on Machine Learning reinforcement learning “Rebranding” or not, this more networked and now highly accessible knowledge counts as a win in my book and in everyone else’s wanting to get things done and make things happen with BD/ML/AI/DS — and, in doing so, to . create customer value instead of geeking out about technology Assertion (2): it’s simply about software tools being repurposed “Related technology now consists of useful and valuable software libraries that are, however, not new, and have simply been repurposed for business use-cases.” I’m not entirely sure about the total accuracy of assertion (2) either, i.e. that it’s simply about the “repurposing” of software tools. Of course, software tools for Machine Learning have been around for a long time, e.g. considering the hype surrounding Artificial Neural Networks in the 1980s. Software for Big Data has indeed exploded in recent years, together with the rise of, e.g., social networks and e-commerce applications that generate really, really data that is then analyzed for both good and evil. And, regarding Artificial Intelligence, software so far seems to be specialized for specific types of applications, and has done so for . big quite some time I cannot contribute own experiences in Big Data or AI beyond that of an explorer / a dabbler, but I can look at assertion (2) through the eyes of someone having solved numerous real-world engineering problems with Machine Learning for a while now. In the following short story, just count the number of different software tools employed in the pursuit of value; are we talking about a “repurposing” of software tools or about a toolkit of ever-growing size at ever-increasing speeds of change and improvement? I believe it’s the latter. How I got into Machine Learning (the short version) As a development engineer at , I developed complex mechanical components (e.g., radial turbines, shafts, casings, threaded connectors) using FEA/CFD simulations, and to investigate the feasibility of technological ideas and new concepts. ABB Turbo Systems In 2009 I played around with ANNs in , and eventually turned to very basic implemented in to model the behavior of a mechanical system for the first time. I also used to design my experiments. JavaNNS polynomial response surface models Mathcad Taguchi matrices In 2010 I started using and many of its included algorithms (such as , , , and ) routinely in a loose application of for iterative and incremental ( ?) development of mechanical components. I also used my own Python implementations of and , as well as libraries like . From a usage perspective, all these tools were clunky, but they did the job well. Weka decision trees support vector machines Kriging Principal Component Analysis CRISP-DM “agile” Latin Hypercube Sampling Particle Swarm Optimization ecspy Because CRISP-DM is too “small”. In late 2011 I wanted to use , but it had just dropped support for neural networks and delegated them to the then still nascent , which proved to be too slow, similarly to . scikit-learn PyBrain library bpnn.py In 2012, eventually, became my workhorse for ANNs. It was fast but by . So, I had to implement methods to do so in Python from a . I’d also had to write my own helper scripts for parametric studies on network architectures and training hyperparameters, for cross-validation, and for creating ensembles of neural networks. And, of course, I also had to do my own juggling of data on different nodes of an HPC cluster to achieve all of the above within a sensible time frame. libFANN didn’t include a way to avoid overfitting automatically early stopping paper In 2015 I returned to Weka in order to help a technology development team at Hilti figure out whether they had already exhausted the optimization potential of a new product component (they hadn’t). In 2016 I continued applying the mindset of experimental design, ML/DS and iterative approaches like CRISP-DM and my own (see diagram above) to support teams in problem-solving. And, in 2017, I built prototypes of a new two-sided data-driven business model that required me to dabble in using , , and — all done using . Natural Language Processing NLTK spaCy VADER for sentiment analysis Jupyter notebooks Yes, software tools are being “repurposed” all the time, in the broadest sense of the word; after all, should we be reinventing the wheel with every new problem? Or should also BD/ML/AI/DS as a field of practice strive to become more akin to the messy world of Javascript, with its countless hipstery libraries and dependency hell? I doubt it. whatever.js Nowadays, much like sources of learning, software tools for BD/ML/AI/DS are also increasingly interconnected, polished, and well documented. The trend of asking questions of forums such as Quora and StackExchange also makes it more likely that toy examples and key use-cases are documented and can be found somewhere online. This makes software use and solution reuse easier and overall cheaper. You don’t need to be an expert in statistical learning or optimization algorithms anymore, in order to get started in the field. Understandably, that annoys these experts who see newbies making grave errors and oversimplifications on their journey of learning. But that’s inevitable — look for example at how the field of engineering simulation with FEA and CFD has become commoditized. Nothing new here. Contrary to the past, where acquiring knowledge in BD/ML/AI/DS was more compartmentalized, large-batched, and dead-serious, the present looks more user-friendly and results-oriented. You can get started with a cursory understanding of the concepts, then start playing with software tools to learn more, then learn from what you achieved; rinse and repeat, akin to Lean’s LAMDA/PDSA, or Lean Startup’s Build-Measure-Learn cycle. Assertion (3): it’s all about faster, cheaper, abundant computers “The main big change in the field has been the increased affordability of massive computational power for processing data.” As explained above, nowadays pretty much anyone can get into BD/ML/AI/DS by reading the basic concepts online, and then using toy examples and self-hosted software installations (e.g., Apache Spark), before scaling up these skills to solve real-world problems. However, to first set foot on this journey, a) knowledge must be accessible, b) toy examples and tutorials must be plentiful, and c) software tools must be relatively easy to use, i.e. not represent a barrier to getting experience with actual problem-solving. Since (a), (b) and © are already there, of course, you also need a computer. But do you really need “massive computational power for processing data”? Hardly, if ever. For most people, the increased computational power for data processing isn’t what’s driving the beginning or the continuation of their BD/ML/AI/DS journey. When did you hear someone last say “ I had the money for a quad-GTX1080 rig or EC2 instances, I would be able to start doing BD/ML/AI/DS”? if only Probably never, except from procrastinators (akin to those thinking that “if only I had the latest Nike shoes and an Apple Watch, I would be motivated to start running often so that I can eventually lose weight”) or from those who mainly want a GTX1080 to max-out Skyrim ENBs and so rationalize the purchase decision through dreams of using GPGPU to train self-driving cars. 🙄 In 99.999% of the cases, your current laptop will do. Yes — even if it only features a 10+ years-old single-core Centrino, 512 MB or RAM, and a 16 GB hard drive. If you really feel the need for multiple cores, here’s your computational-power barrier-to-entry, cost-wise: , plus US$ 4.00 for shipping, plus another US$ 5.00 for a 16 GB MicroSD, plus the time it takes to install and configure . Less than 20 US dollars and 2 hours of total effort, and you can jump into scikit-learn and Jupyter. No excuses. US$ 8.99 for a quad-core Orange Pi Zero armbian Sure — such a minimum-viable setup wouldn’t provide you with computational capacity. However, it does enable more and more people to enter the field of BD/ML/AI/DS at a ludicrously low cost, and eventually graduate to problem-defining and problem-solving situations requiring “massive” computational capacity. massive truly Therefore, no; the main big change in the field has not been the increased affordability of massive computational power for processing data. However, a major driver of change been (and continues to be) the massive decrease of cost for the absolute bare-minimum tech specs you can get, in order to enter the field. A set of tech specs, by the way, which is above the actual minimum set you need to actually get started. has way Conclusions The increased and renewed attention on , and is thanks to the “arrival” of the world to a point where the field of BD/ML/AI/DS can provide solutions to actual problems, combined with a massively increased accesibility of the related know-how to an ever-increasing population that is growing aware of the field itself. Big Data Machine Learning Artificial Intelligence Related technology has always consisted of both new and repurposed software libraries and tools that are increasingly well-documented, easy-to-install, easy-to-use, and easy to find examples in, combined with the accesibility benefits of conclusion (1) to grok well-enough what’s going on in the background. The increased affordability of massive computational power for processing data has not been the main driver of BD/ML/AI/DS; rather, an important driver of the field continues to be the massively increasing affordability of computational capacity for entering the field and eventually graduating to real-world problems delivering customer value at scale. way-more-than-good-enough Disclaimer: This article was authored in my personal capacity. The opinions and views expressed in this article, as well as the context surrounding them, are my own and do not reflect the views of or bear any relation to my employer’s business. Related articles (an unexpectedly viral hit) The incredible story of Deft Go beyond technological navel-gazing and focus on customer value Why product-focused IoT innovation is the comfort zone You can also read on similar business challenges, and follow me . other articles on LinkedIn

Amazon

Apache

Apple

Google

Nike

ORANGE

YouTube

Don’t “implement” or “transform”

Too Long; Didn't Read

Faster computers aren’t the key driver behind Machine Learning adoption

Faster computers aren’t the key driver behind Machine Learning adoption

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Don’t “implement” or “transform”

The Noonification: Use This 7-Step McKinsey Framework to Solve Any Problem (1/10/2023)

The Noonification: A Taxonomy of Inclusiveness (1/11/2024)

The Noonification: What is the InfiniteNature-Zero AI Model? (11/19/2022)

10 Ways AI Has Changed Our Lives

100 Days of AI, Day 8: Experimenting With Microsoft's Semantic Kernel Using GPT-4

Don’t “implement” or “transform”

The Noonification: Use This 7-Step McKinsey Framework to Solve Any Problem (1/10/2023)

The Noonification: A Taxonomy of Inclusiveness (1/11/2024)

The Noonification: What is the InfiniteNature-Zero AI Model? (11/19/2022)

10 Ways AI Has Changed Our Lives

100 Days of AI, Day 8: Experimenting With Microsoft's Semantic Kernel Using GPT-4

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps