If you think that the field of Big Data, Machine Learning, Artificial Intelligence and Data Science is driven only by faster computers, think again.
A recent post on LinkedIn made the following three assertions (paraphrased):
Well, these assertions are partially true; and, even if they were, they’re also not equally important. I would argue that, in this wide and interconnected field of Big Data, Machine Learning, Artificial Intelligence, and Data Science (for which we lack an all-encompassing name), increasing adoption is driven not so much by the technological trends behind assertion (3), but by ongoing changes related to assertions (1) and (2).
Despite that, arguments for BD/ML/AI/DS and how it’s changing the world are mostly based on a fascination with the technological trends behind (3), such as cloud computing, GPGPU, and affordable multi-core CPUs.
However, the door-openers and drivers of the field are:
Here’s why.
“The increased and renewed attention on Big Data, Machine Learning andArtificial Intelligence is a rebranding of mathematics and advanced statistics.”
Assertion (1) is actually partially correct, and mostly so for the field of Machine Learning — but is what is going on really a cynical “rebranding” of math and statistics? Or is it an expanded accessibility to both old and new ideas and knowledge? It’s the latter.
A cursory look at the table of contents of Christopher Bishop’s classic textbook “Pattern Recognition and Machine Learning” showcases that much (not all) of what we call Machine Learning is “fancy interpolation”, i.e. “fancy regression” (and for those living life on the edge, also extrapolation), and regression has typically been a topic of statistics. On the other hand, assertion (1) is also irrelevant. Even if it were true (it isn’t entirely true), it would matter little that it’s a rebranding. The real question is: does the trend make it easier for people to grasp the basics? The answer is: yes — and that’s what really matters, rebranding or not.
What I find most interesting about this assertion is how much of the advanced concepts, mathematics, and statistics has been made easier to grok, even as it has become less relevant for the average Machine Learning practitioner in business, as it has also been abstracted away (see the discussion of assertion (2) below).
This is thanks to the fact that there are now many more sources of learning than used to be available in the past. Moreover, these sources are not limited to books as dense as PRML or ESLII — learning about BD/ML/AI/DS is now more granular than ever before. There are numerous online courses ranging from the math behind the algorithms to their simple application using software tools to attractive explanations of concepts on video. And if you still like books, there are so, so many of them and with a much larger variety of style and content, e.g. on the topic of Machine Learning.
Regardless of your learning preferences, you are bound to find sources to help kickstart your learning journey. And you definitely don’t need to attend meat-space university lectures, let alone attend courses towards a Masters-level degree, to enter the field.
In other words: knowledge about these topics is more accessible than ever before, for cheaper, and in many more as well as more easily digestible formats. And this is also helped by the fact that this alleged “rebranding”, to the extent that it’s true, is mixing things together, instead of keeping them in academically-separated silos of knowledge.
Meaning: you might start reading a book on Big Data in business (here’s one of my favorites) and realize that it has a high-level overview of relevant Machine Learning algorithms. Or, that you might read a popular science book on Machine Learning and be exposed to subtopics very relevant for Artificial Intelligence, such as reinforcement learning. The field is increasingly networked, and jumping from idea to idea has never been easier. Narrowly-focused specialists will always be necessary to move separate subfields of ideas forward. Yet, jacks-of-all-trades become increasingly needed to combine ideas from all the fiels, consult the specialists, and create value together.
“Rebranding” or not, this more networked and now highly accessible knowledge counts as a win in my book and in everyone else’s wanting to get things done and make things happen with BD/ML/AI/DS — and, in doing so, to create customer value instead of geeking out about technology.
“Related technology now consists of useful and valuable software libraries that are, however, not new, and have simply been repurposed for business use-cases.”
I’m not entirely sure about the total accuracy of assertion (2) either, i.e. that it’s simply about the “repurposing” of software tools. Of course, software tools for Machine Learning have been around for a long time, e.g. considering the hype surrounding Artificial Neural Networks in the 1980s. Software for Big Data has indeed exploded in recent years, together with the rise of, e.g., social networks and e-commerce applications that generate really, really big data that is then analyzed for both good and evil. And, regarding Artificial Intelligence, software so far seems to be specialized for specific types of applications, and has done so for quite some time.
I cannot contribute own experiences in Big Data or AI beyond that of an explorer / a dabbler, but I can look at assertion (2) through the eyes of someone having solved numerous real-world engineering problems with Machine Learning for a while now.
In the following short story, just count the number of different software tools employed in the pursuit of value; are we talking about a “repurposing” of software tools or about a toolkit of ever-growing size at ever-increasing speeds of change and improvement? I believe it’s the latter.
As a development engineer at ABB Turbo Systems, I developed complex mechanical components (e.g., radial turbines, shafts, casings, threaded connectors) using FEA/CFD simulations, and to investigate the feasibility of technological ideas and new concepts.
In 2009 I played around with ANNs in JavaNNS, and eventually turned to very basic polynomial response surface models implemented in Mathcad to model the behavior of a mechanical system for the first time. I also used Taguchi matrices to design my experiments.
In 2010 I started using Weka and many of its included algorithms (such as decision trees, support vector machines, Kriging, and Principal Component Analysis) routinely in a loose application of CRISP-DM for iterative and incremental (“agile”?) development of mechanical components. I also used my own Python implementations of Latin Hypercube Sampling and Particle Swarm Optimization, as well as libraries like ecspy. From a usage perspective, all these tools were clunky, but they did the job well.
Because CRISP-DM is too “small”.
In late 2011 I wanted to use scikit-learn, but it had just dropped support for neural networks and delegated them to the then still nascent PyBrain library, which proved to be too slow, similarly to bpnn.py.
In 2012, eventually, libFANNbecame my workhorse for ANNs. It was fast but didn’t include a way to avoid overfitting automatically by early stopping. So, I had to implement methods to do so in Python from a paper. I’d also had to write my own helper scripts for parametric studies on network architectures and training hyperparameters, for cross-validation, and for creating ensembles of neural networks. And, of course, I also had to do my own juggling of data on different nodes of an HPC cluster to achieve all of the above within a sensible time frame.
In 2015 I returned to Weka in order to help a technology development team at Hilti figure out whether they had already exhausted the optimization potential of a new product component (they hadn’t). In 2016 I continued applying the mindset of experimental design, ML/DS and iterative approaches like CRISP-DM and my own (see diagram above) to support teams in problem-solving. And, in 2017, I built prototypes of a new two-sided data-driven business model that required me to dabble in Natural Language Processing using NLTK, spaCy, and VADER for sentiment analysis — all done using Jupyter notebooks.
Yes, software tools are being “repurposed” all the time, in the broadest sense of the word; after all, should we be reinventing the wheel with every new problem? Or should also BD/ML/AI/DS as a field of practice strive to become more akin to the messy world of Javascript, with its countless hipstery whatever.js libraries and dependency hell? I doubt it.
Nowadays, much like sources of learning, software tools for BD/ML/AI/DS are also increasingly interconnected, polished, and well documented. The trend of asking questions of forums such as Quora and StackExchange also makes it more likely that toy examples and key use-cases are documented and can be found somewhere online. This makes software use and solution reuse easier and overall cheaper.
You don’t need to be an expert in statistical learning or optimization algorithms anymore, in order to get started in the field. Understandably, that annoys these experts who see newbies making grave errors and oversimplifications on their journey of learning. But that’s inevitable — look for example at how the field of engineering simulation with FEA and CFD has become commoditized. Nothing new here.
Contrary to the past, where acquiring knowledge in BD/ML/AI/DS was more compartmentalized, large-batched, and dead-serious, the present looks more user-friendly and results-oriented. You can get started with a cursory understanding of the concepts, then start playing with software tools to learn more, then learn from what you achieved; rinse and repeat, akin to Lean’s LAMDA/PDSA, or Lean Startup’s Build-Measure-Learn cycle.
“The main big change in the field has been the increased affordability of massive computational power for processing data.”
As explained above, nowadays pretty much anyone can get into BD/ML/AI/DS by reading the basic concepts online, and then using toy examples and self-hosted software installations (e.g., Apache Spark), before scaling up these skills to solve real-world problems. However, to first set foot on this journey, a) knowledge must be accessible, b) toy examples and tutorials must be plentiful, and c) software tools must be relatively easy to use, i.e. not represent a barrier to getting experience with actual problem-solving.
Since (a), (b) and © are already there, of course, you also need a computer. But do you really need “massive computational power for processing data”? Hardly, if ever.
For most people, the increased computational power for data processing isn’t what’s driving the beginning or the continuation of their BD/ML/AI/DS journey. When did you hear someone last say “if only I had the money for a quad-GTX1080 rig or EC2 instances, I would be able to start doing BD/ML/AI/DS”?
Probably never, except from procrastinators (akin to those thinking that “if only I had the latest Nike shoes and an Apple Watch, I would be motivated to start running often so that I can eventually lose weight”) or from those who mainly want a GTX1080 to max-out Skyrim ENBs and so rationalize the purchase decision through dreams of using GPGPU to train self-driving cars. 🙄
In 99.999% of the cases, your current laptop will do. Yes — even if it only features a 10+ years-old single-core Centrino, 512 MB or RAM, and a 16 GB hard drive. If you really feel the need for multiple cores, here’s your computational-power barrier-to-entry, cost-wise: US$ 8.99 for a quad-core Orange Pi Zero, plus US$ 4.00 for shipping, plus another US$ 5.00 for a 16 GB MicroSD, plus the time it takes to install and configure armbian. Less than 20 US dollars and 2 hours of total effort, and you can jump into scikit-learn and Jupyter. No excuses.
Sure — such a minimum-viable setup wouldn’t provide you with massive computational capacity. However, it does enable more and more people to enter the field of BD/ML/AI/DS at a ludicrously low cost, and eventually graduate to problem-defining and problem-solving situations truly requiring “massive” computational capacity.
Therefore, no; the main big change in the field has not been the increased affordability of massive computational power for processing data. However, a major driver of change has been (and continues to be) the massive decrease of cost for the absolute bare-minimum tech specs you can get, in order to enter the field. A set of tech specs, by the way, which is way above the actual minimum set you need to actually get started.
Disclaimer: This article was authored in my personal capacity. The opinions and views expressed in this article, as well as the context surrounding them, are my own and do not reflect the views of or bear any relation to my employer’s business.
You can also read other articles on similar business challenges, and follow me on LinkedIn.