News in AI and machine learning

For more AI news and analysis, . sign up to my newsletter here Reporting from 19th January 2017 through March 28th 2017 I’m — welcome to issue #18 of my AI newsletter! I will synthesise a narrative that analyses and links important happenings, data, research and startup activity from the AI world. Grab your hot beverage of choice ☕ and enjoy the read! A few quick points before we start: Nathan Benaich 1. : after almost 4 years of investing and building Playfair Capital, I’ve left to capitalise on exciting opportunities in AI. If you’re looking to invest, research, build, or buy AI-driven companies, do hit reply and drop me a line. Personal announcement 2. I gave a talk on at the Oxford AI Society, which includes frameworks to find -problem fit, product design tips for an AI-driven software model, and financing advice. How to start an AI company technology 3. : a) by Bradford Cross ( death to bots and ML as a service, long vertical applications, but remember to tread with caution), and b) by Eder Santana. My top picks Five AI startup predictions tl;dr So your company wants to do AI Referred by a friend? . Help share by :) Sign up here giving this it a tweet *Technology news, trends and opinions* 🚗 Department of Driverless Cars a. Large incumbents In a massive deal this quarter, CEO for $15.3bn. The 18-year old NYSE-listed Israeli company holds a portfolio of camera-based computer vision, sensor fusion, mapping and driving policy products for advanced driver assistance features such as pedestrian, vehicle and sign detection as well as relationships with Tier 1 OEMs. Intel takes the view that . Intel already has a strong position in silicon (in-house + and ), memory and communications. Mobileye therefore adds the “brains” component to accelerate “rack scale” end-to-end autonomous solutions that customers are asking for. This includes the 16m installed EyeQ3 chips (those Tesla used) that run tools like Mobileye’s (REM) product for mapping and localisation (going HD in 2018), as well as next generation chips for Level 3–4 automation. The Mobileye-Intel combination is set to compete head to head with NVIDIA with the following positioning (more details in this ): Intel agreed to purchase Mobileye “whoever has the best data can develop the best AI” Nervana Movidius Road Experience Management official presentation In an effort to distribute the challenge of building an automated safety and awareness processing stack, , which now has a permit to test AVs in California, and leading Chinese ride-sharing startup announced a . The dataset includes Velodyne LiDAR point clouds, radar objects and camera image frames ( ). Udacity DiDi $100k self-driving car challenge here , the self-driving car spin-out from Google X, is on Californian roads than its competitors. Data shows that Waymo logged 30x more autonomous miles in 2016 than others and only required human intervention 0.2x per thousand miles for safety reasons. Waymo also made the news for against . It focuses on Anthony Levandowski, who was a key engineer at Waymo and co-founder of Otto, a 7-month old driverless software company for trucks that Uber purchased for $700m. that Levandowski stole 14,000 confident files from Otto servers that describe key self-driving software and LiDAR IP before selling the company to Uber. It has subsequently transpired that Levandowski consulted to Uber on self-driving technology six month before he started Otto. Putting Uber further into the spotlight, one of its AVs was , where the company’s cars retreated post their testing ban from SF streets. Waymo proving far more effective pursuing a federal civil lawsuit Uber Waymo claims involved in a serious crash in Arizona has spent the better part of a over the last 2.5 years — an endeavor run by 1,300 researchers. The company was also a victim of an directed against its autonomous driving IP. High stakes at play here. Baidu $2.9bn R&D budget on AI attempted cyber attack announced it was close to releasing (AP2) based on NVIDIA Drive PX2 and in-house software (vs. 1.0 Mobileye system). While the speed limit for Autosteer has been upped, only 1 out of 8 camera sensors on the new hardware stack is being used. As such, a US law firm is for selling AP2 to customers before it’s ready. On the other hand, an Ohio car insurance provider is for Tesla owners with Autopilot. Tesla version 2.0 of Autopilot seeking to sue Tesla offering reduced premiums , in contrast to BMW/VW/Mercedes-Benz, is said to consider from their self-driving cars that are set to debut in 2021. They don’t buy that resting drivers can react sufficiently fast to intervene when needed, thus meaning Ford would skip from Level 3 to 5 autonomy. Ford removing all driving controls announced DRIVE PX platform collaborations with , the world’s largest automotive supplier, and , a leading global truck manufacturer. NVIDIA Bosch PACCAR b. The startups , a US self-driving startup authorised to test on public Californian roads, demo of a 2017 Lincoln MKZ navigating autonomously in day, night, light rain and cloudy darkness using only a front-mounted cameras. The company’s stated go-to-market model is to provide the self-driving OS to OEMs instead of selling aftermarket or operating their own service. The team draws roots from Princeton’s vision group. AutoX released an impressive video , the Oxford spinout led by Paul Newman and Ingmar Posner that has quietly built impressive mobile autonomy software, . This team tightly couples fundamental research at the University’s Oxford Robotics Institute with real-world applications for self-driving. In 3 years, it’s accomplished significant feats without venture financing, releasing an autonomous control system, . #LongUKAI! A prime CMU/Uber-style acquisition target here… Oxbotica was featured in the FT Selenium , a circuit board that extracts granular driving data from a vehicle and can issue accelerator and brake commands to the car. The only way to get your hands on one is by accumulating sufficient points on the company’s dashcam video recording app. The (updated) longer term goal being to aggregate worldwide driving data, presumably as a pseudo-Mobileye REM product. Comma.ai announced the Panda Chffr Slightly about , another startups that can test on public roads in California**.** The company retrofits a roof-mounted rig equipped with nine HD cameras, two radars and six Velodyne Puck LiDAR sensors and uses sensor fusion with deep to translate inputs to driving instructions. Current limitations include altering the vehicle path on the fly to compensate for obstructions that suddenly appear. The company is also said to focus on logistics in dense geographic areas as opposed to transporting people. more information emerged Drive.ai learning The big boys joined the , appointing Siri co-founder/CTO to the board along with representatives from DeepMind, Amazon, Microsoft, Facebook and IBM. Apple is also building out its in Seattle, following its acquisition of Turi last year. This includes a in ML at the University of Washington. The company also released a new app, , for native iOS video editing empowered by computer vision, NLP and AR tools. Furthermore, the new iOS 10.3 update includes a consent for Apple to (following differential privacy manipulation) to improve predictive features in Siri. Apple Partnership on AI to Benefit People and Society Tom Gruber engineering and AI research footprint $1m endowed professorship Clips read user iCloud data made at Cloud NEXT 2017 including the acquisition of data science community Kaggle, GA release of a) Cloud ML Engine for training and deploying proprietary models to the cloud, and b) Cloud Vision API. There were also releases to help data scientists visually explore and prepare data (Cloud Dataprep) as well as integrate data from BigQuery and Commercial Datasets, and the fully-managed data processing pipeline for batch and streamed data (Dataflow). This shows that ML infrastructure is indeed still a native space where opportunities exist for specialised startups. Separately, announced that it had reached for their audio content. More on video understanding later! Google lots of announcements YouTube 1 billion machine-generated video captions MIT Tech Review run a piece on ’ efforts to . Starting with replacing 4 currency traders with 1 software engineer, the firm has mapped the 146 steps required to take a company public to identify many that are “begging to be automated”. On their side, has made to develop an internal cloud infrastructure and environment to build and run machine learning applications. This includes their Contract Intelligence software, which interprets commercial loan agreements. The product cuts down on the 360k human hours required to analyse 12k contracts a year. Goldman Sachs breathe automation into their business JP Morgan significant investments , says Joaquin Candela (head of applied ML group) in this Backchannel piece on the group’s genesis and its impact on Facebook, Instagram and Messenger over the last two years. Facebook “today cannot exist without AI” Hardware British chip maker , which was acquired by for $32bn last year, . It applies to ARM’s Cortex-A CPUs and enables custom configurations of large and small CPUs in a single cluster. It also provides a shared memory subsystem, faster data transfer with accelerators and power savings that collectively focus on delivering performance and efficiency for running AI applications at the edge. ARM recently passed the 100 billion chips sold milestone since 1991. ARM SoftBank announced their DynamIQ technology announced a first product with their that is positioned to replace hard drives or SSDs by providing greater density and performance. Intel 3D XPoint memory technology keeps expanding the universe of cloud providers offering their Pascal architecture-based Tesla GPUs. They’ve just , followed by a to develop a new hyperscale GPU accelerator powered by 8x Tesla P100 GPUs for AI cloud computing. NVIDIA added Tencent Cloud collaboration with Microsoft AI research in production NYT ran a profile on various efforts to . The piece includes samples from and . Jukedeck CEO also discusses their progress on that also features Geoff Hinton. automatically create music Jukedeck DeepMind the BBC podcast Innovation in AI, whether it occurs in the real world or research lab, builds upon the shoulders of published research. There are in the implementation of research: Papers a) seldom contribute much time to solving and openly discussing engineering problems, and b) are fraught with a lack of rigor and reproducibility. These are important problems that we must work to correct as a community. , a new open-source publication for the machine learning, can help here. It provides new data visualization opportunities, transparency over methods and cash prizes for clearly communicated work. two fundamental flaws DistillPub Big ideas! It’s clear that talent is a bottleneck in software engineering and even more in AI. In order to deliver on promises for AI, we need to drive more talent from diverse backgrounds into the field and do so by . sustaining the institutions that educate future generations Turns out that the $100m investment of Braintree founder Brian Johnson in to create a neural interface between humans and machines isn’t as sound as he hoped. The project was apparently . Note: make sure sci-fi projects are grounded in scientific reality. Meanwhile, Elon Musk finally announced the launch of project! Kernel “too complex, too speculative, and too far from becoming a medical reality” Neuralink between current AI systems and true AGI is an embodied system for the AI agent. He points to AI systems as only replicating one of the many layers of human cognition, where the others are the biological substrates and complexity of eukaryotic systems. Ben Medlock argues that the missing link NYT runs a piece on , a 20th century Spanish neuroscientist who published fundamental on . Equipped with a microscope, he painstakingly sketched these neural structures and quite incredibly built up his reasoning from there. Santiago Ramon y Cajal how information flowed through the neurons and synapses in the brain Last issue we talked about a new frontier in training AI agents: complex simulation environments. At Google NEXT 2017, founder Herman Narula presented a quick talk on their distributed computation infrastructure for simulation that you . Improbable SpatialOS can watch here Policy and governance Researchers in Cambridge on the collaboration between and the UK’s (NHS). Based on analysis of information reported last year by the New Scientist, the authors argue that the breadth of patient data shared between parties was far greater than originally announced and concerns more than patients under direct care for acute kidney disorder. They claim that plans for a consolidated and canonical data infrastructure for the NHS is beyond the original stated remit of the collaboration (see ). More importantly, the authors state that minimal consultation was had with public bodies governing data privacy, health research and medical device regulation. The NHS and DeepMind responded saying that this paper misrepresents the use of data and makes both factual mistakes and analytical errors. While the tone of this piece is also unfairly harsh, it highlights the careful balance that needs to be struck between sufficiently complying with incumbent regulatory frameworks and streamlining these procedures to catalyse necessary upgrades to core NHS services. published a sharply critical piece Google DeepMind National Health Service DeepMind Blockchain project The list of signatories to the run by the Future of Life Institute continues to grow. Videos from this year’s conference at which questions of ethics, values and longer-term goals are discussed can be . Asilomar AI Principles found here Following Bill Gates earlier this year, French Socialist Party candidate Benoit Hamon suggested a corporation tax on economic value generated as a result of AI (“robot tax”) that will go to fund universal basic income. it’s an effective way to between the rich who can afford robots to work for them and the less well off who can’t. and automation isn’t the only factor that affects the incentive to participate in labor markets (e.g. education, safety nets, trade) and thus shouldn’t be targeted in isolation. Pro case: prevent further wealth disparity Con case: a robot tax stifles innovation Next frontiers for AI Developing systems that understand the contents of video in real-time remains a complex, unsolved problem. This is largely because current static image ML tools don’t go much beyond object recognition, semantic segmentation (labelling each pixel) and captioning. , which has users consume over 100 million hours of video a day, has set it sights on this problem because . launched a Kaggle competition using the YouTube-8M dataset, but that only focuses on predicting video labels from 4716 classes (e.g. “electric guitar”, “cuisine” and “talent show”). Meanwhile, a startup in Berlin called is attacking . First, they build a dataset of crowd-acting videos that depict short segments of objects interacting with one another (e.g. placing/pushing/dropping an object onto/on/off a table). Next, they train networks to accurately predict these correct action labels to learn common sense about the 3D world in which objects interact that can be transferred to new problems. Video understanding: Facebook “ video understanding is going to be ridiculously impactful ” Google TwentyBN video understanding from a unique angle have shown that machine learning can be used to improve how learning systems learn (termed “learning to learn”). Jeff Dean of Google Brain stated that this “automated machine learning” is the most promising avenue his team are working on. Learning to learn: Several research groups WIRED features a piece on a few researchers and companies working on . This is key in the real world where there are only a few examples of driving accidents as a proportion of regular driving footage. AI systems must reason on this uncertainty to make the best (interpretable) decisions. Data efficiency: data efficient means of handling uncertainty , the British semiconductor startup developing novel silicon optimised for intelligent applications, released of networks at work on their hardware. Watch this space as the company unveil aspects of its core technology this quarter! Hardware for computation: Graphcore beautiful teaser visualisations Healthcare received 510(k) clearance from the FDA to market it’s deep learning solution for on cardiac MRI images. This is allegedly the first regulated implementation of cloud-based deep learning in the clinical setting and adds to a CE Mark received in December 2016. Arterys automated ventricle segmentation announced a to integrate and market its deep learning-based non-contrast CT system to help assess patients suspected of head trauma or stroke and rule out brain bleeds. The company is conducting a clinical trial a working towards PMA Class III regulation with the FDA. MedyMatch collaboration with IBM Watson Health outline technology. I do agree that there’s huge value to be created in diagnostics (imaging and physiological sensor), therapeutic discovery and development (see ), treatment & care monitoring, as well as clinical & administrative workflow optimisation. Recent examples include Grail and Freenome (liquid biopsies). Eleven Two Capital opportunities for data-driven health this piece by NVIDIA Researchers at the Scarborough (commercialised via Structura Bio) have demonstrated they’re able to from tens of thousands of low-resolution 2D electron cryomicroscopy images. Existing methods require days to weeks and as much as 500,000 CPU hours and prior understanding of the target structure — new methods overcome these bottlenecks to speed up drug discovery. . University of Toronto reconstruct the 3D structure of protein molecules Paper here People tracker , Chief Scientist at Baidu and the original lead of Google Brain, from the Chinese search giant. Andrew remains a driving motivational and educational force behind the adoption of AI in companies and by students (e.g. via his Coursera ML lessons) worldwide. steps up to lead AI at Baidu. Andrew Ng announced his departure Wang Haifeng , Professor of Information Engineering at the University of Cambridge, in connection with the acquisition of Geometric Intelligence. Zoubin is a world-leader in probabilistic modelling and machine learning, focused on decision making under uncertainty and learning efficiently from limited data. Zoubin will move to the West Coast. Zoubin Ghahramani steps up to Chief Scientist at Uber , Professor of Cognitive Robotics at Imperial College London, at DeepMind as a Senior Research Scientist. He moves to part time at Imperial. His early work focused on symbolic reasoning, cognitive robotics and increasingly on unifying symbolic reasoning with reinforcement learning. Murray Shanahan took up an appointment , formerly part of OpenAI’s founding team, has moved back to Google Brain. Ian Goodfellow , who co-founded Madbits (acq. Twitter) and then tech lead for Twitter’s Cortex AI team has as head of AI infrastructure. Clement Farabet left to join NVIDIA *Research* , . Lots of recent work has tackled the problem of taking a low resolution photograph and mapping it to a high resolution version (“super resolution”). However, these approaches tend to work poorly when considering a low-quality high magnification image where there are multiple reasonable high-quality mappings. Here, the authors train a probabilistic pixel-by-pixel CNN on pairs of low and high quality highly-magnified images. The model can be sampled to produce multiple plausible high resolution images that fool naive human observers. Recursive pixel super resolution Google Brain , . The world around us is full of inherent uncertainty makes understanding the present and reasoning about the future a challenge. In this work, the authors present a framework for learning models to account for uncertainty a) inherent in environmental observations and b) in the learned model as it applies to computer vision tasks. Their approach unifies modelling of both uncertainties to achieve new state-of-the-art results on segmentation and depth regression benchmarks for street level and home interior images. What uncertainties do we need in Bayesian deep learning for computer vision? University of Cambridge , . This paper considers the problem of a robot efficiently learning a) a task by watching just one demonstration (start and finish) and b) generalising to new conditions and tasks unseen in training data, also with just one demonstration. This is interesting because while it is possible to use behavioural cloning (supervised learning) and inverse reinforcement learning (reward function that explains the behaviour), these methods don’t allow a robot to accelerate its learning to imitate new skills. Reinforcement learning, on the other hand, requires many examples of trial and error. One shot imitation learning OpenAI and UC Berkeley , . Today’s neural networks are very effective at learning supervised task with supervision. However, in order for a network trained for task #1 to perform well on a new task #2, it must be retrained with data for task #2. In doing so, it loses its ability to solve task #1 — a major limitation towards general intelligence that is termed “catastrophic forgetting”. This work proposes an approach to overcome this problem by slowing the updating of weights in a neural network that were key to its ability to solve task #1 while it’s learning task #2. This selective decreasing of weight plasticity protects prior knowledge and enables continual learning in challenging reinforcement learning scenarios of Atari 2600 games. from DeepMind. Overcoming catastrophic forgetting in neural networks DeepMind and UCL Blog post here , . In order to model temporal and sequential data (e.g. language, time series, video streams), it is important to learn long temporal dependencies inherent in the data. Using Long Short-Term Memory (LSTMs) RNNs to store and protect information over the longer term, however, doesn’t scale with large capacity storage. The authors present a generative temporal model where computation is separate from memory, which can store early information from a sequence and efficiently reuse the information in the future. Generative temporal models with memory DeepMind , . Deep reinforcement learning methods exhibit very slow learning rates. For example, state of the art agents require >200 hours of gameplay to perform as good as a human with 2 hours of experience. Here, the authors introduce Neural Episodic Control as a method to dramatically improve the learning rate and discovery of highly successful strategies in Atari 2600 environments. This is accomplished by writing all the agent’s experiences to memory and updating its memory faster than the rest of the deep neural network. Separately, researchers at Carnegie Mellon University published an approach to storing 2D memory images in order to solve long-term navigation in 3D mazes using deep RL ( ). Neural Episodic Control DeepMind Neural Map: Structured memory for deep reinforcement learning *Resources* Researchers at Facebook and others released , a python package that offers a GPU-ready Tensor library to replace numpy and a framework to build neural networks using dynamic instead of static graphs, which handle variable workloads better. Stephen Merity discusses . PyTorch why that’s useful here Have a basic knowledge of ML and keen to learn best practices from Google? Here’s a on by Martin Zinkevich, Research Scientist at Google. 43 rule playbook Listen to Ian Goodfellow and Richard Mallah’s on the Future of Life Institute podcast. highlights for AI in 2016 Best practices for training deep learning networks: ! a high level infographic *Financings and exits* 135 deals (64% US and 25% EU) totalling $1.26bn (43% US and 4% EU). Big rounds , which automates laborious steps in data preparation, predictive model design, training and evaluation, round led by NEA. This marks significant interest in ML-specific infrastructure software to bring tools that otherwise only really exist in AI-first companies like Facebook and Google to the masses. DataRobot raised a $55m first tranche of a Series C , a software product to record, analyse and enhance sales call effectiveness, led by Redpoint Ventures and Emergence Capital. The goal is to rigorously discover opportunities on a live call and upskill sales staff. Chorus.ai raised a $16m Series A , which offers data science software for predictive applications in industry, raised a $70m of a $125m planned Series C led by RevolutionGrowth. Uptake Technologies Early rounds , which published a significantly compressed yet powerful model for computer vision tasks ( ) that can be applied to edge devices for self-driving, from Bessemer, Greylock and Auto Tech Ventures. DeepScale SqueezeNet raised a $3m seed round , which develops software that uses video fusion to enable humans to assist robots in automating high-value tasks where there is low margin for error, raised a $1.36m Series A round led by Seattle Angel. The team draws its experience from the University of Washington and work with the Navy to clean up weapons and munitions from the seafloor. BluHaptics graduated (W17: and ) a few ML-based companies including: 1) (telemedicine for machine-driven analysis of CT scans in emerging markets), 2) (crop yield optimisation using satellite and weather data), 3) (voice-based analytics for sales calls aimed at tracking and improving performance), 4) (using NLP to transform previous customer support interactions into website FAQs) and 5) (3D point clouds from street level images for autonomous vehicle localisation and navigation). Y-Combinator day 1 day 2 AlemHealth Vinsight Clover Intelligence Quiki lvl5 graduated ( ) many ML-driven companies including: 1) (predicting onset of sudden cardiac arrest), 2) (optimising fish feeding in aquafarms using underwater video footage), 3) (controlling greenhouse environments to optimise farming yields). Entrepreneur First #EF7 Transformative Observe Optimal (of 5 are still in progress), including: 18 acquisitions acquired by for $15.3bn as discussed earlier. Mobileye Intel , a German startup data security solutions, was by to beef up the security of their IoT implementation platform. Neokami acquired Relayr , the largest community of data scientists online who compete on contributed data problems in a tournament style, was by . This move buys Google mindshare amongst data scientists and provides a new channel to distribute cloud infrastructure services to a burgeoning market. Kaggle acquired Alphabet , which developed video search and discovery solutions applied to live streaming, was by to form a computer vision group working on police crime video data. Dextro acquired TASER International , which offered a video content recommendation service, was . MightyTV acquired by Spotify — Anything else catch your eye? Do you have feedback on the content/structure of this newsletter? Just hit reply! is how hackers start their afternoons. We’re a part of the family. We are now and happy to opportunities. Hacker Noon @AMI accepting submissions discuss advertising & sponsorship To learn more, , , or simply, read our about page like/message us on Facebook tweet/DM @HackerNoon. If you enjoyed this story, we recommend reading our and . Until next time, don’t take the realities of the world for granted! latest tech stories trending tech stories