Machine learning is expensive - it requires high computing capacity. For example, training of the OpenAI’s famous GPT-3 model , and requires an amount of energy equivalent to the and creates a carbon footprint equivalent to traveling 700,000 kilometers by car for a single training session. would cost over $4.5M yearly consumption of 126 Danish homes Photonic processors are one of the most promising solutions to the huge computing capacity need and the extreme carbon footprint because they are fast (as we know, light is the fastest thing in the universe) and really energy efficient. Some companies on the market have production-ready photonic processors. One of my favorites is LightOn because their solution is relatively easy, and the way it works is something like magic (you will see why). can do only one thing, but it can do this one thing fast and efficiently. It can multiply a vector with a giant random matrix. The hardware converts the input to a light pattern. A special waveguide does the random matrix multiplication in an analog way. In the end, a camera converts back the result to a vector. The random matrix is fixed, so the multiplication hardware (the waveguide) here is a passive element. But how can help a random matrix multiplicator to machine learning? LightOn’s OPU (Optical Processing Unit) Multiplication with a fixed giant random matrix sounds meaningless, but don’t listen to your intuition, it is a very meaningful thing. It is a convenient way for dimension reduction called random projection. The random projection has a good property; it nearly preserves the distances because of the . So if you have two vectors in a high dimensional space, the distance of the projected vectors will be nearly the same as their original distance. Johnson–Lindenstrauss lemma This distance preserving property can be used very well in reinforcement learning. In LightOn’s example, they are using their OPU to convert the actual game state to a 32 dimension vector. Here the game state is a screen snapshot that is represented by a 33600 (210x160 pixels) vector—the OPU projects from this 33600 dimension space to a 32 dimension space. The learning method is simple Q-learning. The algorithm calculates the Q value and stores it for the (projected) state. If the Q value is unknown, it uses the Q value of the 9 nearest neighbors and calculates the average of them. Because of the Johnson–Lindenstrauss lemma, the distance of the projected vectors is more or less the same as the original state vectors, so the K nearest neighbors of the projected vector are projected from near original vectors. In this case, OPU is used to reduce the dimensions to create an easier problem (a 32 dimension vector is better than a 33600 dimension) which needs less storage, less computing capacity, or a smaller neural network. The same method can be used in recommender systems or any other cases where dimension reduction can be used, and the OPU does it fast on really high dimensional vectors. You can find the code . here Another use case, when OPU is used to project from a low dimensional space to a high dimensional space. In LightOn’s transfer learning example, OPU is used instead of the linear layers of a neural network. In this case, a pretrained CNN (VGG, ResNet, etc.) is used for feature extraction, but the classification is not done by a neural network. Instead of the dense layers, OPU is projecting the feature vector to a high dimensional space where the samples are linearly separable. So, thanks to OPU a simple linear classifier can be used instead of a neural network. The above 2 examples used the OPU to map the original problem to an easier problem that needs less computing capacity and storage, but can it be used to make the neural network training process easier? The answer is yes. There is a method called in this case the . This method has 2 advantages. We can train the hidden layers parallelly and the error is propagated through a fixed random matrix what can be done efficiently with the OPU. Using random matrices instead of backpropagation sounds like magic, but it works. Direct Feedback Alignment error is propagated through fixed random feedback connections directly from the output layer to each hidden layer Here is a nice YouTube video about . DFA As you saw, a relatively simple photonic hardware can help multiple ways to reduce the huge computing capacity needs of machine learning, and OPU is only one of the photonic hardware on the market. There are other solutions with programable matrices or quantum photonic solutions. It is worth keeping your eyes on this field because machine learning needs more and more computing capacity, and photonic processors are one of the most promising ways to provide this.

YouTube

Writing Decentralized Applications in JavaScript - Libp2p Basics

Code A Minimalistic NFT Smart Contract in Solidity On Ethereum: A How-To Guide

Nominated for 2022 - HackerNoon Contributor of the Year - Neural Networks

How Photonic Processors Will Save Machine Learning

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

A Brief History of Money: From the Economy of Favors to Bitcoin

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

A Brief History of Money: From the Economy of Favors to Bitcoin

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps