As we ponder what 2019 will bring, I’d like to share some ideas I’ve been thinking about. I’m giving them freely to everyone, so if people want to use them, build on them, or copy them in any way, great! I’m making this article “no rights reserved”. The reason I’m giving these ideas away is because I’d love to see them come to fruition in a thoughtful way, but I don’t have the capital nor the talent to do these things by myself. If someone would like to pursue these apart from me or with me, please do. They say that success is 1% inspiration and 99% perspiration. If true, here’s that 1%. But even then, few if any of these ideas are new, they might just be put together in a new way. Please feel free to let me know if you think any of these are good ideas, either in a public or private response or comment.
This is a future where a thousand or ten thousand people sell their time, their diet, their personal genetic code, and effects on their bodies for science. And not in as crazy a way as you might think. Nutritional research firms will get the full DNA readouts of participants who will agree to live on a closed campus of this research firm for a time that could vary from two weeks to three months. The participants will make a good wage, but they will have to stay on campus and conform to a strict diet in order to earn it. That diet will focus on certain foods like nuts or fruit or a certain meat or whatever the researchers want to study. Every week or two the diet could change. The participants will at the beginning and at various times throughout the study tell the researchers how they are doing and respond to detailed surveys. They will also be given blood tests and health evaluations like a stress test on a treadmill. Participants cannot leave early unless they are willing to get paid a prorated rate that is cut in half to disincentivize people from leaving early. People can’t bring any food or supplements onto the campus, but they are free to workout at an on-campus gym, communicate with the outside world through technology, and generally do what they want except that they must maintain a strict diet.
The researchers over time will find out the ideal foods for each individual based solely on their genes. They can do this using machine learning and data science to find out which foods that people eat cause positive and negative outcomes based on the specific genes that a person has.
Who would pay for the research to get such information? Health insurance companies of the future could pay for this sort of information and provide it to their insured population, all in the hopes of making that population healthier. Beyond this, at risk or sick people may get “free” food sent to them by their health insurance company to encourage patients to eat the perfect diet. Some discount health insurance plans may force customers to eat based on their prescribed diet, or face insurance premium increases.
This research could also be launched with a Kickstarter campaign by a team of researchers. The Kickstarter might need ten million backers for $10 a piece to do a 10,000 person 3 month study of many foods. That’s assuming that research of this scale would cost $100,000,000, but it might cost more. Perhaps one million backers at $100 per person might be more realistic pricing. But if it’s $10 or $100, that sum gets the backer a login to a system that can compare their genome that they’ve already obtained from an outside service to the dataset that the research creates to tell them the ideal foods and nutrition for their body. Also, newcomers that didn’t participate in the Kickstarter campaign could pay a certain amount to see their ideal diets as well.
This might not be the most utopian vision, because I love chocolate and wouldn’t want to give it up. But it will be great when it trickles down so that virtually everyone knows whether gluten and dairy really should be cut out of their diets. This type of research could also tell us the ideal drugs for different people to take based on their genes.
A wafer-scale computing chip looks like this:
Image from Wikipedia
The entire wafer is a huge interconnected chip. According to Wikipedia, this method of trying to create a more powerful chip has failed in the past. But the article also says that there’s a firm called Cerebras Systems that’s trying to resurrect this method to produce a chip for machine learning. I think it’s about time.
The reason this huge chip doesn’t work well right now is because defects are bound to happen some of the time. A tiny contaminant in the air of a chip producing facility makes this inevitable, and no facility is perfect. 10 defects on a wafer with 100 chips might only make 10 chips unacceptable, so you still have 90 good chips. This might be an acceptable cost of producing cutting edge microchips. But on a single-chip wafer like the one pictured above, just a single defect on the wafer would render the entire chip worthless unless the chip is somehow mended or modified. This mending is very costly and hard to do, if it’s even possible.
Here’s an example of a defect in the Extreme Ultraviolet (EUV) lithography process that chipmakers are frantically trying to get working well enough to make chips with it:
Image from Wikipedia. According to Wikipedia, this shows an “EUV stochastic failure. A random missing contact hole is a stochastic defect that can occur in EUV lithography.”
I believe one solution to this problem is in a redundant artificial intelligence (AI) chip that is structured similarly to a mesh of neurons in the human brain. Right now it would be two dimensional, but as three dimensional chip design is perfected the chip could actually be layers of chips stacked on top of each other so that it can more accurately replicate the 3D structure of the human brain.
Redundancy will serve two purposes that can increase the computing performance by increasing the number of neural compute units over time: it will enable wafer-size chips so that there are many more transistors per wafer, and it will also enable EUV at even smaller levels than the 3 nanometer (nm) level that is now planned by Taiwan Semiconductor Manufacturing Company (TSMC) and Samsung. Even if error rates are at 25% for each neural equivalent unit, the chip could still work because almost every neuron connects to at least four other neurons. If one or even two of the four neurons that connect to a particular neuron have defects and are effectively dead, that particular neuron’s processing power and information can still be useful because it can send its information and signals to the other two neurons that are against it. Connections to even more neurons either diagonally or further away could be an interesting experiment to see how the performance might change in this theoretical chip. These two factors will mean many more transistors packed onto a single chip, and the smaller nanometer process in particular will mean that the chip can run at lower levels of electricity than a larger nanometer process node would allow.
Here’s the design of the wafer-scale neuromorphic chip:
I created this in Paint. I’ve reserved no rights to it, so please feel free to use it and build upon it.
Hopefully you can see from this potential layout that the chip is resilient: even with a few “dead” neural processing units here and there, the system can still work perfectly fine. Each processing unit can increase or decrease its connection to surrounding units: that’s how the network gets better at doing what helps it complete its goal. And each connection is actually two connections: one connection gives output to one neuron which receives it as input, and the other connection goes in the opposite direction, so the receiver neuron becomes the giver neuron.
Each neural processor is exactly the same, and the connections are exactly the same. In this way the processor becomes like the transistor is in Moore’s Law. The transistor is a simple component that is easy to scale as the process to make transistors improves so that transistors are smaller. In this new chip, the neural processor is the simple unit because it is replicated millions or even billions of times across the chip. With Moore’s Law transistors on a chip double every two years. With the new method, processors on a chip can double every two years, maybe even faster.
I recently looked into how fast supercomputers are increasing in power. They are more than tripling in power every two years (it’s about a 3.08X factor every two years, from my calculations based on data from https://www.top500.org/). That’s a lot faster than Moore’s Law’s 2x factor and an exciting prospect for the study of simulations of the brain and its neurons so that we can get closer to mimicking the human brain.
As a side note, if the human brain can be perfectly mimicked (perhaps along with a rudimentary robot body and nervous system, although this is a hypothetical because that could create ethical as well as existential problems for the human race) than we should have the emergence of human level intelligence, although if it’s a perfect mimic it should take 15 to 20 years to learn and achieve adult-level intelligence.
With this new chip, a processor is no longer a complicated tangle of transistors, but we can now develop one tiny neural processing unit that has a little bit of compute and a little bit of memory, and connections all around it, and this can be the single building block for a new chip.
The benefits of this type of chip are twofold: one is that it will be extremely cheap to produce it because the design cost will only be in one tiny compute unit that gets replicated across the chip, but it will also be cheap because manufacturing errors will be acceptable, unlike previous chips. Because Friston’s free energy principle (more on this principal a little later in the article) will be built into each compute unit, no software will need to be written for this particular chip. This will be another significant developmental cost savings.
The second benefit of this chip is that designers can focus all of their attention on improving the performance of a single neural compute unit, and this will increase the performance of the entire system.
A company creating a system like this can do it in a fabless way, meaning they don’t need the technology to create the chips themselves. They can hire out the likes of TSMC to produce the chip at 7 nanometers for them. TSMC is already doing this for other firms and is currently ahead of Intel in that their process is smaller than Intel’s. TSMC already has plans for a 5 nanometer (nm) process and was recently approved to build a plant for 3 nm production at a cost of 19.5 billion dollars. My point is that TSMC, Samsung, and Intel are the major forces driving the transistors on chips to get smaller, and I don’t see TSMC slowing down in the near future. As these transistors get smaller we can pack more compute units onto a single chip, getting closer to the hundred trillion or possibly even one quadrillion synaptic connections that the human brain has. The closer we can get to this the closer we can get to true artificial intelligence that can think like a human.
These giant neuromorphic chips could get spectacularly more powerful over the next few years as scientists and engineers singularly focus on improving the performance of the single compute and memory unit with its connections, and TSMC and others drive the future of smaller computing forward.
The Extreme Ultraviolet (EUV) lithography process that will soon (or has already) help TSMC in the 7 nm and 5 nm process may be producing random manufacturing defects, as shown two images above. This neuromorphic chip with its error resilience could be the key to making EUV lithography cost effective since a high error rate might linger on, making conventional chips less cost effective than most companies are willing to bare. These neural chips work even with the manufacturing defects. TSMC’s current, commercially available 7 nm process does not incorporate EUV lithography as far as I know, but they are trying to incorporate partial EUV lithography for an improved 7 nm process that will be available in the future. Hopefully that goes well. It seems it will since they are already looking to commit 19.5 billion dollars to build a factory for a much smaller process node that I assume will use a similar or even more advanced process.
The neural processor itself could be a unique mix of transistors that process information, store memory, and perhaps even have a long term storage capability that retains information when the chip turns off. One reason that I think a chip like this doesn’t exist yet is because we don’t know how to make the neurons self organize, or how they will have goals. Maybe we do now. I recently read about Karl Friston’s free energy principle in an article by Wired.
The principle is poorly named, in my opinion. It’s partly a poor name because in the theory organisms work to minimize free energy. Maybe the expensive energy principle would be a more suitable name. Also, “free energy” machines are the scourge of engineers and physicists all over the world: they see these perpetual motion machines as quack science. The principle has nothing to do with that sort of free energy.
Perhaps the principle should instead be named the Certainty Principle. The principle is, at the risk of oversimplifying it, a system that helps organize and run all organic beings and systems by working to negate the effects of the second law of thermodynamics, which essentially says that there will be entropy in the universe. Entropy means that things will break down. Organisms must counter that in order to survive. The cells in the organisms counter that within the bodies they are in, as does the cell nucleus, particular organs, entire societies, and to some extent the governments that those societies institute. Everything from the most basic biological structures, to small and large societal structures, to the psychology of the mind can be explained using Friston’s theory.
Here’s one of the most interesting parts of the Wired article:
After a while it became clear that, even in the toy environment of the game, the reward-maximizing agent was “demonstrably less robust”; the free energy agent had learned its environment better. “It outperformed the reinforcement-learning agent because it was exploring,” Moran says. In another simulation that pitted the free-energy-minimizing agent against real human players, the story was similar. The Fristonian agent started slowly, actively exploring options — epistemically foraging, Friston would say — before quickly attaining humanlike performance.
The principle itself has a crazy amount of math that I may never understand, but if that math can be formalized in the compute unit of a single neuron on a chip, we could solve artificial intelligence.
As you can see from the Wired article quote, part of the free energy principle encourages exploration. Part of this has to do with something that makes organisms what to know what the future will hold, or for lower level organisms, to get what is hoped for. If I go to swat a fly and miss, I can either give up on my prediction that I can hit the fly, or I can modify my strategy and hit the fly (eventually). Either way I am gaining more certainty about the future by testing what I think will happen and then modifying my behavior afterwards to make what I think will happen, happen.
The article explained that an AI created using at least part of the concept of the free energy principle was better than a traditionally trained AI in playing the 3D first-person shooter computer game, Doom. That tells me we might be on the right track toward developing an AI that can truly learn many new things, not just a well-defined set of new things.
In the way shown in the diagram above, an AI could take in input from a computer screen and learn to move a mouse, and would avoid uncertainty in moving the mouse while exploring all of the functions of a computer. Or perhaps easier would be to start the AI off in a 3D world where it can only type words and use the arrow keys to move with minimal other interactions available. It could learn to interact with the world and other people in this way.
The neurons themselves would have to be developed to work in a free energy principle way, so that the whole wafer-sized chip would work in that way.
This is what it says. My thought is that we will soon have Artificial General Intelligence, or AGI.
Many are afraid of AGI, but why aren’t we afraid of the gene editing that can potentially kill humanity with modifications to a common disease such as the flu, a cold, or a bacteria? The answer is that most people don’t want to kill the whole world, especially the smart ones who don’t want to kill themselves, and there are plenty of safeguards working to stop this from happening, and people working to keep things safe. Many of the concepts from this paragraph are from a three hour interview with Ray Kurzweil.
AGI will be good because we will be able to ask it questions. I like this question for it: “What is the biggest question I should be asking about humanity that could affect humanity the most?” Some people think it’s climate change, but maybe the AGI will realize that we should be more worried about a solar flare that could wipe out most electronics, or an asteroid that it notices that is barreling towards us, or something totally unknown to us humans.
The biggest question for the AGI might be, “What questions should we be asking that we aren’t asking?” It could answer with the question to answer, and then propose an answer. That would be nice.
The baseline knowledge for an AGI to be able to answer questions and ask questions would be from it reading much of the books, scholarly articles, and online content that it can get its “hands” on. It may be able to read and understand all of this information quite quickly, all while establishing connections of key concepts that are related to other concepts in its neural architecture that forms a sort of knowledge base, and perhaps even a wisdom base.
From that information it can develop a rough idea of an answer and then test those assumptions. Perhaps more interesting will be when it finds contradictory research within human articles, or when it finds that humans are making assumptions that they haven’t properly tested. I have a feeling that the number of items or whole areas that the AGI finds us making the wrong assumptions in, or at least not properly testing things before making assumptions, will be mind-bogglingly high. But it doesn’t mean we shouldn’t pursue these issues with further research, because of some sort of despair; it simply means we need to come up with a good way to prioritize which issues to look into further first, and second, and so on, and then try to correct our understanding of the world and our way of life.
It will be like layers of an onion, where we correct something that on the surface seems like a horrible problem, but the AGI will have shown us an even deeper assumption that affects that correction in a way that we need to make another correction to the initial correction, because at first we didn’t understand how important the deeper issue that the AGI gave us was. Even if an AGI gives us a really deep issue as the most important problem for humanity, we probably won’t solve the issue if we don’t even understand what the AGI is talking about. That’s the point at which we’ll have to work backwards and solve the simpler issues that we can understand before we can get to deeper issues that we don’t even understand. This of course all implies that we will have an AGI that is smarter than the average human, perhaps much smarter.
If we ask the AGI whether the benefits outweigh the costs for the general population for one flu vaccine over another, the AGI could create a near-perfect physics simulation of the vaccine and test it in models of a sampling of 10,000 virtual humans and see the side effects as well as the efficacy of the vaccine, all on its own, and then tell us the probability that one vaccine will be better than the other, or maybe it will tell us to use both or neither of them. We don’t have to trust its answers, or the questions it suggests to us, but this can be a starting point for us for further research. Over time, as we see that the AGI continually gives us correct answers and useful questions, we may start to trust its outputs, but hopefully always with a wary eye in the event that the AGI is somehow trying to hurt or trick us, or simply makes a mistake.
The answers we get from the AGI might be in probabilities, not in absolute yes’s or no’s.
What would you ask an AGI?