19,523 reads

Why I Left Red Hat

by Daniel JeffriesOctober 15th, 2019

Too Long; Didn't Read

Daniel Jeffries joined Red Hat in 2010 when the company was still “the world’s biggest startup” with only a thousand plus employees. Jeffries: I was burned out, working all night and all day, living in data centers, eating burgers and fries endlessly and pounding energy drinks. I realized the company didn’t need a guy like me anymore. I'm a risk taker. I color outside the lines. I’m the kind of person you need when you’re trying to get something started and there's no blueprint. You need to start a jazz solo instead of playing the sheet.

Companies Mentioned

Coin Mentioned

Everybody remembers their first time.

My first time with Linux was transcendent.

It’s hard to imagine a cold command line as sexy but it was to me. Inside that tiny blinking cube of light was sheer power, just waiting to be unleashed. Linux was something magical, something radical, something that could change the world.

And like anything revolutionary it was a little dangerous and mysterious too. It threatened to topple the old orders of power.

Up until that time every major company on Earth developed their software behind closed doors. The only people who ever saw their source code were the army of coders the company hired to crank it out. Microsoft was the king of the proprietary software. The WinTel dynasty of Windows and Intel ruled the desktop with 95% of the market. It was a true monopoly, like John D. Rockefeller’s Standard Oil in the 1800s which dominated 91% of oil production in the United States.

Old power structures don’t die easily. They fought back. SCO fought Linux with lawsuits. Microsoft unleashed an all out assault on it in the media. Steve Ballmer called Linux “communism" and “cancer.”

I joined Red Hat in 2010 when the company was still “the world’s biggest startup” with only a thousand plus employees. I’d run my own consulting company for a decade building early Linux web farms and SaaS companies but I was burned out, working all night and all day, living in data centers, eating burgers and fries endlessly and pounding energy drinks.

It may seem crazy now but there was a time when nobody knew if open source was a flash in the pan.

Companies feared it. Legal departments banned it. I spent my first few years at the Hat trying to convince skeptical, curmudgeonly Unix engineers that Linux could stand against the titans of Solaris and AIX and that it wouldn’t crash and cost them their jobs.

A decade later there’s no question about which dev model won.

Even Ballmer knows he was dead wrong. Visionary Microsoft CEO Satya Nadella morphed the company from a “Windows everywhere” monoculture to playing nice in the sandbox with everybody. They became a game and cloud company that loves Linux. They even bought Github, the nerve center of distributed, open source development.

Today, open source has eaten the world.

Every major technology starts there, whether that's cloud, AI, mobile or containers. If you’re young and just getting started in tech, you never lived without it. It’s like a tree or a river, just a default part of your reality that was always there.

But as the Linux changed the world, I changed too.

Red Hat grew from half a billion dollars in revenue to three billion. Every time I turned around there were strange new faces at the company BBQ who I didn’t know as it mushroomed to 13,000 employees.

And in a moment of terrifying clarity I realized the company didn’t need a guy like me anymore.

I’m a risk taker. I color outside the lines. I’m the kind of person you need when you’re trying to get something started and there’s no blueprint.

Paul Graham wrote that the kind of person you want at a startup is "a hacker who will stay up till 4:00 AM rather than go to bed leaving code with a bug in it; a PR person who will cold-call New York Times reporters on their cell phones; a graphic designer who feels physical pain when something is two millimeters out of place.”

And that’s exactly who you don’t need when you’ve got a big company that’s got it mostly figured out. You need folks who won’t start a jazz solo instead of playing the sheet music.

Some of you might think that’s worse, as if a company declines when it needs a different kind of person to run it. But it’s not worse. It’s just different. Different people are needed for different times. The river is always changing and we’re different and so is the river.

And so it was time to go.

But go where? Do what?

I stopped and asked myself one question:

What will change the world?

And there’s really only one answer, only one technology that’s wonderful, powerful, dangerous and sexy and will remake the very nature of reality.

Artificial intelligence

Maybe you’d thought I’d say cryptocurrency?

Don’t worry. I haven’t lost my passion for crypto and I’ll keep writing on it and working on it as it continues to change the way money and power work in the world.

But when it comes to AI, no technology holds more promise and peril in the coming decades. There’s not a single business, industry, person or country that won’t be touched by its power and influence.

As a friend of mine at Red Hat once said “there are only two types of jobs in the future, ones assisted by artificial intelligence and ones done by artificial intelligence.”

If I wanted to change the world that’s where I had to go next.

But this time is different. I’m not just a passenger in the boat, going along for the ride on the Internet revolution or the open source uprising. I’ve seen how technology can start with a twinkle in an engineer's eyes and go badly wrong. Too often tech promises to free us all and then one day we wake up and we’ve created a surveillance economy instead of an “information just wants to be free" utopia.

This time I plan to do something about it. I won’t just ride the winds of fate and hope we luck into a fortunate future.

I’m going to help make the future I want to see. As Sarah Conner once said “There’s no fate but what we make for ourselves."

But where to start?

I knew I needed a multi-pronged plan of attack. It wasn’t just enough to go work in AI. I needed to think bigger and weave together a thousand threads to help change the world.

I’ve decided to attack those problems on two very different fronts:

Both are part of the same tapestry but let’s start with ethics and move to infrastructure.

The Age of Intelligent Machines

AI is capable of doing amazing good in the world.

Too often popular Hollywood fantasy fears of Terminators and super AI’s taking over everything make people terrified of phantoms and blinds them to all the good it can do.

Imagine you had a little app on your phone that could detect skin cancer.

You point it at the spot and it tells you whether you should call your doctor. Then you call the doctor and send the results of the scan to the triage nurse on the other side of the phone. Now she can make better decisions about who gets to see the doctor instead of just scheduling everyone on a first come, first serve basis.

Today you could be waiting for an appointment for a month or two. Meanwhile you have a serious problem that needs looking at fast but you’re in line after the old lady who just likes talking to doctors and the hypochondriac all because they called in first.

In the future, you won’t wait. You’ll simply send that little cancer app’s report over to the triage nurse and she’ll put you first in line because she knows you have a real problem.

That’s just the tip of the iceberg. AI will change the way we do everything. We’ve already seen self-driving cars and AI beating the pants off the world’s best Go player. Alibaba’s “City Brain” spots car wrecks in 20 seconds and calls ambulances, all while helping to route traffic better.

But for all the good AI is capable of it’s also got a dark side. Those same cameras in China that can spot car wrecks can also track dissidents in authoritarian regimes.

It’s not just Big Brother that we need to worry about either. Those are just the flashy problems. Humans are great at seeing big, flashy threats and missing the more important ones right in front of our noses. While we’re focused on fake problems, there are real ones facing us right now.

We already have algorithms deciding if people should go to jail or get bail. Tomorrow they'll help decide who gets a job, who gets a loan, gets into school, or gets insurance. That doesn’t have to be a bad thing but when it goes wrong it can go horribly wrong.

Today there’s very little transparency in machine learning. Cathy O’Neil, author of Weapons of Math Destruction, talked about the case of Tim Clifford, a teacher in the New York City public schools, who was teaching for twenty years, won multiple awards and got rated a 6 out of 100 one year and 96 out of 100 the next year even though he didn’t change a single thing in his teaching style.

O’Neil’s book started with a story of another teacher, Sarah Wysocki, who got fired because of her poor scores from an algorithm called IMPACT.

Too often algorithms are black boxes or proprietary. They’re developed behind closed doors and there’s no transparency or way to audit their decisions. It’s a lot like the way we developed software before open source swept the world.

The problem with black boxes and proprietary AI systems is we don’t know how they’re making decisions. You just have to accept how it works. The algorithm could deliver world class, incredible results or total garbage that looks plausible.

A friend of mine looked to an AI SaaS startup to help them hire great people. At first the SaaS company's demos looked great. But after a bit my friend started to sense something odd about the candidates it was picking. They asked the AI company to show them what kinds of features the computer had picked out as good characteristics for new hires. What was one of the key characteristics the machine lasered in on?

If the candidate was named Jerry he would make a fantastic marketing person.

They didn’t buy the software.

How many other people did?

All it takes is a company to develop a slick looking demo and a flashy website and sell it to a school administrator who doesn’t know AI from a broom handle and then we have no idea if it’s making good assessments or assessments from the insane asylum in One Flew Over the Cuckoo’s Nest.

As a society we can’t accept that. That’s why we need a framework for practical AI ethics and that’s my first major post-Red Hat project, a practical AI Ethics program.

“Practical” is the key word here.

Most companies do ethics totally wrong.

Here’s the broken template that every single company seems to follow. They form a committee with great fanfare. They sing Kumbaya, put out a report on how AI should be “inclusive and fair.” Inclusive and fair sounds great but it means absolutely nothing. It’s a platitude. It’s no surprise that after a year nothing changes and the group gets disbanded.

I call this approach “AI Ethics Theater.”

Nothing gets done but everyone feels like they did something.

We need a better approach because in the future when people don’t get hired, or they get fired, or they go to jail because of an algorithm people are going to get mad and they’re going to ask hard questions. You better have the answers if you built the system. And the answer better be a lot better than “it just works that way” or “we don’t know why the machine did that.”

So how do we do ethics right?

To start with I know I can’t do it by myself. I need lots of great minds working on the issue. So I’m starting by forming the Practical AI Ethics Foundation. Right now it’s a foundation in the loosest sense of the word. It’s a grass roots project that will grow. I’m willing it into reality because it needs to exist and I want to bring together as many great people as I can to focus on this problem in a real way instead of pumping out platitudes.

There are three pillars to Practical AI Ethics:

A process for discovering and implementing ethics in code
Auditable AI
Explainable AI

Let’s start with the first because nobody seems to get it right.

How would a real ethics process work?

I’ve given the AI Ethics Foundation a head start by designing a blueprint with the help of the 2bAhead think-tank in Berlin and with the good folks at Pachyderm where I’m the Chief Technical Evangelist.

Let’s pretend your company is creating an algorithm that's handing out loans. That means it's actively “discriminating” against people who can’t pay it back. That’s all right. Companies don’t need to go bankrupt lending to people who will never pay them back.

But there might be a problem with the historical lending pattern of the company. Maybe they didn’t give loans to many women. Now the company decides they want to discover more women who can pay back those loans. This might be for a few reasons.

The first is they might decide it’s just their company's values.

The second reason is much simpler. There’s money in it. Historically women may have been underserved and that means the loan company is leaving money on the table. Ethics can align with profit incentives if it’s done right.

But how do you translate that value to something the machine can understand?

If a deep learning system is just studying historical data it will just echo the past. That means you have to think about the problem in a new way.

Maybe you create a synthetic data set with a generative adversarial network (GANs), or you buy a second data set? Or perhaps you create a rule based system that gives a weighted score that you combine with the black box AI’s score to form a final decision on the loan application?

Now you have the potential of creating a system that more accurately reflects your values as an organization.

But we can’t stop there. These systems are not perfect. Not even close. They’re flawed like everything else in life. They make mistakes and they make different kinds of mistakes than humans do. Sometimes that make outlandish errors that a human would never make, like identifying the name “Jerry” as a good sign of someone you should hire. Other times they make subtle or unforeseen errors.

That’s because they’re making decisions in an infinitely complex environment where we can’t process all the variables called real life. In the early days of AI we could brute force our way through all the possible decisions. Deep Blue beat Kasperov with raw compute power. But more complex decisions are just too variable to know all the possibilities.

The game of Go has more possibilities than atoms in the known universe and there’s no way AlphaGo could search them all. It sometimes had to make do searching random branches on a tree of decisions with Monte Carlo tree search and sometimes there still wasn’t a good answer.

Machines and people make predictions based on incomplete information in a chaotic system. They process as many of the possibilities as they can before they’re overwhelmed with too many variables.

In other words, they make guesses.

Those are good guesses but they’re still guesses.

Sometimes those guesses are wrong and lead to mistakes even when we’re really good at making predictions. A ball player knows how to predict which way a fly ball is going and time his jump to catch it but he doesn’t get it right every time and he can’t no matter how good he gets or how much he practices.

And when you step outside of the field of sports into a super complex environment like driving a car on real streets with rain and dust and other cars and street signs covered over or broken you can’t see every problem coming before it happens. Inevitably, you get a problem that nobody saw coming.

Sometimes it’s a big PR nightmare, like when Google's visual recognition systems started identifying people of color as gorillas. Other times it’s a super subtle problem that might not show up easily without a lot of time going by, like women not getting loans who were really qualified to pay that money back.

In both cases, you need a program in place to deal with it fast. You need an emergency AI response team. That means you need to know who’s in charge, who’s going to talk to the public, deal with it on social media and how you’re going to fix it.

Maybe, you need to take that system offline for a period of time, or roll it back to an earlier version, or put in a rule to stop it from making it go off the rails temporarily until you can fix the bigger problem?

That’s what Google did. They triaged the problem by no longer allowing the system to label anything as a gorilla. Believe it or not, that’s actually a good emergency response but it’s just the first step. But that’s where they stopped. Instead, they needed to then go back and actually fix the problem.

To do that they needed to figure out a better way to train the model and they needed to follow that up with coders who could write unit tests to make sure the problem doesn’t come back. We spend a lot of time in AI just judging everything by accuracy scores but that’s not enough. Right now data science needs to evolve to take on the best ideas we’ve used in IT for decades, snapshots and roll backs to known good states, logging, forensic analysis, smoke tests and unit tests.

All that leads us to the second two pillars in the program, auditable AI and explainable AI.

These systems should continually log their decisions to a log aggregation system, a database or an immutable blockchain. That’s where you get to put that AI Ethics committee or a QA team for AI, in charge of models and data integrity, to work. After that a random sampling of those decisions need to get audited on an ongoing basis. That’s known as the “human in the loop” solution.

Let people look for potential problems with our own specialized intelligence and built in pattern matching ability.

You can also automate monitoring those decisions with other AIs and simple pattern matching systems. Over the next decade I expect automated AI monitoring and auditing to become its own distinct category of essential enterprise software. With a human in the loop and automated monitoring that gives you a two pronged approach to spotting problems before they happen.

The second approach is explainable AI. That’s a bigger problem because we don’t have perfect answers to it yet. Right now we have machines that can drive cars and hand out loans but they can’t tell us why they made the decisions they made.

Explainable AI is still a hotbed of academic, government and corporate research and that’s why we want the Practical AI Ethics Foundation to bring together people working on how to get AI’s to tell us what they’re doing.

Handling all this data, decision making, and AI’s monitoring AI’s in infinite regress, comes down to something basic. We need to tools, processes and software to make it happen.

And that brings us to our second Foundation: the AI Infrastructure Foundation.

Roads? Where We’re Going We Don’t Need Roads

There’s an exciting side to AI. Intense, multimillion dollar research over many years that leads to a billion dollar algorithmic breakthrough that keeps self-driving cars from crashing or detects lung cancer better than ever is glamorous part of intelligent systems.

But 95% of the work in machine learning isn’t glamorous at all.

It’s hard work.

As Kenny Daniel, co-founder of Algorithmia, said "Tensorflow is open-source, but scaling it is not."

It’s long days experimenting with ideas. It’s about crunching picture file sizes down as far as they will go without losing the features a model can key in on. It’s waiting for systems to train over hours or days before you know if you got a good answer. And then doing it again. And again.

When you’re starting out, it’s easy to imagine that machine learning is effortless. Just grab a few open source tools, read some papers on Arxiv, hire a few data scientists fresh out of school and you’re off and running. But the truth is that machine learning in production, at scale, is hard and getting harder every day.

To do the fun parts of AI, you need to do the hard work of managing all the pieces in the pipeline. For that you need tools and Infrastructure. You don’t build a city on quicksand. You need bedrock and a strong foundation.

Most of the breakthroughs in AI have come out of mega-tech companies like Google with their in-house research lab DeepMind. They’ve got incredibly fine tuned, hyper-advanced infrastructures and researchers can throw virtually any problem at the engineers manning those infrastructures. They’re not using off the shelf, AI at scale in a box software. If the key software pieces a researcher needs don’t exist, Google’s engineers can just code it up in house.

But not everyone has an army of in-house programmers and a webscale infrastructure they can modify on the fly. As AI trickles down into enterprises, companies struggle to cobble together a coherent stack of tools to keep their AI/ML pipelines running like a well oiled machine. An explosion of open source projects, data lakes that have grown into data oceans, and a confusing array of complex infrastructure pieces only make it tougher.

And that’s what my work with the AI Infrastructure Foundation is all about, to gather together all the people, organizations, projects and companies that are working on building out the roads and plumbing of the AI driven future.

The biggest problems in that stack is data and data management, versioning, change control. Data is the new oil and the new gold. And it's super easy for data to get out of control.

A data science team doing facial recognition on outdoor cameras may start with a dataset of a hundred terabytes but it won’t stay that way for long. Today’s smart devices are packed with sensors and telemetry systems that send info back in a constant stream.

That data needs to be sorted, shifted, labeled, transformed, scaled and crunched. In short, it changes and changes constantly. Keep track of all those changes quickly becomes a nightmare.

Data science teams can’t manage all that data and keep track of all those little changes on their own. They need to work hand in hand with IT Ops and software engineering from start to finish. Too many organizations started their data science efforts as a separate processes from regular engineering tasks and it doesn’t work.

The security is missing, the change control is missing. If you don’t know who had their hands on a weights file before it goes to production, how do you know it wasn’t changed or tampered with and if anyone unauthorized touched it?

As data grows out of control and machine learning models and tools proliferate it’s essential to treat machine learning as another variation on traditional software engineering. That’s where Kubernetes, containers and OpenShift come into the ML workflow, along with DevOps programming paradigms, open source frameworks like Kubeflow and tools like Pachyderm that act as “Git for data” all come into the picture.

When your code, your files, and your models are all changing at that same time then keeping track of what changed and the relationship between them all gets harder and harder.

Pachyderm sprang from the founders' experience with machine learning at different startups and companies, working with bioinformatics, to risk analysis and anti-money laundering, algorithmic trading, along with video and image recognition.

As data scientists at those companies built their models on the backs of incredibly complex and fragile multi-stage pipelines, the slightest change in the data brought the whole system crashing down. It often took hours to run and when it failed, engineers burned more hours trying to trace the problem, only to find that the data path had moved or the name of a directory had changed.

Pachyderm solved that problem by keeping a perfect history of what, where and when data changes over time.

Every data science team will struggle with this in the coming months and years. Every data science team needs to find a way to get better control over their ever changing landscape before it breaks their own complex architectures or causes the kinds of errors they can’t chase down easily.

Version control and knowing where data came from and where it’s going, aka “data provenance,” is critical for ethics too, which brings us full circle to where I started here.

When our loan program starts denying loans to people named “Joe” and “Sally” we need to track down the problem and roll it back to an earlier snapshot of that model, or fire up a new training session, or quickly put in a rule to make it stop denying every Joe and Sally in the world a chance to buy that shiny new hybrid sedan to drive their new baby Jimmy to soccer practice.

More and more solutions like Pachyderm will pop up in the next few years. We’re already seeing tools like Tensorflow grow to Tensorflow Extended, based on Google’s own struggles with managing ever more complex machine learning pipelines.

All of these tools are evolving towards one thing:

A canonical stack.

Without a strong foundation we can’t build a strong house.

The AI Infrastructure Foundation will help me and you get our hands around this evolving stack and help drive it to deliver the tools we need to build the houses of tomorrow. And it will make sure we’re all working together on it instead of at cross purposes.

Too many organizations try to develop everything in house. Not-invented-here syndrome leads to lots of overlaps. Better to adopt the mantra of the Kubeflow team and not try to invent everything themselves or try to “recreate other services” but to embrace a collection of the best tools and bring them all together.

Lots of teams built their own AI Studio because they had no choice but these home grown systems aren’t going to cut it over the long haul. Over the next few years, we’ll see an industry standard machine learning stack solidify.

Many different organizations will contribute a piece of the puzzle. Who those key players are is still very much in flux, but it will happen faster than anyone expects as organizations laser in on the best open source software, in house and in the cloud and eventually in the fog too.

Be ready to adapt.

As Bruce Lee said, “Be like water. You put water in the cup, it becomes the cup. You put water in the bottle, it becomes the bottle.”

Make sure your organization doesn’t hold on to those home grown systems while the rest of the world blows by them.

Change with the circumstances and the times. Be flexible and willing to adopt new tools as they mature.

Like Bruce Lee, you must continually learn new Kung Fu to adapt to changing world.

The Wheel of Time

Ironically, now that I’ve left Red Hat, I have more chances to work with them than I ever did from inside their walls.

As soon as I exited the building, I reached out to everyone there and I found an untapped reservoir of passion still burning with the good people who decided to stay. I found CTO Chris Wright and I both share a passion for AI ethics and making sure this technology doesn’t go off the rails and do more damage than good.

And as long-time infrastructure guys we naturally approach it from that angle too. I’m hopeful they become my first partner on the pathways of tomorrow.

I’m old enough to have experienced the Internet revolution first hand. But I was too young to make a difference. I was just a passenger along for the ride.

The early days of the net were a heady time. It was a time of unprecedented hope and potential. We were changing the world! We could taste the future where people could work from home, buy anything they wanted online, connect with people just like themselves all over the planet, or get an education without ever going to a university.

All of that happened.

But we missed the dark side.

We thought that if we made information free that the world would magically open up and transform into a new kind of massively connected utopia.

That vision looks hopelessly naive now.

We didn’t realize that someone had to pay for all those services somehow and so we unwittingly ushered in a surveillance economy and a total information awareness age.

When we broke down the walls that filtered content, we opened the doors to a torrent of crazy. The old gatekeepers were imperfect. They filtered as much good content as they did bad but they managed to keep the fringes of society where they belonged, at the fringes.

Now every single nut with a crazy idea can find other nuts out there with ease and they can come together and amplify their voice. We don’t know where to turn or who to trust anymore in the chaotic din.

If you put a thousand people in a room and twenty of them start shouting at each other, eventually you’re so exhausted that you start to believe that’s what everyone thinks. But it’s not true. They’re just the people shouting the loudest.

Now, I’m older and a little wiser. I can see the dark linings in the silver cloud. I see the promise and the potential of AI, but I see the peril too. But this time I’m lucky enough to be in a position to do something about it.

AI will do bad things. If we don’t take action now we might just wake up in a world that’s infinitely worse than anything the Internet visited upon us, a world of surveillance on steroids and clandestine machines running every aspect of our lives. We could find ourselves getting hired or fired by a black box and never know why.

But it doesn’t have to be that way.

The future is not set.

We're the architects of tomorrow. There’s no fate but what we make.

AI will do bad things but it will do amazing things too. And if we can get ahead of the problems, if we can see them coming, then we can make sure it does more good than evil.

I’m ready to do my part to change the world but I can’t do it alone. I need all the help I can get.

I need your help.

Join me.

And maybe this time we really can build a better tomorrow.

######################################################

Come join the communities:

AI Infrastructure Foundation

Practical AI Ethics Foundation

######################################################

I am nothing but grateful for my time at Red Hat and
I still have many friends there and always will now and in the future.
It's one of the greatest companies in the world and I was lucky to
spend a big chunk of my life there. Now I get to work with them in a
new way as a partner. Come and see my talk on Practical AI Ethics at
the #OpenShift Commons meet in San Francisco. Get registered here.