David Smooke

@DavidSmooke

Behind the Scenes of a Database as a Service Provider

Founder Interview

Disclosure: Manifold, the marketplace for independent developer services, has previously sponsored Hacker Noon. Use code HACKERNOON2018 to get $10 off any service.

Lets talk numbers. What is the scale and progress of JawsDB?

It’s humbling, to be honest. Roughly 3 years ago, JawsDB had 5 customers at the end of its first month. I was ecstatic. Over the moon! Something I had put together as a labor of love and a project for learning new technologies was being used out there in the real world? There’s no feeling like it.

A lot has happened since then, but today, JawsDB has over 20,000 customers and is on track to add another 1,400 this month.

How and why did you choose the name Jaws? Are there any easter eggs in your product/service/communications that pay homage to the movie?

Well, it’s not officially an acronym, but when I was trying to come up with a name, I began with the most trivial facts about both the product and the developer. A short version of how it went is: “Ok, so, my name is John. I’m leveraging Amazon Web Services. John.. AWS… J… AWS… dear God…”

The word naturally conjures the image of a shark in peoples’ minds, thanks to Spielberg, and sharks have both fascinated and terrified me from a young age. I’m by no means alone in this, so I assumed that the thoughts and emotions evoked by sharks would help sell the service as smart, sleek, and scary-powerful :)

How/why/when did you start JawsDB?

I started the project in 2014. At the time I was working as a Microsoft and Oracle database architect for a telecommunications company. It was lots of SQL all the time, which obviously influenced the direction I would take the code.

The job was great, the people were great, and the pay was great, but I got restless. Reading about newer languages and technologies gave me a sort of digital wanderlust and so I started using my free time to create web applications in Ruby and Python, using different frameworks and SQL databases than I was using at work.

JawsDB started as an attempt to get more familiar with Amazon Web Services and platforms like Manifold and Heroku. The more I dove in, the wider my eyes got as I saw what incredible power we have at our fingertips.

From there, the project became driven by two goals:

  • 1: To make cloud SQL databases even more accessible to the average developer,
  • and 2: To prove to myself that even in our day and age of complex software, one person could still create a worthwhile product, and take it to market.

JawsDB offers MySQL, Postgres, and MariaDB. Which are your customers using more? And what are the key factors that make them choose one over the other? How do you see their market share changing moving forward?

MySQL databases are Jaws’ most popular by a comfortable margin. Factors influencing this are numerous, but I was just surprised to see there was still such a huge need for SQL databases among the web development community. I foolishly assumed most developers on these newer platforms would be using only the newest technologies.

Speaking from my own experience in tech, it’s easy to assume certain languages or technologies are dead because they are no longer written about as frequently, or perhaps because there are several scathing blog posts out there entitled “Why I left [some technology] and never looked back!” I empathize with that. I started my side projects in-part because I didn’t want to get left behind.

What this whole company has taught me though, is that solutions that work will last a lot longer than people think, even if there are newer and arguably better options. An overwhelmingly large set of existing applications out there are run on MySQL and they need homes! What’s more, so many of us learned web development on SQL databases either in bootcamps or computer science classes, or web tutorials. These database engines are fast and robust, with vast supporting documentation, blog posts, Stack Overflow answers, etc. That makes them very attractive options to this day.

Why did you decide to partner with Manifold?

When a colleague told me about Manifold, I was very interested to hear that the add-on marketplace was decoupled from the web-server. They offered developers a chance to integrate helpful add-ons like JawsDB to their project no matter where they hosted the application itself. This expansion of the potential client-pool was enticing, but after speaking with the team, their professionalism and vision really convinced me to adapt the service to their platform. I felt confident that if a team of people could succeed in this space, it was them.

I’m encouraged to see the diversity of all the different applications utilizing JawsDB through Manifold. The platform offers developers true freedom from walled gardens while still providing a toolset for quickly adding real power to an application.

At a young age, what excited you about the internet?

To be perfectly honest, I was most excited about the ability to chat with girls from my school via Instant Message. That evolved to incredulity that I could download all the songs I ever wanted via Napster, Kazaa, Limewire, etc. Then it was elation at Wikipedia aggregating all the sources I would need for a research paper… Basically the internet provided all that was necessary to satiate a selfish teenager.

Grasping the world-changing potential that the internet provides took time, and do any of us still really grasp it?

It’s a little too easy to take even something revolutionary for granted when it has become so ubiquitous and second-nature. Louis CK had a great bit about people nowadays getting frustrated when their phones take a half-second longer than “instantaneous” to perform a task. “Give it a second!” he cries with exasperation. “It’s sending a message to space!”

It’s healthy for all of us to take a moment and be thankful that we live in the age of the internet. What a gift!

I see you got your masters degree in computer science. How did that help shape your career? And would you recommend it other junior developers who are looking to transition to starting their own company?

This is a tough one. When I signed up for my undergraduate courses (also CS), I had already developed an interest in web development. The program helped teach me the discipline needed to study efficiently and absorb new material effectively, but College isn’t marketed as the place to learn discipline. I’ve come to believe this can be learned via many different avenues.

The undergraduate diploma itself helped me get my foot in the door at my first programming jobs and I do not want to downplay that. I was fresh out of school and I needed to stand on my own two feet. The graduate degree similarly helped me move up within the company I worked for, but I honestly can’t say that I would categorize either degree as a prerequisite for success in starting a company or even in web development in general.

I’ve known wonderful coders who have come out of bootcamps, and self-teaching as well as Computer Science programs. There are also mediocre coders who come from the same. The difference seems to be a combination of innate ability, conscientiousness, and drive.

My advice to any dev looking to start their own company would be to first summon the courage necessary to take their project to the end.

It’s easy to get hung up about a lack of formal education or a lack of a truly original idea when trying to start a business. A courageous and driven person doesn’t need a diploma to learn the skills necessary to build an application. Neither do they need to have the next big idea in order to make their venture successful. For perspective, taking even 1% of a billion dollar market is still 10 million dollars.

How disruptive do you think the blockchain will be in terms of database market share?

The short answer is that I’m not smart enough to answer that question with any degree of confidence. To me, the most exciting potential of blockchain is eliminating the need to place excessive trust in central authorities and middle-men. There are all kinds of database systems that would benefit from this. The how or the when or the magnitude of disruption in the database realm is a mystery to me, but it is something I’m very excited to witness.

What upcoming database innovations are you excited about? How do you see the technology of databases evolving (say 5–10 years)?

Within the realm of SQL databases, I have to say that I’m very excited about the application of Machine Learning to configuration tuning. A SQL database can be used for a million different things and these specific use-cases will tax the underlying server hardware and database algorithms in different ways.

For this reason, these databases usually ship with a default server configuration designed to work relatively well with a good percentage of these use-cases. However, an administrator, familiar with both the database engine itself and the application being hosted on it is usually required to tweak these configuration settings to get it all to work ‘just right’.

Machine learning algorithms are being developed to observe the normal day-to-day operation of a unique database hosting a unique application and to make decisions about how these configuration options should be tweaked to achieve performance gains. They can also recommend better indexing, table design, etc.

Without training in database administration, it’s very difficult to tell if “slowness” in a database is something that requires a hardware upgrade or could be fixed by better queries, better indexing, better configuration tuning, etc. These advances in automatic tuning could eliminate this ambiguity, reduce the need for database support-staff, and keep databases running smoother for longer on cheaper hardware. Brilliant!

Given how dependent we are on Facebook & Google, and given how those companies make money off our information, do you think everyone should own a personal database of their network?

As an early adopter of both Google services and Facebook, I have to say that some of their recent business practices have me a little spooked. The recent Cambridge Analytica scandal only scratches the surface of what sort of data is being stored and for what purposes.

In my Masters program, I heard lectures and saw demonstrations of how the concepts from behavioral psychology that make advertisements so effective are now being sharpened by the large datasets of these platforms and social networks. The result is an industry capable of creating highly effective and targeted messaging. One lecturer referred to it as ‘weaponized persuasion.’

From the pages you ‘like’ to the words you type into a cloud document, all of this information can be aggregated and analyzed by experts and learning algorithms to paint a figurative picture of you. What’s more, overlapping patterns of behavior among populations can be utilized in crafting powerful advertisements and persuasive messages for everything from breakfast cereal to political candidates and issues. In short, it goes beyond knowing about the current you, and extends into the realm of how to shape the future you using knowledge about your demographic’s biases, preferences, etc. There is a very good reason these free-to-use services are worth many billions of dollars.

At the risk of sounding paranoid, I’ll say that owning a personal storage solution, even if it is as simple as a thumb drive, is a smart idea when storing sensitive data that you do not want used for identification or for use by behavioral psychologists.

My hope is that technologies like the blockchain will eventually give us much greater control over our own data in terms of who can see it and how it is used.

You recently allowed users to view their server’s stored backups and restore them there or to a separate JawsDB instance. Could you describe how built this?

Traditional SQL backup solutions involve regularly saving a text file which contains a script for re-populating a database with its contents from that point in time. In less-sophisticated environments, this can cause the performance of the database to degrade while this potentially-large file is created by reading from the database. There is also storage overhead incurred by the party taking and storing the backup file.

I built a few different variations of this sort of backup solution for JawsDB but eventually decided to leverage AWS’ own backup system which saves a binary snapshot for the entire server at a point in time.

These files are akin to VMWare snapshots in that they don’t just store the information within the MYSQL database, but everything from the Database to the underlying OS. Because of the binary nature of these files, they could be created much faster than the text backups traditionally used. In addition, when a server is created with redundant hot-backups, the snapshots can be taken from them with no service degradation on the master instance.

The downside to these files is that they are in a proprietary format and cannot be downloaded locally. This is obviously a big problem for people who need access to their backups in case of emergencies. What JawsDB ended up implementing is a solution where users could see their stored snapshots on their dashboard, along with metadata about the snapshot. Using the unique identifier of the snapshot and the credentials of the source server, customers can restore a snapshot to the same JawsDB server instance or any JawsDB instance of compatible type (equal or lesser storage capacity).

This not only allows users to restore a broken database, but to temporarily spin up a copy from another point in time to grab data, or even to use a JawsDB database as a template from which multiple copies can be created. All of this requires no downloading of backups (which traditionally can be many gigabytes in size) and no extra burden on JawsDB or its customers to store these files.

Could you share a screenshot of what your internal JawsDB account looks like?

Why not? I’m a big believer in Dogfooding as a way to see yourself and your product from the customer’s lens.

[Some sensitive information has been removed :)]

To someone tasked with managing a new database from scratch, what advice would you give them? How should they prioritize / not prioritize their objectives?

With SQL databases, “measure twice, cut once” is a good mantra to hold.

The mathematical foundation behind SQL databases cause them to gain speed advantages from their rigid structures. The problem with this is that these rigid structures can be difficult to change later on, especially when the data contained within grows. A good initial design will save you more headaches later on. That being said, don’t let this paralyze you! Database schema changes, while annoying, are not the end of the world.

Second, while there are a lot of ORM libraries out there that abstract away the SQL language from development, it’s still very useful to do some reading in your database’s documentation on Indexes, such as how they are defined, and how they work. Indexes are perhaps the most important element of good database design after the schema design itself. They can be the difference between a query that takes 23 minutes to finish versus one that takes 23 milliseconds and pulls the same data.

If you could change one thing about how databases work for everyone, what would it be?

Tuning the database server is currently still a very specialized and iterative process. This makes it difficult for small teams to scale effectively during early growth hurdles. As I mentioned before about machine learning, I’m very hopeful that isolating the individual server configuration values and optimizing them will eventually be done in software. This will open up growth potential to small teams and individuals who have a great product or idea but do not yet have the capital to hire the specialized personnel needed to overcome these challenges.

Follow @jawsDB and use JawsDB in Manifold.

More by David Smooke

Topics of interest

More Related Stories