The importance of Primary Keys in MySQL Note: I have used MySQL and it’s storage engine InnoDB interchangeably. A couple of days ago, someone at work asked me whether inserts are faster in a table when there’s a present in the table. My immediate answer was that they were faster. I had some idea about this from experience and previous reading but didn’t remember the reason why it was so. primary key For a business user unaware about InnoDB internals or InnoDB’s existence, it might seem perfectly intuitive to assume that inserts will be faster without a because a is an additional constraint on the table, i.e., MySQL has to check for uniqueness, nullability. When it comes to InnoDB, that is not the case. Also, a table without a will occupy more space than a table with a . See how. primary key primary key primary key primary key Background MySQL has various for preventing multiple threads accessing the same resource at the same time. There is a called which locks the data dictionary, i.e., the . Ovais Tariq has written a very on this. mutexes mutex dict_sys information_schema insightful blogpost When a is not defined on a table, MySQL generates for every row that is inserted in that table. This, because InnoDB is an index-organized storage engine, it mandates a to store the contents of the table. If you don’t define a primary, MySQL will do this itself by generating a which is hidden. It is referred by the name in the documentation. This only has to exist if a table doesn’t have a . primary key row-ids primary key 6 byte auto increment primary key DB_ROW_ID primary key Now, when this is generated on every new insert, MySQL acquires mutex — and it becomes a system-wide contention point because what it is acquiring a mutex on is the data dictionary — which might be in use by a couple of other DDL and DML operations. This would mean that insert performance on a table might be affected by another operation taking place on another table. This contention would otherwise not exist if a is defined on the table. row-id dict_sys primary key To check how a similar insert workload behaves on a table with a defined and another table without a defined , I performed a simple insert performance test on my . I have attached the scripts to perform these tests. primary key primary key mysqlslap local Note: Doing this with _rand()_ is probably not ideal. — is a like tool for MySQL that can be used for performance and load testing of individual tables — Read more about it . mysqlslap sysbench here Result The insert performance in this case was surprisingly not very different in both the cases — checked for 10 million records. This was with the concurrency of 4 threads With , 1284.848 seconds primary key Without , 1282.350 seconds primary key In an isolated environment and with only 4 thread, there were not many mutexes on , hence, there was no significant difference between the insert performance. Although, with the same table configuration, these two tables occupy significantly different amount of space on disk — which comes from the fact that MySQL had to create for every record that was being inserted in the table without a defined . dict_sys information_schema row ids primary key With , 900 MB primary key Without , 1.1 GB primary key I conducted five more tests with increasing concurrency and a consistent load of 100K records. See the difference in insert performance here. Time taken is in seconds With increasing threads, time taken to insert into tables without increases quite a lot because of the contention point added because of mutex. primary key dict_sys Always use a ! primary key From about this P.S. — Jeremy Cole’s blog Given how much other code within InnoDB is protected by dict_sys->mutex I think it’s fair to say any tables with an implicit clustered key (ROW_ID) could expect to experience random insert stalls during operations like dropping (unrelated) tables. Parallel insertion into multiple tables with implicit keys could be performance-constrained, as it will be serialized on both the shared mutex and cache contention for the shared counter variable. References Ovais Tariq’s Original Blogpost about Scalability Issues — strongly recommended book for MySQL power users Chapter 4 from High Performance MySQL Tuning Primary Keys in InnoDB — not immediately related to this, but interesting nevertheless Wasting InnoDB Memory MySQL Server Team’s Blog about InnoDB’s Intrinsic Tables Community Discussion about using mysqlslap Importance of Primary Keys in MySQL Performance — not immediately relevant UUID vs incremental ID insert performance in MySQL — must read Jeremy Cole’s blog on How InnoDB behaves without a Primary Key — this needs a deeper dive Lack of Primary Key May Effectively Stop Slave Another blogpost about Replication Issues when Primary Keys are not present InnoDB Row Format Specification Art Annunciazione by Leornado da Vinci