Illustration composed from MariaLetta/free-gophers-pack, original gopher by Renee French.
Transaction is a sequence of database operations, grouped as a single unit. All operations could be either committed or rolled back. Simplest example is balance transfer.
In case of transfer between two accounts Alice and Bob, with balance, we need to subtract from Alice balance and increase Bob balance in one action. SQL code for this action would be something like this:
BEGIN;
UPDATE users SET balance = balance - 10 WHERE name = 'Alice';
UPDATE users SET balance = balance + 10 WHERE name = 'Bob';
COMMIT;
I won't go deep in theory, how Postgres handles all this internally, but rather focus on Go examples.
Go have two main not-orm libraries to work with Postgres pg/lib and jackc/pgx. Pgx is preferable and I gonna use it in examples. Although pg/lib supports
database/sql
, it is not maintained anymore and has some issues, like panic
when error occurred, instead of returning an error.Let's take a lot at a little bit more complex example:
Assume that we have a users table, each user has name, balance, and group_id. Seed table with 5 users, each with a balance of 100, split to 3 groups.
CREATE TABLE users
(
id serial,
name text,
balance integer,
group_id integer,
PRIMARY KEY (id)
);
INSERT INTO users (name, balance, group_id)
VALUES ('Bob', 100, 1),
('Alice', 100, 1),
('Eve', 100, 2),
('Mallory', 100, 2),
('Trent', 100, 3);
We need to read data to our program, do something with it, and then update, all in one ACID transaction. If someone else will try to update the same data concurrently, then the transaction would behave differently depends on its isolation level.
It theory there 4 isolation levels, Postgres supports only 3 of them. And 4 phenomena, that different isolation levels should prevent.
Read uncommitted
, Read committed
, Repeatable read
, and Serializable
. Read uncommitted
is equal to Read committed
and is default isolation level in Postgres. Isolation levels are targeted to prevent undesirable phenomena: dirty read, nonrepeatable read, phantom read, and serialization anomaly.
Basically it is reading of uncommitted changes from different transactions. All transactions are Postgres protected from dirty read, it is not possible to read changes, that not yet committed.
Default level of isolation
Read Committed
, which is equal to Read uncommitted
in Postgres. Read uncommitted
in different databases allows dirty read.First, we need to prepare two separate connections to the same database, in order to send transactions with both of them concurrently:
ctx = context.Background()
conn1, err := pgx.Connect(ctx, connString)
if err != nil {
fmt.Fprintf(os.Stderr, "Unable to connect to database: %v\n", err)
os.Exit(1)
}
defer conn1.Close(ctx)
conn2, err := pgx.Connect(ctx, connString)
if err != nil {
fmt.Fprintf(os.Stderr, "Unable to connect to database: %v\n", err)
os.Exit(1)
}
defer conn2.Close(ctx)
Attempt to dirty read:
If dirty read would be possible, then the results of reading on steps 2 and 3 would be the same. But since changes made inside the transaction, it's unavailable outside, before the commit is made.
tx, err := conn1.Begin(ctx)
if err != nil {
panic(err)
}
tx.Exec(ctx, "SET TRANSACTION ISOLATION LEVEL "+isolationLevel)
_, err = tx.Exec(ctx, "UPDATE users SET balance = 256 WHERE name='Bob'")
if err != nil {
fmt.Printf("Failed to update Bob balance in tx: %v\n", err)
}
var balance int
row := tx.QueryRow(ctx, "SELECT balance FROM users WHERE name='Bob'")
row.Scan(&balance)
fmt.Printf("Bob balance from main transaction after update: %d\n", balance)
row = conn2.QueryRow(ctx, "SELECT balance FROM users WHERE name='Bob'")
row.Scan(&balance)
fmt.Printf("Bob balance from concurrent transaction: %d\n", balance)
if err := tx.Commit(ctx); err != nil {
fmt.Printf("Failed to commit: %v\n", err)
}
For both isolation level results would be the same:
Dirty read
Isolation level - READ UNCOMMITTED
Bob balance from main transaction after update: 256
Bob balance from concurrent transaction: 100
Final table state:
1 | Bob | 256 | 1
2 | Alice | 100 | 1
3 | Eve | 100 | 2
4 | Mallory | 100 | 2
5 | Trent | 100 | 3
Isolation level - READ COMMITTED
Bob balance from main transaction after update: 256
Bob balance from concurrent transaction: 100
Final table state:
1 | Bob | 256 | 1
2 | Alice | 100 | 1
3 | Eve | 100 | 2
4 | Mallory | 100 | 2
5 | Trent | 100 | 3
Transaction read some values from rows, and those values could be changes by concurrent translations, before the transaction ends. To prevent this,
Repeatable read
isolation level read all these values again before the transaction is committed, and cancels transaction if values differs from initial ones. Otherwise if data changes are ignored, it is nonrepeatable read situation.Test with
Read committed
, Repeatable read
isolation levels:Transaction with
Read committed
simply ignores concurrent changes and overwrites them. With Repeatable read
Postgres detects concurrent changes on step 3 and stops transaction.tx, err := conn1.Begin(ctx)
if err != nil {
panic(err)
}
tx.Exec(ctx, "SET TRANSACTION ISOLATION LEVEL "+isolationLevel)
row := tx.QueryRow(ctx, "SELECT balance FROM users WHERE name='Bob'")
var balance int
row.Scan(&balance)
fmt.Printf("Bob balance at the beginning of transaction: %d\n", balance)
fmt.Printf("Updating Bob balance to 1000 from connection 2\n")
_, err = conn2.Exec(ctx, "UPDATE users SET balance = 1000 WHERE name='Bob'")
if err != nil {
fmt.Printf("Failed to update Bob balance from conn2 %e", err)
}
_, err = tx.Exec(ctx, "UPDATE users SET balance = $1 WHERE name='Bob'", balance+10)
if err != nil {
fmt.Printf("Failed to update Bob balance in tx: %v\n", err)
}
if err := tx.Commit(ctx); err != nil {
fmt.Printf("Failed to commit: %v\n", err)
}
Results are different, in the second case transaction failed:
Nonrepeatable read
Isolation level - READ COMMITTED
Bob balance at the beginning of transaction: 100
Updating Bob balance to 1000 from connection 2
Final table state:
1 | Bob | 110 | 1
2 | Alice | 100 | 1
3 | Eve | 100 | 2
4 | Mallory | 100 | 2
5 | Trent | 100 | 3
Isolation level - REPEATABLE READ
Bob balance at the beginning of transaction: 100
Updating Bob balance to 1000 from connection 2
Failed to update Bob balance in tx: ERROR: could not serialize access due to concurrent update (SQLSTATE 40001)
Failed to commit: commit unexpectedly resulted in rollback
Final table state:
1 | Bob | 1000 | 1
2 | Alice | 100 | 1
3 | Eve | 100 | 2
4 | Mallory | 100 | 2
5 | Trent | 100 | 3
Phantom read is similar to nonrepeatable read, but it is about a set of rows that was selected within the transaction. If with external changes, a set of rows also changes this is phantom read situation.
Repeatable read
level prevents it in Postgres.Test with
Read committed
, Repeatable read
isolation levels:Transaction with
Read committed
will read different rows on steps 1 and 3. With Repeatable read
Postgres will save data from the beginning of transaction and 1 and 3 reads will return the same set of rows, isolated from concurrent changes.tx, err := conn1.Begin(ctx)
if err != nil {
panic(err)
}
tx.Exec(ctx, "SET TRANSACTION ISOLATION LEVEL "+isolationLevel)
var users []User
var user User
rows, _ := tx.Query(ctx, "SELECT name, balance FROM users WHERE group_id = 2")
for rows.Next() {
var user User
rows.Scan(&user.Name, &user.Balance)
users = append(users, user)
}
fmt.Printf("Users in group 2 at the beginning of transaction:\n%v\n", users)
fmt.Printf("Cuncurrent transaction moves Bob to group 2\n")
conn2.Exec(ctx, "UPDATE users SET group_id = 2 WHERE name='Bob'")
users = []User{}
rows, _ = tx.Query(ctx, "SELECT name, balance FROM users WHERE group_id = 2")
for rows.Next() {
rows.Scan(&user.Name, &user.Balance)
users = append(users, user)
}
fmt.Printf("Users in group 2 after cuncurrent transaction:\n%v\n", users)
fmt.Printf("Update selected users balances by +15\n")
for _, user := range users {
_, err = tx.Exec(ctx, "UPDATE users SET balance = $1 WHERE name=$2", user.Balance+15, user.Name)
if err != nil {
fmt.Printf("Failed to update in tx: %v\n", err)
}
}
if err := tx.Commit(ctx); err != nil {
fmt.Printf("Failed to commit: %v\n", err)
}
Results are different, based on the second select, different users affected by the upgrade:
Phantom read
Isolation level - READ COMMITTED
Users in group 2 at the beginning of transaction:
[{Eve 100} {Mallory 100}]
Cuncurrent transaction moves Bob to group 2
Users in group 2 after cuncurrent transaction:
[{Eve 100} {Mallory 100} {Bob 100}]
Update selected users balances by +15
Final table state:
1 | Bob | 115 | 2
2 | Alice | 100 | 1
3 | Eve | 115 | 2
4 | Mallory | 115 | 2
5 | Trent | 100 | 3
Isolation level - REPEATABLE READ
Users in group 2 at the beginning of transaction:
[{Eve 100} {Mallory 100}]
Cuncurrent transaction moves Bob to group 2
Users in group 2 after cuncurrent transaction:
[{Eve 100} {Mallory 100}]
Update selected users balances by +15
Final table state:
1 | Bob | 100 | 2
2 | Alice | 100 | 1
3 | Eve | 115 | 2
4 | Mallory | 115 | 2
5 | Trent | 100 | 3
Let's assume that we have several concurrent transactions in progress, both do some reading and writing with a table. In case if the final table state will depend on the order of running and committing these transactions, then it is Serialization anomaly.
In this case results could be affected by race conditions. Isolation level
Serializable
help prevents this type of issue. I'll have to say, that even with this serialization level, some rare cases could still cause this phenomena.Test with
Repeatable read
and Serializable
isolation levels:Transactions with
Repeatable read
both will be committed without errors. With Serializable
isolation level second transaction won't be committed. These two transactions work with the same data and the order of commits will affect results, which could lead to unpredictable outcomes. Postgres would prevent the commit of the second transaction, to prevent this uncertainty.tx, err := conn1.Begin(ctx)
if err != nil {
panic(err)
}
tx.Exec(ctx, "SET TRANSACTION ISOLATION LEVEL "+isolationLevel)
tx2, err := conn2.Begin(ctx)
if err != nil {
panic(err)
}
tx2.Exec(ctx, "SET TRANSACTION ISOLATION LEVEL "+isolationLevel)
var sum int
row := tx.QueryRow(ctx, "SELECT SUM(balance) FROM users WHERE group_id = 2")
row.Scan(&sum)
tx2.Exec(ctx, "UPDATE users SET group_id = 2 WHERE name='Bob'")
if err != nil {
fmt.Printf("Error in tx2: %v\n", err)
}
rows, _ := tx.Query(ctx, "SELECT name, balance FROM users WHERE group_id = 2")
type User struct {
Name string
Balance int
}
var users []User
for rows.Next() {
var user User
rows.Scan(&user.Name, &user.Balance)
users = append(users, user)
}
for _, user := range users {
_, err = tx.Exec(ctx, "UPDATE users SET balance = $1 WHERE name=$2", user.Balance+sum, user.Name)
if err != nil {
fmt.Printf("Failed to update in tx: %v\n", err)
}
}
if err := tx.Commit(ctx); err != nil {
fmt.Printf("Failed to commit tx: %v\n", err)
}
if err := tx2.Commit(ctx); err != nil {
fmt.Printf("Failed to commit tx2: %v\n", err)
}
In second case transaction failed with "could not serialize access due to read/write dependencies among transactions" error:
Serialization anomaly
Isolation level - REPEATABLE READ
Final table state:
1 | Bob | 100 | 2
2 | Alice | 100 | 1
3 | Eve | 300 | 2
4 | Mallory | 300 | 2
5 | Trent | 100 | 3
Isolation level - SERIALIZABLE
Failed to commit tx2: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
Final table state:
1 | Bob | 100 | 1
2 | Alice | 100 | 1
3 | Eve | 300 | 2
4 | Mallory | 300 | 2
5 | Trent | 100 | 3
When you have multiple connections and concurrent access to Postgres database, choose the isolation level carefully. Higher isolation levels provides safety, but reduces performance. Also, you should check, that transaction has been committed successfully, and repeat if necessary.
All examples from the article:
https://github.com/kochetkov-av/go-postgresql-transaction-isolation