Last week, I wrote A Beginner’s Guide to Deadlocks in Amazon RDS. This week, I’d like to lay out my 10 years of experience about how to avoid deadlocks altogether. Often times, this will be out of the hands of operations people, but you can still move for dev changes based on issues in production. The more knowledgeable you are about deadlocks in general, the more they will lean on you as a resource with wisdom, not a totalitarian barking rules.
As I lay out my 5 tips to avoid deadlocks in Amazon RDS, I want to re-emphasize that RDS is not actually capable of creating deadlocks — it merely reports them from the underlying database engine. While all of my 5 tips are applicable to all six supported database engines in RDS (Aurora, MySQL, MariaDB, PostgreSQL, Oracle, and Microsoft SQL Server), your specific database may provide additional recommendations on deadlock avoidance.
Tip #1: Don’t Use Shared Locks in Transactions with Writes
Databases have two different kind of locks, exclusive and shared. Exclusive locks can only be granted to a single thread, while shared locks can be granted to multiple threads at once. If multiple threads request the same exclusive lock, it will only be granted to one of them at a time, and the other threads will block until it becomes available.
While there are multiple kinds of locks, they always apply to the same kinds of resources — typically rows in a table. A row can be locked with exclusive and shared locks both, but never at the same time. If a row is locked in share mode, and an exclusive lock is requested, no more share mode locks will be granted until after the exclusive lock is granted and released.
The biggest advantage to using shared locks, and the reason why devs prefer them, is because they increase concurrency while ensuring consistency. For example, if a developer wanted to get both a row from the `users` table and the related information from the `payments` table, it would be reasonable to get a shared lock on the user’s record before reading the payments, just to avoid conflicting information if a user happens to be making a payment right at that moment. In this example, a shared lock would allow multiple readers to increase concurrency.
The problem with shared locks is if the locked row is updated in the same transaction as the shared lock. An update requires an exclusive lock, which must be requested separately even if a shared lock is held. Now, because the shared lock is selected to increase concurrency, the chances are high that another thread also has the shared lock and is going to make a similar update. Since both threads have the same shared lock and want the same exclusive lock, deadlock occurs.
Here’s a simple example with MySQL syntax. I have two threads, A and B, doing the exact scenario laid out above.
A => BEGIN; A => SELECT * FROM users WHERE id = 7 FOR SHARE; B => BEGIN; A => SELECT * FROM users WHERE id = 7 FOR SHARE;
At this point, all locks have been granted, and both threads (A and B) continue execution of further statements.
A => SELECT * FROM payments WHERE user_id = 7; B => SELECT * FROM payments WHERE user_id = 7;
Then, both threads attempt to update the user level based on the payment information gathered.
A => UPDATE users SET level = ‘professional’ WHERE id = 7; B => UPDATE users SET level = ‘professional’ WHERE id = 7;
When the query from thread A is run, A becomes blocked, because B holds a shared lock on that row. If A was the only thread to hold the shared lock, the database engine would be smart enough to upgrade the shared lock to an exclusive lock. After A is blocked, B attempts the same query. It blocks for the same reason, and deadlock is in full force.
To avoid this, my tip is to only use shared locks in read-only transactions. Shared locks have their utility - concurrency with consistency. They aren’t all bad, they just have a precarious tendency to create deadlocks when combined with updates, deletes, and inserts.
Tip #2: Keep Transactions Short
This tip has less to do with keeping bad code out of your system, and more to do with the simple statistical fact that shorter periods of locking decrease the chances of a deadlock.
Let’s say, for a moment, that you have 100 transactions per second, and that each transaction lasts 10 seconds. At any given time, you will have around 1,000 transactions open with potential lock interference. Given a limited set of locks for your active data (not all your data is in use at once), you could calculate the statistical odds that any of these queries could request the same locks. Not all lock contention will yield a deadlock, but contention is a required condition for deadlock.
Compare this to the case where you have 100 transactions per second (the same as before), but this time, each transaction only lasts 10 milliseconds. At any given time, you will have only 1 transaction open at a time, which guarantees no lock contention at all. Since lock contention is a required condition for deadlock, deadlock will be all but impossible in this case, even if the code is riddled with race conditions.
In the above paragraphs I am specifically using the word “transactions,” not the word “queries.” Having faster queries will only help insomuch as they increase the speed of the associated transaction. A transaction is the lifetime of a lock (in most of the RDS databases), so it’s the transactions that should be fast, not just the queries. Obviously, if you have a query that takes 10 seconds, you should address that to make the transaction faster.
To keep transactions fast, only start the transaction when all the computation is done, and only the queries remain. After a transaction is open, there should be no encoding data, creating prepared statements, or calculations — the primary focus of code should be to finish the queries and commit the transaction.
Up next, I'll share the remaining three tips for deadlock avoidance. Before reading them, do you know if you have deadlocks already? Most deadlocks occur silently in the night, and developers never find out about them. They manifest themselves as a single line in a log file that gets ignored. Most monitoring tools don't alert on deadlocks out-of-the-box.
If you're unsure about if you're having deadlocks, and want to get notified when you do, try out Blue Matador. We automatically notify you about hundreds of issues, including deadlocks. You can get a free 14 day trial of our alert automation, no credit card required.