In PostgreSQL there is a very handy advisory locksmechanism aka advisory locks .We at Tensor use them in many places in the system, but few people understand in detail how exactly they work, and What problems you can get if you mishandle
Read more about locks For a general broadening of your horizons, I recommend to read a series of articles Egor Rogov , About the locking mechanics in PostgreSQL
Benefits of advisory locks
The fundamental difference between this mechanism and "conventional" interlocks table/page/record level is the presence of several key features.
Locking an arbitrary identifier
"Ordinary" locks in PG are always tied to a specific database object (table, record, data page) and the process serving the connection.Advisory locks are also to the process, but instead of a real object, an abstract identifier that can be specified as (bigint) ilikak (integer, integer)
By the way, tying each lock to a process means that by "nailing" it via
pg_terminate_backend(pid) or by terminating the connection correctly on the client side, you can get rid of all the locks it put on you.
CAS-check for interlock capability
CAS is Compare-and-Set , that is, the captureability check and the capture of the lock itself take place as a single atomic operation, and nobody can knowingly "interfere" between them.
That is, if you first make a verification query to pg_locks then look at the result, then decide whether to put a lock on or not, then no one guarantees that between these operations no one will have time to occupy the object you need.But if you use pg_try_advisory_lock – you will either get this lock immediately or the function will just return
Non-capture with no exceptions or expectations
"Normal" interlocks exist in the model "If you asked for a lockout, you wait.If you don’t want to wait (
lock_timeout ) – here’s an exception" This approach gets in the way a lot within a transaction, because then you either have to implement the block
BEGIN-EXCEPTION-END for processing, or rollback (
ROLLBACK ) transaction.
The only way to avoid this behavior is to use the construction
SELECT ... SKIP LOCKED which has been available since version 9.5. Unfortunately, with this method the options "there was nothing to lock at all" and "there was, but it’s already locked" become indistinguishable.
The advisory interlocks, on the other hand, caused by try -functions, simply return
Do not confuse
pg_ try _advisory_lock– first function taki will wait for until it gets a lock, and the second will simply return
FALSEwhen it is not possible to capture "right now".
Lockouts within and behind the transaction
As I mentioned above, object locks are "tied" to a process and exist only within the execution of the current transaction in it. You can’t even just overlay it :
LOCK TABLE tbl;-- ERROR:LOCK TABLE can only be used in transaction blocks
Accordingly, when a transaction is terminated, all the locks assigned to it are released. In contrast, the advisory locks were originally designed with the ability to to hold a lock and outside of a transaction :
SELECT pg_advisory_lock(1);SELECT * FROM pg_locks WHERE pid = pg_backend_pid() AND locktype = 'advisory';
-[ RECORD 1 ]------+--------------locktype | advisorydatabase | 263911484relation |page |tuple |virtualxid |transactionid |classid | 0 <-- argument int4 #1 or top half of int8objid | 1 <-- argument int4 #2 or lower half of int8objsubid | 1virtualtransaction | 416/475768pid | 29264mode | ExclusiveLockgranted | tfastpath | f
But since version 9.1 there is xact -versions advisory-functions to implement the behavior of "normal" locks, which are automatically released at the end of the transaction that put them on.
Examples of use in VLSI
Actually, just like any other lock, locks are used to provide uniqueness of processing some resource. Usually such resources are either a whole table or a specific table entry which you don’t want to "hard lock" for some reason.
If the need to process some data in the database is initiated by an external event, but multiprocessing is redundant or might lead to race condition , it is reasonable to make it so that only one process at a time
To do this, let’s try to impose a lock with the table ID as the first parameter and the ID of the specific application processing as the second parameter :
SELECT pg_try_advisory_lock('processed_table'::regclass::oid, -1 -- worker type key);
If we got back
FALSE then someone else is already holding "such" lock and this particular process doesn’t have to do anything and the best thing to do is to just quietly exit. This is how the dynamic process works, for example calculation of the cost of goods in stock , in isolation for each individual client scheme.
Parallel queue processing
Now the "reverse" problem – we want to tasks in some queue table. as fast as possible, multithreaded, fault-tolerant and from different business-logics (well, we lack capacity of one) – for example, as our operator on e-reporting transmissions. To government agencies or OFD service
Since the "processing" BPs are different servers, no mutex can be "hung up". Allocating some special task-distributing coordinator process is not safe; if it dies, everything will stop. So, it turns out that the most effective way to distribute tasks is directly on the database level. Dmitry Koterov and then creatively refined.
In this case we impose a lock on the table ID and PK of a particular record :
SELECT*FROMqueue_tableWHEREpg_try_advisory_lock('queue_table'::regclass::oid, pk_id)ORDER BYpk_idLIMIT 1;
That is, the process will get the first yet-unblocked entry from the table by its fellow competitors.
However, if the PK consists of something other than (integer) but from (integer, integer) (as in the same cost calculation, for example), you can apply a lock directly to this pair – there is hardly any overlap with a "competitor".
Important!Don’t forget to periodically properly maintain your queue table !
Monopoly document processing
Used by us everywhere in the document management solutions. After all, in a distributed web-system the same document can be simultaneously opened for viewing by different users, but it can be processed (change its state, etc.) at any given time by only one person.
Where would one go without them! Almost all boil down to one : did not unlock what you locked
Multiple overlay of one advisory block
RTFM , as they say :
If multiple block requests come in at once, they pile up, so If one resource has been blocked three times, it needs to be unblocked three times so that it is available in other sessions.
Put too many blockages at once
Thousands of them! Again read the manual :
Both advisory and normal locks are stored in a shared memory space, the size of which is determined by the configuration parameters
max_connectionsIt is important that this memory is sufficient since otherwise the server will not be able to give out any blocking Thus, the number of recommended locks a server can issue is usually limited to tens or hundreds of thousands, depending on the server configuration.
In general, if a situation arises where you want to overlay a few thousand locks (even if you take them all off correctly afterwards) – think hard where you’ll go when the server is "down".
Leaks during record filtering
Here we take the previous query and add a harmless condition like ID parity check
AND pk_id % 2 = 0 The checks will be both conditions for each entry! The result is,
pg_try_advisory_lock Executed, the lock was applied, and then the entry was filtered out by parity check.
Or an option from the manual :
SELECT pg_advisory_lock(id) FROM foo WHERE id = 12345; -- okSELECT pg_advisory_lock(id) FROM foo WHERE id > 12345 LIMIT 100; -- dangerous!
All – lockout remained and we don’t know about it.Cured by proper queries, in the worst case –
Oh, I got it mixed up!
Classics of the genre…
pg_ try _advisory_lock and
pg_advisory_lock , and wonder why it takes so long.And because the non-try version is waiting for
pg_try_advisory_ xact _lock , and wonder where the lock is gone – and it "ran out" along with the transaction. And the transaction consisted of the same query, because it wasn’t "explicitly" declared anywhere, yeah.
Working with pgbouncer
This is a separate source of pain for many when, for the sake of performance, working with the database goes through pgbouncer in transaction mode
This means that your two adjacent transactions running on the same "with the database" connection (which actually goes through pgbouncer), might end up running In different "physical" connections On the base side. And the locks are different for each…
There are few options here :
- or go to work through direct connection to the database
- Or come up with an algorithm so that all blocking is only within a transaction (xact)
That’s all for now.