Postgres queue tech is a explain of class, however removed from mainstream. Its relative obscurity is in part attributable to the cargo cult of “scale”. The scalability cult has decreed that there are several queue applied sciences with bigger “scalability” than Postgres, and for that reason alone, Postgres isn’t suitably scalable for any individual’s queueing wants. The cult of scalability would moderately we develop functions that scale previous our wildest dreams than ones that solve right complications previous our wildest dreams. Postgres’ operational simplicity be dammed; scale first, operate later.
But some fearless technologists, comparable to these at webapp.io enjoy risked excommunication – their product depends on Postgres queues for core efficiency. Companies like webapp.io are an exception to the norm, recognizing that customarily various solutions outweigh “scalability”. When the cult of scalability fractures, the fractures are on the complete dinky, however they congeal around fresh solutions like operational simplicity, maintainability, understandability, and familiarity. Continuously they congeal around fresh solutions like reusing ragged tech in fresh systems, or using Postgres for queues. You, too, could quiet dare risking excommunication from the cult of scalability.
What is Postgres queue tech?⌗
Postgres queue tech contains two aspects: asserting and listening for fresh jobs (pub/sub) and mutual exclusion (row locks). Both are supplied out-of-the-field since Postgres 9.5, released in 2016.
LISTEN, Postgres makes adding pub/sub to any utility trivial. Moreover to pub/sub, Postgres also presents one-job-per-employee semantics with
FOR UPDATE SKIP LOCKED. Queries with this suffix produce row locks on matching files, and ignore any files for which locks are already held. Applied to
job files, this feature permits easy queue processing queries, e.g.
SELECT * FROM jobs ORDER BY created_at FOR UPDATE SKIP LOCKED LIMIT 1.
Mixed, these two ingredients earn the premise for resource-efficient queue processing. Importantly
SKIP LOCKED presents an “inconsistent” undercover agent of one’s files. That inconsistency is precisely what’s obligatory from a queue; jobs already being processed (i.e. row-locked) are invisible to various team, offering distributed mutual exclusion. These locks pave the scheme in which for both periodic batch processing, and right-time job processing by
LISTENers of present jobs.
Despite these Postgres ingredients having many users, there are pretty few public advocates for combining them as a queue backend. As an illustration, this Hacker Files comment stated that using Postgres this kind is “hacky” and the commenter got no pushback. I stumbled on the comment to be load of BS and straw man arguments. This pondering appears to be “the present files” of the industry – in negate so that you can chat about queue abilities in public, it better no longer be a relational database. This industry of cargo cults has dinky gallop for meals for pushing support on whatever files is “prevailing”. I hope to disimbue any individual of the notion that Postgres is an wicked queue abilities.
We’ll use “background jobs” as the pretext for this dialogue since adding background job processing to functions is a peculiar resolution made by builders, which might enjoy a long way-reaching implications for plot upkeep burden. We can assume “background jobs” as any form of likely prolonged-working task comparable to “generate a document and email it to a customer”, or “direction of an image and convert it to several various formats”. These kinds of use conditions on the complete necessitate queues.
The background job landscape⌗
Love any abilities selections, deciding on direction of prolonged-working projects is a different with many tradeoffs. Within the previous decade, the tech industry has reputedly advance to a consensus that there are about a correct instruments for queuing prolonged-working projects for processing:
- Redis is a ravishing in-memory files retailer and “message broker”^ that is the backend for a lot of neatly-liked background job libraries
- Apache Kafka A distributed “match streaming platform” maintained by the Apache Foundation
- RabbitMQ Allegedly basically the most on the complete deployed “message broker”^
- Amazon SQS An Amazon SaaS product for extremely scalable queues
My apologies if I’ve excluded your favorite(s); here’s no longer intended to be exhaustive.
^ “Message broker” merely manner that a queue plot does various adore stuff on high of being a queue, however for our dialogue, let’s take hold of into account message brokers queues. There are a couple of words and phrases that I take hold of into account effectively synonymous with “queue” and “queue processing”: “message broker(ing)”, “circulate processing”, “streaming files”, and many others. I’m conscious that these mean explicit issues that are no longer precisely “queue” or “queue processing”.
I salvage it’s principal to end here and discuss Redis’ significance within the area of “background jobs”. Whenever you browse the background jobs GitHub topic, the tip 5 most neatly-favored libraries are all backed by Redis:
There’s a station off of this; because Redis stores files in memory, both its insertion and retrieval gallop are extraordinary. It also has a pub-sub API constructed in, and with native
station files building which, when blended, create for an fabulous queue. Redis scales. For loads of builders, that scalability has made it the default different, and defaults are profoundly extremely efficient.
But earlier than deciding on Redis because it scales neatly, take hold of into account this quote from Ben Johnson’s I’m All-In on Server-Side SQLite. It’s specifically speaking about database scalability, however the train holds for scaling all kinds of infrastructure, like queues:
When we assume fresh database architectures, we’re hypnotized by scaling limits. If it can well’t address petabytes, or finally terabytes, it’s no longer within the dialog. But most functions will never survey a terabyte of files, even though they’re a hit. We’re using jackhammers to pressure fabricate nails.
As an industry, we’ve turn into completely keen about “scale”. Reputedly on the expense of all else, like simplicity, ease of upkeep, and decreasing developer cognitive load. All of us are looking to contemplate that we’re building the following explain that can seek files from Google, Fb, or Uber scale, however really, we’re nearly constantly – no longer. Our abilities selections could quiet contemplate that truth. We’re extra likely building for pretty dinky scale, and desires to be optimizing our selections around a completely various station of issues that enjoy extra to complete with team composition than technological superiority.
When we’re starting up projects and businesses, we wants to be optimizing for the complete lot however scale on the outset. Obviously, we don’t are looking to support ourselves right into a nook with abilities selections, however we also don’t are looking to develop Kubernetes clusters to inspire marketing websites for products that are prone to fail for every reason however the proven truth that they don’t scale neatly. We wants to be serious about what applied sciences we know neatly, what’s correct ample, and what’s the least toilsome solution that meets person wants and our team’s skill devices. Be good ample with deciding on “correct ample” over “basically the most productive”; customarily “basically the most productive” is merely a more challenging direction to inevitable failure. List on your head each product that failed because it couldn’t scale. There’s a for loads longer checklist of products that failed prolonged earlier than they obligatory to.
What hasn’t been acknowledged but, is that Postgres with out a doubt does scale neatly. But Postgres is overall-motive tool, and it’s no longer going to be “basically the most productive” at scaling for queue use conditions. It’s going to earn ravishing neatly for that use case, trusty like it performs ravishing neatly doing the complete lot that it does.
Whenever you’re here and with out a doubt feel comparable to you’ve seen ample of what I with out a doubt must narrate, with out a doubt be joyful to abandon this online page and scroll by scheme of Dan McKinley’s purchase wearisome abilities hotfoot deck. I’m assured that whether or no longer you fabricate this submit or Dan’s hotfoot deck, you’ll create same selections by manner of your subsequent queue abilities different. On the least, Dan’s “Utilize Uninteresting Technology” discuss used to be the muse for this submit’s title.
The principle ask to query when making abilities selections is: what applied sciences am I for the time being using and perceive neatly?
The acknowledge to this ask informs the “trace” of deciding on applied sciences for your tool stack. Technologies already in use are, presumably, low-trace. Assuming they’re neatly understood.
There’s a correct probability that you’re already using a relational database, and if that relational database is Postgres, that you might quiet take hold of into account it for queues earlier than any various tool. Whenever you’re no longer using Postgres, that you might quiet take hold of into account whatever is mainly the most wearisome abilities to you, earlier than brooding about anything else.
Technologies no longer (but) in use are dearer.
In various words, wearisome abilities is relative to what’s already in use. Functions that are oriented around message-passing, like notification systems, could take hold of into account RabbitMQ wearisome abilities. Caching functions could take hold of into account Redis wearisome abilities. Functions with a extraordinary amount of relational files could take hold of into account Postgres wearisome abilities. The maximally wearisome different is likely the correct one for you and your team.
Whenever you’re no longer already using Redis, Kafka, RabbitMQ, or SQS, adopting any one amongst them most productive for background jobs is dear. You’re adding a brand fresh plot dependency to each trend, take a look at, and manufacturing atmosphere, likely for the rest of the utility’s existence. A fresh station of abilities is now required of every future Developer, DBA, and/or SRE perform on the team. Now they want to know these fresh systems’ intricate failure modes and configuration knobs. Job candidates could quiet be joyful that discovering out this fresh abilities is a worthwhile time investment. DBAs/SREs want to know enhance from operational failure, diagnose complications, and track efficiency. There’s loads to know; and there’s loads that no one on the team realizes they want to know. These systems’ unknown unknowns are a risk. Especially if these systems are a default different for you, and also you haven’t build masses of even though into why they’re your default different.
Here’s no longer all to narrate that basically the most wearisome abilities is a panacea – Postgres incorporated. What one presents up for familiarity, known failure modes, and amortized “trace” could neatly be paid for in efficiency, or some various principal precept. On the least, pushing and popping from a Postgres queue is severely slower than Redis. The use of Postgres for queues could mean that in region of having a single relational database on a single server, functions now require an “utility” database and a “queue” database. It could even mean a wholly separate database server for background jobs, so background jobs are independently scalable. It could mean databases could quiet be
VACUUMed extra incessantly, incurring a efficiency hit within the procedure. There are many implications that one could quiet take hold of into account earlier than adopting Postgres for queues, and so they wants to be weighed against team and utility wants so that an informed resolution could be made. Postgres shouldn’t be a default different. Equally, neither could quiet Redis, Kafka, RabbitMQ, SQS, or any various distributed queue. Deciding on wearisome abilities wants to be one’s default different.
Technology selections are tradeoffs your complete manner down. I stumbled on that Dagster had a life like scheme to adopting Postgres for his or her queues. When in doubt, take hold of into account the following an axiom:
If and most productive if wearisome abilities is provably unable to meet demands could quiet selections be regarded as.
Plot with slump hatches⌗
Earlier I talked about “no longer getting backed right into a nook”. With respect to background jobs, that manner utility code for processing jobs wants to be queue-agnostic.
One day’s decreasing edge tech is one other day’s wearisome tech. As functions develop and success is carried out, fresh applied sciences are inclined to earn bolted on to functions out of necessity. It’s peculiar so that you can add memcached or Redis as caching layers (however end take hold of into account Postgres unlogged tables first!). Meaning these applied sciences turn into “wearisome” over time, decreasing their trace, and altering the calculus for using them as queues.
Constructing with slump hatches is all about abstraction. Earlier I listed the tip 5 most neatly-favored background job libraries on GitHub. Excluding for Hangfire, none of these libraries present an slump hatch to queue applied sciences various than Redis. Meaning switching queues requires rewriting utility code because there’s no extraordinary abstraction in entrance of the underlying queue.
It shouldn’t be that manner. Queue tech wants to be abstracted away, so users must purchase the correct queue for the job. I’m no longer a Hangfire (or C#) person, however Hangfire appears to enjoy gotten the abstraction correct.
It used to be with the previous philosophy of deciding on wearisome tech and building with slump hatches that I constructed Neoq https://github.com/acaloiaro/neoq. Neoq queues could be in-memory, Postgres, or Redis (contributions for your favored wearisome tech welcome!). Users can swap between queues with out altering any utility code – merely initialize it with a various queue backend. Neoq is extra abstraction than it is concrete implementation. Whereas both the in-memory and Postgres implementations are first-party, the Redis implementation is asynq. It’s extra about offering slump hatches than locking builders right into a specific underlying queue abilities.
I’d like to hunt extra neoq-like libraries for languages various than Dash. I salvage the inability of tool libraries with slump hatches is what backs masses of builders right into a nook, forcing them to starting up up easy projects with a Redis dependency, prolonged earlier than Redis is warranted. Redis is fabulous, however it no doubt’s no longer constantly the correct queue, or correct amount of complexity for the job. The the same goes for Kafka, RabbitMQ, and SQS.
Deciding on Postgres queue tech⌗
I hope this submit encourages others to risk excommunication from the cult of scale the following time they’re deciding on queue abilities. There are such a lot of mighty solutions that are no longer “scale” to grab into account when deciding on applied sciences. Make wearisome abilities your default different, and purchase Postgres if it bores you.