30K Messages Per Second Queue | Scaling Postgres 295

Episode 295 December 17, 2023 00:15:32
30K Messages Per Second Queue | Scaling Postgres 295
Scaling Postgres
30K Messages Per Second Queue | Scaling Postgres 295

Dec 17 2023 | 00:15:32

/

Hosted By

Creston Jamison

Show Notes

In this episode of Scaling Postgres, we discuss a 30K messages per second queue built on Postgres, a zero downtime logical replication upgrade, the release of the Supavisor 1.0 connection pooler and PostGIS day.

To get the show notes as well as get notified of new episodes, visit: 

https://www.scalingpostgres.com/episodes/295-30k-messages-per-second-queue/

 

View Full Transcript

Episode Transcript

[00:00:00] Hey, do you have a queue system for your application? Do you use RabbitMQ? Or maybe Kafka? Well, would you consider using postgres? In this episode we discuss a high performance postgres queue, but I hope you, your friends, family and coworkers continue to do well. And our first piece of content is yeeting over 30,000 messages per second on postgres with TembomQ. This is from Tembo IO and they are talking about the performance of their queue system that is run on postgres as an extension called PGMQ. And they just released version one. And these are some highlights. Number one, the API is stable now. All the PGMQ objects now live in a PGMQ schema. There have been some relaxed compiling requirements. They added an unlock queue type to give you even more performance. But of course that's, as they say, at the expense of durability and of course updated docs. Now I don't know if the numbers they're reporting here are for logged or unlogged. My assumption is that they were logged because they are not explicitly saying that it was unlocked. But they've got a number of metrics that they're testing and showing here. Now this is a queue system that we've discussed in previous episodes of scaling postgres. So you can look at some of the previous episodes or blog posts on their website to learn more about it. But this is basically a performance perspective with the 1.0 release. In the first example that they're looking at here, they're using a batch of 100 items that need to be queued and each message is super small, like only 22 bytes. Like it's the equivalent of saying username equals Elvis, for example. So really really short. And in their queuing system they could write to the queue at 150,000 messages per second and they could read from the queue at 30,000 messages per second. And this instance size is a 16 VCPU 30GB of memory. And they did all of these read and writes using 20 concurrent writers and ten concurrent readers or consumers of the batches. Now they did different changes to help achieve this performance. They're vacuuming very aggressively. They set their shared buffers at 60% of system memory, which is kind of crazy considering the normal recommendation is 25%. So this was actually, I think on their cloud platform offering. This is an open source extension that we'll look at the GitHub in a second. But I think this was for their optimized all in one solution. [00:02:34] Next thing they looked at is increasing the message size because 22 bytes, that's hardly anything for a message in a queue. So they decided to test 1000 bytes or a kilobyte message size for each message because they said generally that's what is recommended for Apache Kafka. So with that, they still were able to sustain 30,000 messages per second on reads, but it dropped down to 30,000 messages per second on writes. But a lot of that reason is also because they dropped the batch size down from 100 to ten because of the larger message size, but they could still read 100 off of the queue at that larger message size. It says the next thing they looked at okay, what if you don't want to do batching, you just want to do a single message. So some single event has happened, you need to throw it in the queue and do something with it. And that's what this looks like. So they have a batch size of one, it's still 1 message. And they had 50 writers, or they also call them producers, that were able to write 10,000 messages per second into the queue. And they had 100 consumers or readers reading one message at a time, and they got up to 7600 messages per second in the queue. So still I think that's pretty good. The next thing they looked at is actually a smaller instance with just one virtual cpu and 4gb of memory, and they reduced the number of writers and readers and the bat size down, but still it was doing thousands of messages per second. [00:04:04] So again, I believe these numbers are from their tembo cloud product where they basically wrap up everything for you, but it is available as an extension that you can install in your own Pescrez installation to my knowledge. So here's the link to the GitHub and the license, as you can see here is basically the postgres license. So hopefully that should be good for most people to use. If you're already using postgres and they offer it as a docker container or you can build it from the source. So if this is of interest to you, definitely check out this piece of content. [00:04:36] Next piece of content zero downtime postgres upgrades this is from NOC app and they're describing their upgrade from 11.9 to 15.3 with zero downtime using logical replication. So this is a super long post that they provided a lot of detail in. Some of it's redundant, but it is really great content. And if you're thinking about doing a logical replication upgrade, I would definitely review this blog post because they have a lot of recommendations on how to handle that, but I'll hit up some key points here. So basically they used logical replication to do the upgrade. So they brought up an entirely new server. They're using Aurora in this case, but of course you could use it with just straight postgres. They then got the old database in sync with the new database using publishers and subscribers. And then at a given point in time, they transitioned to using the new instance and they had different techniques depending upon the tables that they wanted to migrate. So how they replicated their small tables is using this technique here. Now they set up the publication and they set up the subscription on the subscriber side. But what they did is they added a table at a time to the publication to minimize the amount of writes and reads that were going on. So basically they are altering the publication that they established, then they're adding the new table to it. And then on the new database you need to alter the subscription to give it a refresh so it can then see that table and start synchronizing it. So that's how they did all of their small tables for the large tables that were primarily append only. So they were large, but they were only changing recent data. This is the technique that they use. So they still used the same technique for the small table, except they did not copy the new data over, so they were just synchronizing the new data. But then they used a technique where they took a snapshot of the old database and data was replicated in the background by a logical replication to the new database, even while new data was being inserted from the old database. So this definitely did require a little bit of juggling to do. But they also had cases where they had large tables with many updates over most of the rows. So they weren't able to use this technique. But I think they just got through it by just monitoring the replication to make sure that it negatively impact the system. Now, they did make note if you were on 15 and greater trying to do a logical replication upgrade, you can use the new filters in logical replication. And I've actually done this for some of the logical replication upgrades that I've done. Where you have a multi terabyte table, well, you can create multiple publishers and subscribers that only bring a portion of that table over at a particular time, using a where clause to bring a quarter of it over or half of it over, or whatever you choose. And there's even a technique that you can use that wasn't mentioned in this blog post to bring over really large tables in earlier versions and that's by setting up logical replication using a snapshot. And we covered this in a previous episode of Scaling Postgres. So basically you take a snapshot as if you were doing a backup. You set up your logical replication using that snapshot, and then you take a backup using PG dump or copy, some other technique. Then you restore it on the subscriber database and then enable replication at that point. And I actually had to use that technique for a multi terabyte database to get it replicating quickly. And I basically did ten parallel streams of a large multi terabyte table to replicate it much faster than one publisher subscriber stream could do. So that's another technique you could use. They talked about being able to monitor the replication status to make sure things are okay, how you can abort the replication of a particular table if you need to, and restart it again, re add it, and then their process of actually doing the switchover in their application. So this was a great post with a lot of detail that I definitely suggest you check out if you're looking to do a logical replication upgrade. [00:08:39] Next piece of content Supervisor 1.0, a scalable connection pooler for postgres. This is from superbase.com, and of course they're talking about supervisor version 1.0. And the highlights for this version are supporting query load balancing, name, prepared statement support, and query cancellation. So the first one talks about load balancing and basically read requests can be sent to any postgres instance in the cluster. And what they mean by cluster is probably the primary and however many replicas you have, and it says quote, it randomly distributes these read requests across the entire postgres cluster, and it targets write operations to the primary automatically by probing the read replicas until it hits the primary. And this is what you can do with the lib PQ client. Now, looking at this, the concern that I have is I know a lot of applications, they do a write and then immediately do a read for some things. So I'm always wondering, is that read going to time out? And for that reason I normally see applications dedicate certain reads to the replicas and certain reads to the primary. So it looks like that question they're trying to address with the read after writes here. And they say quote. It's easy to guarantee read after writes from the same client by wrapping the read and write in a transaction. So it sounds like if you're going to be reading something right after you've written it, you have to rely on putting it in a transaction to make sure the data is there as opposed to hitting something that hasn't been replicated yet. But there's a future plan to support set server as primary to make sure the reads go to the primary basically then of course they have the name prepared statement supports that we've seen for other connection poolers. So that's great. And query cancellation apparently wasn't working, which I guess is a thing now. So if you're interested in trying supervisor, definitely check out this blog post next piece of content Postgis Day 2023 summary this is from crunchydata.com and this was hosted a couple of weeks ago and this lists all the different presentations and the things that were discussed at Postgis day that crunchy data put on. So if you're interested in that, definitely check out this blog post. [00:10:55] Next piece of content updates on trademark actions against the PostgreSQL community this is from postgresql.org and apparently the organizations that were having the trademark dispute have, quote, reached an amicable resolution. So that's great. And Funda Xion PostgreSQl has surrendered all trademarks and entered into a trademark license agreement with PostgreSQL Community association. [00:11:19] Next piece of content a sneak peek into the state of PostgresQl 2023 this is from timescale.com and they did their state of postgres survey between August 1 and September 15 of this year. Survey is being prepared for release later, but this is an early preview and they addressed a few questions, such as where the primary geographical location of the survey respondents came from. They asked how long you've been using PostgreSQL. They asked what's your current profession or job status? So a lot of different types of engineers at the top here, followed by management, and then finally database administrators. And then have you ever contributed to PostgreSQL? Rate your experience with it, and a question on AI in terms of your thoughts about it and why you're using it in PostgreSQL. And the number one reason is not surprising. Basically, keep your vector and your relational data in the same place is why people would use postgres for AI work. [00:12:18] And this is also interesting too. The top extensions that people are using postgres is number one. I would have thought that it would have been PG stats statements. I think timescale's Db here. I guess it's because predominantly people that interact with timescale uuids. Unsurprising being there. PG crypto, it seems more people than I thought are using PG crypto I haven't used it yet. PG trigram I expected to be a little bit higher, but then we got the postgres foreign data wrapper. That makes sense. PG repack, Pgron and Citus so you can check out this blog post if you want to find out more about the preview. [00:12:56] Next piece of content there was another episode of Postgres FM last week. This one was on constraints, so they talked about the six different types of constraints in postgres. Basically. So check constraints, null constraints, unique constraints, primary constraints, foreign key constraints, and lastly exclusion constraints. So if you want to learn all about those you can definitely listen to the episode here or check the YouTube video down here. Next piece of content understand table statistics using PGSTAt all tables this is from stormatics tech and they show the layout of the schema for the PG stat all tables, but then also some different questions you can answer with it, such as how to identify tables with the highest frequency of sequential scans in the database. So maybe you want to add some indexes how to identify unused or infrequently accessed tables in postgresql, so maybe you don't need those anymore. [00:13:50] Next is how to check the right activity of tables in postgres and then how to determine the number of live and dead tuples in a table and check their vacuum status. And this is usually the number. One reason why I'm using it is to check how often a table is being vacuumed. Is it enough? Is it perhaps too much so? I tend to like to look at it for this purpose, but check this out if you want to learn more. [00:14:12] The last piece of content is specialized cpu instructions. This is from MomGM us and we had a discussion on a timescale blog post that was talking about them using SIMD or single instruction multiple data functions that are a part of cpu instructions for them to accelerate. I think some parallel compression and apparently postgres 16 is using SEMD intrinsic functions as well on the x 86 64 and the arm. Cpu platforms and different operations include doing copies set to a constant test for equality test for less than a value. It says most of these right now are used at the byte level, but some may also process four byte integers doing different finding and checking functions. So this is interesting. I haven't heard this before and pretty interesting. So if you want to learn more you can check out this blog post. [00:15:07] I hope you enjoyed this episode. Be sure to check out scalingpostgres.com where you can sign up to receive weekly notifications of these episodes, as well as getting the links for all the content discussed, a transcript of the show, as well as the podcast version. Thanks and I will see you next week. Weekend.

Other Episodes

Episode 257

March 19, 2023 00:19:16
Episode Cover

Adopting PgCat, Time Bins, work_mem Settings, Bad Constraints | Scaling Postgres 257

In this episode of Scaling Postgres, we discuss Instacart adopting PgCat, binning or bucketing your data by time, the best settings for work_mem and...

Listen

Episode 203

February 20, 2022 00:12:11
Episode Cover

Saving Space, Working with TOAST, Protecting Data, High CPU Utilization | Scaling Postgres 203

In this episode of Scaling Postgres, we discuss the new releases of Postgres, a hairy upgrade incident, why slowing down can make you go...

Listen

Episode 174

July 19, 2021 00:15:54
Episode Cover

Lateral Joins, Generating JSON, dbcritic, Fetch First Rows | Scaling Postgres 174

In this episode of Scaling Postgres, we discuss understanding lateral joins, generating JSON with Postgres, the dbcritic utility, and fetch first rows vs. limit....

Listen