Episode Transcript
[00:00:00] In this episode of Scaling Postgres, we talk about squeeze your system 1 million connections indexing like in PG vector HNSW performance. I'm Creston Jameson and this is Scaling Postgres episode 278.
[00:00:25] All right, I hope you, your friends, family and coworkers continue to do well. Well, I'm planning to make some changes to the scaling postgres episodes as they come out, so I'm thinking about potentially maybe doing interviews with people in the community or people who have written some of this content to have a more in depth discussion, as well as reducing the number of posts that I cover each week, maybe to no more than ten. I mean, I'll still post all the links for the content I found, but have a smaller set to focus on. But I would really like to get your feedback. So if you have suggestions on how I can improve this show, please put it in the comments or reply to the email that gets sent out with the scaling postgres email list. I'd really like to get your feedback, but for this week, the first piece of content is squeeze the hell out of the system you have. This is from blog dansliman.com and he's talking about a situation that I've heard more than once where their database was struggling to keep up with the load and the cpu was writing between 60 and 80%. At least once it spiked to 100%, causing a brief outage. So not good. And they'd been kicking the can down the road, just increasing the database size, increasing the database size. And they're at that point. Okay, what do we do next? So they considered looking into sharding, they considered looking into doing microservices. So splitting up the monolith into multiple services. But as the title of the post indicates, they decided to squeeze the hell out of their system. Because the reality is that these two options increase complexity by a lot. And he advocates here, quote always squeeze first. So they say they put two engineers focused on performance enhancements for about three months. They said, quote, there was no silver bullet, but it looks like they just went a piece at a time adjusting queries or statements to tune it, as well as tuning a lot of postgres settings. And two more engineers redirected queries to a replica DB so the primary wasn't burdened with all the read traffic. And this reduced the maximum cpu usage from 90% to 30%. So I love the message that this sends. Basically, there's a lot of times increasing the size of the database gives you sub 100% increases in performance. Maybe with a configuration setting change, maybe you're going to double or triple some performance, but changing how you use it, what indexes are there, how you're querying it, and how you're using it. In terms of the schema, what inserts, updates are you doing and how much you've optimized. It can make a 10,000 x difference because multiple times I've found indexes that increases a query by 10,000 fold. And talking about this specifically, I've had client engagements where the cpu was hitting 80 90% and just by going through the top queries of the database and optimizing them, we got it down to maybe an average of 25%. So usually there's always a better way to optimize your system. And I don't normally look at Hacker News, but this was posted on Hacker News and I found two comments particularly interesting. One person reports essentially the same thing where they were at 2000 requests per second. And apparently I guess most of the developers and the PM thought okay, that's all we can do. And he started working on it. He got it to over 3000 rps, and then to 4000 rps, 10,000 rps, 40,000 rps, and then changing a whole bunch of stuff. He says he got it to 2 million and a month later 40 million requests per second. So this is anecdotal, but it's definitely possible to drastically increase your performance by changing how you're using the database and not just getting bigger hardware or partitioning or sharding or things of that nature. The other quote I found interesting is quote loads of over engineering decisions would be avoided if devs understood how to read explain analyze. Now I don't necessarily know about that, but I know learning how to read explain, analyze helps you get these thousandfold improvements in performance. But definitely an interesting blog post. I suggest you check out next piece of content supervisor scaling postgres to 1 million connections. This is from superbase.com and this is another connection pooler like PGBouncer and like the also new PG cat. But this, as they say here is a quote scalable cloud native postgres connection pooler. It's been developed with multitenancy in mind, handling millions of connections without significant overhead or latency. So this is a test where they basically put 1 million connections through supervisor and connected it to 400 back end database connections. So at first they started with 250,000 connections. And the database performance, they could get it up to 20,000 queries per second. And this shows the connections increasing and it scales in pretty much a linear fashion. Then they increased it to 500,000 connections, again going through the same 400 connections to the database, again at 20,000 queries per second. It just increased the number of connections. And this also seemed to scale in a linear fashion. And they also showed the load being spread out across the different cores. And as you can see, all of them are being utilized. And of course this is one of the main disadvantages of PG bouncer is that it's single threaded. So if you wanted to have a dedicated instance for PG bouncer and this many cpu cores, basically you have to have this many PG bouncer services running. It takes more engineering to get it running across multiple cores as they show here. And then they up it to a million connections. And how they do that is they just add another supervisor node and send 500,000 connections through the supervisor node that's maintaining the database connections to the postgres server. So this node handles 500, but also relays another 500. So that just happens to be how it's architected. They say this setup successfully handled over a million simultaneous connections to the supervisor, instances that then had 400 connections to the database, but it achieved 20,000 queries per second and each connection executed a select query once every 50 seconds. And what they said here is that in a multitenant setup, or when using read replicas, load is more evenly spaced because each supervisor connects to a given database. But then other supervisors can relay connections to them if they receive the connection from the client. So that's an interesting design perspective that I think is different from PGCAt. Now there was an impact on query duration comparing supervisor versus pgBouncer. And they show here that for their tests, queries with supervisor had an additional two millisecond delay. So that's something to take into account. This has a lot of advantages, but this was one disadvantage that they did mention. But I love seeing these new connection poolers, particularly ones that allow you to use all the cores of the system. So if you're interested in that, check out this blog post.
[00:07:10] Next piece of content indexing, like in PostgreSQL and Oracle. This is from Cyber, postgresql.com. And normally if you're going to query a text column and you have a B tree index on it, using like for a query is not going to use that index. So how do you do that? And they actually take it a step further in terms of showing what different collations can do. So actually if you use the C collation, which is not something I've done for prefix searches, because those are the only kinds you can do. You can't really search inside of an index the way a B tree index is laid out. It's essentially like a normal book index where you can look for the first letter, then second letter, third letter, et cetera. But with a prefix search, it actually uses the index. But the C correlation has a disadvantage when you try to order the actual data that they have in here, and it orders it in unexpected ways. But then they took a look at ICU correlations and that actually ordered things better. But it's not going to use an index when you do a search. So to actually use the index for a prefix like search, you need to add the text pattern ops or vercare pattern ops so that an index can be used. Because what this does is it actually does character wise comparison operators, so they match character by character to find the correct entry in the index. Now they compare to Oracle, and even though Oracle has some advantages, they have some big disadvantages that they list down here, and I'll let you read the post to understand it. But he's basically saying that from the Oracle options you either get a correct result for order by and a wrong result for like using a particular correlation, or you get a wrong result for order by and a correct result for like using a different correlation. And he wasn't able to find a way to have an index support being able to do a query that tries to get you both like you can in postgres. But if you want to learn more, check out this blog post.
[00:09:06] Next piece of content an early look at HNSW performance with PGvector. This is from JCats five. They're talking about PG vector, which is an extension that allows you to store vector data in postgres that is used for a lot of artificial intelligence or machine learning systems. And storing this vector data allows you to basically do nearest neighbor searches to find comparison between different things or find associations between things. The current version of PGvector includes the ability to index it using IVF flat. However, in the 0.5 release that's coming up, they've added support for HNSW as well, which stands for hierarchical navigable small worlds. And the benefit of this is in exchange for a longer build time, potentially larger indexes, you get to have still high performance, maybe not as high as IVF flat, but a much more accurate recall, which he says quote where recall is the percent of relevant results returned. So if you want more relevant results that would have a high recall now he goes in a lot of depth of the test that he did, and he emphasized that this is one point in time from one commit of PG vector and one version of all the different software to put this together. So take these results with a grain of salt, but it gives an indication of how this new index could be beneficial. So he is using an approximate nearest neighbor search benchmark for Python, and he's run that and shown results similar to this. So the y axis is queries per second, so how fast it can do the query and then recall is on the bottom. So how relevant the search results were and you can see the PG vector HNSW. The dark blue seems very accurate or have a super high recall. It may not be as fast as PG vector, but it's a little hard to discern what point matches up with each point in each index. I'm assuming these are different steps of the benchmark, but I'm not quite sure what each dot means. But HNSW gives you very relevant results in the search and still pretty quick compared to previous versions in the red here. And he used different data sites in this test. The further results show basically the same thing, greater speed for the IVF flat, but not as high a recall. So if you're using PGvector for doing AI type activities, maybe you want to check out the new version coming out of PGvector to see if you want to try this new HNSW implementation.
[00:11:39] Next piece of content there was another episode of Postgres FM last week. This one was on sharding and they discussed quote what it means, why, when it's needed, and the available options right now. So you can click to listen to the episode here or watch the YouTube down here.
[00:11:55] Next piece of content securing your postgresqldb with roles and privileges this is from ourlopez.com and there's a rainstorm in the background. Sorry if you're hearing that, but I can't really stop the weather. So this post starts at a very basic level and builds up to explain to someone who's new to using roles in postgres on how to set up a role based system within postgres. Because previously they were using the postgres user changing the password and that's what everyone had access to. So essentially no role infrastructure set up. But here they explain what are roles, what are privileges, how they decided to set things up to architect it to reach their sock two compliance. So if you want to look at how they set up their system to achieve their sock compliance for permissions in postgres. Definitely check out this blog post next piece of content fun with PostgresQL puzzles recursive functions with animations this is from crunchydata.com. So this is actually day 19 of the 2022 advent of Code challenge. So if you want to learn how to do that, check out this blog post.
[00:12:59] Next piece of content how we reduce downtime by 100 x supplementing PG backrest with EBS snapshots this is from timescale.com and here they're talking about using EBS snapshots along with PG backrest to do some of their backups and potentially restores. So they were talking about 100 x. So what they were doing is that in the event of an upgrade, they would do a full PG backrest backup after that upgrade so that moving forward they could use PG backrest to do the restore. But that meant a lot of downtime because apparently they didn't bring the system up until that full backup was done. So what they started doing instead is taking an EBS snapshot, and that happens very quickly. So they did the EBS snapshot, brought the system up, and then they started the backup that would be completed later. So they're still using PG backrest, and they said they don't plan to get rid of it, but they're just adding snapshots in certain areas to help give them additional benefits. And I've actually worked with a client who have chosen to use EBS snapshots as their main backup solution. Now, you can't do point in time recovery doing that, or it would require some additional work to do that. But they've been happy using the snapshot solution. So if you want to learn more about how and why they've done this, you can check out this blog post.
[00:14:15] Next piece of content how an AWS Aurora feature cut our DB costs by 90% this is from graphite dev, and they're talking about how they have a very high Aurora bill, not necessarily because the amount of data that they store, but because of their super high I o costs. So they have a lot of I o happening in their application. But Aurora made some changes that actually adjusted how I o intensive applications were costed, and they were offering up to 40% cost savings. But for this particular application, their cost savings were 90%. So if you use Aurora and it's mostly I o heavy usage, maybe you want to check this out as well. Next piece of content episode six highlights of Path to Citiscon podcast with Chelsea Dole and floor dres. So this is a blog post for the episode that they did talking about. Quote, you're probably already using postgres what you need to know, so this is mostly for developers. So if you're interested in that, you can check out their podcast and the last piece of content balancing act achieving high availability with HA proxy for postgresql. This is from Haigo, CA. This blog post does what it explains shows you how to set up haproxy having a primary and two replicas. So if you're interested in that, you can check out this blog post that does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show notes. Be sure to head over to scalingpostgres.com, where you can sign up to receive weekly notifications of each episode, or you can subscribe via YouTube ride tunes s thanks.