Stable Plans? | Scaling Postgres 407

[00:00:00] So postgres has never had query hints. These are things you give to the planner to guide it on the path that you want to use. Like I know Oracle has this, I'm sure other database systems do, but postgres has never had this. Now there is an extension called PGHINTPlan that tries to give you the ability to use these hints, but stock postgres has never included it, but the developers have always thought it's best to let the system make the decisions based upon the data it has available to it. But some of that may be changing slightly given some development work that's been going on. [00:00:40] But I hope you, your friends, family and coworkers continue to do well. Our first piece of content is pgplannadvice plan stability and user planner control for PostgreSQL. This is from arhas.blogspot.com Now I did cover this briefly on a blog post from pganalyze that was talking about this, but this is actually Robert Haas who has been developing or coordinating the work on this. So he gave a blog post about it and he's calling it, quote proposing a very ambitious patch set for PostgreSQL 19 and only time will tell if it will end up there or not. But he's working on three contrib PGPlan Advice, PGCollectAdvice and PGStache Advice. [00:01:23] Now what these are focusing on is Postgres plan stability because it is the case if you have data changes in your database and then statistics get recomputed, you can come up with a poorer plan. Or if the amount of statistics you're collecting is not sufficient, you could come up with a poorer plan than the one you're currently running. And these plan changes once you're at scale can have a outsized impact on your database system. So plan stability can be very important and products like RDS Aurora have the ability to lock plans in place. [00:02:04] But this is not only for plan stability, but also allow users to control how the planner works as well. [00:02:10] So he's first looking at pgplannadvice here, which basically just generates a planadvice string. [00:02:17] So he loads pgplannadvice and he runs an explain plan including plan advice. And as part of the Explain plan it outputs this plan advice string on how to execute this query. [00:02:32] Now once you have this plan advice string, you can reuse it to make sure the plan is reused again. However, you can also just include relevant parts of it. So he focuses on the hash join here and he actually creates the transactions and sets the plan advice advice to a hash join to say whatever you do, make sure you do a hash join. He runs the query and of course it looks exactly as the same one above other than saying the supplied plan advice is to use a hash join, but you can also change that. And in this example he uses a merge join plan. [00:03:12] So now when this is run you could see the explain plan is different. You're getting a merge join here before you were getting a hash join. [00:03:19] And you could see the generated plan advice did change based upon the merge join plan advice that was submitted. So he says you might think this is kind of like just enabling hash join or enabling merge join or disabling them. But this gives a lot of fine grained control on how you want to handle a particular query. [00:03:40] But he says you can automatically manipulate a plan for a certain query every time it occurs. And you do that using pgstachadvice. So here he creates the extension pgstashadvice. He runs an explain including the verbose output because he wants the query identifier. [00:04:01] Then he runs pgcreateadvice stash with a given name and then he sets that stash, sending in the query id. [00:04:10] So when this query is used again, use this plan advice, do the merge join plan and then set the stash name. Now I hope the external API with this changes a little bit because this does seem a little confusing in the multiple steps. But the goal is as he says here, quote, every time the system plans a query with that query identifier, in a session where the stash name is set to my stash, it will use the query advice. And he says for production use you'd probably want to be able to load in the pgstash advice to the shared preload libraries and set your stash advice stash name to whatever you create. So basically this gives you system wide control to apply certain configurations to individual queries as opposed to having to make changes in the application. [00:05:04] Now he says this patch set is mostly trying to develop different tools that then other people can use to build upon. And he readily acknowledges, you know, this is definitely 1.0 technology and has limitations, but he definitely needs reviewers and testers to see if they can get this into version 19. So if something like this interests you, maybe you'd like to check it out. [00:05:28] Next piece of content Mostly dead, slightly alive killing zombie sessions. This is from mydbanotebook.org and she's talking about the issue when you have a application process crashes or the network between the application and the database crashes. You have database sessions that keep on running, utilizing resources. [00:05:51] And how can you address this? [00:05:54] I'm actually going to mention the one she lists here to the client connection check interval, because we had a post on this a couple of weeks ago that was very good on how this improved the stability of the system, being able to stop postgres backends from continuing to process queries when the client has crashed. And it basically allows a backend to peak or poll that the client is still there processing. And if it's not, it can go ahead and abort the statement that it's running. And then in addition to this, there's also the TCP keep alive settings that help you to detect network breaks. And she says both of these together are really critical for keeping your database healthy. [00:06:36] So your client connection check interval pulls the client and then if it is no longer available, it stops running its query. [00:06:44] Whereas the TCP keep alive close things if you have network issues. Because if you don't have the TCP keep alive but you're using client connection check interval, the poll that it does won't kill anything if the network is down. [00:06:59] And she gives some suggested settings for the TCP keep alive and the client connection check interval for a general OLTP service, as well as some pgbouncer configurations as well, if you're using that. So check this out if you're interested. [00:07:15] Next piece of content. Insert on conflict do select a new feature in PostgreSQL 19. This is from Cyberdeck PostgreSQL.com and he's talking about a new feature that looks like it might get into 19, and that is insert onconflict do select. Now we have had onconflict do nothing. So if there is a conflict with an insert don't do anything or we have on a conflict, do an update and set columns to these particular values. So that's great too. We'll now we have select and what is this for? Now he does go into a little side quest of this, where he says, now that we have merge, why do we need this? And he actually says insert on conflict is different from merge. And merge can have a race condition, even with a concurrent data modification going on. [00:08:08] And that insert on conflict do update handles this. So he shows an example of a table, and he starts a transaction, inserts a value and tries to do a merge statement to insert a value that would be in conflict otherwise update it. And initially this will hang because it has a lock in this transaction. But once you commit this you end up with a unique constraint error. Whereas you don't see this when using insert into on conflict do update. Now he doesn't say anything more about it, so I was unfamiliar with this with regard to merge, but I wonder if this is a SQL standard behavior or is it possible the implementation will change from merge so it can avoid these types of updates? So he doesn't really say that in his blog post. [00:08:58] But what does onconflict do select? It basically enables you to return data. So normally when you're doing an insert you can return columns that have been inserted. Like if you have a serial column or a generated column that generates IDs automatically, you can return that ID so you don't have to do a second select after the insert. [00:09:22] Well, the problem with doing an insert and onconflict is you can't really return the data, whereas now you can and say onconflict do select return the id. But it also looks like it can support doing a select if you want to do for update or no key update or key share etc. So if you want to learn more, definitely check out this blog post. Next Piece of Content How Patrony Brings High Availability to Postgres this is from pgedge.com and this was a very interesting blog post about patrony. So if you want to understand what makes patrony different and why, it's the ideal choice if you're doing high availability on postgres. Unless you're considering using kubernetes, then you might want to do something like cloud native pg. [00:10:12] But if you're not moving to kubernetes yet, Patrony according to him is the best option for high availability postgres. [00:10:21] So he goes and does a lot of explanation about the background. Again, I highly encourage you to read this. It's a great blog post and he gives you the general structure. So basically Postgres was never built as a cluster aware database. And yes, in terms of terminology, a postgres cluster means it has multiple databases in one instance. [00:10:44] However, normally when you think about cluster you think of multiple machines working in coordination and postgres does not have that concept. Basically, they're still relatively independent servers. A primary replica. They're still relatively independent, they just happen to talk to each other. They're not viewed as one all encompassing cluster. However, Patrony helps make that a reality once you add in some additional components to it. So here's an example Petroni setup. You have your primary and your two replicas, although those designations change as the system is running based upon what's healthy and you have Patrony running for each instance. This stores all of its data in the distributed consensus layer, and that can be a number of different services like Etcd or Consul or Zookeeper. This is basically the central data store of the state of the cluster is stored here. So patrony can access this common storage mechanism to understand the current state of the cluster and this helps it handle quorum. The main point of Patrony is for doing orchestration, so because it knows the state of the cluster, it can choose to start and stop and start postgres instances. So with a system becomes unhealthy or it loses connections with others, it chooses to terminate the service. [00:12:06] It can promote replicas to be primaries, it can start up new replicas, demote primary nodes, tracks log sequence numbers, and also handles the replication slots. [00:12:17] And he goes ahead and discusses how a primary node is identified and maintained. He also talks about how Paternity uses pgrewind and pgbaseb backup to be able to catch replicas back up to the primary if there's a promotion event, or bring up new replicas if it has to shut one down. And some of it is talking about fencing. Basically if you have an event that can risk the cluster like one machine having issues, it can fence that off and bring up a new one in its place. [00:12:47] Then he talks about the routing, which is essentially a tool like haproxy. And this is your interface to the cluster. So your application just talks to the proxy and based upon what you need to do, it knows to route your traffic to the writer or to one of the readers. [00:13:05] And you don't have to use haproxy. You can use other tools like an F5 load balancer or an Amazon Elastic load balancer. And then he covers kubernetes of course. But I thought this was a great article explaining patrony and essentially why he thinks it's the best. [00:13:24] Next piece of content PostgreSQL 19 PGDump all in binary format this is from DBI services.com so historically when you do PGDump all, it's always done in a text mode. [00:13:37] Meanwhile, PGDump, you could do it in text mode. You could do it as a tar file, a directory, custom binary format. So it's up to you how you want to dump it. [00:13:49] Well, these same options appear to be headed to PGDump all as well, so that's great. [00:13:54] Next piece of content PostgreSQL statistics why queries run slow this is from boringsql.com and he's talking about the statistics system on Postgres, where those statistics are stored, how they get updated, and calculating the cost of various statistics to determine how plan is going to be implemented. So if you're interested in learning more about that, you can definitely check out this blog post. [00:14:20] Next Piece of Content PG Semantic Cache in Production Tags Eviction Monitoring and Python Integration this is from pgedge.com and we did talk about a previous blog post last week introducing PG Semantic cache, which is basically a way to cache your responses from an LLM to minimize on your computing costs and it does it all within postgres. Well now he's showing a production example, talking about organizing with tags, planning eviction strategies as well as doing monitoring of the system. So if you're interested in that, you can check out this blog post. Next Piece of content we have PGvector at home. This is from TheConcensus.dev. so he says a lot of people are using the PGvector extension to store embeddings in the database but or one of the other vector extensions. He says but really we've had vectors in Postgres for decades now we have the point data type built in, but it's limited to two dimensions, not the thousands of dimensions we're accustomed to. He says but Postgres ships with a built in extension called Cube that allows up to 100 dimensions. [00:15:31] And he says well you could probably increase this if you build postgres from source and change the hard coded limit. [00:15:38] But he says in addition to that with Cube you can calculate Euclidean distance between points and you can even index cube fields to speed up similarity search as well. And he says there are some techniques if you only have 100 dimensions for handling some of these LLM models. [00:15:56] But he goes through a process of setting this up to do similarity search. So if you're interested in that you can check it out, although I'd still probably go with the PGvector extension or one of the other ones that are available. But he says, you know, his example here gives you a hundred dimension similarity search in Postgres but it is still slower than PGVector. But if you want to learn more, check this out. Next piece of content PGvector a guide for DBA Part 2 Indexes that has been recently updated. So if you do want to go with PGvector you can learn all about the different indexes that are available. And he did focus additionally on the disk approximate nearest neighbor index NPG Vector scale It's pretty impressive how small this type of index can be here 21 megabytes compared to almost 200 megabytes for HNSW and IVF flat. But check this out if you want to learn more. [00:16:53] And the last piece of content video conferencing with Postgres this is from planetscale.com so apparently there's a product called SpaceTimedb. I'm not sure if that's Postgres Related Said they had the world's first video call over a database and they invited others to give it a try. [00:17:11] So someone at PlanetScale decided to do this and do a video call over the database. So he is using a WebSocket server, pgRelay in the middle and a database for the data source. So the browser captures a camera frame, encodes it as a jpeg and sends it as a websocket message. [00:17:33] Then PJRelay validates your active call and inserts into video frame and then PJ relay on the receiver is running a logical replication consumer on the database. So it sees that row that was just inserted, checks the To ID column and then forwards the raw JPEG bytes over the websocket to the recipient and and the recipients browser creates the blob URL from the JPEG and renders it in the browser. And the audio works the same way. PCM samples go into an audio frames table and then come out on the other side via replication. [00:18:09] So it's interesting that they're using logical replication for this, but how they got it set up, they got about 15 frames per second of video, so that's crazy. But if you want to learn more, definitely check it out. But he's not actually recommending this. You know, he says basically use WebRTC for this purpose. [00:18:28] But if you want to learn more, definitely check it out. I hope you enjoyed this episode. Be sure to check out scalingpostgrows.com where you can find links to all the content mentioned, as well as sign up to receive weekly notifications of each episode. There you can find an audio version of the show as well as a full transcript. Thanks and I'll see you next week.

Show Notes

Episode Transcript

Other Episodes

Episode 290

Postgres Releases = Reindexing | Scaling Postgres 290

Episode 300

Scaling Postgres Celebration | Scaling Postgres 300

Episode 412

Half as Fast on Linux 7 | Scaling Postgres 412