Ottertune Is Dead! | Scaling Postgres 321

[00:00:00] It's probably going to be a pretty quick episode this week because it seems like there's a lot of video content, because there were two big conferences that happened recently that people were blogging about, but there weren't a lot of in depth blog posts and it's harder to summarize video content, of course, but I hope you, your friends, family and coworkers continue to do well. Our first piece of content is actually a tweet announcing that Otter tune is dead, so this seemed to come out of nowhere, at least for me. They said our service has shut down and we let everyone go today, which happened on June 14. And if you click on the image it takes you to the Ottertune site. And basically this is all that exists right now with it. So I never used the service, but I think what it did is tried to set all the different postgres parameters to optimize your database as much as possible. So this is unfortunate. If you actually used autotune, I'd be interested in your thoughts in the comments down below. [00:00:59] Next piece of content schema changes and the postgres lock queue. This is from Zata IO and they're talking about how, you know when you're doing selects, insert updates, deletes, it all requires some level of locking. And when you're doing basic selects, it basically needs a access share lock, which means it can share that lock with many other processes that are running. And these are typically for selects. Now one thing that will block it is access exclusive locks. And typically you need this on the table when you're changing the schema. So for example, adding a new column requires an access exclusive lock. Now it happens very fast, but it still requires that lock. So the problem that you can encounter is if you have selects hitting your database and you tried to do an access exclusive lock, but you have some long running queries ahead of it, like a long select query that's preventing the access exclusive lock to get a lock on it. What happens is that access exclusive lock waits and then things queue up behind it. So other select queries start queuing up behind it, other inserts start queuing up behind it, and then suddenly you have a queue of read and writes to your table that are not happening, which can cause unfortunate problems and eventually some downtime potentially. And the best way to avoid this is to use a lock timeout. And what it basically means is if your session is trying to acquire a lock, it must do so within the timeout you set or it kills that statement that's being run so if they have an example here where they have a very long select that's running in another session they set a lock timeout to 1 second 1000 milliseconds. They then try to add that column again to the table and in a third session they go ahead and do a select from this table just to simulate it being backed up. But because this add column statement cannot acquire a lock within the lock timeout 1 second, it eventually cancels itself and then the third session continues. Doing selects so this is the safest way to apply schema changes to your database is using a lock timeout. But of course if you use it and you are unable to acquire lock, it does get timed out. So you'll need some sort of process to retry it. Like if it's some sort of migration, you may need to run it multiple times so it can actually apply itself. But this post is a great reminder of that process that you have to be aware of when you're making schema changes to your database. So you can check out this blog post if you want to learn more. [00:03:30] Next piece of content reduce vacuum by using on conflict directive this is from Stborden dot WordPress.com and he's talking about a case when you're doing an insert and you run into some sort of constraint violation. Typically you would see a unique constraint violation because you're trying to insert a record, but it already exists. Now when you do that, it creates a dead tuple or a dead row that vacuum must now clean up. So if you have a ton of these happening, and you will typically see it in the logs. I don't see an example of the logging in the blog post, but you will see a bunch of these constraint violations in the logs, and those create dead rows that then have to be vacuumed up. It also uses the precious txids or your transaction ids. But one way you can eliminate both of these problems is to use onconflict do nothing. So an example here, he's doing the same insert statement, but it says if there's some sort of conflict, do nothing. So that will not create a dead row, and it also will not expend a transaction id if it's unable to successfully insert the data and you can see it still returns zero. So nothing was inserted, but it doesn't create the dead row, it doesn't expend the transaction id. But if you want to learn more about that, you can check out this blog post. Next piece of content there was another episode of Postgres FM last week. This one was on PG vector scale. So this is the extension that timescale recently created that adds two new features when working with vectors in postgres. The first is a disk approximate nearest neighbor search that's designed to be used on local SSD or NVMe drives, and also a statistical binary quantitization method. And Nikolai talked about this with Matt and John from timescale, the actual developers of this extension. They think the LLM's greatest achievement is probably semantic search as opposed to rag application or retrieval augmented generation. So they argue, searching by meaning is the most important aspect of it, not necessarily being able to acquire more information to augment this response that's being generated, they talked about why they created pgvector scale as a separate extension versus just building it into PG vector, and it was what I guessed last week. The reason that they did it is because they're using rust for PG vector scale, whereas the PG vector extension is written in c. Of course, they covered the topic again of using semanticsearch in conjunction with an exact search method to try and get the best of both worlds. Basically a hybrid search technique. And they mentioned a lot of customers they're working with are implementing this type of search, a hybrid method of search. But if you want to learn more, definitely encourage you to listen to the episode or check out the YouTube video down here for the next piece of content. I usually don't cover reviews of conferences, however, PGConf Dot De V was pretty big as well as posette, so I do have a few reviews that I'm covering. This first one PostgreSQL development conference 2024 review is from Andreas Sherbond, LA and this is pretty comprehensive, talking about the conference and some of the numerous presentations. So it was a pretty good overview of all the different things that were covered talking about things coming in Postgres 17 there's really a lot of content in this blog post, so if you want to get a sense of what PGconf dot De V was like, definitely check out this blog post. Now, with regard to that extension ecosystem, there was another summary of the talk that was [email protected] extension ecosystem summit 2024 and he goes into depth about all the different things that were discussed, so you can check that out if you're interested. And for the next piece of content, just did a mini review of Posette 2024 and some of the things he liked about it. The next post is actually a YouTube channel for 2024 posette, and they have all of the content already listed, so if you're interested in some video content, there's definitely plenty here to check out and also related to that virus.org posted the presentation that he did at posit 2024 how to work with other people, so he had a brief description as well as the direct video to his talk. Next piece of content the timekeepers PG, cron and pgtimetable this is from notsoboring SQL.com dot and I've mentioned in the past, I typically don't use any kind of scheduling within postgres itself. I usually have a dedicated scheduling server for my application to run all maintenance tasks for the application, including jobs that the database needs, or I just use Cron on the instance itself. But if you're using a hosted database, clearly you need some other way to run scheduling jobs, and Pgcron is one way to do it as well as pgTimetable, and this post compares and contrasts both of them. Basically, Pgcron is delivered as an extension and is typically available on hosted services like RDS, for example, whereas PGTimetable appears more comprehensive in its feature set and is not an extension. So it actually is a standalone binary or available as a docker image. But if you're interested in checking out either of these, you can check out this blog post, the last piece of content, the surprising logic of the postgres workmem setting and how to tune it. This is from pganalyzed.com and he talks about the memory post we mentioned last week on scaling postgres by Sean Thomas that talks about configuring memory, and he focuses his piece of content on work memory specifically and talks about of course, the risk of setting your work memory too high is running out of memory in postgres because like we mentioned last week, usually take the number of connections times the amount of work memory, times, say an average of five nodes for sessions running queries. And you can quickly see how the amount of memory gets increased as you're using it and you risk running out of memory. But of course, one way to avoid that is to set your over commit memory to two to try to avoid that, as well as probably configuring your over commit ratio as well. He went into some more examples how that amount of memory can be used for each node in a postgres query, as well as talking about how parallel operations can also increase the amount of work memory that's being used, as well as talking about how hash operations can use multiples of work memory as well. And in the end, he did cover of post by Christoph Pettis a number of years ago on some guidance to set work memory. Basically, set it conservatively. Make sure that you're logging the temp files and run the system, and as long as you don't see any temp files being created, you're probably good. But if you start seeing them creep up, you may want to think about increasing your work memory so that all your queries use memory as opposed to having to go to disk. But check out his piece of content if you want to learn more. [00:10:37] I hope you enjoyed this episode. Be sure to check out scalingpostgras.com, where you can find links to all the content I mentioned, as well as an audio version of the show and a full transcript. You can also sign up to receive weekly notifications of the episodes as well. Thanks, and I'll see you next week.

Show Notes

Episode Transcript

Other Episodes

Episode 221

psql Shortcuts, Debugging Deadlocks, Find & Stop Queries, Using SCRAM | Scaling Postgres 221

Episode 83

Postgres 12, Window Frames, Index Types, Synchronous Replication | Scaling Postgres 83

Episode 236

Postgres 15 RC2, Shared Buffers, Secure Connections, PGSQL Phriday | Scaling Postgres 236