More Postgres 17 Highlights | Scaling Postgres 336

Episode 336 October 06, 2024 00:12:54
More Postgres 17 Highlights | Scaling Postgres 336
Scaling Postgres
More Postgres 17 Highlights | Scaling Postgres 336

Oct 06 2024 | 00:12:54

/

Hosted By

Creston Jamison

Show Notes

In this episode of Scaling Postgres, we discuss more about the features released with Postgres 17, an example of performance improvements, things to watch out for, and the start of Postgres 18.

To get the show notes as well as get notified of new episodes, visit: 
https://www.scalingpostgres.com/episodes/336-more-postgres-17-highlights/

Want to learn more about Postgres performance? Join my FREE training called Postgres Performance Demystified here: https://www.scalingpostgres.com/courses/postgres-performance-demystified/

View Full Transcript

Episode Transcript

[00:00:00] Even though I covered the Postgres 17 released last week, there's still a ton of blog posts being released this past week about the release of Postgres 17. So you're going to get more information about that this week, but I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is what's so great about Postgres 17? This is from cybertech postgresql.com dot and he starts off with this heading saying why are there no spectacular new features in Postgres 17? And I don't necessarily think he's saying this, but I think other people may question where are these features? I actually think there are a number of them that I really liked, like the logical replication and the new collation provider that not enough people seem to be talking about. But I would have to agree there's no major one or two features that defines this release, but it's a lot of small, very great improvements. And he makes the point. You know, PostgreSQL is already feature complete and given the current state of the project, it becomes harder to contribute and move features forward. But the great new features he listed is the unobtrusive performance improvements. So basically you don't have to do anything. Postgres 17 will just execute your queries faster. And that's definitely awesome. Particularly the processing of endless with b tree indexes being faster. That's an awesome addition. He also mentions fast startup plans in a query with union all and limit. And of course the multiple improvements to vacuum as well as analyze as well just makes it use less resources on your server as it runs. He does mention the new built in collation provider, which again I think is huge. And he makes a point that the disadvantage is it really doesn't help with linguistic sorts. But he says, you know, maybe in the future there may be a release that supports that as well. So that way you could do upgrades without having to worry about re indexing linguistic sorted indexes as well. The logical decoding of failover slots, again, that's huge. And even being able to upgrade with those slots in place, the addition of serialize to explain output as well as incremental backups, which is a huge improvement if you have a large database. So those are some of the main features that Lawrence liked. Next piece of content is coming up in Postgres, PostgreSQL 17 and trends and innovations to watch. This is from PostgreSQL dot Fastware.com and the first thing in this blog postee covers the evolution of all the different features in previous versions of postgres, and then lists some new features that have landed as well as potential features in the future. So in terms of the version 17 features that have landed, he mentions the incremental backups, vacuum improvements, faster reads by using streaming APIs. This may be related to some of the async IO improvements coming, I'm not quite sure, improved performance of subsystems on top of the SLRU cache, and that these cache sizes are configurable. He mentions the sped up in queries and improvements of wall writing performance. He mentioned the logical replication failovers that they persist during upgrades, support for identity columns and exclusion constraints on partition tables, some other efficiencies that the planner is doing with nulls. So a lot of this was repeated with what was covered last week in terms of the new features, but there's also some additions that I didn't catch there. But you can also just check the readme of postgres as well. Oh, and this is one I did mention, is that there's a new system view PGwait events. You can actually see some of those wait events, and the blog post closes out talking about potential features coming in Postgres 18 that are being worked on. Next piece of content there was another episode of Postgres FM last week, this one on Friday. They covered postgres 17. They mentioned a lot of similar features, but Nikolai also mentioned some UUID functions that do things like allow you to extract a timestamp once uUId seven is available. He mentioned another feature that was important to him was getting transaction timeout. So this is not just a statement timeout, but a whole transactions timeout. They also mentioned that with some of the SLRU cache parameters that you can change that. Some of them also scale with shared buffers. So again, as long as you keep your shared buffers configured, they should scale some of these cache parameters as well. So that's great. And they talked a little bit more about the system views as well, particularly the wait event view. But if you want to learn more, definitely listen to the episode or watch the YouTube video down here. Next piece of content is actually a YouTube video, and its SQL in clauses are miles faster in Postgres 17. This is from Hussein's channel, and he does a little bit of a visual representation of how the inquiries used to work in version 16. Basically, they would internally do an index scan for each item in the end list, whereas postgres 17 tries to search more than one value at the same time to reduce the number of index scans, and then he does an example in postgres 16 and an example in postgres 17. And I think he got close to a tenfold improvement with the example that he was using. So if you want to see some of the end clause improvements in actions, you can check out his video next piece of content postgres planner quirks, incremental sort and when it goes wrong this is from pganalyze.com and with postgres 17 here with all the new features that are available, there's definitely a group of people that are eager to upgrade as soon as you can. But then there's also the group of people that want to be more conservative about it. And actually in postgres FM they kind of talked about this, it's good to upgrade, but you don't want to upgrade too quickly if you have a critical production setup. Well, this blog post talks about where some of these improvements in recent versions have actually caused performance regressions. So on the whole, maybe they improve performance, but there are some cases where they can make it worse. So in this video and this blog post, they show an example of where someone upgraded from version eleven to 15, and this particular query went from running in a few seconds to ten minutes. And what basically happened, it was trying to do incremental sorting and was choosing to use an index only in the order by, because it only had to find 50 records as opposed to using an index to actually narrow the amount of data to look through from the where clause. And he shows a simpler example here, showing the same problem and the issue. It's really quite difficult for any statistics to give the planner information about this. So in certain cases when you're ordering and limiting, you could run into a potential issue. And if you're seeing a performance regression like this, the only alternative that exists now is to turn incremental sort off. But Lucas said that this is an issue that they are discussing in the mailing list to see what kind of improvements or changes they can make to fix this type of performance regression. But for a particular session, you can set enable incremental sort to off. And then he also mentions a version 16 feature that may result in regressions with regard to presorted aggregates. So the performance difference wasn't as severe, maybe a two fold worsening of performance. But again, you can run into a performance regression with this. And the solution with regard to this one is to turn off enable presorted aggregate. So that's another potential solution to resolve some of these regressions. But if you want to learn more, definitely check out this blog post. Well, this has all been about 17, but version 18 is being worked on right now. So we've got two blog posts about that. The first one is waiting for PostgreSQL 18 add temporal, primary key and unique constraints. This is from depeche.com and this was actually held back from postgres 17 due to some issues, but now it's been added back into 18. So again, temporal tables allow you to use the same id in an identity column as long as a second correlated column. In this case valid range, does not overlap. So you can see here the identity column id. Both have three, but that is valid in a temporal table as long as the ranges do not overlap. So for an individual record, it allows you to track its changes over time through multiple rows. So check this out if you want to learn more about that next piece of content. Postgresql 18 verify tar format and compressed backups so the Pgverify backup command today can only verify plain formats. Well, now you can do tar formats and compressed backups as well in the development version of Postgres 18. [00:09:08] Next piece of content when postgres indexing went wrong this is from blog Bemi IO and they had an incident in their production setup. They needed to add indexes. They were creating these indexes concurrently, and apparently it was a partition table, so they were actually adding the indexes to all the child tables. But once these indexes were starting to be used, they started getting really poor performance. [00:09:35] And basically, as the heading says here, their concurrent indexing failed silently. But when you run a create index concurrently interactively, it should return an error to you if there was a problem. So my assumption is that they had this in some type of job that they didn't check the logs on. So basically some of those indexes failed. In the case of here, there was some deadlock detected. Therefore the index was not completed. And this particular index on, I guess the 2019 partition of the users table was in an invalid state and therefore none of the queries were using it. So you can see, as you describe the table, you can see this particular index is invalid. So they recommend, you know, closely monitoring the index creation. And if you're doing it interactively, it should tell you if there was a problem with creating the index. If you're doing it in a job, because it can take a while. If you have really large tables, check the logs and make sure that those indexes were actually created and they list some ways it could potentially fail you could get deadlocks. You could run out of disk space, although you've got bigger problems with your database if you run into disk space or potential constraint violations. And this could happen if you're trying to create a unique index, for example, and they give a query where you can find all of your invalid indexes. So this would not be a bad thing to run on your database periodically just to make sure you don't have any invalid indexes. And how do you fix an invalid index? Well, basically you would generally drop the CC new index because that's what the concurrent index new adds to the index. You could drop that index concurrently and then add it back concurrently. They also mentioned you could also re index concurrently as well. So if you want to learn more about this, definitely check out this blog post next piece of content are you qualified to use null in SQL? This is actually from Agent M GitHub IO and this is a test of sorts, testing your knowledge of nulls. So if you have select null and true, what will it return? Will it return null or false? And there's a number of different questions and levels. So this is level one, this is level two, and this is yes, level three. And once you fill everything out, you can click this to show you your score. So I thought this was really cool and it definitely highlights some of my frustration with nulls. But you can check this out if you're interested. And the last piece of content is PostgresQl Hacking Workshop October 2024 this is the postgres hacking workshop that Robert Haas is coordinating on a monthly basis. The topic for the upcoming one is from Thomas Monroe's 2024, PGconf, Devtalk streaming IO and vectored IO. So if you want to participate in this, you can click the link here to potentially sign up to join the session. [00:12:30] I hope you enjoyed this episode. Be sure to check out scalingpostgres.com where you can find links to all the content discussed, as well as sign up to receive weekly notifications of each episode. There you can also find an audio version of the show, as well as a full transcript. Thanks, and I'll see you next week.

Other Episodes

Episode 127

August 17, 2020 00:14:37
Episode Cover

Recursive Queries, Lateral Joins, Pagination, JSONB Arrays | Scaling Postgres 127

In this episode of Scaling Postgres, we discuss recursive queries, using lateral joins, paging through a data sets efficiently and working with JSONB arrays....

Listen

Episode 207

March 20, 2022 00:17:54
Episode Cover

Autovacuum Configuration, Duplicate Key Violations, Pipelining, Tuning max_wal_size | Scaling Postgres 207

In this episode of Scaling Postgres, we discuss how best to configure autovacuum, the dangers of duplicate key violations, how to set up pipelining,...

Listen

Episode 178

August 16, 2021 00:15:11
Episode Cover

Postgres Releases, Performance Discussion, Index Types, Index Power Use | Scaling Postgres 178

In this episode of Scaling Postgres, we discuss new releases for Postgres & pgbouncer, have a performance discussion, describe the different index types and...

Listen