New Releases & 1,000 Times Faster Query | Scaling Postgres 316

Episode 316 May 19, 2024 00:14:34
New Releases & 1,000 Times Faster Query | Scaling Postgres 316
Scaling Postgres
New Releases & 1,000 Times Faster Query | Scaling Postgres 316

May 19 2024 | 00:14:34

/

Hosted By

Creston Jamison

Show Notes

In this episode of Scaling Postgres, we discuss new Postgres releases, optimizing a query to be 1,000 times faster, custom vs. generic plans and the pgtt extension.

To get the show notes as well as get notified of new episodes, visit: 

https://www.scalingpostgres.com/episodes/316-new-releases-1000-times-faster-query/

 

View Full Transcript

Episode Transcript

[00:00:00] This week we get new postgres releases. We also have a security vulnerability that some of these releases address and we also talk about a performance optimization that resulted in a 1000 x win. But I hope you, your friends, family and coworkers continue to do well. Our first piece of content is postgresql 16.315 point 714.1, 213.15 and 12.19 are released. So this does address one security issue as well as fixes up over 55 bugs. They also give a end of life notice for postgres twelve. So basically shortly after 17 gets released, postgres twelve will stop receiving fixes on November 14, 2024. Now it doesn't look like there is anything related to having to do re indexing with the these releases, so that's good. But it does have a security vulnerability related to people being able to see statistics collection when you're using create statistics. So if you're not using create statistics, you shouldn't be vulnerable to this. But if you do, there is a possibility of one user being able to see data that they're not normally supposed to because the views that allow you to see what statistics are being collected is open for any user. Now this vulnerability was actually reported by Lucas at PG analyze. So he was doing some work with statistics and noticed this issue and reported it as a security issue. So he did a post on his five minutes of postgres, postgres Cve 2024 4317 and how to fix the system views. And basically what he said is that it is insufficient to merely upgrade to the latest patch release that doesnt actually fit fix the security vulnerability because you can still log in as a user who shouldn't be able to access a particular table but still be able to see the data in the system views. What you actually have to do is run a separate command to rebuild those system views because those system views get created on database initialization so that it also means it needs to be applied for every database. But he covers that in more detail in his blog post and of course in his YouTube show here. So definitely encourage you to check it out if you want to apply the patches in their entirety. But even on the original post here, the link to the CVE right here takes you to the detailed instructions on how to recreate those system views. [00:02:32] Next piece of content Bruce Momjian released the first draft of the postgres 17 release notes are available, so Bruce Momgeum posted these so you can review this to get a more comprehensive look of everything that should be coming in postgres 17 this fall. [00:02:47] Next piece of content, making a postgres query 1000 times faster. This is from mattermost.com and they have a scenario where they have an elasticsearch database and basically it needs to take the raw data from postgres and build its indexes. And they had someone who was coming online with 100 million posts. So they needed to build the elastisearch index from scratch. But it already been running 18 hours and it wasn't even halfway finished. So they started looking into what the problem might be and what they discovered. The query that pulls the posts to then build the index looks like this. And what they discovered was the query kept getting slower and slower as it was executed, because they didn't just pull all 10 million rows at once, they were applying a limit to it for each query against it. And each batch that they were doing was getting slower and slower. So when I heard that, I started thinking, sounds like some sort of a paging issue, like a limit with an offset, but they're not doing an offset here. And then I saw this or query and I said, okay, this is probably where it is. And he eventually gets to running the explain plan, which is listed here. I'm sorry, it's very truncated here. And you can see this particular query was taking 24 seconds to run. And whenever I look at an explain plan, I say, okay, this is the execution time. Where is it? In which node is it taking forever? And looking at it, it is this node right here because the time is 24 seconds. So pulling 10,000 rows, it says it's using an index scan, but it's actually doing a filter and removing over 40 million rows. And look how many buffers this this head as well. That's where all the amount of work is taken. So the issue was definitely in that or. But he actually tried breaking down the query. He tried getting rid of the join, which of course didn't help because the problem was in the OR. But he said once he narrowed down to this, okay, yes, this was the problem. When he broke that up and just looked at the first part of it created at is greater than whatever parameter was sent, it ran in 33 milliseconds. That's definitely a lot better. Then when he tried the second part of the query where created at is equal to a date and the post is created at a certain value, it ran in way less than a millisecond. He did eventually narrow it down to the area that we were looking at in the explain plan. And the reason it kept getting slower and slower is that the amount of rows that had to be discarded kept on growing and growing based upon how the query was written. He basically looked on stacked overflow and discovered row constructor comparisons. So this part of the where clause can actually be written in this manner here where you have parens the two columns created at an id. Compare that against the two parameters surrounded by parens, and that relatively simple change resulted in a much faster query running in 34 milliseconds. And this is the query syntax that's typically used when you're trying to do things like key set pagination, which is pretty much what they're doing. They're paging through a result set, passing on blocks of rows for an index to be built. He also talks a little bit about trying to deal with MySQL as well, but that's pretty much it. So if you're interested in learning more about this, definitely check out this blog post. [00:06:10] And if you're new to scaling postgres, I wanted to offer you a free mini course that I call the PostgresQl performance starter kit. This covers some basics of PSQL, talks about how to use PG stat statements to be able to track the statements running against your system, as well as talk about how to read and use explain. So if you want to start learning more about performance, this is a great way to start. And I'll have a link in the description below on how you can get the mini course next piece of content custom versus generic plan this was the episode last week from Postgres FM, and Nikolai and Michael discussed custom versus generic plans in terms of prepared statements. And basically a prepared statement prepares a statement beforehand, doing a number of the steps up to the execution phase so that when you want to run the same statement again and all you're changing is a parameter like changing the id of data you're looking for is ten versus 20 versus 50. You can easily swap out the parameters and just run that query again, making it very efficient. It doesn't have to go through all the parsing steps and eventually some of the planning steps as well. So basically, how prepared statements work is the first five executions use a custom plan. So it develops a plan for each query that's run against it and the average cost of those is estimated. Then it builds a more generic plan and that cost is compared to the cost of making the custom plans and generally tries to use the generic planning going forward as long as the costs are not that much higher than a custom plan to try and minimize replanning, basically, and basically they talk all about that in this episode, as well as the importance of using prepared statements to avoid lock manager contention, particularly when dealing with a lot of table joins or partition tables. So you can listen to the video here or check out the YouTube version down here. Next piece of content activating cached feature flags in instantly with notify triggers this is from brandier.org and he had a post we talked about last week on scaling postgres where he was talking about using the listen and notify features of postgres, basically a messaging or a pub sub system, and how he likes to construct it. And here he shows how they've put it in action at I believe he works at Crunchy bridge. So basically he's talking about their feature flag system, and apparently it's pretty robust. They can have feature flags that are fully on, that are random only, that are on by a particular token. And these feature flags must be assessed on every API request. So they don't want to query the database every time for this. So they actually do a form of caching for these feature flags and by default it was resynchronizing itself with the database every 30 seconds. And 30 seconds isn't that long. But the problem is if they're working on something like you turn on a feature, then you go to the web browser to test it. Well it's not working, what's wrong? And they forget to wait 30 seconds before that flag would potentially be enabled. And they say oh well, it is working. So he wanted to devise a system where when a feature flag is changed, it quickly recaches or refreshes those feature flags. So he used the listen notify system to do it basically sets up triggers on tables so that when a change occurs it does send the notify for the application to do a refresh of its cached data. He talks all about how it works, how they set up their listener, and how to change caching flags. So if you're interested in learning more, definitely check out this blog post. [00:09:41] Next piece of content use of PGTT extension in self managed versus cloud products this is from stboarden dot WordPress.com and I haven't heard of the PGTT extension, but apparently does global temporary tables similar to the way Oracle does it. So Postgres has temporary tables, but the frustration he's talking about is that those only exist per session. So if your session disconnects, your temporary table is gone. Or if you log out, that session stops and that table no longer exists. You can do unlogged tables and those persist across sessions. But the global temporary tables in Oracle have an interesting trait about them. They are considered temporary tables, but they do persist across sessions and then furthermore, the only data that is visible in them is to a particular session. So if one user logs into the database and inserts some data into this global temporary table, another using logging in won't be able to see it, but they can add their own data to this table that only they can see. So I thought that was a pretty interesting feature and the PGTT extension allows it. And he says this is available in some of the Google Cloud products, but it looks like he had some issues using it. But I mostly thought this extension was interesting and maybe you have a use case for trying them out. So check out this blog post if you want to learn more. [00:11:07] Next piece of content when to split patches for Postgresql this is from Peter dot ezentraut.org dot this blog post talks about if you're submitting patches to postgres, how should you break them down? Should you send one big patch? Should you batch them up or separate them out in some way? Well, he goes into depth about the best ways to do that. I'll focus on the bottom here because I think this is a good summary. Step one is that each patch in a series is self contained. So basically it's atomic. You can apply that first patch and everything still works, all the tests still pass, it doesn't need another patch and it does some additional new functionality. Basically it can can stand alone. Then you could apply the next patch that does the same thing. It's self contained, does some additional features, but it can stand alone. And he says, second, if in doubt, start with a single patch. And lastly, if we're going to break it up, be sure to communicate the structure and the expectations to people who are going to be reviewing it. So if you want to learn more about this, you can check out this blog post next piece of content time partitioning and custom time intervals in postgres with PGPartman. This is from crunchydata.com and it looks like these are some newish features that have been added to PGPartman to support all sorts of time intervals, including weekly partitioning and a quarterly partitioning. So if you use Pgpartman or want to start using it for partition tables, definitely check out this blog post. Next piece of content PG xn language poll result this is from justeory.com and PG Xn is the postgres extension framework that they're rewriting and they were wondering which language to use and the choices were rust or go, and based upon the voting, rust became the clear winner. So it looks like they're going to be writed and in rust, the last piece of content rag with PostgresQL. This is from PG IO and a Rag is a retrieval augmented generation system, and I like their sentence here. Read on to see how you can build your own rag using PostgreSQL, PG vector, Olama, and less than 200 lines of go code. So in terms of my basic understanding, how I think about a rag system is it's basically a search system. You're asking for some information, but what it does is it augments the retrieval. It generates a more human like response based upon the data that it finds using a model to do so. So this is similar to other rag posts that we've seen, but he sets up the data with postgres and pg vector, and they're loading in four paragraphs from a Sherlock Holmes story, and they're setting up the Olama tool to use the llama three model. I've got their 200 lines of go code and have a pretty simple command line interface to it where you just run the code, pass it in a query, and you can ask it questions like what did the gamekeeper see? And the model looks at the source data and generates a more human response. So if you want to learn more, definitely check out this blog post. I hope you enjoyed this episode. Be sure to check out scalingpostgres.com, where you can find links for all the content mentioned, as well as an audio version and a text transcript of the show there. You can also sign up to receive weekly notifications of each episode. Thanks, and I'll see you next week.

Other Episodes

Episode 134

October 04, 2020 00:14:10
Episode Cover

Community Acquisition, Space Saving Terabytes, WAL Archiving, Vacuum Analyze Tips | Scaling Postgres 134

In this episode of Scaling Postgres, we discuss a company acquisition within the Postgres community, how to save terabytes of space, setting up WAL...

Listen

Episode 217

May 29, 2022 00:16:33
Episode Cover

Schema Change Mistakes, Canceling Statements, pg_rman, Pedantry Removal | Scaling Postgres 217

In this episode of Scaling Postgres, we discuss mistakes you can make when doing schema changes, how best to cancel statements, looking into pg_rman,...

Listen

Episode 250

January 30, 2023 00:14:03
Episode Cover

Performance Issue, Survive Without Superuser, Reserved Connections, Partition Management | Scaling Postgres 250

In this episode of Scaling Postgres, we discuss resolving a performance issue, how PG16 allows you to survive without a superuser, reserving connections and...

Listen