Storage Engine Progress? | Scaling Postgres 380

[00:00:00] I don't know about you, but I've been waiting for a long time to have an alternative storage system for postgres, particularly one that allows you to use an undo log as opposed to storing all of the versions in the heap itself. That way, in theory, if you had a very update heavy workload, you could use that storage engine and not have to deal with so much vacuum. And right now the best candidate for that would be orioldb. [00:00:28] But they keep adding all these other optimizations. I keep wondering when they're going to basically release a storage system that can exist as an extension. I know that various patches have to get into postgres for that to work, but still, we'll look into a blog post where they're continuing to add enhancements to Oriole DB itself. Also, be sure to stay till the end of my consulting corner where I talk about the nightmare of working with some application frameworks. But I hope you, your friends, family and coworkers continue to do well. Our first piece of content is Ordered insertion optimization in Oriodb. [00:01:07] This is from oraldb.com and they're looking to optimize how entries get into B tree leaf pages because they say classically it's been an exclusive page lock and then serially adding new entries to it. But to a newer version of Oriole db, which is still in beta, they're offering a batch insertion method that can 2x the throughput, they say. [00:01:32] So they have an example animation down here where you have an index leaf page and you have queued up several different entries that need to happen and each one has to wait for its turn and you have a lock of the page while that insertion is happening. But the new way they're doing the batch insertion is you have an insertion, it locks it and queues up all of them. So they can happen simultaneously with just one lock, which is awesome. And they said this is particularly optimized for say you have some sort of append only workload or very sequential insertions where things are being inserted into the same page. You will see more benefit of this optimization. [00:02:16] So they discuss a little bit about how it's designed and then they jump into some benchmarks and the blue or the purple line is base postgres 17 and you can see the beta 12 which is the current release of Oriole DB, actually does pretty bad at low client counts, but it does a little bit better as your client counts increase. But the development version which adds this change to it, you can see quite a performance gain, almost 2x compared to the development version in Postgres 17, or maybe even greater than that. So this is awesome. I frankly wish this type of change would be going into Postgres to see some of those performance differences. So on the one hand this is cool, but on the other hand I lament the fact that it's in a separate product that's still in a beta phase. And will it ever eventually get into Postgres? I I don't know. But check this out if you want to learn more. Next piece of content PostgreSQL 17.6, 16.10, 15.14, 14.19 and 13.22 as well as 18 beta 3 are released. So we are coming up on Postgres 13's End of Life on November 13, 2025. That's when it's going to stop receiving fixes. So if you're on that version you should definitely upgrade These set of releases also cover three security issues. [00:03:40] One is related to being able to view the data in optimizer statistics, so basically can expose the sample data within a view partition or child table. But there's also some pretty severe like a base score of 8.8 for each of them where PGDUMP can execute arbitrary code or a super user of an origin server can insert arbitrary code that a PSQL client will execute. So basically if you use say third party databases, you get databases from other source. You definitely want to have these patches in your system before doing something like that moving forward. [00:04:15] And of course all these releases had a number of bug fixes including 55 bugs that were fixed. [00:04:21] One thing to note, if you are using a Brin index with a numeric min Max Multi Ops operated class, they do recommend re indexing those after you upgrade. So it's not all Brin indexes, it's just the ones with this specific operator class. But check this out if you want to learn more. Next piece of content Postgres Logging for Performance Optimization this is from crunchydata.com and I thought this was a great logging post. I don't know if it was particularly for performance optimization, but they definitely list a lot of things that will get logged to help you improve your performance. But they go through almost every logging setting and give some recommendations on how you should set each one. They even recommend two extensions if you want to capture more logging information. One was the PG Audit extension, which helps audit in detail what kind of changes are taking place in your database system. [00:05:19] They enable the pgaudit log to only do DDL so schema changes basically. [00:05:25] And they also recommend the other extension, auto explain, so that you can get explain plans in your logs. [00:05:33] And in addition to all the detail listed here, they come up with a full summary at the bottom of all the recommended log settings for your database. So definitely check this out if you're interested. Next piece of content. The database has a new user LLMs, and they need a different database. This is from tigerdata.com and they're basically talking about LLMs. Really love context to really understand what you're talking about. I mean, frankly, like most humans, if you take a general human or someone who knows SQL and then you ask them to get a particular question answered from a database system, they need all sorts of additional information to understand, well, what are these tables doing? What kind of data is it storing? And they may need to get context from outside the database to understand how to generate particular queries. Well, LLMs are no different. And they said they really need context to understand. [00:06:28] You know, this is an orders table. What kind of orders is it for? You know, they say here, is it a purchase order, is it a customer order? Or even it's an experimental table that someone created and forgot to drop five years ago. [00:06:43] So all of this context would be important to understand how to use that table in a query or not. [00:06:50] Now, as I was reading this, immediately, I'm thinking, okay, well, comments. You can add comments to different objects within the database. Would that be sufficient or would you need to do more? Well, they actually proposed creating a separate semantic database that lives outside of the actual database. And they have a link to the repo. So they have a component called a semantic catalog. And as well as a separate component they're calling an evaluation harness. And they said when you take all of this information and incorporate it, build context about the actual database you're using to write SQL queries. They saw a SQL generation accuracy increase by about 27%. So it looks like a general increase of improvements from 58% to 86%. [00:07:41] So definitely interesting. But they're proposing very detailed information to store in this semantic database. In this example here, they have a restaurant table, the description of the table, what it stores, then each column, what's the purpose of each column? Here's how you would do a particular SQL example against it and some additional facts to consider when using this table in queries. So really a lot of detail needs to be added in, and it definitely needs a human review to review this information being generated to make sure it's accurate and makes sense. [00:08:18] But if you want to learn more about this, definitely check it out Next Piece of Content Fun and weirdness with SSDs this is from Vonder me and he is doing some performance testing and the results he's getting are quite different than say using magnetic drives. And some of the assumptions he has made is that random I O can get close to sequential I o, sequential I O is still faster than random I O and the direction of the sequential scan doesn't matter forward or backward, he says. You know, I think all of these are fairly reasonable assumptions. But once he started testing on five various drive types, he saw something that he didn't expect. [00:08:58] So sequential scans had much higher throughput on pretty much almost all the drives until you got an really large page sizes and then the random scans almost perform better. [00:09:10] But what was also strange is that sequential scans in reverse were much more poor than a straight sequential scan and sometimes even worse than a random scan, which was quite odd to him. [00:09:22] But if you want to understand more about this, you can definitely check out this blog post. [00:09:27] Next piece of content PostgreSQL's incredible trip to the Top with Developers this is from enterprisedb.com and this is a brief post talking about the Stack Overflow Developer survey that was released recently and Postgres is still at the top and seemingly growing with regard to what database is in use currently. Although I am noticing that SQLite is rising up pretty well too here as well. It's also the top in terms of what database you want to use as well as Most Admired, although everything seems to be dropping off a Little Bit and 2025 but they have the link to the survey here if you want to check it out. There was another episode of Postgres FM last week. This one was on self driving postgres. [00:10:12] So this is based upon the postgres AI blog we discussed. I think it was probably about two weeks ago about self driving postgres basically setting up Postgres so it runs itself, it optimizes itself and what is needed to make that happen and put it into place. [00:10:29] So this episode talks all about that. It's a little bit on the longer side for them. It's almost an hour in length, but if you want to learn more you can definitely listen to the episode here or watch the YouTube video down here. [00:10:41] Next piece of content exploration migration to CNPG. This is from Cyberpeck. PostgreSQL.com and CNPG is the cloud native PG. So this is running Postgres and Kubernetes. [00:10:55] This Particular blog post talks about migrating to it and it basically offers three ways to migrate to it. PG based backup so you can back up a database and restore it onto cloud native pg. Or you can do an import which basically is a PGDUMP and a pgrestore, or use a logical replication, which that's probably what I would use today. But this particular blog post covers the first two methods. So they talk about the PGbase backup method as well as what they call the Import Method or PGDump and Restore. But if you want to learn more, definitely check out this blog post and the last piece of content hacking workshop for September 2025. This is from rhaas.blogspot.com and next month Robert Haas will be hosting two to three discussions of David Rowley's talk Writing Fast C code for Modern CPU and applying it to Postgres. And this particular talk was given at PGconf 2025 and if you're interested in attending that, you can sign up using this form. [00:11:53] And now it's time for my consulting corner. And I've had a bear of a time actually today working with an application framework and postgres, and maybe some of it is a little AWS related, but we have a database that we're wanting to migrate into RDS and use RDS proxy with it. Now what's a little bit different about this database that it actually sets the time zone to something other than utc, which I actually haven't seen with a lot of postgres databases. Normally it's set to UTC time, but this one has changed to actually a different time zone. So we got the app connected to it through RDS proxy and talking to the database. But if you know anything about RDS proxy, there's a number of conditions that can happen that can cause session pinning, which basically locks a client connection to a particular database connection, which is not great when you're talking about a proxy when you're trying to do transactional pooling. And what was happening is that the application was setting the session to be UTC time. So it was looking at the database saying, okay, the time zone is not utc, so I will set it to UTC time. [00:13:05] And then we found out that as long as it detects a UTC time zone, it won't set that session command, then we should avoid the session pinning. So I basically went to the user that the application was connecting as and made its time zone UTC time, because I didn't necessarily want to change the database yet. But when the application connects with that user, go ahead and do that. I tested it in PSQL and everything worked as expected. The application connected and it was still detecting the non UTC time zone. So I was a little bit baffled at this. But the app actually uses psychopg and apparently it reads the startup parameters for the database when it connects to determine what the time zone is. So it doesn't even look in the session of the user connecting. It bases its decision on what is the configuration of the server. So my set the time zone at the user level workaround wouldn't work at all. So what we ultimately did is said okay, let's go ahead and make this server UTC time. We think we can safely do that because the application was setting that anyway, and there's a few other things we checked just to make sure we're good. It just definitely caused a bit of frustration that we actually had to change the whole database's configuration because the psychopg library chose not to read the actual session state. I don't know if that'll help anyone in the future, but that's basically what we ran into. [00:14:34] I hope you enjoyed this episode. Be sure to check out scalingpostgres.com where you can find links to all the content mentioned, as well as sign up to receive weekly notifications of each episode There. You can also find an audio version of the show, as well as a full transcript. Thanks and I'll see you next week.

Show Notes

Episode Transcript

Other Episodes

Episode 126

BRIN Pitfalls, pg_stat_statement Troubleshooting, Natural Join, Geography Type | Scaling Postgres 126

Episode 120

Postgres 13 Beta 2, System Stats Extension, Tuned, Join Strategy | Scaling Postgres 120

Episode 362

Optimizing For Time-Series Data | Scaling Postgres 362