The Dynamic Duo? | Scaling Postgres 419

[00:00:00] On my own projects as well as on client projects, I've never had to use foreign tables. I haven't used materialized views in my projects either. They've been discussed with client work, but no one's ever opted to really utilize them. But for the first blog post this week we discuss an interesting use case where you use both of them together. [00:00:24] So we'll talk about that, but I hope you, your friends, family and co workers continue to do well. Our first piece of content is Foreign Tables and Materialized Views, a dynamic duo, and we discussed a post about wall log shipping that he had a few weeks back where he was saying if you need access to relatively real time data but you don't want to deal with query cancellations, maybe you should use log shipping as opposed to physical streaming replication. But this blog post is talking about using foreign data tables and materialized views for accessing production data from other data sources, and particularly for running analytics across heterogeneous data sources. [00:01:11] So some of those data sources you may have Legacy systems in MySQL you may have other PostgreSQL systems you need to pull data from you, you may have flat files sitting in object storage or even CSVs. And today you may have parquet tables and all sorts of other file formats as well. So foreign data tables give you a unified SQL interface. But the downside of that is the performance cannot be that great because maybe it's not pushing down things effectively to the database system. And as he says here, in the case of a CSV data source, it's not even indexed. But this is where you pair it with materialized views. [00:01:52] So for example, you create a foreign table to say, an external MySQL server. Then you build a materialized view of that foreign table, and once you've done that, now you can place indexes on it so you can query it highly efficiently. Now, yes, it does mean you're pulling the data over to build the materialized view. But the advantage of this is you can only include the columns you need to do whatever analytics you you're running. So as a quote here now we've turned a slow remote data set into a locally optimized analytical structure. Now, this data is not real time, but then of course to update it, you just need to refresh that materialized view concurrently. So ideally you don't block any existing selects that are happening against it. So he says combining form data wrappers and materialized view is a great match for poorly indexed remote systems because you can create the materialized view and create indexes on it. It's Also a great use case for high latency data sources. [00:02:52] So if it's going to take a lot of time to pull that data over for each query, having that localized and index can really help. Or if you have flat files and large data, maybe you can just extract what you need into a materialized view and then query against that. So I thought this was a great idea. Definitely encourage you to check it out. Next Piece of Content Long running transactions, Job queues and the cascade that Wreaks Havoc this is from Stormatics tech and he's talking about a migration step that caused a database system to I guess almost go down. Basically it held an open transaction snapshot for hours doing an initial data copy and they had a job queue that was beginning to slow down. [00:03:38] So the queue got up to 7 million rows deep, the primary was pinned at 100% CPU across 24 cores and the customer support was receiving complaints about delayed transactions and they said the migration plan worked in staging but when they got to production with the load that it had, particularly with this job queue that set off this cascade. So they had a very large table that was append only sitting at 11 terabytes holding transaction records. But it was the job queue that ultimately I think caused the issue, or I should say it was the rather long copy doing the initial data sync to do the migration in conjunction with the job queue that caused the issue because the database had a snapshot as it was doing this copy. So the autovacuum could not remove dead rows because of the that current copy snapshot and at around 4 million dead tuples in the active queue table selection queries at the 1 second mark and the queue started filling up faster than it could drain. Then the database hit 100% CPU. They said they also ran into a multi transaction SLRU bottleneck because so many transactions were trying to access the same row and they actually had to adjust the queue selection logic to be in the index scans at randomized positions. [00:04:59] So basically yet another indication of be careful with job queue processing and long transactions because they can really get you in trouble. But if you want to learn more, check out this blog post. Next Piece of content why Postgres lacks transparent data encryption this is from pgedge.com and he says, you know Oracle has transparent Data encryption or TDE, SQL Server has it, MySQL has it, MariaDB has it, but Postgres doesn't have it. What's the deal? The answer is it's complicated, but people can't settle on what the project requirements could be and Every time anyone steps up with a proposed solution, typically it's the scope of the work required that knocks those proposals down. But some of the reasons why people are interested in it is because it's a compliance check mark that some institutions require. [00:05:52] It's more convenient to manage encryption at the database level, but even a lot of people say we don't need it, just use Lux, which is basically disk encryption. He did mention there are multiple forks that are available. There is the PGTDE extension which is open source. Right now it still requires percona server for PostgreSQL. We will see at some point if this is no longer required and you can put it in stock community Postgres. There's also enterprise DB offerings for their Postgres advanced server. There's also CyberTech offers PostgreSQL Enterprise Edition as well as Fujitsu offers an enterprise Postgres product as well. And those do include TDE. But the core question it comes down to is what threats do you want defend against? [00:06:41] And he goes over a long history of what people are interested in and then people disagreeing and then talking about the rather large scope of work required to implement a soup to nuts TDE system. And he says it is possible to have encryption at rest today without adopting a proprietary fork. That basically means using file system overlay encryption with Lux. Or maybe your file system itself has native encryption like ZFS or ceph. [00:07:11] Or if you're on a cloud provider you could use their encryption that's built into their system like AWS's EBS encryption. Personally I'm using this for my production systems and I also use file based disk based system encryption for all my other systems. But if you want to go so far as to encrypt things even within a backup. So you've taken a backup and you want to store that elsewhere. You could store it on S3 encrypted volumes as well. But I take it a step further with certain secrets and I actually encrypt that data before it even gets to the database. So it's an application responsibility to encrypt that data before sending it to the database. And when it retrieves that data from the database, it decrypts it back. Now that's definitely not transparent, but this is the solution I have implemented to work around the fact that it doesn't exist in Postgres right now. But if you want to learn more, definitely check out this blog post. [00:08:07] Next piece of content Postgres as an execution environment for AI failure modes, hooks and the Orbit Framework. This is from vibhoorkumar.WordPress.com and this is from a presentation I think he gave at PGCON DEV 2026 and he's talking about a scenario where people are running a lot of workloads in the database, which sounds a little counterintuitive to me. Like he's talking about a transaction call where within the database itself you're calling an LLM prompt. So now within your transaction you've got this 30 second external dependency waiting to return before you can actually finish that transaction. Personally I'm used to this happening at the application layer and not at the database layer. And you would have a transaction perhaps to retrieve this data or set the fact that a job is in a certain state, then the application would call the LLM, get that data back, and then it would update the status in a separate transaction. So of course if you're doing this you're going to run into a lot of performance problems. He also mentions retry storms that can happen and basically the way to help mitigate that is just being, well, jittered. If you, you do have to do retries. Then he talks about a rollback problem where you're doing within a transaction. Again, the call to the LLM, which I personally wouldn't do, I would do this in the application and then simply insert the data into the database. So this blog post basically takes from the perspective if you're doing a lot of work in the database, but personally I do a lot of this work in the application and merely use the database and as the transactional data store. But if you are putting more of this work in the database, definitely check out this blog post to get some guidance on how to do it better. Next piece of content pgvector0.8.2 and the trouble with parallel HNSW this is from thebill.com and there is a new CVE for pgvector. So basically if you have something below 0.8.2, you should definitely upgrade it because it does leave you vulnerable if you are running parallel HNSW index builds. And then he goes over the detail about this CVE if you want to learn more. [00:10:25] Next piece of content how to hack logical replication in PostgreSQL insights from contributors this is from PostgreSQL and this is basically an advertisement about they're looking for basically more people to help improve logical replication in Postgres and they give an overview of the system and basically educating you on how it works and what are the requirements to start working in that area of the code base and best practices to follow while you're developing different improvements. So if you're interested in that, definitely check out this blog post. [00:11:01] Next piece of content Graph queries across billions of rows of scattered data with Postgres and Apache Age. [00:11:07] This is from Snowflake.com and this blog post is focused on the extension Apache age or age that allows you to use graph queries to access external data. So basically you can stay within Postgres and run graph queries against data, and these queries return standard PostgreSQL result sets. So that means you could wrap a cypher query in a cte, and she has a couple of examples here showing how you could use this. So check this out if you're interested. Next piece of content PostgreSQL locking behaviors that trip people up this is from Dev 2 by Shinya Kato. [00:11:46] The first one he talks about is when you have an access exclusive request that gets queued, subsequent queries get blocked in a chain. So that means if you have a lot of selects or other activity running on the system and and then you need to add a column through an alter table, this is now waiting to run because you have these other transactions running ahead of it. Well, once this comes, every select statement that comes after it is forced to wait behind this one because this needs its opportunity to run. So this causes a long wait queue. So this is why when you run migrations you definitely want to use a lock timeout so that this alter table gets cancelled if it has to wait too long to acquire that access exclusive lock. [00:12:35] Next example invisible deadlock caused by foreign key constraints. And this is because it needs to lock the parent table when there's a foreign key present. And if you have one session that updates one row, another session updates another row, well then the first session could potentially be locking the reference row session two, and session two could be locking the reference row session one. And now you've got a deadlock. So basically opposite orders cause deadlocks, so that's what you definitely want to try to avoid. Next is a deadlock between two inserts caused by a unique constraint duplicate check. So again, if you're trying to insert the same values in multiple sessions in a different order, you can run into deadlocks. [00:13:18] Next is only the transaction ID wraparoundprevention vacuum refuses to yield on conflict. So if your vacuum hits the freeze age max to prevent wraparound vacuum, that vacuum won't be canceled by certain activities, like for example an ultra table to create a partition, it would normally be canceled by a regular vacuum, but not one that is to prevent wraparound. And the last one is Vacuum's hidden access exclusive phase. [00:13:47] So the last part of the phase where it returns the disk space to the operating system, this requires an access exclusive lock. Now usually it just waits to do that on the primary, but on the replica it can cause more problems and potentially cancel a query. But check this out if you want to learn more. [00:14:07] Next piece of content what else is in there? This is from thebuild.com and he's talking about a situation where there was a CVE related to Refint, which is a vintage referential integrity workaround before Postgres had foreign keys. [00:14:23] So it's still in the contrib tree. So basically this blog post is a call to everyone taking accounting of what extensions you're running, because these extensions could have vulnerabilities, so you just want to make sure that you're actively using whatever extensions are installed and remove ones that you're not. So definitely a good practice to do. [00:14:46] And the last piece of content toast where PostgreSQL hides big values. This is from boringsql.com and this blog post is all about Toast and how and when it gets triggered to be stored. In Toast he talks about the different toasting strategies, whether things get compressed, does it get stored out of line based upon the data type, etc. And goes into a lot of depth with regard to how to things are stored and managed. And Toast in terms of talking about different encryption as well. So if you're interested in that, definitely check out this blog post. I hope you enjoyed this episode. Be sure to check out scalingpostgres.com where you can find the links to all the content mentioned, as well as sign up to receive weekly notifications of each episode there. You can also find an audio version of the show as well as a full transcript. Thanks. I'll see you next week.

Show Notes

Episode Transcript

Other Episodes

Episode 374

Migrating Millions Of Databases | Scaling Postgres 374

Episode 326

Faster Index Creation | Scaling Postgres 326

Episode 206

100x Optimizations, Auditing, Document Schema Designs, Checksums | Scaling Postgres 206