PG14 Index Bug, View Permissions, Logical Replication Conflicts, AlloyDB Under the Hood | Scaling Postgres 218

Episode 218 June 05, 2022 00:17:03
PG14 Index Bug, View Permissions, Logical Replication Conflicts, AlloyDB Under the Hood | Scaling Postgres 218
Scaling Postgres
PG14 Index Bug, View Permissions, Logical Replication Conflicts, AlloyDB Under the Hood | Scaling Postgres 218

Jun 05 2022 | 00:17:03

/

Hosted By

Creston Jamison

Show Notes

In this episode of Scaling Postgres, we discuss a severe bug with index creation in Postgres 14, view permissions options, dealing with logical replication conflicts and looking under the hood of AlloyDB.

To get the show notes as well as get notified of new episodes, visit:

https://www.scalingpostgres.com/episodes/214-pg14-index-bug-view-permissions-logical-replication-conflicts-alloydb-under-the-hood/

View Full Transcript

Episode Transcript

[00:00:00] In this episode of Scaling Postgres, we talk about a bug in postgres 14, index creation, view permissions, logical replication conflicts, and alloy DB under the hood. I'm Kristen Jameson and this is Scaling Postgres episode 218. [00:00:24] Our it all right. I hope you, your friends, family and coworkers continue to do well. Our first piece of content is five minutes of postgres episode 20, an important bug in postgres 14 with reindex create index concurrently and using the Am check extension. This is from Pganalyze.com and this is a very important notice. And Lucas uses a notification that was on Twitter and some other sites to give insight into this bug. And basically there's a chance for index corruption if you've done a create index concurrently or a reindex concurrently in version 14. And he has the link to the Git commit here where they mentioned that when doing index creation or reindex with the concurrently option, it can quote miss heap tuples that were hot updated and hot pruned during the index creation. So someone said that they identified this bug when they had an active database that they were doing a reindex or create index operation. So unfortunately, you are at risk for experiencing this as well and they are actively working on a fix. Now, what you can do to see if your indexes are corrupted is there is this Am check extension that you can use and then there's a specific function that you can run called BT Index Check to be able to check your indexes for consistency. So that's kind of the information we seem to know about now. And presumably they are looking to accelerate releasing a new version with this patch included. But if you want to learn more, definitely check out the most recent episode of five minutes of Postgres. The next piece of content, view permissions and row level security in PostgreSQL. This is from Cyber Copy and Postgresql.com, and they're talking about the new feature in postgres 15, talking about security invoker Views. So basically, Views typically run with the permissions by the person who created them. So that way you can create a view, give it access to multiple tables, and then grant a role or a user the ability to use that view without them having permissions to the base tables. And he also says as a quick security note that if you're doing this, you may want to add a secure barrier when you're creating the view. So definitely check out this documentation from postgres on that if you want to learn more. But this kind of works differently than row level securities. It's possible to kind of bypass the row level security if you're using views. So he has an example where a user here creates a table, inserts two values into the table and then enables row level security and adds a policy where a user can see essentially the row that that user owns since it was inserted into the RLS user column. And then that user creates a view and grants permissions on it to this new user. And now when this user tries to query the table, he can see both rows. So essentially this view allows him to bypass the row level security. But postgres 15 has security invoker views, so basically apply the security of the person who is invoking the view as opposed to who created the view. So here he altered the view and he set the security invoker to be on. And when Joe goes into query the table it can only see his row. So if you're using row level security, having a security invoker set for views seems to be important, but there may be other use cases as well that you may want to use it. So if you want to learn more about this, definitely check out this blog post. [00:04:07] The next piece of content how to handle logical replication conflicts in PostgreSQL. This is from PostgreSQL Fastswear.com and they're talking about a feature that enables you to identify a replication conflict and then essentially skip it. Now that has a certain set of danger applied to it because you may forget a row or include a row. And then can you replay future actions that need to be played back on that replica? But this blog post goes through the process of how you can do that. So basically they have a publisher database and when an update happens it gets sent to the wall, the wall sender sends it to the subscribers apply worker and if there's some sort of inconsistency it can produce an error. So in postgres 15 there's a new option to disable on error. So essentially it disables logical replication altogether for that subscriber if that happens and then you can address how to deal with it as opposed to it keep trying the same thing again and again. So they also mentioned that postgres 15 has a new PG stat subscription stats view that you can look at more information mentioned the disable on error option and it also gives additional context information with regard to an error. So in the example they created a table on a publisher, inserted a value, created the publication, then on the subscriber node they created the table there, they created the subscription with the disable on error true. So it's going to fail if anything happens. Now they begin a transaction and then they insert one value into it, they commit it, then they insert a second transaction and in this one they are actually inserting a value that already exists in the table because they already inserted it in the publisher and it got sent to the subscriber. So essentially they're going to be sending a duplicate value. Now, when they created the table on the subscriber they made the ID unique. So they're going to run in a conflict on the subscriber when they try this, but not on the publisher because it doesn't have that unique constraint when they created the table here. So basically on the publisher, you see this information. You see 512-345-6789. So all of the additional inserts are done without error. When you look on the subscriber, they check the PG Stats Subscription Stats view and they can see this is the name of their subscription my sub. You see the apply error count, so they couldn't apply a change. So that's an error. So then they also query the PG subscription table and they see that the sub has been disabled, it's sub enabled false and that it is set to disable on error. And when they select from the table, they can only see the original five that was inserted during the initial sync. And they see the first transaction was done, but nothing from the second transaction was done. So they had a conflict and that conflict. This is the error that shows duplicate key value violates unique constraint that was applied to the table. And they say ID five already exists which we knew about. And it says Processing remote data for replication origin with an Identifier during insert for replication target relation. The table finished at a particular LSN and it says logical Replication Subscription my sub has been disabled due to an error. Now, how you can resolve this and again, you need to be very careful about doing this. But on the subscriber side, they did a PG Replication Origin advance. Then they put in the Replication Origin and a particular LSN. Now they happened to just increment it one from what was here. I don't know if you can always just do that. They may have used, as they mentioned here, PG wall dump to identify the particular LSN. I'm not sure if he did that or just assumed that they could just increment it by one. But upon doing that, it skipped over that transaction that had the problem and it was able to replicate nine, which was a third transaction that was done on the publisher but not sent to the subscriber. So basically they are showing that the subscription was able to skip over that whole transaction, but it was able to publish a subsequent transaction to the subscriber. And this is a very important thing to look at what can go wrong if I specify the wrong parameters. So this is the danger of skipping transactions and getting the LSN right to advance to the right one. And I believe from what they're saying here, they use PG wall dump to be able to identify what transaction to do. So this is a feature to be able to identify errors and understand exactly where they're coming from and shows an example of how to skip over them. But again, there are risks associated with skipping over. You may want to do something different like resolving this some other way. Maybe you remove the unique constraint on the subscriber. Maybe you want to delete the road that's there and to correct it manually in order to get replication up and running. It's pretty much up to you. But if you want to learn more about this feature, definitely check out this blog post. [00:09:11] The next piece of content alloydb for PostgreSQL under the Hood Intelligent Database Aware storage. This is from Cloud Google.com and they're talking about alloydb. This is the new hosted postgres service by Google that they've done a lot of work, put a particular emphasis on is the aggregation of compute and storage. Now they use this word about twelve different places in this blog post. I would call it a separation of compute and storage, frankly. But basically they get some advantages by separating the compute nodes from the storage nodes and the storage system is kind of where all their secret sauce seems to be located at for this and that. You still have a primary instance, you still have Replicas. But of course the primary is only going to be receiving writes and it goes through a wall log process to store that log immediately. And then they have log processing servers that actually take those logs and apply them to the block storage for all the other primaries and Replicas across the different zones to read. And they talk about a bunch of different benefits of doing this. Now, one big advantage that this technique has and they go through a lot of detail on how this is constructed is that because you separate the compute units from the storage units is now you can have as many Replicas as you want without having to duplicate all the storage. So essentially what they're saying is you have all of your data replicated essentially across three different zones and maybe there's some duplication within a zone, I'm not sure, but all the data is replicated that way for safety of the data. But you can have as many Replicas as you want to reading from that data within each zone. Whereas a normal replication set up by postgres a Replica has its own independent storage. So if you want to have five replicas you're going to have to have five copies of your data and that can get pretty expensive if you have a terabyte scaled database. Whereas with this you are able to store essentially less copies. But you can have as many compute nodes serving as a replica as you want. So that is definitely an advantage of this type of infrastructure and it would be interesting to see postgres eventually support something like this in an open source method in the future. But related to this, there's been another post called Alloydb for PostgreSQL under the Hood columnar engine. And in this one they're talking about the ability for Alloydb to look at the queries that are being sent and use artificial intelligence to dynamically assign data to a column format so it can keep some data in a column format to be able to answer aggregate queries very quickly. And they're talking about some acceleration anywhere from 100 x improvement. For certain queries like this is doing a sum. So aggregations like sums or averages or things of that nature, can get a big boost from a column oriented format of the data. They saw a 19 x improvement, eight x improvement 2.6. So it shows different advantages of the column format compared to just storing all your data in rows. And apparently it does this dynamically based upon the queries that it's seen. So you want to learn more about that, definitely check out this blog post as well. [00:12:28] The next piece of content using a virtual environment with PL Python Three in PostgreSQL so Python enables you to set up virtual environments for different versions for different projects potentially, that you're working on. The thing about Postgres, of course, is that it can run Python. And they say, well, there's a way to actually set up a particular virtual environment for your Postgres installation to determine what Python version it's going to be running. And you can do a server wide configuration for it. Basically, you go into the system D configuration and you can actually append a source command to be able to set the Python environment before running the actual executable for the Postgres postmaster. So by doing that, you can set up a particular Python version for your Postgres installation. And then down below there is a way in a development environment to set a per session configuration. So you use a Python utility called Virtual Env, and they work through the process of setting that up and then creating a particular extension within your Postgres database and a way to activate a particular Python virtual environment with this function here. So if you're interested in setting up virtual environments for Postgres in terms of your Python environments, definitely check out this blog post next piece of content. New order by and limit offset pushdown features increase agility in MySQL foreign Data wrapper so the concept of pushdown is that when you have a foreign data wrapper, so Postgres is talking to another database, maybe it's a Postgres database, and you're using a foreign data wrapper to do it. In this example, you're using MySQL. When you use a push down feature, you're able to ask that database to do the query in terms of order bys or offsets or joins, have that database do it, and just give you the results of that query. Now they're talking about new features that were added where an order push down, a limit offset push down, and then the combo of both being able to support a push down. And they've looked at the performance impact of this. And prior to implementing this push down functionality, a particular query would return in 48 seconds or 48,000 milliseconds, whereas with the push down functionality, it ran in 31 milliseconds. So that's a huge performance benefit. So if you're using Postgres with a MySQL foreign data wrapper, you definitely want to upgrade to get some of these performance benefits from these pushdown capabilities, but if you want to learn more, definitely check out this blog post. And these last two posts were from Enterprisedb.com. [00:15:04] The next piece of content effective PostgreSQL cluster configuration and management using PGO version five. One. This is from Crunchydata.com. They're talking about a few enhancements that have added to their postgres operator for Kubernetes. The first one is rolling. Database restarts. You used to have to manually trigger these to do it, but now it automatically happens. So if you do a configuration change that requires a restart of the database to get that configuration change enabled, apparently the system will now automatically do that for you. [00:15:33] Second is pod disruption budgets. So basically you can specify how many Replicas you want to keep around and available to maintain availability of your cluster if there's any kind of incident that happens with a pod going down. And then third, supporting manual switch over and failover. So if you want to fail over or have a particular Kubernetes instance be the primary, you should be able to designate that now. But if you want to learn more, definitely check out this blog post. [00:16:02] Next piece of content is the PostgreSQL Person of the Week is Virginie Jordan. If you're interested in learning more about Virginia and her contributions to postgres, definitely check out this blog post. [00:16:13] And the last piece of content we did have another episode of the Rubber Duck Dev Show this past Wednesday evening. This one was on reviewing the 2022 Rails community survey. So this was recently published and we kind of go through the survey and give our reaction to the different responses that different people made, as well as our perceptions of kind of what that means for the Rails community. So if you're interested in that content, we welcome you to check out our show. [00:16:38] That does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show notes. Be sure to head over to Scalingposgrads.com, where you can sign up to receive weekly notifications of each episode, or you can subscribe via YouTube or itunes. Thanks.

Other Episodes

Episode 95

January 06, 2020 00:11:51
Episode Cover

max_locks_per_transaction, Logical Replication, Backup & Recovery, Pgbouncer | Scaling Postgres 95

In this episode of Scaling Postgres, we discuss max_locks_per_transaction, logical replication, managing backup & recovery and pgbouncer connection pooling. To get the show notes...

Listen

Episode 27

August 27, 2018 00:18:04
Episode Cover

Multiple Databases, Slow Queries, Sorting Performance | Scaling Postgres 27

In this episode of Scaling Postgres, we review articles covering scaling with multiple databases, detecting slow queries, and sorting performance. To get the show...

Listen

Episode 114

May 18, 2020 00:20:28
Episode Cover

Data Directory, Vacuum Features, Binary Performance, Network Latency | Scaling Postgres 114

In this episode of Scaling Postgres, we discuss the Postgres data directory, new vacuum features, binary storage performance and the impact of network latency....

Listen