Episode Transcript
[00:00:00] In this episode of Scaling Postgres, we talk about practical partitioning, understanding PG repack, explaining buffers and multiro constraints. I'm Kristen Jameson and this is scaling postgres episode 171.
[00:00:24] Alright, I hope you, your friends, family and coworkers continue to do well. Our first piece of content is PostgreSQL 14, beta two released. So this is from Postgresql.org and they've made the announcement of postgres 14, beta two being released. They mentioned some of the changes that have been done from version one. Predominantly bug fixes, but if you're interested in trying this version to see if it works for your use case, you can go ahead and check that out.
[00:00:53] The next piece of content is the website for Postgres Vision. So this was an online conference that happened this past week. You can just go to postgresvision.com and you can click Watch now to watch the virtual episodes or the online talks of what's presented. Now, related to that, there's another post called Practical Partitioning in production with Postgres and it took place at Postgresvision 2021. This was from Vyrush.org. It's a very simple post, but it has a link to a YouTube video of the talk that was given at Postgresvision. And this was a very good talk explaining about partitioning. The things to keep in mind when you're considering moving to partitioning. And as a reminder, partitioning is where you take a very large table and you break it up. Maybe you want to partition a large table by date, or by time, or by status, or by a particular customer ID. Basically that makes the tables more manageable because vacuum can more easily vacuum each table. The individual indexes on each table will be smaller as well as the tables themselves if you need to do scans, only looking at a very small portion of a table at a time. And it's a really great post, including at the tail end where he describes a real world scenario where you have a 20 terabyte table and how can you convert that table to a partition scheme without having significant downtime? And he describes a process for doing it. Now there are two hacks that make me nervous in his description, but it appears that these should work. So if you have a particularly large table and you want to find out a way to move to a partitioning scheme that should result in little or no downtime, definitely check out this presentation from Jimmy Angelakos.
[00:02:37] The next piece of content, understanding PG Repack what can go wrong and how to avoid it. This is from Percona.com. They're talking about a utility called PG Repack, which allows you to compress your indexes or tables. Although since versions postgres twelve and 13 have implemented concurrent reindexing, I've basically moved to using that utility for doing indexes and would only use PG repack for tables, and particularly with version 14 coming up, it even handles partition tables, the reindexing, concurrently. But I would say this is probably one of the best tools to use if you want to compact your tables due to bloat issues, because it does do it in an online manner, unlike vacuum Full, which would lock the whole tables in cluster. So this does it without bringing the table offline. But there are some things it has to do in terms of locking that can cause issues. So you want to be very careful how you do your PG repack. And he emphasizes that in this post. And it goes through the process of what PG repack does. I'm not going to describe the whole process here, but talks about different areas where you could run into problems. Most notably, there's about three places that an access exclusive lock happens, which means it must get full access to the table with nothing else accessing it, including select statements. So it's critical to run PG repack when the system is in a lower traffic state. So either if you can shut down workers or do it in an off hour period, basically it needs multiple opportunities to get these access exclusive locks on the table that you're trying to repack. The first one does it for creating an initial empty template of the objects it needs to compact. The next one does it when switching over and then during the cleanup. The other thing to keep in mind is that it also cancels statements to be able to try and acquire this lock. And there is a timeout associated with it, but then also by default, if it doesn't successfully cancel these statements, it then moves to terminate them. And he says specifically, you're going to want to use Hyphen Hyphen no Kill back end when you do this to avoid that termination step. And instead it will just skip repacking that table and it'll give a warning message similar to something like this warning timed out. Do not cancel conflicting backends info skipping repack table name due to timeout, DDLS are also prohibited from running while Pgrapak is running. So any trying to add a column, add an index, those won't succeed because of locks it obtains while doing the repack operation. But this is a great post and if you want to learn more about Pgrapak, definitely check out this one.
[00:05:19] The next piece of content explaining the unexplainable part six buffers. This is from Depsc.com and he's talking about when you're doing an Explain or Explain analyze, you can specify buffers on to give you information with regard to shared buffers and disk access relative to it. So for example, here he's doing an Explain analyze on and buffers on and you can see that it outputs buffers and gives you information about it. So this is for the shared memory area. The hit in the cache or the shared buffers was 480 blocks, or that's the number of pages. And what was read actually from the disk is 17 six, six two. And you even hit the buffers at the planning stage. So you could see 19 blocks were hit while doing the planning. Now this is a simple representation of what buffer shows, but it can show all this different variety. So it covers shared, hit, red, dirting and written local hit, red, dirty and written. And Temp read and written and he goes through each one. Basically a hit is it finds the information in the shared buffers. A read is that it had to go to disk to read it. A write is that as part of that statement it has to write to the disk, usually to potentially free some space up in shared buffers he mentions and then finally dirtied. That can happen in certain cases when there are new pages in a table or an index, but something gets marked as dirtied and must be written to the disk. In terms of what's shared local and temp. Shared is basically the normal database. Objects like the tables, indexes, materialized views, things of that nature. Temporary objects that are part of that session are considered local. So it's basically local to that session you're running in. And then Temp is for temporary data used in sorting or hashing and things of that nature. So this is a very comprehensive post explaining all of this in detail and giving you some examples. So I highly encourage you to check out this one, the next piece of content, PostgreSQL Constraints over multiple Rows. This is from CyberTech Postgresql.com and he's talking about when you're wanting to apply some sort of constraint over multiple rows, not just in one, because it's pretty easy to add a constraint that applies for a single row of data you're trying to insert update. But what about when you want to constraint over multiple rows? Now, this actually is not an object or something you can create in postgres, but he explains how you would do the implementation to satisfy this type of constraint. So for example, he has a scenario where you have planes and planes are owned by multiple owners. So he has an owners table that designates the percentage ownership of each plane. So basically the toner ownership should not exceed 100%. So you may have one row where someone owns 100% or two rows where one person owns 50%, other person owns 50%. So basically you have a 100% constraint across multiple rows. And if you just try to do an update and have two users updating a particular plane at a particular time, you're going to run into conflicts because you can update the system such that it's going to have more than 100% ownership for a particular plane that's possible to do. And he shows that in this diagram here. One scenario is to lock the table in exclusive mode. That's really hard to do because now suddenly you've got a really heavy lock and not much concurrency on that table now because anytime anybody wants to do an insert into this table, it has to lock the whole table. So that's not a great solution and he describes that here. But the most lightweight way to do it is with serializable transaction isolation so that endeavors to only allow one update at a time. So for example, if user one needs to do its owner assignment, it does begin transaction isolation level serializable. It does its inserts and it does a select to validate that the resulting value is still 100%. So that's good, but it hasn't committed yet. Meanwhile, user two starts another transaction using the isolation level serializable. It does its inserts. But meanwhile this does a commit. Now this succeeds, but this hasn't committed yet. But at the point when it's ready to commit, it's going to receive an error that it could not serialize access. So if you need concurrent access to information like this, a serializable isolation level can help you with that. Now, he said someone's proposed, well, could you do select for update which that basically locks a single row to do an update. But that's a problem. It's a single row that he mentions here. It doesn't handle existing plus future rows that may happen due it inserts, whereas this technique will. So it's an interesting use case of putting constraints on multiple rows. So if you're interested in that, definitely check out this blog post.
[00:10:03] The next piece of content from Nd five to Scramshaw 256 in PostgreSQL. This is also from Cybertechn Postgresql.com and they're talking about moving from Nd five to Scram authentication and they explain the reasoning behind it. Basically there's more vulnerabilities for MD Five cryptography compared to Scram shot 256 and it's a better implementation of doing passwords. But he has this great list of steps for doing the upgrade. Number one, upgrade your client software because you're going to want to have all your clients be able to support Scram as opposed to just MD Five. So that's the first step. Make sure you have up to date versions of the client software that talks to your postgres database. Second, change the password encryption parameter for your postgres configuration. Now what this is, is the default password encryption that's used when you create users. So basically you are going to want to in the future when you ever create new users, use Scram and not MD Five. The third thing is you need to reset all the passwords. So there's a backslash command that you can use here for doing it, or you could do an alter user and set the password that way as well. And then fourth is change the authentication method that people connect to through Pghba conf. So if you have a password authentication, usually it will say MD Five in Pghba.com. Well you would change that to Scram shot 256. So this is a great post explaining how to move from Nd Five authentication to Scram. So if you're interested in that, check out this blog post.
[00:11:35] The next piece of content is a tale of making companywide standard psqlrc. This is from Depesc.com, and this explains Psqlrc file and how you can configure defaults for working with the psql client. And they had a use case where they had thousands of postgres servers at this company. And when they logged into a particular server, they wanted it to output a variety of information to confirm that they were indeed logging into the correct server. Now, they had an existing function in the DBA schema called SMI, and it would output things such as the project related to it, the environment, the cluster, the particular role you're connecting as, et cetera. So for example, when you logged in, it would say something like, hey, this is the magic project in the production environment for cluster one, and then the backup DB role as well as the process ID. So basically, this was sufficient to output what they were seeing here. But then he, hmm, can I connect to the postgres database? And he resulted in an error and he had to rework it using some. Then he said, Can I connect up through PG bouncer. But then ran into another issue. So okay, had to refactor it again. Then he said, Can I connect to the PG bouncer DB? That PG bouncer runs itself, and he ran into more problems. So basically, this is an iteration of this psqlrc file and all the modifications he had to do to get this working. So if you're interested in all sorts of ways you can configure your Psqlrc file, definitely check out this post.
[00:13:14] The next piece of content zero downtime schema migrations in postgres using views. This is from Fabianlinforce Se, and he's describing an interesting way of abstracting away table changes from, say, your applications or end user, because there's a long process for doing things like renaming a column and not disrupting access to the table or the information. But he has a unique way of doing it, combining schemas and views. So basically the schemas are essentially versioned, and each schema has its own view to access particular tables. So for example, he has a user's table and he has schema A. So this is considered just a versioned schema. And within Schema A, he creates a view to access the user's table. So everything is accessed through these views. Then when you're connecting as a user or an application, you set your search path to the A schema. And when you do a select from users, you are going to get the information in the table as reflected in the view. Now, how do you update a column in that table? So, for example, he did an alter table users and renamed the column from ID to user ID. Now the view will handle that. So this user connecting to the A schema, it essentially will know to use the user ID column and not ID column anymore. So you don't need to change anything for the A users. But then what you do is you create another version, Schema. He called it Schema B and created a new view with the new definition. So, as he mentions, older clients can still access the old column name through Schema A, but then the new clients or new people connecting would use Schema B. Now, I wouldn't necessarily implement this because it seems to be a fair amount of work, and I'm sure there's use cases where this would not work out as conveniently. And it's so rare to have to do some of these changes like this. I'm not sure if the overhead of managing all of these schemas and views makes sense, but it's definitely an interesting thought, experiment and technique to hide backend table changes. So check this blog post out if you're interested the next piece of content using Geojango and PostGIS in Django. This is from Pginalyze.com, and they're talking about using these tools with PostGIS and postgres to work with spatial data. So if you're interested in that type of content, you can definitely check out this blog post.
[00:15:40] The next piece of content is the PostgreSQL person of the week is Frederico Campoli. So if you're interested in learning more about Frederico and his contributions to postgres, definitely check out this blog post.
[00:15:52] And the last piece of content is not exclusively related to Postgres, but it is the first episode of a new show I started called The Rubber Duck Dev Show, and the first episode is all about testing. Now, this is a longer format. It's not as highly edited as Scaling Postgres is, so this is about an hour in length. But if you're a developer and are interested in some more long form content that talks about testing, maybe you'd like to check that out.
[00:16:20] That does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show notes. Be sure to head over to Scalingposgres.com, where you can sign up to receive weekly notifications of each episode. Or you can subscribe via YouTube itunes. Thanks our.