Episode Transcript
[00:00:00] In this episode of Scaling Postgres, we talk about scaling suggestions, slot safety, scram channel binding, and effective cache size. I'm creston. Jameson and this is scaling postgres episode 125 one.
[00:00:23] All right, I hope you your family link coworkers continue to do well. Our first piece of content is some SQL tricks of an application DBA. And this is from Hakibanita.com and he's giving some scaling suggestions essentially for different ways to use the database. Now, the comic that is embedded here is particularly great because you have a couple of developers that are trying to figure out the best way to optimize a query. And then a magical DBA appears and the query goes from seconds down to 00:30 3 seconds and they're amazed at the magic that was done. But moving on to the actual content of the post, it goes over a series of suggestions when working with the database on how to make things more performant or more efficient. And I'll just mention some of the highlights here like update only what needs updating. So don't, for example, run a function against every column in the database. Be sure to use aware to reduce the surface area that you're updating. And in PostgreSQL's case, this also limits the amount of Bloat you're creating in the table. Next is disable constraints and indexes during bulk loads. Very important. Next is use unlock tables for intermediate data. So unlock tables don't get written to the wall and they are much faster. The problem is there's no crash safety. So if you have a crash in the middle of it, the data is essentially lost. It's totally unaware that that table exists and it's not going to try and rescue it. The next one is implement complete processes using with and returning. So basically using CTEs to be able to develop a stepwise process and using results of a previous query to move on to the next. And he addresses the issue of identifying, duplicates and resolving them using a CTE. Although at some point he does mention some caution when trying to execute DML in a CTE, so need to be aware of that. But I've noticed a lot of programmers like using CTEs. It's easier for them to reason with compared to say, using subqueries. The next section is avoid indexes on columns with low selectivity. So basically a boolean column where you only have two values, it doesn't make sense to necessarily index that because there's so many values that are the same. So using an index is not efficient to do that. However, the next section he talks about here is partial indexes. So partial indexes are great to use on Booleans. So if you're only interested in one of the booleans, you could just create one partial index, or if you're interested in both, you could create two. The other way you could use partial indexes, of course, is if you have, say, statuses like you have four or five statuses. Sometimes it gives you better performance to create four or five partial indexes versus one and trying to find all of those statuses with a single index. The next is always load sorted data. So basically when you're loading data in, always do it in a sorted fashion so you have high correlation between what's on the disk and what's in the tables or indexes. Next section is index columns with high correlation using Bren. So Bren is the block range index and it indexes ranges. So one disadvantage of Brin is that it's not as fast pulling out single values compared to a B tree index, but it is still relatively fast but it gives you dramatic space savings. And he had an example here where I think the Brin index was 46 times smaller than the equivalent Btree index and for that it's smaller and easier to cache because it's so much smaller, but pulling out a single value is not as efficient. But these are good for things that have high correlation, for example, sequential IDs or date times. The next section is Make Indexes invisible. So basically transactional DDL statements and the last suggestion is don't schedule long running processes at round hours because nearly everybody tries to record something at a round hour and you basically get spikes in your database usage during that time. So basically offset it a bit. So if you want to review these suggestions in detail, go ahead and check out this post from Hakibinita.com.
[00:04:36] The next post is related to the previous one and it's scaling relational SQL databases. Now this is not postgres specific, but so much of it applies to postgres, but he's considering MySQL MariaDB or PostgreSQL and this is basically a hit list of things to consider to improve your database performance. He talks about step one update your database. Get on a more recent version to get the best performance ways you can scale vertically, look for ways you can potentially leverage an application cache. And he goes on and on with different areas you can examine to get better performance out of your database. Now one that didn't seem to really apply to postgres, but you need to be aware when you're making changes to your objects in the database. But he says make alter tables work. So I don't know if this is a problem with MySQL or not, but I haven't really seen this issue with postgres. There's definitely issues, particularly with earlier versions when you're adding a column and wanting to make it not null or set a default. There's certain things you need to take care of when you're doing that. Less so in more recent versions. But I've never had an issue with altering the table wasn't necessarily fast. So again, if you want to check this out as a reference, this blog post can be found on a stripny name.
[00:05:55] The next post is PostgreSQL 13. Don't let slots kill your primary. And this is referring to you have a replication slot on your primary database for a Replica, say. Now if that Replica goes down, that slot is going to maintain the wall on the primary and save it there and not archive it until that Replica comes back. So essentially, you can run out of disk space on your primary database and bring the database down if you don't have alerting set up for your disk space. But with version 13, there's a new setting that's called max slot wall keep size. So basically define the amount of wall to keep around if the Replica goes down or stops connecting to the primary. So at some point it's going to deactivate that slot, which means the wall can then be archived and the primary can continue functioning. So basically it's a trade off. Your primary can stay up and running, not run out of disk space. But the disadvantage is this will kill your replication with a Replica. Now, by default, they say that max slot wall keep size is zero. So this isn't active, but you can set this at a particular size of wall to keep around to enable this feature. And he says there are also two different parameters that have been added to PG replication slots that you can monitor. One is a wall status that gives an indication of whether it's reserved, it's unreserved whether the status has been lost, so it's lost communication with the Replica and also the safe wall size. So this is a really great feature coming in 13 that I'm definitely looking forward to. And I'll probably start using more replication slots because of this because a lot of times I haven't used replication slots because of the disk filling risk. But this seems to mitigate that issue. So if you're interested in learning more, check out this post from Secondquadron.com.
[00:07:49] The next post is waiting for PostgreSQL 14. So this is the one coming in over a year from now. But rename wall keep segments to wall keep size. And this is from Dep.com. And basically because of the previous setting where they're allowing you to define in terms of size how much wall to keep around, it looks like they're renaming how many segments to keep around to the size of the wall you want to keep around. So it looks like they're moving to a size parameter to retain wall versus just account. So if you're interested in looking at this change, you can review the patch note that is right here.
[00:08:29] The next post is how to securely authenticate with Scram in postgres 13. Now, we've had a few posts in recent episodes of Scaling postgres about Scram authentication in PostgreSQL. Now this is addressing 13 because 13, the client allows channel binding to be done, which helps the client actually validate that the server is who it says it is. And this post also does a really good job of describing how to set up Scram authentication with channel binding in postgres 13. So it goes over in detail all the steps that you would need to use to set this up and get Scram working on your postgres instance. So if you're interested in doing that, check out this post from Citusdata.com.
[00:09:14] The next post is effective. Cache size. A practical example, this is from CyberTech Postgresql.com and it's talking about the parameter effective cache size, which essentially communicates information to the optimizer to determine how to cost out different queries. And primarily this helps determine whether index scans should be used or not. Now, as an example, they created a random table with about a million rows and then an ordered table with about a million rows, and they set the effective cache size very low 1 MB, turned off bitmap scans and then ran a query that got a particular cost estimate. Then they ran it setting the effective cache size to 1000GB, so a lot higher than 1. Here the cost estimate was smaller, so basically larger cache sizes. The optimizer assumes that more of the indexes will be in memory, so it projects a lower cost for that compared to smaller cache sizes. But now both of this was done using the random table that they created because with an ordered table, the cost estimates makes no difference, no matter the size of the effective cache size. That's because of the correlation between the data on the disk and the index itself. So it already believes it's going to be pretty efficient to pull out that data. And in terms of recommendations, in terms of effective cache size, they recommend about 70% of the memory, although I've seen estimates between 50 and 75%. So basically you want to pick one of those and then adjust to see what gives you the best performance for your queries. So if you want to learn more about this setting, check out this post from Cyberdecken Postgresql.com.
[00:10:59] The next post is migrating to PostgreSQL version 13 incompatibilities you should be aware of, and they're talking about some incompatibilities moving from twelve to 13, I believe. So for example, some of the ones they mentioned is similar to escape null and substring of it return null. The next one JSON b, two TS vector properly check the string parameter. The next one relates to lTree. So a lot of these don't seem to be very commonly used, but definitely some issues to be aware of. And they run through about twelve different changes here. So definitely before you upgrade to dip 13, you should check out this blog post from Percona.com so you can see if you're using any of these features.
[00:11:44] The next post is Webinar being committed a review of transaction control statements. Two out of three follow up. This is from Secondquarter.com, and it's a post about transactions related to replication. So for example, how do transactions get replicated to Replicas? And how your settings for like synchronous commit impact when data appears on the Replica as well as for performance. So basically, the more durable you make your communication with a replica, say synchronous, replication, the higher latency you introduce as well as reducing the throughput. But if you want to go faster, have a higher throughput less latency you're going to sacrifice some durability like you're going to use asynchronous replication and maybe not wait as much to have the data up to date on the replica and they address a little bit of vacuum and freezing. So if you're interested in this webinar, you can just click the link here to go ahead and register for it.
[00:12:42] The next post is SQL style guide. This is from Sqlstyle Guide and it's a published style guide for SQL Code. So if you're interested in adapting a style, this is one interpretation of it. I know this is very subjective, but I saw this post and thought it was interesting. Now, related to it, bruce Momgm's blog at Momgmut US mentioned PG formatter. So basically it takes your SQL code and formats it in a more friendly format. So he has had some good luck with that. So if you're looking for a tool to help you format your SQL files, maybe check out PG Formatter. The next post is Connection Pooling in Pgpool Two, and this is from Bping Blogspot.com, and this explains a little bit about what Connection Pooling is and how Pgpool Two does it. So if you want to learn more about PG Pool Two and how to set it up, I've found this to be a really great post explaining how to do it compared to some others that I've seen. So if you're interested in that, check out this blog post.
[00:13:48] The next post is Announcing PG Backrest for Azure fast, reliable postgres backups. So it describes how to set up PG backrest as well as connect it up to Azure to store your backups. So if you're interested in that, check out this post from Crunchydata.com.
[00:14:05] The next post is Backtraces in PostgreSQL, and this is from Amit Khan at PG Blogspot.com, and he's basically describing how you can set up Backtraces in PostgreSQL to diagnose errors that you receive. So check this post out if you're interested in that.
[00:14:23] And the last post is the PostgreSQL Person of the Week is Gilberto Castillo. So if you're interested in learning more about Gilberto and his contributions to PostgreSQL, definitely check out this blog post that does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the Show Notes. Be sure to head over to Scalingpostgres.com, where you can sign up to receive weekly notifications of each episode, or you can subscribe via YouTube or itunes. Thanks.