PGCon, Tips, PostGIS, Strings | Scaling Postgres 67

[00:00:00] In this episode of Scaling Postgres, we talk about Pgcon Tips, postgres and strings. I'm Kristen Jameson and this is scaling postgres episode 67. [00:00:20] All right, I hope you're having a great week. So our first set of content is actually a lot of videos have been posted to the Pgcon YouTube channel. So I believe this is Pgcon that happened in Ottawa, Canada, and it appears there have been 33 or so videos that have been uploaded so far. So definitely a lot of content this week. I haven't had the opportunity to look at any of these yet, but I definitely will be doing that over the next number of weeks. So definitely plenty of content. If you're interested in postgres with this channel, the next set of content is even more videos. So there was apparently an event called Percona live in Austin, Texas. So they have posted a number of videos. [00:01:10] Some of the more recent ones aren't necessarily PostgreSQL related, but a fair number of the ones in and around this event have been posted. I have watched Zheep, the Next Generation storage engine for Postgres because I'm super interested in that because I think that has huge performance benefits, potentially with the pluggable storage architecture and using Zheep kind of using an undo kind of the way Oracle does it. So I'm definitely keeping track of that. They say that they plan to release it around PostgreSQL 13, or that's the objective for version one and then a version two with additional enhancements following on, but definitely interested in keeping track of that. But some more videos for you to check out. [00:01:55] The next post is actually a third presentation that has a YouTube video with it. So this Was Put On by EDB Postgres by Bruce Momgian, and it's titled Exploring Postgres Tips and Tricks. So we cover six different areas and has 15 different points of reference. So basically hardware things to take into account there, the internals of how PostgreSQL works, some different things to do in terms of monitoring, some configuration considerations, how to use some particular features of SQL, and of course, talking a little bit about clients and of course psql. So this is the presentation and he basically just has references that go back as far as 2009, it appears, talking about the different issues. And then the video with the presentation occurs here. It's only about 35 or 37 minutes of content. So if you kind of want a little bit of a deep dive into some tips and tricks with Postgres in terms of its operation, definitely an interesting video to check out. The next post is parallel PostGIS and pgSQL twelve. Now this post is a follow on to one that was mentioned last week, and he's talking about the automagical parallelization of many common spatial queries. And in this post he actually does some performance checks. So he, as he states here, acquired a 16 core machine on AWS, an M Xlrge, and installed the development snapshots of PostgreSQL and PostGIS. So versions twelve and three, he configured the workers to use 16. In all cases, normally you would put the gather a little bit lower, but I guess he definitely wanted to get all of the cores being used for particular queries he was testing. And he shows some of the performance for the different summarizations of two different tables with a number of different polygons and points. And you can see that it definitely improves in performance as it goes down, but it definitely trails off pretty quickly. Like, for example, he has this representation of scan time versus the number of workers used. And when he was using one worker, it took 318 going down to four. Essentially the four workers drops it by a third, but then when he goes from four to eight, it drops it by less than half. So you definitely have diminishing returns here. So it's not just throwing cores on, it will magically make everything super, super fast in parallel, at least with these development versions that he's using, maybe there's more optimization that needs to be done. And then looking at Join performance, he saw about the same thing. You do have some quick benefits, some quick gains going from say, one to four and even a little bit down to six, but then once you hit eight, it's pretty much diminishing returns. You don't get double the performance for doubling the workers, clearly, because it goes from 5 seconds down to 4 seconds. So you do get a benefit, but it's not very much. And of course his conclusions are there's a limit to how much advantage adding workers to a plan will gain you. And also the limit feels intuitively lower than I expected, giving the CPU intensity of the workloads. So if you're interested in using postgas and want to look at some performance benchmarks, definitely a blog post to check out. [00:05:18] The next post is Beautifulthings Strings. And this is from second quadrant postgresql.com. And this is a post all about using strings with PostgreSQL. So they talk off talking about the different character types, character varying, essentially varicare character as well as text, and how most people usually just use text, unless you're wanting to actually limit how much you want a text string to contain. And then they go into pattern matching using like and I like as well as similar to and showing you all the different ways you can manipulate text than going into regular expressions, as well as using specific functions for working with strings such as substring, regex, replace and some others. So if you want to learn a bit more about how PostgreSQL uses strings, definitely a blog post to check out. [00:06:12] The next post is the contributors team. And this is from the Robert Haas blog and in it he's talking about quote, recently the PostgreSQL project spun up a contributors team whose mission is to ensure that the PostgreSQL contributors list is up to date and fair. So this is basically information about the management of the PostgreSQL project and how they're wanting to get contributors list more up to date and fair, representing everybody who helps contribute to PostgreSQL. So if you want to get insight into how this new contributors list is going to be set up and working, as well as insight to how the PostgreSQL project is managed, definitely a blog post I encourage you to check out. [00:06:55] The next post is Indexes in PostgreSQL nine Brin. So this is from Haber.com and again, I believe the original post, as we've seen some of these from Haber, is from the Postgrespro Ru site, which is in Russian. And we've talked about numerous different posts on Indexes and this one talks about the Brin index. So as we've discussed before, this is great if you have basically the physical layout of the data on the disk closely matches how you want to query it. So if it's by primary key and you query by primary key, a Brin index could get you a lot of storage savings, how it's constructed. So if you're wanting to get super in depth with exactly how the Britain index works and how it's set up and used, definitely a very comprehensive blog post to check out. [00:07:46] The next post is PostgreSQL incremental backup and point in time recovery. And this is from the PG IO blog and they're talking about how to get an incremental backup and point in time recovery set up. So if you're still just doing a logical backup with pgdump, this blog post basically walks you through how to set up the wall archiving and doing a backup and a point in time recovery. Now, this is a test text based version, but I did do a tutorial a little while ago that talks about PostgreSQL backup and point in time recovery through a video tutorial. So if you wanted to check out this tutorial, I'll include a link in the description. It's basically in the tutorial section of Scaling Postgres. [00:08:31] The next post is how to optimize PostgreSQL logical replication. And this is from the several nines.com blog. So basically they go over how logical replication works. It basically logically replicates one table to another table to another database system, and they talk about some of the different configuration parameters you can adjust and how to tune them. So it's a pretty short post, but if you want to get logical replication set up, definitely a blog post to check out. [00:08:58] The last post is power. Four changes in power. Archivist So this is from Arjuju GitHub IO and this post is the next in the series of Power Four beta posts and he's talking about changes done in the Power archivist. So if you use Power or if you want to potentially consider using it, definitely a blog post to check out to see what's coming in version four. [00:09:25] That does it for this episode. Of scaling postgres. You can get links to all the content mentioned in the show notes. Be sure to head over to Scalingposgres.com, where you can sign up to receive weekly notifications of each episode. Or you could subscribe via YouTube or itunes. Thanks.

Show Notes

Episode Transcript

Other Episodes

Episode 274

Subquery Performance, Avoid Update Locking, Column Changes, Outage Workshop | Scaling Postgres 274

Episode 88

Partitioning, Logical Replication Upgrade, Columnar Compression, HAProxy Connections | Scaling Postgres 88

Episode 68

Prepared Plans, Materialization, Recursive CTEs, Safe Migrations | Scaling Postgres 68