Postgres 13 Beta 2, System Stats Extension, Tuned, Join Strategy

[00:00:00] In this episode of Scaling Postgres, we talk about postgres 13 beta two system stats extension. Tuned and joined strategy. I'm Kristen Jameson. And this is scaling postgres episode 120. [00:00:22] All right, I hope you, your family and co workers continue to do well in these times. Our first piece of content is PostgreSQL 13. Beta Two is released. So this is leading up to the release of Postgres 13 in the fall, and it mentions some of the changes since Beta One predominantly bug fixes, but a couple of different changes. So if you're interested, go ahead and check that out. Now, to go along with this, there has been a recent YouTube video posted on the San Francisco Bay Area PostgreSQL Users Group YouTube channel called A First Look at PostgreSQL version 13. And this talk was given by Christophe Pettis, and it's a good summarization of all the new features in Postgres 13, particularly the ones that are of interest to him. Now, the video is about 30 minutes in length, but it's really the first 2022 minutes of presentation and then questions at the end. So it's a pretty short overview that gives you all the different changes coming. So if you're interested, check out this YouTube video. The next piece of content is System Stats Extension for PostgreSQL. And this is from Pgsnake bloodspot.com, which is Dave's postgres blog. And it's actually an extension for Postgres that actually looks at various operating system and hardware statistics. Basically, they've given you SQL access via these new system tables to be able to query what the state of the hardware in the state of the OS system is doing. So they present operating system info, CPU info, usage statistics for CPU, memory info, as well as i, O and Disk info, system load averages, process info, network info, and memory by process info. So this is a great addition. I'm definitely going to be checking out this system extension to see if I could potentially use that to help with monitoring. So if you're interested, check out this extension as well as his blog post about it. [00:02:22] The next piece of content is tuned PG and U. This is from Hunleyd GitHub IO, and he's talking about doing tuning to your OS that PostgreSQL is running on. And he advocates using this tool called Tuned that kind of helps with dynamically making changes to your system. And I believe it's written and maintained by Red Hat. And he advocates using Tuned over something like making changes or editingctl.com instead. And he goes through the process of taking some existing tuning configurations and adjusting it to come up with a PostgreSQL tune configuration. So this is not only beneficial to see if you want to use tune, but to look at his OS configuration changes that he does for some of the systems. And of course, he said he uses benchmarks to also determine what these changes should be. So definitely something to take into account. But if you're interested, definitely check out this blog post. [00:03:21] The next piece of content is Join Strategies and Performance in PostgreSQL. Now, this post goes over joins in the three different join types that are typically used with a Btree index. So they talk about the nested loop join strategy, the hash join strategy, and the merge join strategy, and it talks about how each of these are used. The algorithm used, the different types of indexes that can help in certain situations and whether it's a good strategy. So for example, the nested loop join is a good strategy to use if the outer table is small. A hash join using a hash table is advantageous if it can fit into work mem. And then a merge join can be useful if both of the tables are large. But they do mention that if it's not an equality but something different than equality, then pretty much things just drop back to a nested loop join. Now what one thing they mention here is a way to be sure you get the best strategy is to make some of the changes they suggest here. And number one, of course, is having your statistics be up to date. So either analyzing frequently or making sure your default statistics target is set appropriately at the database level and at the table level if you need to. Also, you could potentially look to increase your work mem if that makes sense for you to get the cheaper hash joins, as well as communicating to PostgreSQL about your hardware and resources, tuning things like your random page costs, effective cache size, and effective I O concurrency. So that way the planner may make better decisions on which types of plan to use. And he also mentions that quote you can speed up nested loop and merge joins with index only scans, so adding more columns to an index may be beneficial. And with versions eleven and above, you can just include columns onto that index without actually having it in the index. So if you want to learn more about these strategies and how they could potentially help your performance, definitely check out this blog post. [00:05:19] The next piece of content is actually a PDF and it's talking about JSON unification with regard to postgres. So what they're talking about is that around 2014, JSON B was introduced in postgres that attracted some NoSQL users. You can tell the rapid rise of postgres here in terms of popularity from DB engines and they're arguing as a result of this, JSON has been added to the actual SQL standard. But the issue being that we have a JSON data type and a JSON B data type, but the standard is going for one common standard. So this is talking about unifying the JSON and the JSON b and how we could potentially do that. Now, this is a substantial piece of content and it's over 89 slides, but it goes into the history of JSON. The different ways you can query it and also how to approach this unification process. So it's very dense in terms of content and material, but if you're interested in learning more about the JSON, the JSON being potentially this unification, that will have to be done with postgres to be in alignment with the SQL standard, definitely check out this piece of content. [00:06:35] Next piece of content is PostgreSQL lTree versus with Recursive and they're talking about they had done a previous post talking about lTree and hierarchical data and querying it. They said well what about performance and what about with Recursive? So this blog post actually looks at doing something similar in order to optimize to get really fast queries. He basically does the same thing still using lTree but using a materialized view that can be refreshed and therefore you can index that to get better performance. And after some of these changes he got relatively low execution times less than a millisecond for doing some of these queries. So if you have hierarchical data and you potentially want to look into lTree or with Recursive and getting better performance by leveraging materialized views, maybe you want to check out this blog post from CyberTech Postgresql.com. [00:07:32] The next piece of content is Advanced Active Record using subqueries in Rails. Now. This is from Pginalyze.com. Now with Rails they're talking about Ruby on Rails which is an application framework that uses Ruby and ActiveRecord is the Orm that Ruby on Rails uses. So it's an object relational mapper. So as such it has its own syntax but sub queries aren't the easiest thing to figure out in it. But this post goes over some different techniques that you can use to do it. So for example, this is an example where you have a sub query in a where clause you can actually have query that you can then insert into the SQL. And again, this is the active record. Ruby syntax. So this is the equivalent to doing something like this query here with a subquery as a part of the where clause and he goes over different ways you can do things such as where not exist subqueries select subqueries as well as from sub queries and having subqueries. So if you have a need to use subqueries in your Ruby on Rails project, maybe you want to check this blog post to see how best to do that. [00:08:44] The next post is composite primary keys PostgreSQL and Django. This is from Crunchydata.com and they were doing a database design where they ideally wanted to have a composite primary key but when trying to set it up using Django's orm they had some difficulties and basically it was a lot of difficulties but they were able to work through and get something working to their satisfaction. So if you are interested in learning how they were able to rectify the situation with the Django RM, definitely check out this piece of content. [00:09:18] The next piece of content is Types of Indexes in PostgreSQL. This is from Higo CA, and it goes over all the different indexes of which there are many of PostgreSQL, starting with of course the Btree index. Talking about hash, gist, spgist, gen, brin, multicolumn indexes, unique indexes, expression indexes, partial indexes index only scans as well as covering indexes. So if you want to learn all about indexes on PostgreSQL, definitely check out this blog post. [00:09:50] The next two pieces of content are in relation to Oracle and Migrating to PostgreSQL. The first one is Oracle to PostgreSQL Reasons to Migrate, and it covers some of those here such as foreign data wrappers versus federation, all the different languages available to write procedures and functions, and compared to PL, SQL and Oracle as well as others. So if you're interested in this type of content, check out this blog post. And this is from a second quadrant.com. [00:10:17] The second one is set up Aura to PG for Oracle to postgres migration. So this is an open source utility called Aura Two PG that helps you convert your data from Oracle to postgres. And this is from Crunchydata.com. So if you're interested in doing that, check out this blog post. [00:10:36] The next piece of content is the PostgreSQL Person of the Week is Simon Riggs. So if you're interested in learning more about Simon and his contributions to PostgreSQL, definitely check out this blog post. [00:10:48] The next piece of content is how to monitor PostgreSQL Twelve performance with Omnidb part One this is from Second Quarter, and they're talking about using their monitoring tool Omnidb to monitor postgres. So if you want to learn more about Omnidb and what it does in terms of monitoring, check out this blog post. [00:11:08] Next piece of content is authenticating PG Pool Two with LDAP. This is from Higo CA and LDAP authentication is a new feature to my understanding with a new version of PG Pool Two. So this shows how to get that working with the newer versions of PostgreSQL and PG Pool Two. [00:11:28] And the last piece of content is how to use Neural Network Machine learning model with two EDA PostgreSQL and Orange Part Seven. So part Seven in a Post talking about network machine learning with PostgreSQL. So if you're interested in that, check out this post from secondquadron.com. [00:11:47] That does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show notes. Be sure to head over to Scalingpostgres.com where you can sign up to receive weekly notifications of each episode, or you can subscribe via YouTube or bye Bye was.

Show Notes

Episode Transcript

Other Episodes

Episode 6

Terabyte Scale, Permissions, Fast Column Adds, pgBouncer | Scaling Postgres 6

Episode 179

Foreign Data Wrapper, Timescale Compression, Fuzzy Search, Query Execution | Scaling Postgres 179

Episode 367

How Often Does That Query Run? | Scaling Postgres 367

Postgres 13 Beta 2, System Stats Extension, Tuned, Join Strategy | Scaling Postgres 120