Showing Plans, Atomic, Configuration, Migrations | Scaling Postgres 107

Episode 107 March 30, 2020 00:10:48
Showing Plans, Atomic, Configuration, Migrations | Scaling Postgres 107
Scaling Postgres
Showing Plans, Atomic, Configuration, Migrations | Scaling Postgres 107

Mar 30 2020 | 00:10:48

/

Hosted By

Creston Jamison

Show Notes

In this episode of Scaling Postgres, we discuss how to show live query plans, the importance of atomic operations for scaling out, configuration options and DBaaS migrations.

To get the show notes as well as get notified of new episodes, visit: 

https://www.scalingpostgres.com/episodes/107-showing-plans-atomic-configuration-migrations/

View Full Transcript

Episode Transcript

[00:00:00] In this episode of Scaling Postgres, we talk about showing plans, atomic configuration and migrations. I'm creston, Jameson. And this is scaling postgres episode 117. [00:00:20] All right, well, with the craziness still going on, I hope you, your family and coworkers are safe. And as always, I hope you're having a great week. Our first piece of content is PG show Plans. Watching execution plans in PostgreSQL live. This is from Cyberdeck Postgresql.com and they're talking about a new extension that I believe they've released called PG Show Plans. And it allows you to see active query plans. Now, they mentioned there will be some overhead to this, but the performance overhead of PG Show Plans will be discussed in a future blog post. But this tells you how to go ahead and get started using the extension. So you have to clone the repo. And it is from CyberTech Postgresql.com. So it shows you how to set an environmental variable, do a make and install. Basically, they give you all the commands on how to get it set up. And then you need to add it to your shared preload library. So it will require a restart. But once you get that set up, you can look at query plans. So they did an example here where they had to set up a long running query with a PG sleep. And then they queried the view. Essentially. PG. Show plans. So you see the actual query you're running. Here the select all from PG show plans. That's what this record that's being returned. But then the next one is this, selecting all from PG Stats. So you can see the exact query that's being used for this. So in the long running query, you can get what that plan is for that query using this extension. So this is really great. I could see a lot of use cases where this could be beneficial. The only creston is what's the performance hit like that they're going to cover in a next blog post. What would be interesting is because adding it to the shared preload libraries requires a restart, I wonder if there's a way to activate it or inactivate it that would say just require reload, but a very interesting new extension. So if you want to learn more, go ahead and check out this blog post from Cybertechn Postgresql.com. The next article is Optimizing SQL step one, explain in PostgreSQL part one. And this is from Higo CA. And this is a basic post explaining how Explain works. So if you're using the extension from the previous post, this can tell you how to interpret the results. So they start from a very basic start explaining how the query processing works, running through each of the stages. And it shows you a basic Explain command and then goes over and explains the different output and what it means. They also cover things like you need to be sure that your table is analyzed to get the most accurate reporting when doing an explain. And also that Explain Analyze actually runs the query for you. But what's also interesting at the last part of this blog post is that they're building a query from a given plan output so that they're looking at the plane and then reverse engineering what that query is and telling you how you go through and do it. So I've never seen this used before, but it is an interesting technique to really kind of get to understand explain output. So if you're interested in learning more about this, definitely check out this blog post. [00:03:39] The next piece of content also from Hago CA is Atomic Commit and Atomic Visibility for PostgreSQL Explained. So this is talking about the future where we're wanting to move towards a scale out scenario where there are multiple PostgreSQL instances that essentially are acting as a common database and you're using foreign data wrappers and storing part of the data on each of those PostgreSQL instances. Well, in that scenario, you're going to want to have an atomic commit. So basically you have a commit that occurs across multiple foreign servers, essentially. So they go through the process and explaining why that's important, where you're going to have to do a prepare transaction in each of the they're calling participants, but they could be the foreign servers. They're going to return a result and then they're going to go ahead and commit or roll back terms of the decision resulting in the result of that transaction. So essentially a two phase commit and how you need that to happen at an atomic level. All or nothing. Then he talked about atomic visibility, meaning when you do a query, you want atomic visibility across the different foreign servers and how this is a feature that's going to be important. And again, they graphically demonstrate you have one DB node here, a second here, and then depending on where different things happen, you could have a commit occur at a different timeline. And you want this visibility to be atomic. So you need to know whether something has been committed or not across the whole cluster of machines and then they go into a little bit about global snapshots. So again, this is kind of like a forward looking and planning for this feature. So if you're interested in that, definitely check out this blog post. [00:05:20] The next piece of content is what's wrong with Postgres? And this is from Mydbainotebook.org and this is a follow on post to the one that was mentioned last time, where is Postgres running? And this one basically how he interprets determining what's wrong with Postgres is you look in the log file and he says on Debian and Friends. So like Ubuntu, it's quite easy. You just run this command to find out where the log file is. It gives the output of PGLS clusters, gives you exactly where the log file is. However, for Red Hat and Friends, it's a little bit harder and it goes through some different techniques you can use to determine where the log file is. And he even has this flowchart helping you to determine it. So it definitely seems to be a lot easier to do this with a debian. So if you're interested in learning more, check out this blog post. [00:06:08] The next Piece of content is actually a YouTube video. This was posted on the Enterprise DB YouTube channel and it's how to use Postgresql.com to configure and tune the PostgreSQL server. So it covers Postgresql.com as well as a great many different configuration options that are available. And then finally it also goes into actual tuning recommendation as well, and some different tuning tools you can use. Now the tuning recommendations start at around the 38 minutes mark, but prior to that they go through these other topics in this webinar. So if you're interested in that, go ahead and check out this YouTube video. [00:06:46] The Next Piece of content is a tale about migrating a PostgreSQL database between different database as a service providers. And this is from Procona.com. So apparently a customer wanted to migrate from one database as a service provider to another, and because these are essentially walled gardens or platforms, they couldn't do a direct replication. So what they did is they created an intermediate database instance. So basically they had the source database as a service database. They replicated to an intermediate server, basically using the archived wall files to replay what it was here, and then they used that to actually finally load it into the target database as a service provider. So it's an interesting discussion of the process they use to do that. So if you have a potential need to do that, definitely a blog post you may want to check out on Percona.com. [00:07:43] The next piece of content is embedded SQL in C for PostgreSQL with ECPG. So you can within the C language embed SQL and they give an example of a sample embedded SQL program here. So this is bypassing libpq and using this embedded SQL technique, and it uses the ECPG capabilities within Postgres. So as it says down here, the preprocessor is called ECPG and part of the PostgreSQL core distribution, and I assume this means embedded C for postgres of sorts. And then in the conclusion they talk about some of the disadvantages of this technique versus the advantages. So if you have a desire to potentially do this, definitely a blog post to check out. The next piece of content is logical replication between PostgreSQL and MongoDB. And it looks like they're using a change data capture method and a MongoDB decoder plugin they developed to be able to stream data from PostgreSQL servers into MongoDB. So if you have this particular use case, you may want to check out this blog post from Higo CA. [00:08:56] Next post is developing PostgreSQL for Windows part three. So this is part three of the post about developing for PostgreSQL on Windows. And this is from secondquadrant.com. [00:09:09] The next post is PostgreSQL GSS API authentication with Kerberos. Part Two PostgreSQL Configuration. Now, last week we presented part one, which described just setting up Kerberos on Ubuntu. The second one actually sets up the PostgreSQL configuration. So they're talking about setting up the key file, doing the PostgreSQL client authentication, the username maps and user authentication. So if you have a desire to use Kerberos with PostgreSQL, definitely a blog post to check out. [00:09:43] The next piece of content is that the PostgreSQL Person of the Week is Dave Kramer. So if you're interested in learning more about his work and contributions to PostgreSQL, definitely check out this blog post. [00:09:57] And the last piece of content is Tile serving with dynamic geometry. And this is from Crunchydata.com. And again, this topic covers the PG Tile serve. They're a microservice for interfacing with Postgres. And this one talks about building dynamic geometry such as the series of Hexagons here and how you can do it using this tool. So if you're interested in this content, definitely a blog post to check out. [00:10:24] That does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show notes. Be sure to head over to Scalingpostgres.com, where you can sign up to receive weekly notifications of each episode. Or you could subscribe via YouTube or itunes. Thanks.

Other Episodes

Episode 65

May 26, 2019 00:14:08
Episode Cover

Multi Column, Multi DB, Dos & Don'ts, RUM Indexes | Scaling Postgres 65

In this episode of Scaling Postgres, we discuss multi-column indexes, using multiple databases, Postgres dos & don'ts as well as using RUM indexes. To...

Listen

Episode 174

July 19, 2021 00:15:54
Episode Cover

Lateral Joins, Generating JSON, dbcritic, Fetch First Rows | Scaling Postgres 174

In this episode of Scaling Postgres, we discuss understanding lateral joins, generating JSON with Postgres, the dbcritic utility, and fetch first rows vs. limit....

Listen

Episode 279

August 27, 2023 00:15:59
Episode Cover

TPS Benchmark, Partition-wise Join & Aggregate, Partitioning Billions, Posgres 16 Features | Scaling Postgres 279

  In this episode of Scaling Postgres, we discuss how pgbouncer can impact a TPS benchmark, partition-wise join & aggregate performance, partitioning a table with...

Listen