Stuck Transaction IDs, pgBouncer Auth, UUIDs, Monitoring

[00:00:00] In this episode of Scaling Postgres, we talk about stuck transaction IDs, PG bouncer authentication, sequential UUIDs and monitoring. I'm Kristen Jameson, and this is scaling postgres episode 46. [00:00:22] All right, our first article is Zombies dealing with a case of Stuck Transaction IDs. And this is from Richen.com. And basically the story is that everything is seemingly working well with a PostgreSQL database. But then you start seeing a warning and a hint. And the warning is oldest X Men is far in the past in hint close open transactions soon to avoid wraparound problems. And he says, okay, this is strange. There's no open transactions. And looking in PG Stat activity, you don't see anything that leads you to what's going on. And then he proceeds to discuss the panic you go through and you start looking everywhere in terms of auto, vacuum, freeze, maxage, Ralph, frozen ID prepared statements, I mean, nothing. You try vacuuming the whole database, no difference. And even restarting the database doesn't help. Now, the origin of this problem lies in PG replication slots. So someone set up a slot but wasn't using it. Now typically you're going to get your wall files growing extremely large, and here he says, quote, we've seen this problem before where the PGx log with many wall files would breed down the servic as it would exceed the size of the partition that it's on. But in this particular case, it was sufficiently sized. The problem didn't bubble up there or wasn't seen there. And basically over time, Xids continue to advance and quote, the stray replication slot is waiting for a subscriber to send all its stashed away data to, leading to the warning and hints as seen above. So basically the conclusion of this he says, is quote, clean up after yourself. Basically, if you're not using a slot, you need to delete it or you're going to usually run into disk space problems. But as shown here, it can actually cause wraparound problems, which requires you to shut down your whole database to do a vacuum to resolve the situation. So definitely a good public service message to keep in mind. [00:02:27] The next post is PG. Bouncer authentication made easy. And this is from CyberTech postgresql.com. And basically it's talking about a Pgbouncer, which is a connection pooler for postgres. And basically once you get up to more than 200 or close to 300 connections on postgres, you want to switch over to some sort of connection pooler. And PG Bouncer is the one that I tend to go for. There's also PG Pool, which does a lot more stuff. But this post talks about PG Bouncer and how to set it up. And they talk about basically three ways. The simplest way is just creating a UserList TXT file and you place the users that are going to be connecting to your database and then the password for it. Now, how PG Bouncer works is you actually connect to the PG Bouncer as if it were a PostgreSQL instance. So you need the same username and password so that then it can accept that connection and pass it on to the database down the line. Now, once you have this user list file in place for PG Bouncer, you then configure PG Bouncer by saying the Auth type will be MD Five and then where the Auth file is located that you've set up your UserList TXT file. So that's the simplest way to get PG Bouncer up and running. And for a lot of applications that's all that you need. [00:03:53] Now if you start having a lot more users then they're going into this second technique where a client does this authentication about Pgbouncer. But what PG Bouncer does is then query the PG shadow table in PostgreSQL to get what the current username sorry, get what the current password is for that user to then determine whether to allow connections to it and then the data that it can access. And they go through the process here basically create a new role that can query the PG shadow database. In this case they're using a PG Bouncer role or PG Bouncer user, set up a password for it and they created a dedicated function that will query the PG shadow database. And then they're using this security definer as well as part of their function. And then they revoke access to this function from the public schema, but they grant it to this individual user. So now you can configure PG Bouncer. Again, you use the same MD Five Auth type, you can have an Auth file, but the important part is what user you want to use to do authentication and what query to use. And again they're querying the function to get the username and password to allow connections to the database through Pgbouncer. And then they talk about an advanced authentication method using Pghba.com. Now in this scenario, Pgbouncer is set up using a Pghba.com file format. So in this case the Pgbouncer is accepting connections from two different systems at specific IP addresses, but it is using an app user and the method MD Five. So it does have the Auth file as the user list, but it's also doing an Auth HBA file and the Auth type two HBA. So that's another way to use authentication with Pgbouncer. So if you're using PG Bouncer or you're realizing you're going to start using, it definitely a blog post to check out. [00:05:48] The next post is sequential. UUID. Generators on. SSD. And this is from the second quadrant.com blog. Now this is a follow on from a previous post we covered a few weeks ago about sequential UUID benchmarks where they were saying random UUIDs can cause performance problems on inserts as it's searching around for where particularly if you're having indexes on it as it's searching for where to insert that ID. And you can cause slowdowns in transaction throughput with regard to writing and actually increase the size of the wall files. Now on his previous benchmark test he was looking at he was using magnetic storage, so not SSDs. And someone had made the point, well, the SSDs are better at random I O, so what would the benchmarks look like with that? So they went ahead and did this post. So again, they took a small data size, medium and a large, and looking at the differences, they had the random so entirely random UUIDs. This case where by time there's a certain part of that UUID that is sequential with regards to time, and then two sequential where the first part of it is sequential and then a remaining portion of it is random. And looking at small and medium, there's not too much of a difference. But then when it gets to the large data set size, you do start to see an impact of that random or entirely random UUID on their performance with regard to transactions processed per second. Now, it's not as overt as it was with the magnetic storage, but as you can see, you do get about a 50% performance hit once you start increasing the data size, even with SSDs with the random I O. And the other thing that still happens is that you do have a lot of full page image writes that happen to the wall. So your wall size still increases dramatically compared to doing the initial part of the UUID sequenced with regard to time or general sequence. And then they also tested a much more powerful server with an Intel Optane SSD and they got similar results, although some of the sequence level UUIDs were not as good. So in general, it's still even with SSDs, it still allows higher throughput and it definitely reduces the amount of wall produced. So definitely something to keep in mind if you're using UUIDs and want to use them in a sequential manner. [00:08:22] The next post is Webinar PG Logical and postgres BDR update. So this is a general update for second Quadrant's BDR product, their bi directional replication. So this is basically their Master Master PostgreSQL offering. I haven't had a chance to actually review this webinar yet, but if their Pgological as well as Master to Master replication or BDR of interest to you, definitely a webinar to check out and sign up for. [00:08:51] The next post is Monitoring postgres with Prometheus. This is from Dataful Blogspot.com, and basically this is a post about monitoring postgres with Prometheus based upon a talk he gave in Lisbon at PGConf EU 2018. And I went and got the slides form and the link are right here for the PDF of the slides. Basically it's using Prometheus and Grafano, a graphing tool to build a postgres dashboard. And he talks about kind of the different tools he's using. Prometheus Alert Manager, node exporter, postgres exporter, these are things that work with Prometheus to grab certain statistics, using Mtail to actually tell some of the logs, I believe. And then what you see is what you get dashboard Grafana and he talks about the different data flow of what's needed. So wanting to look at the different statistical views, wanting to look at the logs, some other tables such as PG Stat statements and basically talking about the different tools that will enable you to capture that. So the postgres exporter grabs the statistical views and the other PG Stat statement type tables entail does the logs and you feed it into Prometheus. And then you can also use a net node exporter to grab operating system related information or even general system metrics, CPU and things of that nature. Those go into Prometheus and then you send them into Grafana. And then Prometheus also allows alerting by email PagerDuty IRC Slack so, definitely an interesting post to check out about how he set up his monitoring system using these tools. And he goes into some of the different settings for rates and errors and different things that he's added. So if you're interested in building a monitoring solution using these tools, definitely a blog post to check out. [00:10:48] The last post is just something of general interest. So if you are interested in history, and particularly PostgreSQL's history, there is this post that has been put up that is called Looking Back at Postgres by Joseph Hellerstein and it's basically looking back at the history of the postgres project and how it got started from the mid eighty s to the mid 90s. So if that's something of interest to you, definitely a blog post to check out that does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show notes. Be sure to head over to Scalingposgres.com where you can sign up to receive weekly notifications of each episode, or you could subscribe via YouTube or itunes. Thanks.

Show Notes

Episode Transcript

Other Episodes

Episode 148

Hash Indexes, Int Float Numeric Types, Postgres 14 Features, Data Science | Scaling Postgres 148

Episode 316

New Releases & 1,000 Times Faster Query | Scaling Postgres 316

Episode 310

Andres, Microsoft & Postgres Save Linux? | Scaling Postgres 310

Stuck Transaction IDs, pgBouncer Auth, UUIDs, Monitoring | Scaling Postgres 46