Date Time Bins, Search Every Field, JSON_TABLE, Picking Primary

[00:00:00] In this episode of Scaling Postgres, we talk about datetime bins, search every field, JSON table and picking primary. I'm Kristen Jameson, and this is scaling postgres episode 210. [00:00:22] All right, I hope you, your friends, family, coworkers continue to do well. Our first piece of content is PostgreSQL mapping timestamps, date bin. This is from CyberTech Postgresql.com, and they're talking about the function date bin. And basically it lets you place a given date into a particular bin. So this is more sophisticated than just the Date trunk function where you're truncating it, but you can actually put dates into specific bins. And so the first parameter is the bin size. So it can be something that Date trunk can't handle, like they're doing 30 minutes here. You could do 15 minutes, five minutes, whatever binning you want. Then you specify in the second parameter the value that you want to bin. And then the third is the starting point of it. So if you're choosing an even time here, it'll bin every 30 minutes at that exact time. But in the other example here, if you're binning every 20 minutes, you can actually have an entirely different start time. So if it's 1449, it'll go into 1431, or if it's 1454, it'll go into 1451 just based upon what the bin size is, as well as where it starts. So this is an interesting function. If you want to learn more, you can definitely check out this blog post. [00:01:45] The next piece of Content DBA Tips and Tricks Search in every field in a table. This is from Depesc.com and he's talking about you can quite easily just search each individual column for a particular value. So that's easy to do. But what if you had a lot of columns and you want to search for the same value in every field? Well, there's actually a row data type, so you could actually do in his example, he's selecting from PG user. You could do select PG User from PGUSER and it gives you one column being returned with all the fields in it for each row. Now, to make it easier, he did alias it to U. So he's selecting U from PGUSER U and he's just limiting it to one to return one record. But once you have this, you can then cast it to a text value and do that same search. And it's going to look through the text of that row to see, hey, can I find, for example, 22 somewhere in here? Now, of course, the thing to keep in mind, this is not going to use an index, but this could be a helpful way if you want to look through any column of a table. If you want to learn more, definitely check out this blog post. [00:02:55] The next post, also from Depsc.com, is waiting for PostgreSQL 15. JSON table. So this is a new patch that was added as part of the SQL's JSON improvements that are coming and it basically lets you convert JSON into a table that looks like PostgreSQL that you can now query and you have different ways that you can render out the table. You can handle nested JSON as well. So a pretty interesting function and if you want to learn more about it, definitely check out this blog post. [00:03:28] The next piece of content is actually not a new post, but I've recently done some work where I was questioning how I was picking the primary database in a primary replica setup or a primary standby setup and for a long time I was using things like virtual IPS or elastic IPS or even trying using DNS and switching the DNS names. But that gets into an issue with TTL times. So basically it was the infrastructure that was determining what postgres database was the primary one. Now, a lot of this is just due to history because there was no capability in earlier versions of postgres to determine which one was the primary, at least before version ten. So these were the solutions that were used. But even though, as a version ten, you could designate a client to talk to more than one database at a time, and it would pick the connection based upon just a read, write or any type of database, I've never really used it for determining which database was the primary one or how the client actually determined which one to go to. I've always used a single host in the client configuration, but I recently moved to using multiple hosts in the client configuration and it made things a lot easier. So basically in the client connection string you can include more than one host and then with the option target session adders, you can determine whether it's going to connect to any host or a read write host. Now that's as of version 13, that's what it does. Version 14 added four additional options read only, primary, standby, and prefer standby. So you can use this capability as a version ten, but the enhancements in 14 essentially make it even better. So actually on a recent project I actually moved to using the client, determining if the server is primary or not, instead of having the network infrastructure abstract that away. So basically the client is aware of every single host running postgres and I'm using a private DNS to tell what hosts exists and the port numbers. But then it's the client configuration in the target session adders and the client capabilities that are determining which one is the primary. So if something fails over, it knows it needs to look at the primary. And I found this really easy and smooth to set up, so I wanted to mention this as another option you should potentially consider in terms of allowing your client to pick who the primary is as opposed to explicitly telling it. Now this capability of all these different options exist in the client of postgres 14. So you don't even need to use Postgres 14 server, although it does give some added benefit in terms of a little bit of reduced traffic in terms of assessing if a particular server is a primary or not. But as early as version ten you could connect to any database or read write database, so that may be sufficient for your use. Case there were two posts that discussed this in detail. One back in March of 2021 talking about this new feature by Cybertechfossgresql.com and then another post by PostgreSQL Fastware.com on May 27 of 2021 that talked about these features. Now, again, this made Failover much easier because the client can just follow appropriately. The only thing to keep in mind is that you do need to essentially restart a session because that's where it determines what it's connecting to. So for example, if you have a client that's set to only connect to standbys and you promote one of the standbys and some of the clients are talking to it, unless its connection is restarted, it'll still connect to the primary until that connection is reestablished. But there are a lot of possibilities in using this and I wanted to bring it up to indicate that this could be a viable solution for clients to determine who the primary is as opposed to explicitly telling it via the network infrastructure. But if you want to learn more, definitely check out these pieces of content. [00:07:28] The next piece of content five minutes of postgres. Episode Twelve the basics of tuning vacuum and auto vacuum. This is from Pganalyze.com, and this essentially goes over the post that was discussed last week on Scaling Postgres, and that was postgres to the EDB blog talking about vacuum and auto vacuum. And this was a great post that covers all of that configuration. So if you want another viewpoint on that content, definitely check out this post. [00:07:55] The Next Piece of Content on Shape Files in PostGIS this is from Endpointdev.com and they're talking about shape files and using them within PostGIS and the process you have to go through to get that set up. So if you're interested in that, you can check out this blog post. [00:08:11] The next piece of content how to enable SSL for postgres connections. This describes exactly how to do that from setting up certificates. Maybe you need to set up your own private CA and your own certificates to do this, but he goes through some commands that you can use how you configure PostgreSQL, as well as the changes you need to make to the Pghba comp file as well as updating PG bouncer as well, so it can speak securely as well. So check out this blog post if you want to learn more. [00:08:41] Next Piece of Content DLE three one PG backrest time Zones for CLI DLE community this is from Postgres AI and they're talking about a release of their new open source software called the Database Lab Engine, which basically does thin cloning of databases for testing or development purposes. So if you're interested in that, you can definitely check out this new release. The next piece of content, the PostgreSQL Person of the week is Tyrus Kluba. If you're interested in Tyrus and his contributions to Postgres, definitely check out this blog post and the last piece of content. We did have another episode of the Rubber Duck Dev show this past Wednesday. This one was on working with outside teams, so a more project management type episode. So if you're interested in that type of content, definitely check out our show. [00:09:27] That does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show notes be sure to head over to Scalingpostgres.com, where you can sign up to receive weekly notifications of each episode, or you can subscribe via YouTube or itunes. Thanks.

Show Notes

Episode Transcript

Other Episodes

Episode 60

Pluggable Storage, Developer Gotchas, Cursors, PG12 | Scaling Postgres 60

Episode 254

PG Edge Cache, Postgres & OpenAI, citext to Collations, Compression Options | Scaling Postgres 254

Episode 222

Postgres 15 Beta 2, Concerning Locks, Vacuum Tuning, Transaction Anomalies | Scaling Postgres 222

Date Time Bins, Search Every Field, JSON_TABLE, Picking Primary | Scaling Postgres 210