Will Postgres Pull A Redis? | Scaling Postgres 309

Episode 309 March 31, 2024 00:15:12
Will Postgres Pull A Redis? | Scaling Postgres 309
Scaling Postgres
Will Postgres Pull A Redis? | Scaling Postgres 309

Mar 31 2024 | 00:15:12

/

Hosted By

Creston Jamison

Show Notes

In this episode of Scaling Postgres, we discuss whether Postgres will pull a Redis, remembering Simon Riggs, built-in collation provider and C.UTF-8 in PG 17 and health checks.

To get the show notes as well as get notified of new episodes, visit: 

https://www.scalingpostgres.com/episodes/309-will-postgres-pull-a-redis/

 

View Full Transcript

Episode Transcript

[00:00:00] So did you hear about what Redis did last week? And basically they are no longer open source. Moving forward, do you think there's a risk of postgres doing the same? Well, we'll get into that in this episode of Scaling Postgres, but I hope you, your friends, family and coworkers continue to do well. And the first piece of content is will postgresqL ever change its license? This is from jcats zero five.com dot. And given what happened this week with regard to Redis, in that it is essentially no longer open source moving forward, he chose to write a post about will this ever happen to PostgresQL? And he says the Postgresql website has an will postgresql ever be released under a different license? And it basically says the PostgresQL global development group remains committed to making postgres available, is free and open source in perpetuity. But Jonathan did mention that for full disclosure, he is on the postgreSQL core team and he did help write this paragraph. And this blog post goes into more depth. But it's basically talking about historically what people have said about postgres being open source for nearly 30 years, and it should continue to do so moving forward. Although I would argue a lot of the other open source projects that have recently not went open source or have become less free, they were saying, no, we're going to continue to do open source in the future. [00:01:29] But to my mind, what makes PostgreSQl a little bit more stable than some of the other things like Redis and Terraform, is that when you think a lot of these other projects that have went less free, they've mostly been controlled by one company, and therefore it's quite easy to just suddenly change the licensing and what you're going to do, whereas at least at this point in time, there are so many companies that are contributing to PostgreSQL that I find that much less likely. But even if some organization took over the majority of the postgres core team or the global development group and decided to change the licensing pattern, I think the greater community would step up and perhaps fork postcres from that licensing change point and just develop it as something else. So for example, Terraform, when they announced their licensing changes, a group of developers and organizations got together and said, okay, we're forking terraform, and from this point it'll be open source. And I think that's called open tofu now. And now that project is under the Linux foundation so it seems to have a very good foundation for moving forward, independent of whatever terraform decides to do. And if something similar happens to postgres, I imagine a similar process would happen. Maybe it wouldn't be called postgres, but essentially all the bones of it would live on in an open source manner. Also, I keep waiting to see if there are going to be legal ramifications for some of these licensing changes, because in the example of redis, I I saw a reference to approximately 50% of recent code changes are by people outside of the redis organization, and some of them work for large corporations like Red Hat. So they've been working on code to put into the open source platform, and now that they've changed the licensing, I saw that this person wanted their code removed from the project. So do they or do they not have their right to do that? But I find this whole discussion super interesting and you can definitely check out his post and then check out what's going on with redis and let me know. So if you think postgres ever is going to go down this path as well next piece of content remembering Simon Riggs this is from postgresql.org dot and I'm just going to read the first brief paragraphs here. So the PostgreSQL core team is deeply saddened by the loss of our longtime friend and colleague Simon Riggs on 26 March 2024. Simon was responsible for many of the enterprise features we find in PostgreSQL today, including point in time recovery hot standby synchronous replication. He was the founder of Second Quadrant, which employed many PostgreSQL developers, later becoming part of EDB where he worked as a postgres fellow until his retirement. Now I will add, I never got the opportunity to meet Simon Riggs. I've seen some of his presentations and talks, but knowing that he was the founder of Second Quadrant, I will say I definitely missed second Quadrant's blog because up to the point where they were acquired by EDB, it was one of my top resources for learning things, so they put out a lot of great educational content on their blog. EDB doesn't have quite as much as second quadrant did, but you can check out this blog post as well as check out one of the last presentations he did as well. [00:04:51] Next piece of content waiting for Postgres 17, the new built in C UTF eight locale. This is from pganalyze.com dot and this is Lucas's five minutes of postgres and he covers an enhancement where they've added a new collation provider in Postgres 17. So by default, Postgres uses the Glib C provider for collations. So this is basically something dependent upon the operating system, or you can make it an ICU collation, but in Postgres 17, Apache has committed to create a built in collation provider. So this could be great to avoid sorting changes when doing upgrades, because presumably this wouldn't change much. And the first locale that they're introducing is a new C UTF eight locale. So much like the post we discussed last week about binary sorting. This of course does binary sorting, but a lot of the functions to do capitalization and upper or lower case functions for text strings works appropriately on non us words. So that's great. Lucas also goes into some of the performance differences, which he does show about a twofold performance improvement for something he tested with bitwise sorting as opposed to linguistic sorting. So it was actually a little bit more than I anticipated. And he actually does advocate moving to using C and the C UTF eight locale so that you can avoid collation issues when you upgrade, but also for the better performance. And if you need to sort something linguistically, well then you can do that on a column by column basis. So if you have a name or description that you definitely want sorted linguistically, you can go ahead and make that change for that individual column. And then of course any indexes built upon it will be collated the same as well. And it's funny, I actually just gave that recommendation to a client recently because they're getting ready to transition the platform that hosts their database. And we were talking about collation changes, and I recommended that it probably made sense to go ahead and stick with the C collation. And if you need linguistic sorting, go ahead and use the collation appropriate for your language. But definitely check out this blog post if you want to learn more. [00:07:12] Next piece of content there was another episode of Postgres FM last week. This one was on health checks, and basically they're thinking of these as checkups for your database. So they talk a little bit about of what could be included and how often you should do them. And these aren't necessarily monitoring. Monitoring is something you do continuously, but a checkup is something that you do on a more periodic basis. I think they were talking in the range of months or a year. I know I do certain checkups on a weekly basis for certain clients, some on monthly, some quarterly, and things at a weekly resolution are simply checking. Okay, are the backups working? Are they the appropriate size? Checking the logs to see did anything unexpected happen over the last week, etcetera. But check out this episode if you want to learn more. [00:08:04] Next piece of content best PG based backup compression settings for version 15 and above. This is from Kmople GitHub IO and he's comparing of course different compression settings for PG based backup. And he looked of course at Gzip, LZ four and Z standard, and pretty much Gzip was the worst. Z standard was great for compression in terms of how much you can compress, but LZ ₩4 on speed. [00:08:31] I know starting now I've pretty much just been using Z standard for this stuff, but check out this blog post if you want to learn more. [00:08:40] Next piece of content recovering deleted data from PostgreSQL tables this is from cyberkyphenpostcrusql.com dot and he covers a few different options to do this. The first option is to not have to recover it at all, meaning if you're going to delete data, don't just delete data, actually create a transaction. So if you do begin do the delete, and then suddenly you notice you deleted too many, we'll just do a rollback. Although one could argue that the rollback is the way of recovering the data, so this is definitely the easiest thing to do. So whenever you're doing a delete, wrap it in a transaction, because if you suddenly get more than you expect, you can easily roll it back. [00:09:24] Option number two is if you have backups, you just make a note of the time, but you'll have to do a point in time recovery here up to that point to get the data back that you deleted. [00:09:35] Option number three is actually a separate tool called pgdirtyread that allows you to read the deleted rows from the table as long as vacuum has not been run. So you can see this is an extension you can install and then just do create extension PG read and then you can read the marked for deletion tuples from the table. But again, you need to make sure vacuum hasn't been run. And option number four, he says when you're totally desperate, you can use full page writes. So basically this is a process of going to the wall files, getting the full page rights, grabbing those and copying them into a particular database table, which is insane to me. Definitely do this on a copy of the database and then use pgdirtyread to be able to retrieve them. And then option five is calling your postgres consultant that you have on a support contract next piece of content is recovering from data loss despite not having a backup a postgres true story and this was presented at scale 21 x on March 25. And this is from Jimmy Angelakos and you can get the slides for the talk here, or you can watch the YouTube video. I would advocate the video. It's a very fascinating story where unfortunately someone had one database, no replica. The backups were from months ago. They didn't work and the hard drive crashed. It actually turned out to be the controller card, so the data was okay. A data recovery group retrieved all the files but none of the directory structure. But Jimmy goes through the process he used to take those files and put them in the proper directory structure using his knowledge of the postgres internals and how it writes to the disk, as well as even fabricating some commit data to be able to get all the data to appear accurately and ultimately start up the database system. So really fascinating presentation. I definitely encourage you to check out next piece of content postgresql 17 convert a physical replica to a logical replica using pGcreatesubscriber. This is from dBI services.com and this is super interesting where if your logical replication is having trouble getting in sync, you can actually just create a physical replica of your database and then convert it to a logical replica. So that's a super interesting feature. So you can definitely check this out if you want to learn more. [00:12:05] Next piece of content checking your privileges this is from the build.com and this is talking about function privileges. So for example, say you are the owner of the database, you create a function, and you want to grant the permissions for a high privilege user to run it, but you want to revoke the execute permission for a low privilege user. But unfortunately, both the high privilege and the low privilege can still run the function, and you're probably thinking, what the heck, I revoked it, why is it working? And the reason being is because there's no such thing as revoke in privileges, and the function by default is granted to public. So really what you must do is revoke the public access to that function, and then only those that are explicitly granted to it can use the function. So here he revokes execute on the function f that they created from public that revokes it, and now the low privilege user can no longer execute that function. [00:13:08] And then of course the thing you need to keep in mind is that this only impacts existing functions. If you want future functions to have their public privileges revoked, you actually have to alter default privileges to make that happen. So check out this blog post if you want to learn more about that next piece of content. Cloud native pg recipe number four connecting to your postgresql cluster with pg admin four this is from gabriel Bartolini it and he goes through setting up pg admin access to your kubernetes cluster using cloud native pg, and he has a fifth recipe about this cloud native pg recipe five how to migrate your postgresql database in kubernetes with zero downtime from anywhere and he basically shows you how you can do this using logical replication to migrate from other providers into cloud native pg. And specifically he shows how to do it from rds as well as the configuration needed in cloud native PG. So if you're interested in that, you can check out this blog post and the last piece of content major developments in postgres extension Discovery and distribution this is from ardentperf.com and this is another report or so talking about the extension ecosystem that David Wheeler is working on as part of Tembo IO, where they want to create a new extension ecosystem for postgres and they're actually having people get together who use extensions as well as people who are developing them. And he's advocating joining some of the Zoom calls that they're having to discuss what this new ecosystem should be. So you can check out this blog post and learn more about that. [00:14:48] I hope you enjoyed this episode. Be sure to check out scalingpostgras.com where you can find links to all the content mentioned in this episode, as well as sign up for the email list so you get weekly notifications of each episode. Thanks, and I will see you next week.

Other Episodes

Episode 101

February 17, 2020 00:17:46
Episode Cover

Postgres Releases, Useless Vacuum, Isolation Differences, WAL Compression | Scaling Postgres 101

In this episode of Scaling Postgres, we discuss new postgres releases, useless vacuuming, isolation differences between databases, and different ways to compress WAL files....

Listen

Episode 109

April 13, 2020 00:12:35
Episode Cover

Show Plans, WAL Monitoring, Using UUIDs, Default Logging | Scaling Postgres 109

In this episode of Scaling Postgres, we discuss how to show live plans, how Postgres 13 allows WAL monitoring, how to use UUIDs in...

Listen

Episode 257

March 19, 2023 00:19:16
Episode Cover

Adopting PgCat, Time Bins, work_mem Settings, Bad Constraints | Scaling Postgres 257

In this episode of Scaling Postgres, we discuss Instacart adopting PgCat, binning or bucketing your data by time, the best settings for work_mem and...

Listen