Default Security, Max Connections, Progress Reporting, Migration | Scaling Postgres 70

Episode 70 July 01, 2019 00:12:57
Default Security, Max Connections, Progress Reporting, Migration | Scaling Postgres 70
Scaling Postgres
Default Security, Max Connections, Progress Reporting, Migration | Scaling Postgres 70

Jul 01 2019 | 00:12:57

/

Hosted By

Creston Jamison

Show Notes

In this episode of Scaling Postgres, we discuss the default security, max_connections and pooling, progress reporting and migrating to Postgres.

To get the show notes as well as get notified of new episodes, visit: 

https://www.scalingpostgres.com/episodes/70-default-security-max-connections-progress-reporting-migration/

View Full Transcript

Episode Transcript

[00:00:00] In this episode of Scaling Postgres, we talk about default, security, max connections, progress reporting and migration. I'm Kristen Jameson and this is Scaling Postgres, episode 70. [00:00:20] Alright, I hope you're having a great week. Our first piece of content this week is PostgreSQL Deep Dive, PostgreSQL defaults and Impact on Security part one. [00:00:32] There's also a second post that is part two we'll be covering as well. So this is indeed a deep dive. It is a very long post talking about impact on the defaults and their impact on security. So basically talking about users groups and roles and kind of what they have by default, what's the impact of the public schema. And also in discussion about the situation identified by this CVE that was posted in 2018 and how you can basically protect yourself from it. So he talks about basically there is the command to create users, groups and roles, but really they're all the same thing. A user or a group is basically an alias for a role. So when you create a role, you're creating a user or a group. Generally users have the connect privilege. All right, they mentioned here actually consider that an attribute whereas groups generally do not, but essentially they're all considered roles. Now, what's great about this post is the level of education about how roles in the public schema and the search path in terms of what commands to execute are. So these are two very important posts and they are very long. So if you're not able to read it now, I would definitely bookmark these and come back to them because they're really, really good in terms of from an educational perspective. So he's talking about roles have basically four types of relevant properties. They have attributes which is a capability for a role. So example login, super, user, et cetera, membership, which is it's a member of another role, privileges. So can it do an unselect and insert an update, delete, et cetera. And then settings so it's a custom value for a configuration variable bound to a role. Now, as a part of this post, they also go over an extension that was created called the Crunchy check access. And basically with roles acting as users and groups and again the public schema and privileges or default privileges, it can become really confusing what rights someone has. And basically they've developed this extension to be able to give you an insight into what rights a given user or a role or a group has. So in this example post, they created a basic database called Deep dive and then they created a group. Again, these are still just roles, but they created a group using no inherit some super users, created some different roles, created group add users to a role or make it a part of a role, and then used the du command to get the list of roles and kind of what they're attributes and what they're members of. So again, this really goes through and gives you an education on how you can create these different things and then what the security ramifications are in terms of what privileges certain people have. And then they use the check access extension to be able to see what they have. And some of the output appears down here. Now, I'm not going to go over everything this post because it is super long and it will take an immense amount of time to review, but I strongly suggest you check it out and look at it for edification purposes. Now the second part, once you have that basic knowledge down, goes into CVE 2018 1058. And first they mentioned some concepts about schema where he says basically there's a schema called the PG catalog which includes the built in objects provided by PostgreSQL and then new databases have a schema called public. And there's a concept of the search path where quote, the PostgreSQL searches the system catalog schema, the PG catalog first and then it looks in the user schema and then the public schema where the user is essentially the session username. And with regard to the security issue, you can create a function that has the same name but accepts slightly different inputs. So it can be text. But for example, this one you created here can be varcar. So you could actually create an object in the public schema that is named similar to an existing built in function that's in the PG catalog, but then you can overwrite that functionality to escalate your privileges. So they show a basic example here where someone created a function called lower which gives you lowercase. But they used their care as the input and they do a harmless they select so and so was here. Now another user running this will actually, the search path will eventually look in the public schema and potentially run that and then they go over the full example over what's possible in terms of escalating privileges with this. Now then they lead on into the fix, which basically one of the main things is to revoke create on schema public from public. So prevent people from creating objects in the public schema essentially. But they say you may also want to also revoke execute on all routines in schema public from public as well as altering the default privileges for that as well. So again, this is a super good set of posts to help you get an education on roles and permissions and groups and things of that nature. So I highly suggest you check out these two posts. The next post is The Challenges of setting max connections and why you should use a connection pooler. So here they're talking about max connections and of course the default in PostgreSQL is set to 100. So in terms of determining how large this can be for a database or what's the best practice, he says, quote, Talk to any PostgreSQL expert out there and they'll give you a range, a few hundred or some will flat out say not more than 500 and definitely no more than 1000. And then he talks about where do these numbers come from. So he actually wanted to do a test. So he used a G three eight x large EC two instance and set up PG bench with a number of clients to run against it. And he set the number of concurrent connections to run from 100 up to 5000 and he set max connections to 12,000. So, quite a bit on this database, and here you can see the graph of transactions per second. So essentially how many transactions, how high it can go concurrently, and then how the latency increases. But seems like most of the action happens in the beginning part. So we zoomed in on it here. So as he says, the optimal performance was when there were between 305 hundred concurrent connections and after 700 connections the performance dropped off precipitously. So again, very close to that tribal knowledge quote, a few hundred, no more than 500, definitely no more than 1000. So that's pretty much what this is saying here too. And then he goes over a section called what if we need more connections? And there comes in connection pooling. So using PG bouncer or PG pool? And he did some tests following similar to Olivaro Hernandez's concurrent connection test, but he also used the connect flag because he wanted to simulate the cost of building up and building down connections. And that's what this chart looked like now when he added in the connection pooler. So this is without a connection pooler, sorry I can't highlight it. This is without a connection pooler. And with a connection pooler you definitely see increased performance. So basically max connections, it looks like a few hundred, probably not more than 500, definitely not more than 1000 is the way to go. And if you need more connections than that, then fall to start using a connection pooler such as PG Pool or PG Bouncer. But if you're interested in this type of content, definitely a blog post to check out. The next post is postgres twelve. Highlight more progress reporting. And this is from Pakir XYZ and he's talking about how we have progress reporting for vacuum, but now they've added more progress reporting in postgres twelve for cluster activities as well as vacuum full activities. So rewriting essentially the whole table. And that one's called PG Stat progress cluster. And then also for reindex and Create index operations. And for that, there's a view called PG Stat Progress create Index. So definitely some great new features being added to twelve in order to monitor these two potentially long processes. And if you want some more details, definitely blog post to check out. The next post is actually a presentation from Bruce Momgian from EDB Postgres talking about major features postgres twelve. So this goes over partitioning improvements, btree improvements. Most common value statistics where you define statistics across columns inlining many CTE queries as opposed to materializing them prepared plan control, which we talked about in a previous episode of Scaling Postgres just In Time compilation being on by default the new Checksum control feature and then reindex concurrently, which again, my opinion is huge feature. Great to have. So if you want some more detail with regard to Lee's, definitely check out his presentation. [00:09:37] There's another presentation that Bruce Mongian also posted, which is the democratization of databases. So basically he did some analogies between government and their structure and the software governance structure of Postgres and why it is advantageous. So it's an interesting read if you want to check it out. The next post is PostgreSQL Administrator account with no login recover your Postgres role. And this is from Luca Ferrari at fluca 1978 GitHub IO. And he says, what happens if you lose the ability to log in with your Postgres account? What are you going to do? So when you try to do it, it says Postgres is not permitted to log in. So you can actually stop the service and then start it in single user mode. So always remember that you have this mode to be able to start postgres in, and with that you can execute operations such as Alter Role, Postgres with Login and then Control D to stop out of it. And then you can start the Postgres service normally and be able to connect as Postgres. So, very simple post, but just something to keep in mind if you ever run into such a problem as this. [00:10:45] The next post is Managing multiple PostgreSQL instances on Ubuntu or Debian. So this talks about some of the different cluster commands that are available on Ubuntu for being able to manage multiple clusters. So for example, there's the commands PGLS clusters to list out the clusters that are available. [00:11:05] There's the cluster control using PG underscore CTL. Cluster control. And you can also create clusters drop clusters. So this goes over how you can manage multiple clusters of PostgreSQL on Ubuntu or Debian platforms. So if you're interested in that, definitely a blog post to check out. [00:11:26] The next post is a webinar migration to PostgreSQL follow up. So this is a webinar given by second quadrant and they talk about migrating to PostgreSQL. And if you go to the registration page, they talk about why would you want to do this, how to plan your migration, how to migrate your business logic, converting data types, migrating your SQL in terms of function procedures and triggers, typical challenges encountered as well as post migration tests. So if you're interested in migrating to PostgreSQL, definitely a blog post to check out. [00:12:00] The last post is version 1.6 of PG watch two PostgreSQL monitoring tool released. So this is a tool by Cybertechn Postgresql.com that allows you to do monitoring and it now has Prometheus support. So you can send this monitoring to Prometheus, as well as Petroni support for being able to determine from my interpretation what is the primary or database to be able to follow. So if you're interested in using PG watch, they have a new version that you can check out. [00:12:33] That does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show Notes. Be sure to head over to Scalingpostgres.com, where you can sign up to receive weekly notifications of each episode, or you could subscribe via YouTube or itunes. Thanks. [00:12:54] You our.

Other Episodes

Episode 293

December 04, 2023 00:15:18
Episode Cover

Better Plans With Statistics | Scaling Postgres 293

In this episode of Scaling Postgres, we discuss how create statistics can help you get better plans, issues with subtransactions, all about extensions and...

Listen

Episode 86

October 20, 2019 00:15:34
Episode Cover

Prewarming, Nondeterministic Collations, Generated Column Performance, Foreign Keys | Scaling Postgres 86

In this episode of Scaling Postgres, we discuss prewarming your cache, working with nondeterministic collations, generated column performance and foreign keys with partitions. To...

Listen

Episode 48

January 27, 2019 00:14:41
Episode Cover

Microsoft Acquires Citus Data, Split WAL, Maintenance Work Mem | Scaling Postgres 48

In this episode of Scaling Postgres, we review articles covering Microsoft acquiring Citus Data, split WAL files, maintenance_work_mem and logical replication. To get the...

Listen