Tablespaces, Streaming Replication, More Postgres 14, Security

[00:00:00] In this episode of Scaling Postgres, we talk about table spaces, streaming replication, more Postgres 14 and security. I'm Kristen Jameson, and this is Scaling. Postgres episode 159. [00:00:17] One all right, I hope you, your friends, family and coworkers continue to do well. Our first piece of content is when to use table spaces in PostgreSQL. This is from CyberTech postgresql.com. They're talking about when you should use table spaces, and they compared it to Oracle, where Oracle basically has its own file system it uses to store the data, whereas Postgres relies on whatever operating system it's using to store its data. And they have a terminology in Oracle versus Postgres chart. Here where they say an Oracle table space is similar to a operating system file system where an Oracle data file is equivalent to a logical physical volume in a file system. A segment in Oracle's equivalent to the data files that are part of a table and an extent in Oracle's equivalent to a segment or a data file in Postgres. Now, I don't necessarily 100% agree with this chart because there doesn't list where the table space exists. So the table space is logical representation of where you're storing the data that is connected to a physical location. So, for example, you logically create a table space, but you specify a physical location for it. So when you create a new table, you specify that table space which is actually stored in that location. So Postgres normally just uses a data directory and stores everything in there. You can direct, say, logs to a different location using SIM links or other methods like that. And you can determine to a certain extent maybe some temporary space, but mainly everything is in the data directory and table spaces allow you to store things outside of that data directory. And when you create objects, you need to specify the table space. Now, they do say that when you specify table to go in a particular table space, when you create indexes, you need to specify that table space as well. It doesn't automatically go with the table, but if you put a whole database on a particular table space, it will stay within that table space. Now, the blog talks a little bit about backups and how you need to take table spaces into account and then ask the questions, when should you create a table space? Now, here are some reasons and they've specified. A lot of these reasons don't necessarily apply anymore. They applied more in a time when we were using a lot of physical drives, a lot of magnetic storage using SSDs and these virtual file systems in a cloud environment. A lot of these reasons don't really exist anymore. And I've only used table spaces when we're running out of data where the primary data directory is located, and there wasn't an easy way to expand it. So what we typically did is brought up a new table space and then created large indexes on that table space and then deleted them on the primary table space or the default data directory to free up some space to bias time to then address it. But it is a bit of a hassle doing backups when you have table spaces, so I definitely wouldn't recommend it. And that's what this post advocates as well, is that you probably shouldn't be reaching to use table spaces a lot and they make a quote here. Since almost everybody uses virtualization these days for their postgres installations, table spaces are becoming an increasingly irrelevant PostgreSQL feature. So whereas they mentioned that in Oracle, it's more of a consideration, these table spaces in postgres, it's really only certain use cases that they iterate here and mainly the only way I've used them is you're running out of space on the main data directory and cannot easily move or expand that data directory. But if you want to learn more, definitely check out this blog post. [00:04:05] The next piece of content is setting up streaming replication in PostgreSQL 13 and streaming replication internals. This is from Migops.com and they're talking about basically physical or streaming replication and they talk a little bit about the internals, basically the wall files that are being generated. You stream that to a standby server to be able to create a replica of the primary database and then they go into all the different commands that are used to get this set up and to get two systems up and running and replicating data from the primary to a replica. So if you're interested in that, you can check out this blog post. [00:04:44] The next piece of content waiting for PostgreSQL 14. Add PG Database Owner default Role so this is a new default role that's been added and the intent is to add it to things like template One. And template One is a database that all databases are created from. So it's a template for you to create new ones. And if you apply this to template one and in his example here, he created a security definer function to be able to query the PGSTAT activity table and then applied it to template one so that when a new database is created with a particular owner they can run this function. Get running queries against the database they've created because they are the owner. So it's a way to give more permissions to users at a more granular level. So if you're interested in learning more about this new feature, you can check it [email protected]. Now there are two other postgres related to new PostgreSQL 14 features. The next one is Add Date bin function. So this is similar to the date trunk function where you can take a date and truncate it down to a particular resolution. So for example, there's microseconds milliseconds, days, months, decades, et cetera. Well now you can bend it to a particular time and then offset it. So it's not always truncated to a particular time, so it's bend at an off time interval. So if you're interested in that, maybe you want to check out this blog post. And the third post is adding the Add Uni string function. So this allows you to decode a string with Unicode escape sequences. So if you're interested in that, you can check out this one. [00:06:26] The next piece of content is is Postgres secure? This is from Crunchydata.com and the answer to the question, of course, as they say here, quote the short answer is yes, but it's all based upon your implementation and how you set up postgres. And they have a number of resources that they've linked to here to give you guidance on how to set up postgres in a secure fashion. And the first thing they cover is the Common Criteria for Information Technology Security Standard, and that there is a protection profile for database management systems that you can apply for deployment of postgres. They also talk about work done with the center for Internet Security and the United States Defense Information Security Agency to develop more guidelines for deployment, configuration and administration. And they include that here, along with the Security technical implementation guides. And they also talk about postgres SQL injections and encryption requirements. So if you're interested in making a more secure postgres installation, definitely check out this blog post. The next piece of content is, again more webinars that have been posted to the EDB YouTube channel. So if you're interested in video content, you can check that here. [00:07:36] The next piece of content is preventing FreeBSD to kill PostgreSQL, also known as out of memory killer Prevention. FreeBSD is different from Linux and we have covered articles talking about the out of memory killer in Linux and there's specific documentation on postgres about how to do that. But this covers FreeBSD and a number of different functions like Protect. It has to be able to do that and configure it. So if you run postgres on FreeBSD, maybe you want to check out this blog post. [00:08:08] The next piece of content is logging of PG pool two on Kubernetes. So if you run PG pool two and you want to do it on Kubernetes and interested in the logging, definitely check out this blog post from Bping blogspot.com. [00:08:22] The next piece of content dumping a byte array with psql this is from cleverelephant CA and he's talking about using binary data in postgres, specifically the byte array to store binary data. And he was using that to do some debugging and creating a pipeline within postgres to be able to take Raster images that are stored in binary in postgres but be able to examine them. So if you're interested in learning more about the pipeline he generated here in terms of working with byte array data, definitely check out this blog post. [00:08:56] And the last piece of content is the PostgreSQL Person of the Week is Jan Karaman's. So if you're interested in learning more about Jan and his contributions to postgres. Definitely check out this blog post that does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show Notes. Be sure to head over to Scalingpostgres.com, where you can sign up to receive weekly notifications of each episode. Or you can subscribe via YouTube itunes. Thanks.

Show Notes

Episode Transcript

Other Episodes

Episode 128

Statistics Speed Up, Synchronous Commits, TLS Updates, Avoiding Cursors | Scaling Postgres 128

Episode 298

10-100 Times Higher Latency With Distributed Postgres | Scaling Postgres 298

Episode 68

Prepared Plans, Materialization, Recursive CTEs, Safe Migrations | Scaling Postgres 68

Tablespaces, Streaming Replication, More Postgres 14, Security | Scaling Postgres 159