PgBouncer SCRAM, Art of PostgreSQL, Distributed Time-Series, Window Functions

[00:00:00] In this episode of Scaling Postgres, we talk about PG Bouncer, the ARCT of PostgreSQL, distributed time series and window functions. I'm creston. Jameson. And this is scaling postgres episode 79 one. [00:00:22] Alright, I hope you're having a great week. Our first piece of content is PG. Bouncer one. Point 110 is released and the major new feature is support for Scram authentication. So this is pretty big. It's only supported MD Five, of course, and some others, but finally it has Scram authentication. And there's a change log here that talks about adding scram authentication. Handle auth type equals password when the stored password is MD Five. Like a PostgreSQL server would add option log stats to disable printing stats to the log, add time zone to log timestamps and put the PID into brackets in the log prefix. And it also lists a number of fixes, even one for PostgreSQL twelve. So definitely a great set of enhancements. Particularly the scram authentication is now offered for PG bouncer. So that's great. [00:01:17] The next piece of content is that a book, the Art of PostgreSQL has been released with the subtitle turn thousands of lines of code into simple queries. Now, I'm not familiar too much with this book, but this is just another piece of educational content. So if this seems interesting to you in terms of learning more about PostgreSQL, check out the website and it looks like you can download a free chapter to see if it's something that you'd like. [00:01:43] The next post is building a distributed time series database on PostgreSQL. And this is from Timescale.com blog. So they make Timescale, which is a database extension for PostgreSQL focused on time series data. And they're actually introducing a distributed version. So it enables you to distribute the data across multiple database systems. So essentially you have scale up. Well, now you can scale out to multiple systems. And it looks like this is something in private beta right now. So this is announcement about kind of why they're doing it. They're not necessarily considering Sharding, but more chunking because they operate at the level of what they call chunks. And that allows them to do some different things than Sharding typically would that they say here. So you can put more chunks of data being managed to do a scale up scenario, elasticity being able to move them around, partitioning flexibility in terms of changing the chunk sizes or partitioning dimensions, deleting the chunks older than the threshold. So maybe they're kind of like partitions, but not really. They're their own thing. And they show some benchmarks here that they've looked with a single node. The insert performance for cluster was 2.7 million metrics processed per second. And with eight nodes and one access node that grants access to essentially the data nodes, they get up to 12.8 million metrics per second. So if you work with a lot of time series data and you're looking to scale further from a single instance, then maybe this is something you'll want to take a look at. So it's a pretty long post. Goes into some of their rationality for structuring the way they did, how they feel. It's different from traditional sharding. So definitely a blog post to check out. [00:03:32] The next post is actually a webinar, and it's Webinar Business Intelligence with window functions in PostgreSQL follow up. So you can register for the webinar right here, or look for the recording right here and it basically covers Windows functions. I haven't had the opportunity to watch the whole thing yet due to other time constraints this week, but I'm always interested in learning more about window functions because they're the kind of thing that I don't use on a daily basis, and anything that keeps me fresh on different capabilities of analyzing data, I like to keep up on. So if you'd like to do that, definitely a webinar to check out. [00:04:10] The next post is Postgres Table Partitioning, and this is from Enterprisedb.com and it describes the partition feature of PostgreSQL, talking about range partitioning, list partitioning and the newly added hash partitioning and how you would use them in what different scenarios. So range is typically for dates list is you have the list of known values that you can specify and hash is being able to use a hash function on a wide variety of data to be able to break it into equal buckets. And they go through showing you how you can do with declarative partitioning, create the base table and then create each partition along with inserting data. Talking about the default partition and then talks about the performance benefits of doing partition exclusion when doing queries. And it gives some advice on when to use partition tables. So if you're looking into partitioning, definitely blog post to check out the next post again from the Enterprisedb.com blog is Pgpool versus PG Bouncer. So this takes a very rational examination of each of these products. PG Bouncer, which it's marketed as a lightweight connection pooler, so that's pretty much what it does. And they say here quote PG Bouncer executes pooling correctly out of the box, whereas PG Pool Two requires fine tuning of certain parameters for ideal performance and functionality. So PG Pool Two has a lot more features compared to PG Bouncer, but it requires more configuration. So if you're looking for this evaluation or potentially want to switch from one connection pooling solution to another, definitely a blog post to check out the next post is Tips for Postgres from a Postgres Insider. So these are a set of nine tips and they're not specifically PostgreSQL related, but they can be items that are tangential to PostgreSQL. In other words, something that will impact the database but is not in the database necessarily. So the first thing mentioned here is Smart Tools for assessing Hardware. So that's not necessarily a PostgreSQL configuration, but it's something if you have direct access to the hardware, something you want to monitor the health of your storage, as they say here pgtest F Sync to determine what F sync method is the fastest for your operating system and hardware. They talk about the backend flowchart that actually communicates how data flows through PostgreSQL if you want some extra knowledge in that area. And they go over some other ones such as a command line control GUC or grand unified configuration levels. So where you can set different levels of configuration, the setting of work mem how to do that optimally transactional DDL, which is one of postgres's more interesting capabilities as they mentioned here, the advent of virtual columns in PostgreSQL twelve and of course time zones. So if you want to get more in depth into some of these tips, definitely check out this blog post. [00:07:04] The next piece of content is actually a YouTube video called What's New in Postgres Twelve? And this is from a webinar and they basically outline some of the new upcoming features for twelve, including the partitioning improvements, b tree improvements for indexes most common value statistics, the fact that it inlines many CTE queries by default. Now prepared plan control just in time compilation checksum control and reindex concurrently. So if you want to learn more about postgres twelve features that are coming, definitely a webinar to check out. [00:07:38] The next post is PG backrest to backup PostgreSQL. And this is exactly what it says. It explains the PG backrest solution, how to get it installed and how to configure it for doing backups for your system. So if you're interested in moving to PG backrest or evaluating it, definitely a blog post to check out. [00:07:57] The next post is computing day working hours in plpgsql. So this is a relatively simple thing to do, basically calculate the number of working days in a particular month. He actually designed a custom function in plpgsql. So if you're wanting to get more experience with writing plpgsql functions, here's a blog post you can check out. And with regard to that, the next post is why you need plpgsql underscore check if you write procedures in plpgsql. So what this does is it does some validity checks to the code that you're writing that the built in solution does not have. And it even has a profiler as well as does some performance warnings here. So it looks like it's a great tool in your plpgsql development. So definitely check out this blog post if you tend to do a lot of that because maybe you'd like to use this tool to make your job easier. [00:08:55] The last post is waiting for PostGIS three GEOS three eight. So basically what they're talking about here is that a PostGIS version three is coming. And what a lot of it does is it links to specialized libraries, as they say here, to handle particular problems. And one of the ones, they use a GEOs for computational geometry, and there were some issues, and they're talking about some functions that are backed by GEOS, including the ones listed here. And they've had some problems in terms of sometimes the library throws errors, but they're working to make improvements with PostGIS three that apparently you'll really see the benefits with GEOS three eight in terms of being more reliable. So if you use PostGIS, definitely something to look forward to with the upcoming PostGIS Three. [00:09:42] That does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show notes. Be sure to head over to Scalingposgres.com, where you can sign up to receive weekly notifications of each episode, or you could subscribe via YouTube or itunes. Thanks. What's?

Show Notes

Episode Transcript

Other Episodes

Episode 19

100K Tenants, Extensions, Query Planning, Crosstabs | Scaling Postgres 19

Episode 285

Indexing Deep Dive | Scaling Postgres 285

Episode 357

Postgres Best Practices? | Scaling Postgres 357

PgBouncer SCRAM, Art of PostgreSQL, Distributed Time-Series, Window Functions | Scaling Postgres 79