Postgres Scheduling Opportunities | Scaling Postgres 289

Episode 289 November 05, 2023 00:15:19
Postgres Scheduling Opportunities | Scaling Postgres 289
Scaling Postgres
Postgres Scheduling Opportunities | Scaling Postgres 289

Nov 05 2023 | 00:15:19

/

Hosted By

Creston Jamison

Show Notes

In this episode of Scaling Postgres, we discuss new options for Postgres scheduling tools, proper ways to kill processes and explain usage.

To get the show notes as well as get notified of new episodes, visit: 

https://www.scalingpostgres.com/episodes/289-postgres-scheduling-opportunities/

 

View Full Transcript

Episode Transcript

[00:00:00] Do you know how there are all of these scheduling tools you can use with postgres, like PG Cron or a PG timetable? I think there's a few others. I've pretty much always ignored them because my perspective is I have Cron. What else do I need? But the first article today makes me wonder. I might want to check them out a bit more. But I hope you, your friends, family, and coworkers continue to do well. And before we get started, I do want to thank everyone that met with me about my postgres course, and I will have more details coming in the next few weeks about that. But the first piece of content is making PostgreSQL tick new features in Pgkron. This is from Citastata.com, and this is a scheduling tool that runs as an extension in postgres. Now. Like I mentioned, I've always just used cron. Or you could also use System D timers to run any scheduled jobs. And even if you have a hosted database, you could still designate a particular application server to be your Cron server or your scheduling server and have that just connect to postgres and do certain scheduled tasks. So we've never really seen a need to use some of these scheduling tools, but I believe PG Cron is available in almost all of the hosted postgres services. But what I found interesting about this is that in version one five, you can schedule a job every few seconds. So as opposed to using the standard Cron syntax, where the shortest resolution you can do is a minute, this actually enables you to run jobs every one to 59 seconds. So it's more of an interval time period as opposed to running a job at an exact time. And he said that this was one of the most popular requests. But the reason it has not been done is, one, Cron can't do this. Currently, it only does a resolution a minute, and two, if something went wrong with PG Cron, it could cause a lot of problems if it's going wrong every few seconds. But given the maturity of PG Cron, they decided to go ahead and implement this feature. Now, what I find interesting is I could imagine you firing off some different things like they mentioned here, like update aggregations for real time insights. Like maybe you're checking something and you trigger an update of a summary table, or update of a particular counter value, detect particular anomalies, poll external sources. I'm not so sure I would have my database be doing this, but I would see that as more of an application function or implement more sophisticated scheduling workflows. So this is an extension, and this doesn't explain how to install it, but basically you install it as an extension, and it should be available if you're using hosted postgres and many platforms. But to actually run it, you run this function Cron schedule, give it a name, the interval you want it to run, say, 10 seconds and then the function you want to run. And it does log everything running. So you want to make sure that that table doesn't get full if you're running something every second. For example. Now, he did show an example of how this could be used where he hand built a job queue executor. So this is not in PG Cron, this is just something he built using standard functions and procedures in PostgreSQL. First he created a job queue table, then he created a job errors table to track any job failures. He created a schedule function so you can put one command to be scheduled. This basically just inserts that command into the job queue. He created a procedure that runs the jobs, where it basically gets the job from a queue, runs the command that was sent as a part of the queue, does some error handling, and it also supports doing multiple attempts, so it can track the number of attempts for a job, so it can try multiple times. And then to start one of these executors, you just run the procedure that this code is describing above and he calls them job runner one, two, three and four. Because you can run jobs in parallel with PG Quran and you're basically just calling that procedure four different times and you're say running it every 5 seconds. So you have your own job queue here. And with this set up, you can schedule basically one time jobs to do certain activities. So here's an example where they populated a random table with some random values. So this is super interesting. Now, whenever I've ever had the need to run something on a per second basis like that, I've normally set up an application process or an application daemon that's constantly running in a loop with however many seconds I want to delay to check for more work that's available. Now, I'm probably not going to move away from that for the majority of the short term job processing that I'm doing, because there's a lot to be gained from doing this within the application context with the language I use, ruby, for example, or it could be elixir. But I could see this really interesting from the perspective of updating totals all within the database. So not having to have an application server involved with keeping certain aggregations up to date. But I thought this was really interesting and it's starting me think maybe a little bit differently on how I might be able to use this for different tasks I need to do. So let me know in the comments what you think of this new feature. Do you use any of these scheduling tools that are extensions in postgres? And if you do, how do you use them? The next piece of content is error. Current transaction is aborted in PostgreSQL. This is from cipher postgresql.com. They're talking about the message and commands ignored until end of transaction block which is exactly what the blog post title is. And this kind of gives you a clue about what's going on. So the first thing this blog post talks about is basically transactions in that they are atomic, they either happen or they don't happen. They're all or none, essentially. And whenever you run a command, like at a psql prompt, that's going to run within its own transaction, unless of course you start with Begin. Now you're starting a transaction and every subsequent statement is within that transaction until you either commit it or you do a rollback statement. But how can you get this error? So in this example, he starts a transaction Begin. He does just a simple select, then he does a statement that causes an error. So he divides an integer by zero, which gives an error division by zero. Now when he continues to run additional statements, he's going to get this error. Current transaction is aborted, commands ignored until end of the transaction block. So basically until this transaction is concluded, every additional command is going to output this error message on the screen, in this case and also in the PostgreSQL logs. So basically you can no longer commit this transaction. So you can see here where he tries to commit it actually does a rollback instead of a commit. So basically that's what this error message is. And he says if you really need to save part of your work within a transaction, then you need to look into save points, which is a way to save your way part way and you're actually doing subtransactions and they have a post on that here that they link to and you can talk about it. But personally I like the atomic nature of transactions and I've never really had to do save points like this, but that feature is available if you're interested. Next piece of content also from cybertechsql.com is kill nine explain for PostgreSQL. Now before we get into this, I would almost never kill any PostgreSQL process using OS tools to do it. I would use the functions PG cancel backend to cancel and run and query and PG terminate backend to actually terminate that backend and run those functions, of course from the say psql prompt. However, if you need to use OS commands for some reason, here's some things to take into account. So you can run the kill command for a given process. And he's showing here the main postgres process and then all the different subprocesses of it, like the logger, the checkpoint or background writer, wall writer, auto vacuum launcher, and then the logical replication launcher. Now in this example, he ran the kill command for the process ID for the auto vacuum launcher. And that's really not a problem, it just simply stops that process and starts a new one. So you can see here this process ID is gone, the one ending at 80 is no longer present. But you have this new process ID here and it just restarted the auto vacuum launcher. There's no problem. Now, the reason this is no problem is because what's actually happening is you're sending a SIG term signal that the process can actually capture and close itself gracefully. And that allows everything to remain consistent, particularly when you're talking about shared memory. But if you use kill nine and this is the danger of it, what it does is it just immediately kills the process and there's no opportunity to shut down gracefully. Well, now you have no idea of the state of the shared memory because maybe this process has made some changes or alterations. So basically what the system does is has to do a restart essentially and do a crash recovery. So in this example here, he shows all the different process IDs and then he does a kill nine of a postgres backend process here where a user is logged in to a postgres back end that is currently idle. And as a result, it is able to keep the main process running and the logger. But every other process checkpoint or background writer, wall writer, auto vacuum launcher, logical replication launcher, all of those processes, in addition, any other users that were connected to a postgres process all get restarted. So they all get new process IDs, as you can see here. So that's why you want to be very careful, especially with the nine kill, because essentially you're killing a process that can leave the shared memory in an inconsistent state. So the system has no choice but to do a restart and do crash recovery at that point. But if you want to learn more, definitely check out this blog post. Next piece of content understand explain plans in PostgreSQL this is from Stormatics Tech. This is a blog post that covers how to read explain plans. Now, I personally found this a little bit difficult to read some of the examples here. I don't know if some of the formatting was off, but he does talk about explain and analyze and buffers. Yeah, like in this example here, I found this explain plan really hard to read because of course this is a continuation of a line here. So it just makes it a little bit difficult. So it goes through all of these different queries and different ways to look at the explain plan. But I also found it interesting here where he said where to look for potential bottlenecks and that could be high costs, bad estimated row counts, meaning that the number of rows it expects is different from the number of rows returned. High buffer operations where there's a large number of pages that were hit or read filters that remove a lot of rows. There are sequential scans on large tables, there's external disk merge operations or a lot of temp reads and writes. But I know personally when I look at explain plan, I basically prefer to run analyze of course and just identify what Node is taking the longest and figure out, okay, how can I make that node faster? Do I need an index? Do I need to change how the query is being run, et cetera. And these are some of the areas I'm going to be focusing on my course as well. But if you're interested in that, you can definitely check out this blog post. Next piece of content There was another episode of Postgres FN last week. This one was on under indexing, so the previous episode was on over indexing. This is on under indexing, which is arguably the biggest problem because most people with performance problems, they have an insufficient amount of indexing or inappropriate indexes on the tables to get the performance that they're looking for. So that's what they're covering this week. You can listen to the episode here or watch the YouTube video down here. Next piece of content is actually a YouTube video, and it's how we're building AI search engines using LLM embeddings. This is from the Think Nimble YouTube channel, and they're talking about building AI embeddings and doing searches on your own data. So not relying on chat GPT or an API, basically getting your own data, creating your own embeddings and querying your own data for that purpose. In this example, he is using PostgreSQL as well as the Pgvector extension along with Django. So I would say a lot of the content is python focused, so there wasn't too much Postgres specific in here. But if you're interested in AI, definitely encourage you to check out this video. Next piece of content. PostgreSQL IO visibility, we hack PostgreSQL internals and PGSTAT IO. This is from Andyatkinson.com, and he was talking about a weekly hackathon, for lack of a better term, on PostgreSQL Internals. That was done. And here's the link to the thread announcement. I believe it's already concluded. So it was just a very short week of hacking on Postgres Internals, and what Andrew chose to work on was PGSTAT IO. So he talks a little bit about how data is stored in Postgres in terms of eight kilobyte pages and then goes into PGSTAT IO, which is the new system view that enables you to get IO stats out of Postgres. And he discussed who created it. Melanie Plageman was the lead on it, but also Lucas Fiddle from PG Analyze assisted as well. Talks a little bit about how Postgres writes data, how there's different backend types, so there are client backends, there's backend writers, there's the checkpointer for flushing memory contents to disk. So if you're wanting to learn more about the PGST IO extension, you can definitely check out his article on it here. Next piece of content SQL identity Columns this is from Peter Eisentraut.org, and this is a very short post, but it's talking about Auto incrementing columns, which are identity columns or in Postgres, they're serial or big serial columns. But the syntax for these auto Incrementing. Columns was created in the SQL 2003 standard, so 20 years ago, and there's these different implementations and they have pretty much the same syntax for using them. And this was implemented in Postgres in Postgres Ten, which was about six years ago. But what I found interesting is that these database systems still haven't implemented this SQL standard, so I'm super surprised that things like MySQL and MariaDB and Microsoft SQL Server haven't implemented it yet, so I find that pretty crazy. But if you want to learn more, definitely check out this blog post next piece of content. Unlock your arsenal's. GDB debugging essentials with PostgreSQL. This is from Techbuddies IO and this post uses the GNU debugger to learn more about PostgreSQL internals. So if you want to learn about how postgres works by tracing the code, definitely check out this blog post next piece of content. Performance implications of medium sized values and toast in postgres and how to mitigate them this is from Pganalyze.com. This is this week's five minutes of postgres, but he's actually covering a post that's a couple of years old by Hacky Benita, where he saw something odd in that small text with, say, a length of two, three, four or five words. Something like that has little impact querying other columns in the table. Large text like a full blog post or a full document still had minimal impact with querying other columns in the table, but a medium sized text like a description or a summary where it's under the threshold of what is placed in the toast table actually resulted in worse performance when trying to query other columns. The reason being is because all of that medium text information was placed within the main heap table itself, it wasn't toasted, essentially, so that made the table relatively large compared to a small amount of text or text that was all placed in the toast. And they show a lot of examples of this in the blog post and Lucas's video, as well as some ways to mitigate it, like maybe using a toast Tuple target or even creating separate tables for the text that you're working with. But check out this blog post if you want to learn more. I hope you enjoyed this episode. Be sure to check out Scalingposgres.com, where you can get links to all the content mentioned, the podcast audio, as well as a transcript of the show. Thanks and I will see you next week.

Other Episodes

Episode 283

September 24, 2023 00:16:00
Episode Cover

222 Times Faster Analytical Queries With Columnar Storage | Scaling Postgres 283

In this episode of Scaling Postgres, we discuss how to get 222 times faster analytical queries with columnar storage, a Postgres 16 feature review,...

Listen

Episode 258

March 27, 2023 00:08:41
Episode Cover

Logical Replication, Database Antipatterns, max_wal_size Setting, Delete vs. Truncate | Scaling Postgres 258

  In this episode of Scaling Postgres, we discuss use cases for logical replication, database anti-patterns, how to set max_wal_size and the difference between delete...

Listen

Episode 43

December 17, 2018 00:09:49
Episode Cover

Logical Replication, Prepared Transactions, Ansible, Partitions | Scaling Postgres 43

In this episode of Scaling Postgres, we review articles covering logical replication, prepared transactions, Ansible Postgres deployment Ansible and tidy partitions. To get the...

Listen