Four Million TPS | Scaling Postgres 324

Episode 324 July 14, 2024 00:16:29
Four Million TPS | Scaling Postgres 324
Scaling Postgres
Four Million TPS | Scaling Postgres 324

Jul 14 2024 | 00:16:29

/

Hosted By

Creston Jamison

Show Notes

In this episode of Scaling Postgres, we discuss experiments to achieve four million transaction per second, the importance of extended statistics, parallelism in Postgres and an introduction to window functions.

To get the show notes as well as get notified of new episodes, visit: 

https://www.scalingpostgres.com/episodes/324-four-million-tps/

Want to learn more about Postgres performance? Join my FREE training called Postgres Performance Demystified here: https://www.scalingpostgres.com/courses/postgres-performance-demystified/

View Full Transcript

Episode Transcript

[00:00:00] So for the first piece of content this week, when I look at the title, I can't stop thinking about Mike Myers when he played Doctor Evil in the Austin Powers movie where he said $1 million. I look at the title and I think 4 million tps. [00:00:20] But anyway, I hope you, your friends, family and coworkers continue to do well. Our first piece of content is 4 million tps, or transactions per second. This is the most recent episode of Postgres FM, and Nikolai, who is the founder of Postgres AI, which uses AI tools to do benchmarking, I think only for postgres. And he had a little side project where he wanted to see how fast he could get a single node of postgres to be, and he wanted it to do some actual work. So this was using PGBench on a single node, and it was a table with I think 10 million rows in it, and he was doing a primary key lookup. So just a single select query. So he asked, okay, how many of those transactions can I get to happen per second? And he was using a cloud instance with 360 virtual cpu's, so nothing small, but not the largest. And he talked a little bit about this before in previous episodes, but in this one he goes into a little bit more detail. But he was able to get up to 1 million tps without too many issues. But to go from 1 million transactions per second to 2.5 million required not using PG STAT statements now, before you think, wait a minute, I can get 2.5 times the performance by getting rid of PG stat statements? Not necessarily. Generally, PG STAT statements is recognized to have maybe a 5% impact on performance, maybe a little bit more. But why it was having such a huge impact on him is because just doing a single query where all you're changing is the id that you're searching on, that only appears as one entry in the PGST statements table. So basically that one row has to be updated many, many times, and you run into lock contention when you're doing that. Now, some postgres developers said they had some ways to potentially mitigate that, or figure out a different way to do it. Maybe you drop down and start doing sampling when the updates are happening too frequently. But in the synthetic use case, it did have quite an impact on the transactions per second that were possible. So turning off PG STAT statements in this specific case did result in him achieving 2.5 million transactions per second, and then to go from 2.5 million to 3 million. He was able to achieve that by then doing them as prepared statements. So basically eliminating the planning time for the query and just going off of the execution time. Oh, and the thing that I should mention, the clients that he's testing this against are local to the machine, meaning all of these transactions are being sent from clients that exist on the same 364 virtual cpu server. So they weren't external clients. And what got him from 3 million tps to 3.75 million tps. So he didn't quite hit the 4 million like his episode states, but 3.75 million was using Unix domain sockets as opposed to TCP IP connections. So eliminating that network overhead got him to 3.75 million or rounded up to 4 million transactions per second with a single node. But I thought this was a pretty interesting episode. If you want to learn more you can listen to it here or watch the YouTube video down here. Next piece of content why postcrustql prefers merge join to hash join this is from danolivo dot substack.com. and Andre mentions here that there are some users that I guess predominantly use rest or orm libraries that with postgres they disable their merge join option, which I am super surprised that somebody would do this. I would never disable a whole join operation in a production system. But apparently users are reporting that sometimes the merge join is chosen when it's a poorer option than doing other types of joins, like a hash join. And he says, quote, it turned out that the real reason, or at least one, but quite frequent one, lies in the typical challenge optimizer faces with a multiclause join. So he's saying in the case where you have table a and table b and you are joining them on multiple columns. So in this example, he's joining the x columns from both tables, the Y and the Z columns. That type of multicloss join for tables is where you can see poor performance. Now, he did a synthetic example with some data here. He apologizes. It does look kind of strange, but in order to see the behavior, he had to do it this way. And so from this data layout, when you run the query joining the a table to the b table, on the x, Y and Z columns, you end up getting a merge join and it runs in 843 milliseconds. However, if you turn off merge joins in the postgres configuration and then run the query, it runs in 154 milliseconds. So five times faster. So merge join is the problem, yes? Not necessarily. As is frequently the case, it's choosing this bad plan because a lack of information that the planner has or the optimizer has, and if you give it more information through the form of extended statistics about the relative statistics of each column, basically multivariate statistics, it allows the optimizer to choose the best plan. So here he creates statistics just for indistinct on all three columns together for the A table and for the B table. He turns the merge, joins back on, runs the query again, and now it does not choose the merge join, it chooses the has join, and it's even faster than when merge joins were turned off. So it's an even more optimized version of the one that was initially five times faster. It's now twice as fast as that one. So on the whole, once the planner is aware of more statistics relative to the columns it's working with, it ended up finding the ten times faster path. And if you're wondering, well, what's the harm of turning off merge join? Well, in this example here, he actually applied indexes to each of the tables for each of the three columns and did a query with a group by and it turned out now the merge join was two times faster at 44 milliseconds compared to a 75 millisecond hash join for this query. So what I take from this, it's not ideal to just turn off certain joins, but think about how can we give the planner more information so it can choose the optimal path? And of course extended statistics is a way to do that. Now it's interesting here he also mentioned a tool that he has released, which is a extension. I don't know if it's a new one, but it's called PG index stats. And what it does is it auto generates or manual extended statistics based upon the definition of indexes. So their theory is that columns with indexes are highly important and you probably want to collect some extended statistics about those. So it auto generates these for you, or you can also do them manually. So if you're running into issues like he's describing here, maybe you'd like to check out this extension to see if it could help you in terms of generating additional statistics to help optimize your queries. Next piece of content parallel queries in postgres this is from crunchydata.com dot. And this post talks all about the parallelism in postgres. And generally you have a query that can break work up into two or more workers to get jobs done faster, and then ultimately the results are gathered combined at the end. And the main way you can think about this is doing a parallel sequential scan so you can spawn multiple workers that scan that table in parallel in order to generate an answer faster. So this can be really important if you don't have as many indexes or you're working with data that can be hard to index, like certain queries on JSON B columns, or aggregations on JSON B columns, or different conditional or distinct or statements. Basically you leverage more cpu workers to get done faster en masse. Now you can do more than just parallel sequential scans. There are parallel index scans, parallel aggregate scans, and even parallel joins that can happen in postgres. And there are configuration options where you can adjust how the parallelism works and at one point it gets triggered. Because there is a cost to quote unquote going parallel because you have to spawn multiple workers, you're using multiple resources on the box in terms of memory like work memory has to be dedicated for each parallel process, and the work has to be divided amongst those workers and then combined at the end. So there is a cost to going parallel. They talk about the configuration and you have settings like max worker processes that tell you how many processes you have. And they say the general rule of thumb is to make it 25% of the total for VCPU count or greater, up to potentially however many vcpus you have on the machine. They have max parallel workers that could be as high as the number of max processes and max parallel workers per gather. So a particular query what's the max amount of workers that they can use? So you'll probably be setting these high in a more data warehouse or a data mart environment or a data lake, whereas not aggressively on like an oltpinal database load. And they actually did a test here where they're testing out a 32 core server and they went ahead and set max workers equal to that number of virtual cpu's. They set max parallel workers to 32 as well, workers per gather 32 as well. So one query can use all the worker processes and the parallel max maintenance workers. At 16 they applied a particular workload to see how long it would take to process this 384 gigabyte data dataset. So with one process it took 290 seconds to go through that data, all the way down to 88 seconds at the point of ten parallel workers. And from that point it just flatlined. This may not be a postgres limitation, there may be some other limitation in the server itself causing this bottleneck, maybe on the disk or memory or something unrelated to postgres. But if you want to learn more about that, you can check out this blog post. [00:10:49] Next piece of content gentle introduction to window functions in postgresQl. This is from notsoboringsql.com dot and this post is talking about window functions which do different types of functions or aggregates within windows of data. So like if you're doing a sum, you don't do it for the whole query, you're doing it but for particular windows of the results in that data. And by doing that you can do things like moving averages, running totals or ranking. So this blog post is a gentle introduction to that. So we shows a basic window query here doing rolling average and then going into a moving average and then showing functions like row number, how to do rank and dense rank of data as well as lag lead, first value and last value. So if you want to learn about window functions, you can definitely check out this blog post. [00:11:42] Next piece of content is actually presentation in a PDF called check your privileges. This is from thebuild.com and this is from Christoph [email protected] and this talks all about roles, how to grant them rights to be able to access objects in the system, etcetera. But what I found super interesting, I really don't see the page numbers, but lower down he has a cookbook section and these are great recipes for something you're trying to achieve. Like you want to have an application driven oltpinal. Well, here's a recipe on how to do that. All the commands needed. You want an analyst role? Use these commands. You want a read only user, do these commands or read only user the easy way, or a DML user, or a general DB role, etcetera. So this is a really comprehensive presentation and if you want to learn more, definitely encourage you to check it out. Next piece of content say hello to the Talking Postgres podcast. This is from citusdata.com and this is just a general announcement that the podcast that used to be called Path to Cituscon has now been relabeled Talking Postgres and it is a monthly podcast about postgres. So if you want to learn more you can check out this blog post. [00:12:56] Next piece of content mentoring program updates this is from arhas dot blogspot.com and we talked a little bit about this a few weeks ago in a previous episode of Scaling Postgres. But this is where nine committers had offered to mentor people who were interested in learning more about how to hack and make commits to postgres. Well, it actually expanded up to about 14 slots and they got 34 people who applied. So that still left about 20 people that didn't get a mentor, so what they decided to do is spin up a discord server so that anyone wanting to learn how to hack on postgres could join and do that. And he has a link here on how to join if you're interested in that. The other thing they did is they put together a survey about an event they're calling the postcard school hacking workshop. So if this is of interest to you, you may want to check out the survey to potentially join this at some point in the future. But if you want to learn more, you can check out this blog post next piece of content cloud native pg recipe nine inspecting the network and the storage in a cloud native pg cluster this is from gabrielebartulini it and this talks about how networking and storage is basically managed when you're using the cloud native PG operator with kubernetes in that basically Kubernetes is controlling all of the network and you have endpoints defined by the operator you can use primarily for things like your read and write endpoint, which would point to whatever the primary database is, a read only endpoint, or just a general read endpoint to talk to any database. He also talked about disk and how those are configured by kubernetes using the operator as well. So feel free to check that out if you want to learn more. Also related to that, dbi services.com had a blog post on cloud native PG covering storage. So Daniel had previously done six blog posts about Cloud Navy PG, so this is his most recent one, all about storage. So if you're interested in that, you can check this one out. [00:15:00] And the last piece of content using AI to generate data structures for PostgreSQL. This is from cyber tech, postgresql.com. and he shows that you can ask an AI like OpenAI to develop you a schema. In this case he's saying, hey, create a data model for postgreSQL including indexes to store addresses, and it output this table as well as these sets of indexes. And he says, you know, this is not too bad. It has an auto incrementing primary key, it has the appropriate columns, it has a good set of indexes to be able to answer different questions from it. But this is a little different because he's using a local LLM. He's using Olama to run a llama three model to do it, and he even developed a Python program and then used that to develop a more comprehensive schema for a basic product type database. So if you're looking to use AI to generate a schema for you. Maybe you'd like to check out this blog post as well. [00:16:02] I hope you enjoyed this episode. Be sure to check out scalingpostgres.com where you can find links to all the content mentioned, as well as sign up to receive weekly notifications of each episode there. You can also find an audio version of the show, as well as a full transcript. Thanks, and I'll see you next week.

Other Episodes

Episode 212

April 24, 2022 00:21:23
Episode Cover

Slow Queries, 4TB Upgrade, Postgres on ZFS, Storage Stampede | Scaling Postgres 212

In this episode of Scaling Postgres, we discuss dealing with slow queries, upgrading a 4TB database, running Postgres on ZFS and handling a storage...

Listen

Episode 328

August 11, 2024 00:16:10
Episode Cover

Implement Get Or Create | Scaling Postgres 328

In this episode of Scaling Postgres, we discuss different get or create implementations, a new pgBouncer version, alter default privileges, and six degrees of...

Listen

Episode 300

January 28, 2024 00:18:02
Episode Cover

Scaling Postgres Celebration | Scaling Postgres 300

In this episode of Scaling Postgres, we discuss how you can build a GPT in 500 lines of SQL code, how to optimize extension...

Listen