Parallelism, JIT Compiling, Indexes, Administration | Scaling Postgres 30

Episode 30 September 17, 2018 00:09:45
Parallelism, JIT Compiling, Indexes, Administration | Scaling Postgres 30
Scaling Postgres
Parallelism, JIT Compiling, Indexes, Administration | Scaling Postgres 30

Sep 17 2018 | 00:09:45

/

Hosted By

Creston Jamison

Show Notes

In this episode of Scaling Postgres, we review articles covering parallellism in Postgres 11, JIT compiling, indexes in depth and administration.

To get the show notes as well as get notified of new episodes, visit: 

https://www.scalingpostgres.com/episodes/30-parallelism-jit-compiling-indexes-administration/

View Full Transcript

Episode Transcript

[00:00:00] In this episode of Scaling Postgres we talk about parallelism in Postgres Eleven JIT compiling indexes and postgres administration. I'm Kristen Jamison, and this is Scaling postgres episode 30. [00:00:22] Alright, I hope everyone had a great week. Our first piece of content is Parallelism in PostgreSQL Eleven. This is from Thomas Monroe and it was presented at Postgres Open in San Francisco in 2018. So this is a series of slides from a talk he gave and sometimes it's hard to discern what's going on, but this was a really good presentation. I hope that the video is posted up on YouTube eventually. So he goes into a history of parallelism with Postgres and he has a few slides I'm just going to pull out here I thought were interesting. And this is just part of the history portion of his talk. And here he's showing this quote which appears to be from Herb Sutter that says the free lunch is over. So basically around 2004 is when we stopped getting dramatic increases in CPU frequencies. The single thread performance started flattening out. And basically at that point the way to get more processing power was putting more cores in each CPU. So basically this is the point at which doing parallel operations started to become more important. And then he shows this other slide here called the Parallel Gold Rush and a lot of different database platforms and how they started offering parallel query execution. And you can see Postgres has been pretty late to the party in 2016. Now you may think that's not great, but the reality is that if you have a lot of users connecting to the database, like a typical OLTP transactional load, each of those processes for each connection can be serviced by different CPUs or different cores of a machine. So it would only be used in analytical processing where you would want to paralyze complex queries. So in terms of the postgres sweet spot OLTP, these kinds of things aren't needed as much. But when you're talking about analytical processing it becomes very important and which is why in previous episodes of Scaling Postgres we've seen a number of vendors that forked Postgres and developed analytical databases based upon its core. But this is a very technical presentation that goes into a lot of the background of parallelism and how it works. So if you're interested in that, this is definitely a post to check out. I will show one of the later slides are some things that prevent or limit parallelism. And the first one here is CTE. But to get around it you can rewrite it as a subselect. And he mentioned some other things like full outer joins, no foreign data wrappers currently support parallelism cursors max rows queries that writer lock rows functions not marked parallel safe and serializable transaction isolation. So even with postgres Eleven keeps adding parallelism improvements. There's still more that needs to be done. But again, if you're interested in the parallelism features of postgres eleven and what's coming in the future. Definitely presentation to check out the next post is parallel PostGIS and pgSQL eleven. This is from the cleverelifent CA blog by Paul Ramsay. So he's evaluating the parallelism of postgres and basically he says from 9.6 to ten there was a possibility of getting parallel plans and he has seen some minor improvements in PostgreSQL eleven by modifying the cost for certain functions. But basically there's still more work to be done. So if you use PostGIS and you're interested in the parallel capabilities of it, this is definitely a blog post to check out. [00:04:06] The next post is PostgreSQL eleven and Just In Time compilation of queries. And this is from the Citusdata.com blog. So basically this post checks some benchmarks on how the new JIT compiler in PostgreSQL eleven can lead to some query improvements. So he used the TPC benchmark H in terms of report query one, which he shows a representation of it. Here he set up a database and ran some tests against it for PostgreSQL 9.610 and eleven with JIT compilation enabled and we can see the increases in performance for the different versions. And he did mention that he did turn off parallel query execution so that that wouldn't be a factor. And basically for this query, PostgreSQL eleven is 30% faster processing this query, which is quite significant. And he goes over into the tooling and the setup he used to use these benchmarks. So definitely a pretty thorough blog post. Now in terms of the release notes for PostgreSQL eleven and specifically for the Just in Time compilation, in order to enable JIT, it looks like you have to compile it from source using the with LLVM flag. So it depends on how you install postgres on whether this will be enabled. And if you do it from source you can use this flag to make sure it's enabled. I haven't tested specifically, but my assumption it would be off normally. So in terms of this blog post you can reference how he set up his tests in order to do it. So again, this is another case where this is probably best for sophisticated analytical queries like this. So if you think the new JIT compilation feature could potentially improve your performance, definitely a blog post to check out. [00:05:54] The next post is actually a webinar, and this is webinar database security and PostgreSQL follow up. And this is from the second quadrant, PostgreSQL blog. So they had a webinar about database security and PostgreSQL. And there's a link right here, it says available here and it will take you and you have to register, but then at that point you can immediately watch the video. [00:06:17] It's about 45 minutes long. About the first half is a presentation, the latter half is questions. So basically this presentation covers defense in depth for your PostgreSQL instance. So what kind of controls can you put in place at the network layer at the host layer at the process layer on the box. What can you do about the file system? How can you secure access to the data within the database? For example, using grant revoke permissions, it talked about row level security and how to implement that, as well as general application security, because an application that needs to access the data in the database basically goes straight through all those layers. So I did find it to be a good presentation. So if you're looking to increase the security of your database, definitely a webinar that you should watch. [00:07:09] The next post is Indexes and Postgres The Long Story or Crocodiles Going to the Dentist by Louise Granjonic. So this again is a slideshow presentation and it covers the different index types of postgres. What's significant about this is this is probably the most technical presentation on how indexes work in postgres I've seen. So there are a few slides that cover general advice for a developer. But if you're really wanting to know how postgres indexes work at a granular level, this is probably the best presentation I've seen on it. So if you're interested in that, definitely a piece of content to check out. [00:07:51] The next post is Setting Up Streaming Replication in PostgreSQL, and this is from the Procona Database Performance Blog, and in it they basically give you the steps on how to set up replication. It's a relatively short post, but they go through all the details. [00:08:08] If you want a video format of this, I have done a tutorial showing how to set up PostgreSQL streaming replication, so if you want a little bit different format, you're welcome to check out this piece of content as well. [00:08:22] The next post is actually a PDF presentation and it's called Mastering PostgreSQL Administration by Bruce Momgen. So I have seen a reference to this presentation before, but it's a very comprehensive presentation going into all the different aspects of doing PostgreSQL administration. It's 112 slides worth. If you do PostgreSQL administration and you haven't seen this presentation yet, definitely a link to check out. [00:08:52] The last post is announcing Timescale DB 10, the first enterprise ready time series database to support full SQL and scale. So I've mentioned Timescale DB in previous episodes of Scaling Postgres, but I thought this announcement that it had reached 1.0 is notable. So if you're looking for a Timescale series database that can work as an extension for PostgreSQL, definitely a product and a blog post to check out, that does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show Notes. Be sure to head over to Scalingpostgres.com where you can sign up to receive weekly notifications of each episode, or you could subscribe via YouTube or itunes. Thanks.

Other Episodes

Episode 35

October 22, 2018 00:17:07
Episode Cover

Terabyte Scaling, Postgres 11, Connections, Backup | Scaling Postgres 35

In this episode of Scaling Postgres, we review articles covering terabyte scaling, PostgreSQL 11, managing connections and backup of terabyte databases. To get the...

Listen

Episode 229

August 22, 2022 00:11:49
Episode Cover

Postgres Playground, PG14 Internals, DB Corruption, Anti-Join | Scaling Postgres 229

In this episode of Scaling Postgres, we discuss a new Postgres playground, a book about PG14 internals, how to corrupt your database and using...

Listen

Episode 277

August 13, 2023 00:19:04
Episode Cover

Postgres Releases, PostgreSQL Survey, Partitioning vs. Sharding, Bulk Loading | Scaling Postgres 277

  In this episode of Scaling Postgres, we discuss new Postgres releases, taking the 2023 State of PostgreSQL survey, partitioning vs. sharding and the fastest...

Listen