Episode Transcript
[00:00:00] In this episode of Scaling Postgres we talk about new CVEs postgres programming, JSON B and advisory locks. I'm Kristen Jamison and this is scaling postgres episode 76.
[00:00:20] Alright, I hope you're having a great week. Our first piece of content is actually new PostgreSQL releases on the Postgresql.org website. So they have new releases including beta three of twelve. Now most of these appear to be due to four security issues that were identified, but also 40 bugs were addressed. Now two of these security issues involve SQL Server itself. One related to security definer functions which we've talked about in a previous episode of Scaling Postgres. And if you're using these you should definitely consult the documentation on the best way to use them without causing a potential problem. And then the last one is memory disclosure and cross type comparison for hashed subplan. And this one says for an attack to become possible, a super user would need to create unusual operators. The other two involve Windows installers and of course you also get a host of bug fixes as well. So go ahead and update your instance when it's best for your environment.
[00:01:26] The next post is postgres is the coolest database. Reason number five it cannot be bought out. And this is from secondquadrant.com and they've had various different reasons coming up why Postgres is the coolest database. This is reason number five and they have links to some of the other ones that they've been talking about. But this was kind of interesting talking about the structure of the PostgreSQL community and how it develops and why that kind of makes it pretty resilient and that it can't be bought out. For example, they use the example of MySQL being bought out by Oracle and how that's not really possible because there's really no one company behind it. And it says it assigns copyrights to the PGDG which is the PostgreSQL Global Development Group which is a global community that cuts across countries, enterprises, people and cultures. Some other interesting things is something as important as the core team they say, which is currently comprised of five people. It does not have more than two members from the same company and community conferences don't allow more than 50% representation from the same company on talk selection committees. So basically this just gives more evidence that it is truly an open source project that cuts across a variety of organizations and individuals. And here are the links to some of the other reasons. So interesting blog post to check out.
[00:02:51] The next post is overview of server side programming in PostgreSQL. So this goes over all the different ways that you can do server side programming. The first area they address are SQL functions, so you can create a function and make sure the language is SQL and you can basically create user defined functions that do some sort of purpose like here's updating items and inserting into an audit table for example. Now they make a note here, the function body can only contain SQL statements, however, that means there are no flow control statements if while variables and the like. Now you can do that with plpgsql which is essentially procedural level postgres SQL.
[00:03:38] So because the language is SQL, you can't really use these capabilities. But we'll see that in the post further down. The next thing they talk about is C functions. So you can develop C functions, as they say, to pretty much do anything. So you can create a function and then define essentially where your entry point is talking about. You have a shared library here and the entry point being a sum. So you can call that function within some C code and then they go into the plpgsql functions. And again the language specified here is plpgsql and this does allow loops and conditionals and variables and things of that nature. So it is more of a programming environment. And this is included by default in PostgreSQL. And then there are other core procedural languages, so you can use Python Perl as well as TCL, but you may need to install some additional packages in order to get that to work because it's not really included by default. For example, they were saying on debian you may have to do install this package in order to get the Python working. But then they say there's also some non core procedural languages, so there's some open source projects that have support for things like Java, Lua, R, et cetera. And they have a link to the list here. And then they're talking about aggregate functions and how you can essentially create your own. And again they're using the SQL language to be able to do that. So for example, this does a median and they do select medium from a students table. And of course you also have user defined types, so you can basically create your own types. And you remember how one of the CVEs was about creating some sort of custom operator. There's a risk associated with that. Well here they're literally creating an operator. The next area they cover are triggers and how you can set up a trigger to be before after an insert update, delete a row of a table on the truncate of a table, or instead of inserting update, delete a row of a view. So triggers perform some action when another action takes place. So their example here is after an update on the items table, basically you're going to insert into an audit table and then they have event triggers which are similar to triggers because as they say quote while triggers respond to DML events on a single table, event triggers can respond to DDL events on a particular database. Then you can define rules which essentially lets you rewrite queries as they say here. So for example, create a rule that on an insert into items do instead nothing. So basically don't allow it. Then they talk about stored procedures, which is new with Eleven, which is essentially like the standard functions before, but these procedures allow transaction control so you can do commits and rollbacks within them, whereas with a Plpg SQL straight functions you can't do that. It's all part of one transaction, but with procedures they enable you to have this transaction control. Then they cover what they classify as a few other exotic things such as foreign data wrappers, connecting to another database, another postgres instance, or MySQL Oracle, et cetera. How it's possible to create your own index types with upcoming PostgreSQL being able to create your own table access methods. They talked about logical replication plugins used with things like logical decoding procedure, language handler as well as extensions. So of course this allows a whole host of functionality to extend what postgres can do. So it's definitely an interesting blog post. I encourage you to check it out if you're more interested in more options when it comes to server side programming in PostgreSQL.
[00:07:27] The next post is comparing postgres JSON B with NoSQL and this is from Couchbase.com. So of course they're talking about NoSQL in the context of CouchDB, which is a NoSQL database. So in terms of comparing them, there's probably going to be a little bit of bias. But whenever they talk about JSON B and different ways to use it in PostgreSQL, I always find that content interesting.
[00:07:54] So I like just looking through it just to kind of understand more how people are potentially data modeling and using JSON B in the context of PostgreSQL or even NoSQL databases. Now of course they do talk about some disadvantages. Like they're saying the JSON B syntax is not necessarily friendly, but again, with a JSON path that's going to be coming, I find that syntax much more appealing. And I included the link here to the JSON path feature that's coming in PostgreSQL Twelve where you can do like syntax like equipment rings, get the first item of the ring array and get the name. So I find this syntax much easier to understand compared to some of the other syntax that JSON b uses with our Curtain set of operators. So if you're interested in more content as it relates to JSON b, definitely a post to check out.
[00:08:50] The next post is postgres advisory locks with Async IO. Now, I believe Async IO and Async PG are related to Python here, but they are talking about advisory locks and how they're using it as a solution to have only one process interact with a database on a particular connection. And you can use here's the syntax that they use in terms of you just create an advisory lock with a given name. So there's not a lot of PostgreSQL specific content in this article, meaning code, so a lot of the code is Python and using these particular tools. But if you want to know how someone is using an advisory lock in their use case. This is a pretty short post to check out and see how they're using it. The next post is doubling the sorting speed of postgres network types with abbreviated keys. Now, this is a post that talks about how they developed a patch for postgres to be able to speed up inet and CIDR types in postgres. So it's more of a technical discussion. It doesn't give you guidance on how to improve the sorting speed of PostgreSQL as it is, but it's what this individual did in terms of developing the patch, in terms of achieving almost doubling of speed with inet and CDR data types. So if you're interested in the content behind this patch and a lot of detail into how it was developed and the rationale with regard to it, definitely a blog post to check out.
[00:10:25] Next post is automatically updating materialized views. Now, this is a short post, but it's basically communicating how to do it. Now, I don't see a lot of code with regard to the materialized views because how they're doing it is through triggers. So they don't actually have the code of the trigger, but they do explain some of the concepts of how you would need to develop triggers to be able to update a materialized view incrementally. So again, this is a very brief post, but it gives some ideas conceptually of how to automatically update materialized views. And this is from Pgsqlpgpool blogspot.com.
[00:11:04] The next post is PostgreSQL Graph Search Practices 10 billion scale graph with millisecond response. Now, this is from Alibaba Cloud and they're talking about using graph search using a CTE syntax that you can easily implement a graph search in terms of in depth search, shortest path, point, edge property and so on. So this is a pretty long post that includes a lot of code in detail about how to do this. So if you have a data set that can be represented in a graph such as this, this might be a blog post you want to check out to see how you could potentially use CTEs and recursive CTEs to be able to achieve the similar type of response time in your application.
[00:11:54] The last post is waiting for PostGIS three separate Raster extension. So again, upcoming in PostGIS three, they've broken out something that was included by default previously, but from what they're saying here, breaking out the Raster functionality allows packagers to more easily build strip down just the basics postgres without also building the Raster dependencies, which include the somewhat heavy GDAL library. So just something to keep in mind when you upgrade PostGIS to version three that this functionality is a separate extension. Now that does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show Notes. Be sure to head over to Scalingpostgres.com where you can sign up to receive weekly notifications of each episode, or you could subscribe via YouTube or itunes. Thanks.