Custom Data Types, Row Level Triggers, Dynamic pl/pgsql Columns, citext Usage

[00:00:00] In this episode of Scaling Postgres, we talk about custom data types, row level triggers, dynamic plpgsql columns, and SITEX usage. I'm creston Jameson. And this is scaling postgres episode 162 point. [00:00:24] I hope you, your friends, family and coworkers continue to do well. Our first piece of content is creating custom postgres data types in Rails. This is from Pginalyze.com and they are talking about custom data types in postgres. But also the second part of it applies how you can actually use it with the Ruby framework, Ruby on Rails. So what they're talking about is a custom data type where, for example, you could create a composite data type where you have dimensions that is itself built from three integers. And then you can actually use this data type in a table. And the first thing they cover there are two data types. The first one is a domain type where you can put a certain restriction or a constraint on that data. One example they gave is a data type with currency, for example, can't be negative, or in this case, it's a string that cannot contain an exclamation point. And they just called it a string without bang. And this is a domain type. So they show some examples of using it and how postgres will give you an error if you try to insert exclamation points into this column. And you can also nest these domain data types. So they added an additional check where an email must contain an at symbol and that works as well. And then they talk about the composite data types, where it's a data type that contains more than one base data type. And here they're using dimensions and three different integers for the depth, the width and the height. And how you generally insert into there is using these per ends in order to insert the data. And there is a way in SQL to be able to pull out one single column from that data type. Now, the next part of the post they are talking about how you can implement this in the Ruby framework, Rails. And the first thing they did is they did a migration to apply the DDL. So they're creating the new type and creating their new domain and then creating the table using these new types. So after they're created, the database is all set up and ready to use. However, it doesn't really know how to interpret it. And it interprets pretty much anything it doesn't know as a string. So if you try to return the dimensions, it'll just return it as a string. Rails or the Active Record interface has an attributes API where you can define these new types. So generally in Ruby at least, you would create a class that defines the different entities it contains and define a few functions to use it, as well as create an Active Record type value to be able to interpret this dimension type. So the database knows how to use it. And once you have this in place, as well as defining this attribute on your active record model, you can then use this new composite data type as you would any other data type. So if you want to learn more about composite data types and domain data types in postgres, especially if you're using Ruby on Rails, definitely check out this blog post from Pganalyze.com. [00:03:20] The next piece of content, what to return from a PostgreSQL row level trigger. And this is from CyberTech postgresql.com. And they talk about when you create a trigger in postgres. Generally the first thing you do is you define the function of what actions you want the trigger to do and then you define the actual trigger that determines when that function will be fired. And then they cover different trigger types. So there's event triggers and then there's triggers that get fired by data modifications and that's what they focus on here. With regard to that, there are statement level triggers that fire for each statement. But what they're discussing here is actual whenever a row gets changed, those are the row level triggers. And triggers can run either before, after, or instead of a data modification. And they talk a little bit about the return value of a trigger in that generally statement level triggers, it's null and row level triggers, it depends on how they're being used. But one thing to keep in mind when you're using row level triggers is that you're going to have the definition of a new or an old variable in the trigger. And new is set on inserts and updates. So this is the new data and old is set on updates and deletes. So it represents the old data that was in the row that's either been deleted or from what has been updated. So they created a basic row level trigger. So basically they have a table with a value in it that's text. Then they created a history table to track modifications to it. They defined a function and they used a conditional looking at the trigger operation. So if it's a delete, insert the data into it this way, given the presence of the old or the new variable. Otherwise insert data into the history table this way. And then they define the trigger that on insert or update or delete for each row execute this function. Now, what they make note of here, a lot of audit triggers use a technique similar to this. But you can make this more efficient by removing this conditional because you can actually do it by just defining a coalesce to determine what is shown here. So for example, you could just do an insert into there and coalesce, whether it's a new or an old, so that that which is not null is the one that gets inserted and then new will be inserted or it will be inserted as null. Essentially, you can significantly simplify this audit using this type of function versus this one with a conditional. So definitely interesting use case of Coalesce in this case for audit triggers. So if you're interested in learning more, you can check out this blog post. [00:05:51] The next piece of content getting value from dynamic column in plpgsql triggers this is from Depes.com. He's talking about how when you want to pass in a row into a function and then be able to determine what column you want to return. So as an example, he created a basic function where you're passing in a row and it's always going to return the same column from that table that's defined. And he did a test of returning all the data and then using the function to return essentially the payload again through the function. So this works as expected. Next he says, how can you do it given a dynamic column that's passed in? So you get to determine what column of data gets returned from this function. So here he developed another one and he's using execute and format the query to be done using the P row being passed in into a variable that gets returned. And as you can see, this works and gives you a dynamically returned column based on the variables that are passed in. Now, I actually tried doing the same thing, but using an H Store column to actually do the return. And this implementation gives you the same answer as well as trying JSON implementations as well as JSON B implementations. And for completing this, he even did an entirely different language using Perl as a way to do it. And they needed some performance checks to see, okay, what is the most efficient at doing this? Now of course, the fastest is the get static. So that's the fastest implementation out of all of them. But the next fastest was the dynamic H Store. So the H Store implementation, which I was a little surprised by, then comes the JSON and the Perl. They're pretty much equivalent in the JSON B that may be in the range of statistical variation, but by far the slowest was the plpgsql, where you're doing an execute on a format. So if you have a need to do this, maybe you're interested in checking out this blog post to find out more. [00:07:49] The next piece of content is actually a YouTube video and it's handling case insensitive strings explaining PostgreSQL. This is from the YouTube channel CyberTech Postgresql.com and it's a relatively short six minute video, essentially talking about the CI text extension and data type that you can use for case insensitive strings. So as opposed to using a function lower to query a column to make sure that you ignore case, and then having to worry about potentially expression indexes because otherwise an index won't be used if you're applying a function to it, the CI text kind of gets around that. Now, there may be some downsides and considerations you need to take into account for your application framework, but definitely interesting video that shows you how you could start using CI text if you're interested. [00:08:38] The next piece of Content hash Index Internals this is from Haigo, CA, and they're talking about the hash indexes. Now, they do have a number of constraints that we've discussed in a previous episode of Scaling Postgres in terms of how they don't really allow unique constraints. They don't allow index, only scans, they don't allow covering indexes. There may be some size advantages to Btrees in certain instances, but there may be some use cases for them. And this kind of goes into detail about how they work on the implementation side. So if you're interested in learning more about them, you can check out this post from Higo CA next piece of content Calling Stored Procedure from Java part Two. This is from Higo CA, and they're discussing how you can use stored procedures using not only multiple in parameters, but also in out parameters. So if you're interested in learning more about this, you can check out this post. [00:09:31] The next piece of content Visibility with Query Cache this is from Pgsqlpgpool Blogspot.com, where they're discussing how Pgpool can actually do caching of queries. But there are some potential drawbacks to it depending upon how your transaction isolation levels are configured and what you're running as. So Read Committed kind of gives you some expected results. But repeatable Read using this feature can give some unexpected results or even potentially serializable transaction isolation. So if you use this feature or you're interested in using PG Pool, you may want to check out this post to learn more about these potential gotchas. [00:10:11] And the last piece of content is the postgres Girl Person of the Week is Pavlo Kulov. So if you're interested in learning more about Pavlo and his contributions to Postgres, definitely check out this blog post that does it. For this episode of Scaling Postgres, you can get links to all the content mentioned in the show notes. Be sure to head over to Scalingposgres.com, where you can sign up to receive weekly notifications of each episode, or you can subscribe via YouTube ride.

Show Notes

Episode Transcript

Other Episodes

Episode 67

PGCon, Tips, PostGIS, Strings | Scaling Postgres 67

Episode 139

Weekly News, Full-Text Search Performance, pg_cron, Bulk Data Loading | Scaling Postgres 139

Episode 294

How Much Observability Is Needed? | Scaling Postgres 294

Custom Data Types, Row Level Triggers, Dynamic pl/pgsql Columns, citext Usage | Scaling Postgres 162