Yugo Nagata: Transition Tables in Imcremental View Maintenance

Introduction

In a previous post, I explained that our implementation of Incremental View Maintenance (IVM) on PostgreSQL were using AFTER trigger and transition tables. In this article describes how we use transition tables in the IVM implementation.

Transition Tables

In order to calculate changes to be applied to views, changes that occurred on the updated tables are extracted using AFTER triggers and transition tables.

The transition table is a features of AFTER trigger, which allows trigger functions to refer to the changes that occurred on the table like normal tables. There are two transition tables; one contains “tuples deleted from the table” and another contains “tuples inserted into the table”.

When you create a trigger using CREATE TRIGGER command, you can specify names of transition tables by REFERENCING clause. In the following example, old_table and new_table are used as transition table names.

CREATE TRIGGER update_trigger
AFTER UPDATE ON tbl
REFERENCING OLD TABLE AS old_table NEW TABLE AS new_table
FOR EACH STATEMENT EXECUTE FUNCTION update_trigger_function();

If you define the function using PL/pgSQL, these transition tables can be referenced in SQL by your specified names like normal tables although they do not exist in the catalogs. Instead, tuples of transition tables are stored in tuplestore, which is in memory or temporary files. Such kind of table is called ephemeral named relation.

When you write a trigger function in C language instead of PL/pgSQL, you can use Server Programming Interface (SPI) to run SQL commands. (You can find an example in the documentation.) In order to use transition tables in trigger functions written in C, you have to call SPI_register_relation or SPI_register_trigger_data to make ephemeral named relation(s) available.

Transition Tables in IVM

In our IVM implementation, when a base table is modified, a trigger function is called for maintaining a materialized view. In this function, view delta tables that contain the changes that will occur on the view are calculated as explained in before.

The trigger function is written in C, but we don’t use SPI for this calculation because we don’t run SQL command directly. Instead, we use a view definition query stored as an internal structure (namely, Query Tree). For example, suppose we have a materialized view defined as bellow:

SELECT x,y,z FROM R,S where R.i = S.i;

This SQL query is stored as the view definition after transformed to a query tree. Then, suppose that table R is modified. Using the stored query tree, we can get the query to calculate the view delta tables by replacing R with transition tables (old_table_R and new_table_R) as bellow:

-- tuples to be deleted from the view 
SELECT x,y,z FROM old_table_R,S where old_table_R.i = S.i;
-- tuples to be inserted into the view
SELECT x,y,z FROM new_table_R,S where new_table_R.i = S.i;

Specifically, the query tree is rewritten so that the range table entry of R is replaced with it of a transition table, and run the rewritten query tree. The results are stored into tuplestores. Then, these tuplestores are transferred to ephemeral named relations, and applied to the view by SQL executed via SPI.

Summary

In this article, I explained what is transition table and how it is used in our IVM implementation. In the next post, I will explain a more complicated situation, that is, where multiple tables are modified simultaneously.

PostgreSQL