Unpivot in Teradata and Snowflake
by Pedro Meléndez, on May 25, 2022 8:35:20 AM
The ability to pivot has been key to moving couches and manipulating dataframes for more than 30 years. But not all pivots and unpivots are created equal. Let's take a closer look at the UNPIVOT function in Teradata and in Snowflake.
The UNPIVOT function has the purpose of transforming columns into rows when querying information from a table and is present in multiple SQL languages. Teradata implements this functionality in the TD_UNPIVOT function.
However, Teradata users wanting to migrate to other platforms have noted that complications arise when implementing queries that use the TD_UNPIVOT function in another platform. Let’s see how this works in Snowflake.
Snowflake has its own UNPIVOT function, which has the same expected behavior as the Teradata function of transforming columns into rows. However, when compared with TD_UNPIVOT some implementation differences become visible:
- TD_UNPIVOT allows a user to unpivot column groups into multiple value columns at the same time (column groups are the column lists wrapped with ‘’ inside the COLUMN_LIST clause of the function). On the other hand, Snowflake UNPIVOT receives a single column list and unpivots it into a single value column. A “fix” for this is using multiple UNPIVOT calls in the same query, but this will result in many duplicated rows.
- TD_UNPIVOT allows the user to define aliases for the names of the columns or column groups getting unpivot. If no aliases are specified, then the function will automatically generate them by joining the names of all columns in the column groups with an underscore. Snowflake’s UNPIVOT function automatically uses the name of each column in uppercase as values.
These differences make it impossible for a direct translation of TD_UNPIVOT to Snowflake UNPIVOT. So… are we doomed? Is it impossible to build a functionally equivalent query in Snowflake? Fortunately, the answer to this question is no. It is possible. Look at the following example migrated using SnowConvert for Teradata:
CREATE TABLE unpivotTable ( myKey INTEGER NOT NULL PRIMARY KEY, firstSemesterIncome DECIMAL(10,2), secondSemesterIncome DECIMAL(10,2), firstSemesterExpenses DECIMAL(10,2), secondSemesterExpenses DECIMAL(10,2) ); INSERT INTO unpivotTable VALUES (2020, 15440, 25430.57, 10322.15, 12355.36); INSERT INTO unpivotTable VALUES (2018, 18325.25, 25220.65, 15560.45, 15680.33); INSERT INTO unpivotTable VALUES (2019, 23855.75, 34220.22, 14582.55, 24122); SELECT * FROM unpivotTable; SELECT * FROM TD_UNPIVOT( ON unpivotTable USING VALUE_COLUMNS('Income', 'Expenses') UNPIVOT_COLUMN('Semester') COLUMN_LIST('firstSemesterIncome, firstSemesterExpenses', 'secondSemesterIncome, secondSemesterExpenses') COLUMN_ALIAS_LIST('First', 'Second') )X ORDER BY mykey;
CREATE TABLE PUBLIC.unpivotTable ( myKey INTEGER NOT NULL PRIMARY KEY, firstSemesterIncome DECIMAL(10,2), secondSemesterIncome DECIMAL(10,2), firstSemesterExpenses DECIMAL(10,2), secondSemesterExpenses DECIMAL(10,2) ); INSERT INTO PUBLIC.unpivotTable VALUES (2020, 15440, 25430.57, 10322.15, 12355.36); INSERT INTO PUBLIC.unpivotTable VALUES (2018, 18325.25, 25220.65, 15560.45, 15680.33); INSERT INTO PUBLIC.unpivotTable VALUES (2019, 23855.75, 34220.22, 14582.55, 24122); SELECT * FROM PUBLIC.unpivotTable; SELECT * FROM (SELECT myKey, TRIM(GET_IGNORE_CASE(OBJECT_CONSTRUCT('FIRSTSEMESTERINCOME', 'First', 'FIRSTSEMESTEREXPENSES', 'First', 'SECONDSEMESTERINCOME', 'Second', 'SECONDSEMESTEREXPENSES', 'Second'), Semester), '"') AS Semester, Income, Expenses FROM unpivotTable UNPIVOT(Income FOR Semester IN (firstSemesterIncome, secondSemesterIncome)) UNPIVOT(Expenses FOR Semester1 IN (firstSemesterExpenses, secondSemesterExpenses)) WHERE Semester = 'FIRSTSEMESTERINCOME' AND Semester1 = 'FIRSTSEMESTEREXPENSES' OR Semester = 'SECONDSEMESTERINCOME' AND Semester1 = 'SECONDSEMESTEREXPENSES') ORDER BY mykey;
As shown above, it is possible to mitigate the inconsistencies that hinder the transformation of TD_UNPIVOT to a functionally equivalent SELECT query. Let’s look at each part of the query and how it works:
- The columns list of the query contains all the columns that will result from the TD_UNPIVOT function. This list also includes any columns that are not unpivoted (known in Teradata as copy columns), all value columns, and the unpivot column resulting from the unpivot process.
- The unpivot column has a particular structure to correctly apply the aliases to the names of the columns. In Snowflake, an OBJECT_CONSTRUCT is used to create an object that contains key-value pairs of column names and aliases. If no alias was specified, the same criteria that Teradata uses is applied to automatically generate names. A GET_IGNORE_CASE is then used to map the column names in the unpivot column to their aliases.
- In the FROM clause, an UNPIVOT call is added for each value column being generated, The first value column will have the name of the unpivot column as the generated column name. All subsequent UNPIVOT calls get their column name autogenerated to avoid an error that could arise from duplicated names.
- Lastly, a WHERE clause is added to filter the duplicated rows when more than one UNPIVOT call was used. The criteria used is based on the column groups from the original TD_UNPIVOT. Duplicated rows combine columns from different groups, so the WHERE clause is used to take only the rows where all unpivot values are from one column group.
Note that this transformation requires column information from the original table to build the column list. If the source table is not included in the conversion, the conversion tool will not be able to build the column list as it does not have access to the original. When utilizing the power of a tool like SnowConvert, it’s always recommended to include as much of the source DDL available to allow for the tool to properly build, analyze, and convert all references and dependent objects. If you do not have the source DDL, SnowConvert will output an error code (MSCEWI2061) informing you that it does not have enough information to perform the conversion. This kind of error/warning reporting is the kind of thing that accelerates any migration by providing informed errors and warnings. If you can track down the source DDL, SnowConvert will automate this transformation.
So if you can’t PIVOT a couch up some stairs, at least you can now UNPIVOT effectively from Teradata to Snowflake.