Convert Teradata SQL & Stored Procedures to Snowflake

by Brandon Carver, on Apr 28, 2020 6:39:45 AM

spock_imageLet’s talk about SQL and sProcs. And no, we are not talking sequels about Spock (though that does sound compelling, it's been done), but something just as exciting: translating SQL and Stored Procedures (sProcs) between two data warehouses (aka the hard problem in a database migration). Recently, Mobilize.Net and Snowflake announced a partnership to provide script translation tools for converting from a legacy, on-prem warehouse to Snowflake. We’ve been translating code since the days of Windows 95, but what’s there to translate with SQL? Isn’t SQL the same everywhere you go? And what does this have to do with Spock? (Nothing. Again, it’s sProc.)


All SQL is not ANSI SQL

You would think SQL was the same everywhere, but alas, no. That would be way too logical. SQL has ANSI standards that most data warehouses adhere to, but everyone’s architecture is a bit different. That bit of difference creates huge problems when translating from one architecture to the next. When transforming from any data warehouse to another, both systems are going to have “things” that appear in one that don’t appear in the other. For example, most Teradata SQL programmers I know start creating a set table the same way:

CREATE SET TABLE DATABASE.TABLE
,NO FALLBACK
,NO BEFORE JOURNAL
,NO AFTER JOURNAL
,CHECKSUM = DEFAULT
,DEFAULT MERGEBLOCKRATIO (

That’s quite the mouthful for creating a table with no duplicate values (no one ever accused SQL of lacking verbosity). Could we just write ```CREATE TABLE DATABASE.TABLE (```? Sure, if you're confident in leaving those Teradata specific instructions to the default values for that session (and if you're sure what those default values are). There's plenty of reason to be specific though. So what’s actually happening here? And why doesn't if work in Snowflake? It’s those differences in the architecture and those extensions of the ANSI SQL. The ```NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO``` statements are essentially telling Teradata how to store the table or defining different attributes of the table (i.e. not to use large quantities of space to create backups and save copies in order to recover the table). Snowflake optimizes the architecture and has an ever present recovery system in place that doesn’t need to be defined in the DDL.

If that verbosity wasn't enough, the table types aren't always directly translatable from one to the next. Snowflake does not support Teradata's SET TABLE functionality directly. Instead, you are forced to create a MULTISET TABLE. This is not a huge problem given that MULTISET is the default CREATE TABLE in both Teradata (when in ANSI transaction mode) and Snowflake, but these differences in architecture create some obstacles that would otherwise not exist if ANSI SQL was followed by all. The same table as above would be converted to ```CREATE TABLE DATABASE.SCHEMA.TABLE (```in Snowflake, but note that this will be a MULTISET table. It this all seems overwhelming, fear not! We have the definitive guide on converting Teradata tables to Snowflake at your disposal.

Are Stored Procedures ANSI?

But what about those sProcs? Just as with the previously mentioned tables, this is the same story: different architecture. Stored Procedures written for a legacy system like Oracle or Teradata are written in SQL, though procedural functionality can also be written in a proprietary scripting language like BTEQ for Teradata. We could ask if all of that procedural SQL is the same, but we already know the answer to that. (They're not.) The transformations can be more complex, but the idea is pretty much the same. 

Here’s an example of a simple stored procedure:

REPLACE PROCEDURE DATABASE.FOR_STATEMENT()
BEGIN
FOR fUsgClass AS cUsgClass CURSOR FOR
     SELECT
           
COL0

          ,TRIM(COL1) AS COL1ALIAS
          ,TRIM(COL2)
          ,COL3
     FROM DATABASE.Some_Other_Table
DO
     SET lvSeqNo = lvSeqNo + 1;
     SET UPDATE1 = fUsgClass.COL1ALIAS + 1;
     SET UPDATE2 = fUsgClass.COL3 + 123;
END FOR;
END;

We have some basic procedural commands like FOR and DO in this procedure along with a CURSOR statement. We’re using those to update the values in our columns. It’s relatively simple as far as sProcs go, and depending on what you are trying to do, quite useful. However, will this procedure work in Snowflake? You guessed it. Nope. How will we ever overcome this tragedy?

Snowflake currently uses JavaScript as its primary scripting language for stored procedures. So if you tried to import this directly from Teradata into Snowflake, it would fail. Not just because of a few differences in syntax, but currently the ability to create a cursor statement like the above requires calling JavaScript into the Snowflake SQL. The above stored procedure requires us to recreate the cursor functionality along with the ability to return values into a temporary table. That's not as simple as it sounds though.

Luckily, that's pretty much our secret sauce: recreating functional equivalent code across source and target code languages. A stripped down version of a functionally equivalent output to the above stored procedure in JavaScript might look like this:

CREATE OR REPLACE PROCEDURE DATABASE.PUBLIC.FOR_STATEMENT()
RETURNS STRING
LANGUAGE JAVASCRIPT
EXECUTE AS CALLER
AS
$$
{
//insert Mobilize.Net JavaScript helper classes here

for(var CUSGCLASS = new CURSOR(
`SELECT

COL0,
TRIM(COL1) AS COL1ALIAS,
TRIM(COL2),
COL3
FROM DATABASE.PUBLIC.Some_Other_Table`,[]).OPEN();
CUSGCLASS.NEXT();) {

var FUSGCLASS = CUSGCLASS.CURRENT;
lvSeqNo = lvSeqNo + 1;
hola = FUSGCLASS.COL1ALIAS + 1;
HOLA2 = FUSGCLASS.COL3 + 123;
}
CUSGCLASS.CLOSE();
}
      $$;

The entire code with all of the helper classes and variables created in a code migration are not populated above simply because it would make this blog twice as long as it already is. If you want to see the complete output code, check out our demo code samples available with our documentation for SnowConvert for Teradata (Mobilize.Net's code migration product for Teradata to Snowflake migrations).

Overcoming ANSI in a Migration

The same action is accomplished, and our sProc works just as well in Snowflake as it ever did in Teradata. Except that it’s in JavaScript now. This conversion from SQL to JavaScript may be pretty intimidating up front for anyone considering a move to Snowflake from a legacy data platform like Teradata. But it doesn't have to be. The procedural language may be different, but so is the SQL. ANSI SQL exists, but it isn't exclusively used by any major platform right now.

There will always be extensions to the code when migrating from one data warehouse to another. That shouldn't prevent you from taking advantage of all that a new platform has to offer. The cost of moving shouldn't stop anyone from moving to a platform that offers the best solution for their business needs. You just need the right movers to make sure everything works on the other side of the migration. Luckily, Mobilize.Net has that experience. We can set you up for success if you're looking to make the move to Snowflake. Until then, live long and prosper.

Topics:MVCSnowflakesql conversionTeradata

Comments

Subscribe to Mobilize.Net Blog

Download Free Trial