HowTo: SQL type casting

Eleni Markou

While in an ideal world each column in a database table has a suitably chosen data type, in this non-ideal place we live in, having stored dates in a wrong format, is a problem that the majority of those who wrote SQL has faced. This is a common consequence when different applications use the same data in different ways than the one you originally defined.

Handling columns with wrong data types can become cumbersome easily, as it does not allow you to perform comparisons or logical operations the way you may wish. So the need of casting columns into different type than the one currently defined arises.

Luckily for us, the majority of the top used databases have implemented helpful functions that one can use to perform this type of tasks easily. Unfortunately, these functions we are talking about, are not exactly the same across all database systems.

Having created a neat cheatsheet of the majority of the cases one may face, can save a lot of time and effort from searching in long documentation pages.

In this post, we are going to present the more common castings one may wish to perform along with the SQL code to do so, when using Amazon Redshift, PostgreSQL, Google BigQuery or SQL Server.

In addition, we are going to investigate what happens in the case of fields that represent currency and what is the most appropriate way to store them in different databases.

Microsoft SQL Server

If you use Microsoft SQL Server and want to convert a value from one data type to another the two available functions that will help you are cast and convert which both provide similar functionality.

String to Date and Timestamp

The CONVERT() function takes as input the current valid expression and the target data type and performs the appropriate conversion.

Convert syntax: CONVERT(target_data_type(length),expression,style)

Regarding the supported representation styles, MS SQL Server offers a quite large variety as shown in the following table.

Date as inputFunction Call
02/08/17CONVERT(varchar,date,101)
17.02.08CONVERT(varchar,date,102)
08/02/17CONVERT(varchar,date,103)
08.02.17CONVERT(varchar,date,104)
08-02-17CONVERT(varchar,date,105)
08 02 17CONVERT(varchar,date,106)
February 08, 17CONVERT(varchar,date,107)
10:59:00CONVERT(varchar,date,108)
10:59:00AM(or PM)CONVERT(varchar,date,109)
02-08-2017CONVERT(varchar,date,110)
17/02/08CONVERT(varchar,date,111)
170208CONVERT(varchar,date,12)
20170208CONVERT(varchar,date,112)
08 02 2017 10:59:00 (in 24h)CONVERT(varchar,date,113)

String to Numeric

Regarding the conversion of a string into a numeric data type such as int and float, cast is the suitable function. Its syntax is similar to convert, taking as input the column name or the literal along with the target data type.

Cast syntax: CAST ( expression AS data_type )

Number as inputFunction Call
5CAST(number AS int)
2.3CAST(number AS float)

Fortunately, MS SQL Server is clever enough to raise an error when a malformed input data is given, such as “Conversion failed when converting date and/or time from character string.

Numeric to Currency

Although MS SQL server does support a special data type for currency, called money, the usual advice is to prefer the decimal data type since money is a residue of an older version of SQL Server where the bytes that a decimal number occupied were fixed to 17.

At that time, money data type reserved only 8 bytes which were obviously sufficient for currency subdivisions.

Since MS SQL Server 2005 things have changed and the user can specify the bytes that each field of decimal type occupies thus making money deprecated.

In case one wants to also include the currency sign, the format function is suitable. The following examples combine everything we have mentioned so far regarding currency.

DECLARE @Price Decimal(21,6) =  1500.50
SELECT FORMAT(@Price,'c','en-US') AS 'CURRENCY IN US Culture'
SELECT FORMAT(@Price,'c','ar-EG') AS 'CURRENCY IN LE Egypt Culture'
SELECT FORMAT(@Price,'c','ms-MY') AS 'CURRENCY IN Malaysia Culture'

Postgresql

String to Date and Timestamp

The TO_DATE function in PostgreSQL is used to converting strings into dates. Its syntax is TO_DATE(text, text) and the return type is a date.

In contrast with MS SQL Server which has strictly specified date formats, in Redshift, any format constructed using the patterns of the table found in the corresponding documentation can be correctly interpreted.

When using the TO_DATE() one has to pay attention as even if an invalid date is passed, it will convert it into a nominally valid date without raising any error.

Below we have intentionally declared wrong date format and nevertheless, PostgreSQL returns a valid, yet meaningless, date.

Input textFunction usedOutput date
2017-05-03to_date(input_text, ‘DD-MM-YYYY’)0008-11-06

This can be overcome by using explicit casting with the :: annotation but in this case, this won’t work with as a wide range of date formats as TO_DATE() .

Date as inputFunction Call
2017/02/08TO_DATE(date, ‘YYYY/MM/DD’)
08/02/2017TO_DATE(date, ‘DD/MM/YYYY’)
08-02-2017TO_DATE(date, ‘DD-MM-YYYY’’)
02082017TO_DATE(date’, ‘MMDDYY’)
February 08, 2017TO_DATE(date’, ‘Month DD, YYYY’)

The TO_TIMESTAMP function converts a string value into proper timestamp with timezone following the specified format. This function takes as input the initial string and the desired target format. In order to construct format strings, you can use the template patterns for formatting date and time values which can be found here.

Timestamp syntax: to_timestamp(text, text)

Datetime as inputFunction Call
02/08/2017 10:59:00TO_TIMESTAMP(datetime, ‘MM/DD/YYYY HH24:MI:SS’) or TO_TIMESTAMP(datetime, ‘MM/DD/YYYY HH12:MI:SS’) if in 12h format
2017/08/02 10:59:00TO_TIMESTAMP(datetime, ‘YYYY/MM/DD HH24:MI:SS’) or TO_TIMESTAMP(datetime, ‘YYYY/MM/DD HH12:MI:SS’) if in 12h format
08-02-2017 10:59:00TO_TIMESTAMP(datetime, ‘DD-MM-YYYY HH24:MI:SS’) or TO_TIMESTAMP(datetime, ‘DD-MM-YYYY HH12:MI:SS’) if in 12h format
10:15:20TO_TIMESTAMP(datetime, ‘HH24:MI:SS’)::TIME or TO_TIMESTAMP(datetime, ‘HH12:MI:SS’)::TIME if in 12h format

String to Numeric

In PostgreSQL strings can be converted to integer or double using CAST()function or the ::annotation. Both are equivalent and you can choose whichever you like.

Number as inputFunction call
5CAST(number AS int) or alternatively number::int
23CAST(number AS double precision) or alternatively number::double precision

If a column contains money data you should keep in mind that floating point numbers should not be used to handle money due to the potential for rounding errors.

A money value can be cast to numeric without loss of precision. Conversion to other types could potentially lose precision, and must also be done in two stages:

Number as inputFunction call
13.4Number::money::numeric::float8, cast (price as money)

In case of malformed input, PostgreSQL will not perform any conversion and instead will raise an appropriate error.

Amazon Redshift

Amazon Redshift comes with a batch of handy functions that help you in performing run-time conversions between compatible data types.

String to Date

When in comes to date conversions, the most suitable one is the TO_DATE(). This function takes two arguments: the first is a string that represents some date and the second one the format in which we want to represent this specific date.

Since Amazon Redshift is based on PostgreSQL 8.0.2, what is previously mentioned about the TO_DATE()function still holds.

In short, one has to pay attention not providing the TO_DATE() function malformed date format as this won’t raise any error and instead will perform a meaningless type conversion.

Regarding the supported input date format, whatever synthesized from the documented patterns found here, is a valid format and thus interpretable from Redshift without further implications.

Some indicative examples follow:

To_date syntax: TO_DATE (string, format)

Date as inputFunction Call
2017/02/08TO_DATE(date, ‘YYYY/Mon/DD’)
08/02/2017TO_DATE(date, ‘DD/Mon/YYYY’)
08-02-2017TO_DATE(date, ‘DD-Mon-YYYY’’)
02082017TO_DATE(date’, ‘MonDDYY’)
08 Feb 2017TO_DATE(date, ‘DD Mon YYYY’)
18:45:09TO_DATE(date, ‘HH24:MI:SS’)

String to Numeric

Regarding conversions from strings to other data types cast and convert function or the equivalent form of a cast which makes use of the :: annotation,  should serve your needs well.

The cast function takes as input an expression that evaluates to one or more values, such as a column name or a literal and returns the data types specified by the type argument.

Cast syntax: CAST ( expression AS type ) or equivalently expression :: type

The convert function takes as input one of the supported data types, as the target data type and an expression that evaluates to one or more values, exactly as the cast function does. The return type is the data type specified by the previously selected type argument.

Convert syntax: CONVERT ( type, expression )

Number as inputFunction call
2CAST(number AS integer)

or alternatively

number::integer

or alternatively

convert(integer, number)

2.3CAST(number AS decimal(1,1))

It is important to specify correctly the precision you want.

For example, for input 2.33 you should choose:

CAST(number AS decimal(1,2))

 

In case of malformed input, Redshift will not perform any conversion and instead will raise an appropriate error.

Numeric to Currency

Unfortunately, Redshift does not support any currency data type. One way to format a float as currency is using the to_char function as shown below:

to_char(price, 'FM$999,999,999,990D00')

Google BigQuery

As with other databases, in Big Query one can perform explicit conversion using appropriate functions such as cast.

String to Date and Timestamp

When handling dates declared as strings in BQ there are two options to point out. The first option involves the use of timestamp and parse_date functions in order to convert the initial string into a date format.

The relevant syntax that can be used is presented in the following table.

Date as inputFunction Call
2017/02/08TIMESTAMP(PARSE_DATE(‘%Y/%m/%d’,date))
08/02/2017TIMESTAMP(PARSE_DATE(‘%d/%m/%Y’,date))
08-02-2017TIMESTAMP(PARSE_DATE(‘%d-%m-%Y’,date))
02082017TIMESTAMP(PARSE_DATE(‘%d%m%Y’,date))

Yet, you have to keep in mind that BQ performs literal coercion in cases when the actual literal type is different from the type expected by the function in question.

This means that strings formatted as valid dates are implicitly coerced to dates and get treated as such when given as input to relevant functions.

String to Numeric

Cast function performs as in every other database mentioned in this post using the following syntax: CAST(expr AS typename).

Alternatively to cast one can also use the safe_cast which returns a null value in case when BQ is unable to perform the actual cast while cast would raise an error. Apart from this, safe_cast is identical to cast.

Number as inputFunction call
5CAST(number AS int64)
2.3CAST(number AS float64)

For BigQuery’s Legacy SQL you can alternatively use the function INTEGER(number)and FLOAT(number).

Outro

When working with data types, casting is a useful feat, but it is important to remember something rather obvious. Having a type converted or cast could be easy to be forgotten in subsequent queries. So take extra care when converting your types.

 

 

Images

  • Blog Post image made with Pablo by Buffer
  • Source Image: Unsplash

Get our latest posts in your email to stay on top of everyone.