Monitoring Amazon Redshift Query Performance

So far we have looked at how the knowledge of the data that a data analyst carries can help with the periodical maintenance of an Amazon Redshift Cluster. Knowing the nature of the data we work with, can help us to maximize the potential of our cluster by using tools like the Column Compression Encoding of a table and the Vacuuming process  mechanism.

When we talk about maximize the potential of a cluster, we usually look at two main metrics. The first is its capacity, i.e. the amount of data we can load into it. The second is the time it takes for our Amazon Redshift Cluster to answer our queries.

So, no matter how many tools we have for optimizing our cluster, if we are not aware of its performance and more specifically the query execution time, we cannot use the knowledge of our data together with the provided tools for optimization. For this reason, Monitoring the Query Performance on our cluster should be an important part of our cluster maintenance routine.

No matter how many tools we have for optimizing our cluster, if we are not aware of its performance and more specifically the query execution time, we cannot use the knowledge of our data together with the provided tools for optimization.

In this chapter, we discuss how we can monitor the Query Performance on our Amazon Redshift instance.

Monitoring the Query Performance using the AWS Console

The easiest way to check how your queries perform is by using the AWS Console. From the cluster list, you can select the cluster for which you would like to see how your queries perform. There, by clicking on the Queries tab, you get a list of all the queries executed on this specific cluster.

For each query, you can quickly check the time it takes for its completion and at which state it currently is. Also, you can monitor the CPU Utilization and the Network throughput during the execution of each query.

The AWS Console gives you access to a bird’s eye view of your queries and their performance for a specific query, and it is good for pointing out problematic queries. Amazon Redshift also offers access to much more information, stored in some system tables, together with some special commands. All of these can help you debug, optimize and understand better the behavior and performance of queries.

STL_ALERT_EVENT_LOG

The STL_ALERT_EVENT_LOG table logs an alert every time the query optimizer identifies an issue with a query. You can use these alerts as indicators on how to optimize your queries.

When you get an alert on the table, the command ANALYZE can be used to update the statistics of a table and point out how to correct a problem, e.g. vacuuming might be required.

SVV_TABLE_INFO

The next important system table that holds information related to the performance of all queries and your cluster is SVV_TABLE_INFO. That table contains summary information about your tables.

After you have identified a query that is not performing as desired, using information from the AWS Console and the STL_ALERT_EVENT_LOG, you can consult this table for hints on how the tables that participate in a query might affect its performance. To be more precise, this is a view that utilizes data from multiple other tables to provide its information.

This view contains information that might help an analyst identify what is causing the deterioration of a query, as it contains information linked to Compression Encoding, Distribution Keys, Sort Styles, Data Distribution Skew and overall table statistics. For example,

  • The empty column contains the number of blocks eligible to be freed by a vacuum command. IMPORTANT – this is a deprecated column to removed in future releases.
  • The unsorted column contains the percentage of rows that are currently unsorted. That* might indicate that a full vacuum or a vacuum sort only might help explain the current performance of a query.
  • Sortkey1_enc contains information for the compression encoding of the first column in the sort key.
  • Skew_sortkey1 is the ratio of the largest non-sort key column’s size to the size of the first column of the sort key. Have a look at this if you worry for the effectiveness of your sort keys.
  • Skew_rows is usually used to evaluate the effectiveness of your current distribution strategy.
  • Max_varchar contains the largest column’s size with a datatype of VARCHAR. Columns of this data type, while compressing quite well, might not fit in memory and thus trigger the creation of a temporary table. That is something that can affect your queries’ performance.

Amazon Redshift offers a wealth of information for monitoring the query performance. There are both visual tools and raw data that you may query on your Redshift Instance. A combined usage of all the different information sources related to the query performance can help you identify performance issues early. Figure out what causes them and together with the input from an analyst, improve them significantly.

Monitoring Disk Space

Another factor of a cluster that you should monitor closely, which affects the performance of your queries and you can manage it by both VACUUMING and the proper selection of Compression Encodings for your columns is the cluster’s free disk space.

Temp tables are often created when you execute queries, and if your cluster is full then these tables cannot be created, so you might start noticing failing queries.

Temp tables are often created when you execute queries, and if your cluster is full then these tables cannot be created, so you might start noticing failing queries.

To monitor your current Disk Space Usage, you have to query the STV_PARTITIONS  table. It contains information related to the disk speed performance and disk utilization. For example, the following query prints information about the capacity used for each of the cluster’s disks, the percentage that currently used, at which host each disk is and who is the owner.

select owner, host, diskno, used, capacity,
(used-tossed)/capacity::numeric *100 as pctused 
from stv_partitions order by owner;

If usage percentage is high, we can Vacuum our tables or delete some unnecessary tables that we might have. If utilization is uneven, then we might want to reconsider the distribution strategy that we follow.Examining the results can help us to quickly see if data is not evenly distributed across the disks of our cluster and their current usage.

Useful Resources

Your starting point regarding the Monitoring of your Query Performance should be the AWS Console. It offers an excellent view of all your queries and some vital statistics that can help you quickly identify any issues.

Amazon also provides some auxiliary tools that use the information stored in the system tables of Amazon Redshift to offer more detailed monitoring. You can check this monitoring solution which is using Amazon Cloudwatch and Amazon Lambda to perform more detailed cluster monitoring.

load data into any data warehouse - Blendo