All data in Snowflake is stored in database tables, logically structured as collections of columns and rows.
Two principal concepts are used by Snowflake to physically maintain Table Structures for best performance. They are:
- Micro-Partitions
- Data-Clustering
Micro-Partitions – Snowflake stores the data in a table using columnar fashion divided into a number of micro-partitions which are contiguous units of storage of size around 50 MB to 500 MB of uncompressed data. Unlike traditional databases, partitions aren’t static and aren’t defined and maintained by users but are automatically managed by Snowflake data warehouse in a dynamic fashion.
A table can consist of thousands or millions of micro-partitions in Snowflake depending on the table size. This data will be broken into multiple micro-partitions in Snowflake.
Data-Clustering – Data-Clustering is like the concept of sort-key available in most massively parallel processing (MPP) databases.
Snowflake automatically does the job of clustering on the tables, and this natural clustering process of Snowflake is good enough for most cases and gives good performance even for big tables.
However, if a user wants to do manual clustering, there is a concept of the clustering key which can be defined on the table by the user. Snowflake uses this key to do the clustering on the table. This can be useful only for very large tables.
DataTerrain with years of experience and reliable experts is ready to assist. We have served more than 200 customers in the US and over 70 customers worldwide. We are flexible in working hours and do not need any long-term binding contracts.