Overview of the architecture of Snowflake Analytic Data Warehouse
Snowflake analytic data warehouse architecture is a hybrid of traditional shared-disk database architectures and shared-nothing database architectures.
Similar to shared-disk architectures, Snowflake uses a central data repository for persisted data that is accessible from all compute nodes in the data warehouse. But similar to shared-nothing architectures, Snowflake processes queries using MPP (massively parallel processing) compute clusters where each node in the cluster stores a portion of the entire data set locally.
This approach offers the data management simplicity of a shared-disk architecture, but with the performance and scale-out benefits of a shared-nothing architecture.
Snowflake’s unique architecture consists of three key layers:
• Database Storage
• Query Processing
• Cloud Services
When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format. Snowflake stores this optimized data in cloud storage.
Snowflake manages all aspects of how this data is stored — the organization, file size, structure, compression, metadata, statistics, and other aspects of data storage are handled by Snowflake. The data objects stored by Snowflake are not directly visible or accessible by customers; they are only accessible through SQL query operations run using Snowflake.
Query execution is performed in the processing layer. Snowflake processes queries using “virtual warehouses”. Each virtual warehouse is an MPP compute cluster composed of multiple compute nodes allocated by Snowflake from a cloud provider.
Each virtual warehouse is an independent compute cluster that does not share compute resources with other virtual warehouses. As a result, each virtual warehouse has no impact on the performance of other virtual warehouses.
The cloud services layer is a collection of services that coordinate activities across Snowflake. These services tie together all of the different components of Snowflake in order to process user requests, from login to query dispatch.
The services in this layer include:
• Infrastructure management
• Metadata management
• Query parsing and optimization
• Access control
The cloud services layer also runs on compute instances provisioned by Snowflake from the cloud provider.
DataTerrain, with years of experience and reliable experts, is ready to assist. We have served more than 250 customers in the US and over 70 customers worldwide. We are flexible in working hours and do not need any long-term binding contracts.