Blog Excerpt The new normal has seen Enterprise Leaders push the envelope on their BI(Business Intelligence) implementations to deliver on the dual mandate of growing revenue and optimizing costs. As data volumes and availability continue to grow, the emphasis is on democratizing data and enhancing collaborations amongst the field teams. Actionable advanced data visualizations and interactive business dashboards have made advanced analytics accessible to all levels of an organization. BI tools continue to innovate in allowing end-user organizations to customize for their specific needs. Organizations – big and small are looking for cloud-based simplification of data security and discovery while expanding the usage of collaborative BI and AI(Artificial Intelligence). Snowflake BI and Analytics for improved collaboration The Snowflake platform provides a multi-cluster shared data architecture that powers the data cloud in a fully managed environment. It allows enterprises to handle their most critical workloads on the vast computing power and resources available on the cloud. DataTerrain SMEs(Subject Matter Experts) have observed that the major differentiator of Snowflake is its ability to drive collaboration. Analysts across organizations have also highlighted the simplicity of discovery, acquisition, and monetization of data, data services, and applications across clouds using Snowflake. Some of the key […]
Read More +Category: Snowflake
HOW TO CHANGE TRACKING USING TABLE STREAMS IN SNOWFLAKE
A stream is a Snowflake object that provides change data capture (CDC) capabilities to track the changes in a table. It records changes made to a table, including information about inserts, updates and deletes as well as metadata about each change. This process is referred to as change data capture (CDC). Types Of Streams: Standard table stream tracks all DML changes to the source table, including inserts, updates, and deletes (including table truncates). Append-only table stream tracks row inserts only. Update and Delete operations (including table truncates) are not recorded. An append-only stream returns the appended rows only and therefore can be a better choice than a standard stream for extract, load, transform (ELT) and similar scenarios that depend exclusively on row inserts. Append-only table streams can be combined with Snowflake tasks for continuous ELT workflows to process recently changed table rows. STALE Streams: A stream becomes stale if its point of offset gets positioned at a point earlier than the table’s data retention period. In order to avoid staleness we should ensure that data from the stream is consumed regularly. Consuming stream data advances the stream offset point as a result of which it avoids stream becoming stale. ENABLE […]
Read More +An overview Snowflake Analytic Datawarehouse
Snowflake is an analytic data warehouse provided as Software-as-a-Service (SaaS). It is not a packaged software offering that can be installed by a user.Snowflake manages all aspects of software installation and updates.Snowflake runs completely on cloud infrastructure. It cannot be run on private cloud infrastructures (on-premises or hosted).All components of Snowflake’s service (other than an optional command line client), run in a public cloud infrastructure. It uses virtual compute instances for its compute needs and a storage service for persistent storage of data. Snowflake provides a data warehouse that is faster, easier to use, and far more flexible than traditional data warehouse offerings.Snowflake’s data warehouse is not built on an existing database or “big data” software platform such as Hadoop. It uses a new SQL database engine with a unique architecture designed for the cloud. To the user, Snowflake has many similarities to other enterprise data warehouses, but also has additional functionality and unique capabilities. Snowflake’s data warehouse is a true SaaS offering. More specifically: ● There is no hardware (virtual or physical) for you to select, install, configure, or manage. ● There is no software for you to install, configure, or manage. ● On-going maintenance, management, and tuning is […]
Read More +Overview of the architecture of Snowflake Analytic Data Warehouse
Snowflake analytic data warehouse architecture is a hybrid of traditional shared-disk database architectures and shared-nothing database architectures. Similar to shared-disk architectures, Snowflake uses a central data repository for persisted data that is accessible from all compute nodes in the data warehouse. But similar to shared-nothing architectures, Snowflake processes queries using MPP (massively parallel processing) compute clusters where each node in the cluster stores a portion of the entire data set locally.This approach offers the data management simplicity of a shared-disk architecture, but with the performance and scale-out benefits of a shared-nothing architecture.Snowflake’s unique architecture consists of three key layers:● Database Storage● Query Processing● Cloud ServicesDatabase StorageWhen data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format. Snowflake stores this optimized data in cloud storage.Snowflake manages all aspects of how this data is stored — the organization, file size, structure, compression, metadata, statistics, and other aspects of data storage are handled by Snowflake. The data objects stored by Snowflake are not directly visible or accessible by customers; they are only accessible through SQL query operations run using Snowflake.Query ProcessingQuery execution is performed in the processing layer. Snowflake processes queries using “virtual warehouses”. Each virtual warehouse […]
Read More +Overview of Time Travel feature in Snowflake
Snowflake Time Travel enables us to perform the following tasks:● Accessing data that has been changed or deleted at any point within a retention period.● Restoring tables, schemas, and databases that have been dropped.● Create Clones of tables, schemas and databases at or before specific points in the past.These tasks can be performed until the retention period, after which the data is moved into Snowflake Fail-safe. Time Travel SQL Extensions:In Snowflake Time Travel to access or clone historical data we use the Time Travel SQL Extensions AT or BEFORE clause which can be specified in SELECT statements and CREATE … CLONE commands. To pinpoint the exact historical data which you wish to access we use the below parameters with the SQL Extensions:TIMESTAMPOFFSET (time difference in seconds from the present time)STATEMENT (identifier for statement, e.g. query ID) Restoring Objects:A dropped table or schema or database that has not yet been purged from the system (i.e. not yet moved into Snowflake Fail-safe) can be restored using the UNDROP command. Calling UNDROP restores the object to its most recent state before the object was dropped. Enabling and Disabling Time Travel:Time travel is automatically enabled in Snowflake with 1-day data retention period by default […]
Read More +HOW TO SHARE DATA IN SNOWFLAKE
Snowflake enables sharing of data through named Snowflake objects called shares which supports data sharing by sharing tables, secure views, and secure UDFs in our Snowflake database (Data Provider) with other Snowflake accounts (Data Consumer). If the data consumer does not have a Snowflake account then we can create a reader account through which we can share data without requiring the consumer to become a Snowflake customer. All database objects shared between accounts are read-only so data consumers cannot make any change to the data. In secure data sharing, data is not actually copied or transferred between accounts; it means that shared data does not take up any storage in a consumer account. The only charges to consumers are for the compute resources used to query the shared data. In the case of a reader account, the provider will be charged for the compute resources used by users in the reader account. Data shares and reader accounts must be located in the same region of the same Cloud Provider (AWS, Azure or GCP). Data consumers cannot re-share the shared data with other accounts. In reader accounts data consumers are allowed to create new users. To start sharing data we need […]
Read More +Snowflake Insights Webcast: Leave Your Legacy Data Problems Behind
Snowflake Insights Webcast: Leave Your Legacy Data Problems Behind No images? Click here Leave Your Legacy Data Problems Behind With Snowflake’s Modern Cloud Data Platform Webcast: Wednesday, Aug 12, 10 AM PT Reserve my Seat Presenter: Eric Pouliot Snowflake delivers a serverless, modern cloud data platform, optimized for data engineering, data lakes, data warehouses, data applications, data science, and data exchange. Built from the ground up in the cloud, Snowflake provides limitless possibilities for businesses of all sizes. This Precisio presentation will highlight High Level Overview Key Benefits Basic Pricing Getting Started Together, DataTerrain and Precisio provide enterprise-class, cloud-ready data engineering, business intelligence, analytics, and data science solutions. Reserve my Seat DataTerrain Inc 228 Hamilton Ave, Palo Alto, CA 94301, USA Preferences | Unsubscribe
Read More +Overview of Continuous and Automated Data loading in Snowflake using Snowpipe
Snowpipe is Snowflake’s continuous data ingestion service. Snowpipe loads data from files as soon as they are available in a staging area based on the COPY statement defined in a referenced pipe. The COPY statement identifies the source location (e.g. AWS S3 Bucket) of the data files and a target table to be used by Snowpipe to load data into Snowflake table. Below mechanisms are available for detecting and loading the staged files: Automating Continuous Data Loading Using Cloud Messaging Calling Snowpipe REST Endpoints to Load Data Automating Continuous Data Loading Using Cloud Messaging – Automated data loads uses event notifications (e.g. AWS SQS) for cloud storage to inform Snowpipe of the arrival of new data files to load. Snowpipe then copies the files into a queue, from which they are loaded into the target table in a continuous, serverless fashion based on parameters defined in the pipe.Calling Snowpipe REST Endpoints to Load Data – Your client application calls a public REST endpoint with a list of data filenames and a referenced pipe name. If new data files matching the list are discovered in the stage, they are queued for loading. Snowflake-provided compute resources load data from the queue into […]
Read More +