At Snowday 2023, Snowflake unveiled a series of groundbreaking updates within the Snowflake ecosystem. From enhanced data governance to cutting-edge AI integration, these innovations are set to transform how businesses manage and leverage their data. This post explores the key features introduced at Snowday 2023 and how they will impact the future of the Snowflake ecosystem.
Christian Kleinerman, Senior Vice President of Product at Snowflake, took center stage to unveil a myriad of innovative features in the Snowflake ecosystem. Let’s delve into each segment:
Data Foundation: Elevating the core of the platform
The Data Foundation segment stood out for its implications in everyday platform use, introducing interface enhancements and features designed to simplify user interactions.
Unistore, for mixed OLTP and OLAP Workloads
Unistore, initially introduced as a new Workload at the SUMMIT, progressed to a Public Preview of HYBRID TABLES, scheduled for late 2023. This feature allows the fusion of OLTP tasks with traditional OLAP usage, enabling real-time analytics. For instance, an e-commerce company could leverage Unistore to record user interactions in real-time and fuel real-time analytics, enhancing the personalization of the user experience.
Furthermore, Unistore has the potential to streamline operations, reducing the number of databases and pipelines by allowing fast single-point operations in Snowflake. This, in turn, accelerates the development of new features and use cases.
More Data Lake versatility with, managed or not, iceberg tables
Another highly anticipated feature is the integration with Apache ICEBERG TABLES standard, featured on TechRadar by Devoteam, entering Public Preview with two integration options: UNMANAGED and FULLY MANAGED ICEBERG TABLES. This integration provides additional versatility to all workloads within the platform, whether in Data Warehousing, Data Lakes, or Data Lakehouses.
Performance Improvements everywhere for traditional workloads
Focusing on corporate DATA LAKES, Snowflake announced the General Availability of features for handling semi-structured data and introduced Dynamic File Processing with Snowpark in Python and Scala. The improvements extend to traditional DATA WAREHOUSES with innovations like automatic clustering cost estimation, materialized view refresh enhancements, and support for INSERT statements for the query acceleration service, all in private preview.
These enhancements, often unnoticed by the common eye, significantly contribute to the enhancement of the SNOWFLAKE PERFORMANCE INDEX. This index, incorporated into the platform’s web interface since August 2022, indicates a 15% improvement in platform performance compared to a year ago.
Innovations in the Snowflake Ecosystem: Data governance with Snowflake Horizon
The highlight of this block centers on the Snowflake Ecosystem’s Data Governance framework, known as SNOWFLAKE HORIZON. This comprehensive solution addresses key areas like Compliance, Security, Privacy, Interoperability, and Access. With a focus on providing full control over data across all platforms and regions within the Snowflake Ecosystem, Horizon ensures that businesses remain compliant with both local and international regulations.
Many announcements were made regarding the Security of the platform:
- ENHANCED NETWORK SECURITY (Public Preview): Improvements in the management of blacklists and whitelists for IP access.
- IMPROVED AUTHENTICATION (Public Preview coming soon): Enhancements in authentication mechanisms.
- DATABASE ROLES (General Availability): Introduction of new roles at the database level for expanded Role-Based Access Control (RBAC) capabilities.
However, what stood out for me in terms of Security Governance was the announcement of CIS: SNOWFLAKE FOUNDATION BENCHMARK. This serves as a catalog of recommendations and best practices, ensuring a consistent and robust security policy. Coupled with the upcoming TRUST CENTER (Private Preview coming soon), and a new section in the graphical interface, Snowflake is bolstering its security and compliance features.
Privacy also takes center stage with the announcement (still in development) of DIFFERENTIAL PRIVACY POLICIES. These policies aim to add layers of “noise” to data, making it less identifiable as granularity increases, contributing to improved data privacy.
In terms of Interoperability, features already mentioned in other sections of the Keynotes focus on the platform’s compatibility with new cataloging standards such as external catalogs, Iceberg Tables catalogs, and access to Iceberg tables from Rest APIs for Snowpark.
The final pillar of Horizon, Access, was addressed through features like AUTO-CLASSIFICATION (still in development) and CUSTOM CLASSIFIERS (Private Preview), both leveraging the capabilities of LLM and AI to assign custom classifications to objects automatically.
SNOWFLAKE COPILOT (in Private Preview) caught my attention for its potential to bring natural language interaction features, akin to ChatGPT, through LLM. This promises to facilitate building SQL statements from natural language requests, ushering in a new era of interaction within Snowflake’s graphical environment.
Cost Management: Enhancing financial control
The introduction of the COST MANAGEMENT INTERFACE empowers administrators with tools for enhanced financial control. Notable features include COST INSIGHTS, offering real-world examples for optimization, and the BUDGETING view (in Public Preview on AWS), providing a comprehensive overview of budgeted and actual spending.
Visibility, through charts and dashboards showing workload metrics per warehouse, is useful for evaluating scaling or considering increasing capacity through multi-clustering. Cost-per-query charts to identify queries that may need reengineering and more.
Putting efforts into Optimization, they focus on optimization through a COST INSIGHTS section, providing real use cases in our account where a situation that could be improved has been detected. It offers an explanation of the best applicable practice for that specific case and a guide on how to apply it. It is extremely interesting and a lifesaver for many account administrators.
As a final point in this section, they revisited the BUDGETING view (in Public Preview on AWS), where users can preview and compare budgeted and actual spending over time. This budget tracking can be automated with email or message notifications when certain non-compliance thresholds are exceeded. Budgets themselves can be configured individually for resources (by Database, Schema, Table, or Warehouse).
From this point on, we delve into the trending topic of Artificial Intelligence, Large Language Models, and Machine Learning. We all know a trend is here to stay and has long become the new paradigm shift that will impact all aspects of our lives.
Snowflake wasn’t going to be left behind, reacting months ago with the acquisition of Neeva and swiftly incorporating the expertise of its professionals into the core of the Data Cloud.
Enhancing machine learning in the Snowflake Ecosystem
Snowflake’s foray into Machine Learning (ML) within the Snowflake Ecosystem is marked by the upcoming General Availability of the ML MODELING API. This introduction brings key features such as FEATURE ENGINEERING, TRAINING, the SNOWFLAKE MODEL REGISTRY, and the SNOWFLAKE FEATURE STORE (currently in Private Preview), which aim to redefine ML capabilities and expand the potential of data-driven decision-making across the Snowflake Ecosystem.
In order to ease the transition for Data Analysts and Data Scientists familiar with conducting sampling and training within Notebooks, Snowflake will introduce SNOWFLAKE NOTEBOOKS (currently in Private Preview). Developed using Streamlit, this new interface will allow development in cells that support Python code, Streamlit, SQL, and Markdown—providing a seamless experience for data professionals working across different stages of the ML workflow.
For more advanced ML applications that require additional functionalities or power beyond the API, Snowflake proposes using the SNOWPARK CONTAINER SERVICE (soon to be in Public Preview). This service allows entire applications developed in Snowpark to be published in containers, similar to Kubernetes, and run within the Snowflake Ecosystem—leveraging optimized capabilities for flexible and efficient computing.
SNOWFLAKE CORTEX: Elevating AI accessibility
Snowflake’s total commitment to bringing the adoption of Artificial Intelligence to its customers’ workloads (AI for Everyone) has materialized in the CORTEX engine.
Comprising many features of services managed by Snowflake (Serverless) to provide access to cutting-edge LLM and AI models in the industry easily and quickly, helping democratize their use.
SPECIALIZED FUNCTIONS (In Private Preview): These functions aim to incorporate Translation, Sentiment Analysis, Summarization, and Extraction of Answers capabilities into queries and applications.
GENERAL FUNCTIONS, incorporating existing functions in leading LLM standards, such as LLAMA2, were also introduced. Well-known functions like LLM Inference, Vector as Native Data Type for Llama2, Complete, Txt2SQL EMBED_TEXT, Vector_L2_Distance, etc., are part of this offering.
The demonstration of these features showcased Snowflake’s substantial effort in making CORTEX a differentiating element compared to its competitors.
Scale with Applications: Native integration evolution
As a follow-up to the announcements made at the SUMMIT regarding the integration of native applications in Snowflake, they provided a few more updates:
As part of the application development lifecycle, they announced DATABASE CHANGE MANAGEMENT (in Private Preview). This feature will allow the execution of scripts directly against our account from a Git repository (thanks to the integration with GitHub announced at the SUMMIT). It even supports DML statements, and it enables us to manage CI/CD for development and the underlying Snowflake data model with incremental change analysis, etc.
Another feature related to this application integration announced at the event was the NATIVE APP FRAMEWORK (in Private Preview, soon to be in General Availability on AWS). It’s an environment for deploying and consuming native applications within the Data Cloud itself. To accelerate its development, they announced a $100 million investment in startups to assist in the development of these Native Apps.
Future-Proofing your business with the Snowflake Ecosystem’s innovations
As we eagerly anticipate further announcements at the year-end partner event, Snowflake has undeniably delivered robust features. These developments solidify Snowflake’s position as a trailblazer in the cloud data platform space. The platform’s unwavering focus on performance, governance, AI integration, and financial control positions Snowflake as a comprehensive solution for tech experts and organizations navigating the evolving landscape of data management.
Snowflake’s continuous innovation is transforming the data landscape, and we are committed to keeping you informed at every step of this transformative journey.
Explore the Snowflake Ecosystem Today.
Harness cutting-edge innovations to optimize data, boost performance, and integrate AI seamlessly. Let Snowflake transform your business with advanced features.