Why you should pair your powerful data warehouse with a collaborative data workspace
Quick level set so all readers are on the same page:
❄️Snowflake is a powerful and fast SQL data warehouse.
⬡ Hex is a collaborative data workspace for doing analytics and data science with Python and SQL.
There are many reasons they work well together, but in the spirit of being hex-y, we'll keep it to just six here.
Hex supports a vast selection of integrations and data sources, but we're especially proud to be a Snowflake Elite Technology Partner. So why do Snowflake and Hex work so well together?
The shape of water molecules causes nearly all ice crystals to form a six-sided structure, so although some snowflakes have 6 points rather than true sides, they're still pretty much hexagons. Hex loves hexagons (and all things that come in sixes). Do you really need 5 more reasons?
Many Python-based data science tools lack true SQL support, which means they also lack robust database configuration options. This presents not only user experience problems, but legitimate security concerns— there are often credentials floating around in environment variables or even worse, being copied and pasted into local notebooks and accidentally pushed to git. Either way, they're being handled regularly and by many people— this is generally Not Good.
Hex provides a secure connection process that, once configured by an administrator, allows users to easily query data without needing to handle usernames, passwords, or private keys. A Snowflake connection can be configured at the workspace level, making it available to all users (or specific groups of users), or at the project level where only the members of one project have access.
There's nothing worse than troubleshooting authentication issues across packages and tools. Configure your Snowflake connection once in Hex, and never worry about it again.
Even if you're confident that credentials are securely managed in your Python notebook, you don't want to write SQL in block quotes inside of a notebook cell, without syntax and error highlighting or any of the features you expect from a SQL IDE.
But users of notebooks don't tend to have a choice. If you want to get your data from a DB into your notebook, you're typically forced to blunder through this process. This leads to two problems.
First, SQL just takes longer to write without error highlighting, autocompletion, and proper formatting. It's less fun, you make more mistakes, and queries are harder to debug.
Second, to minimize writing SQL (see point #1), folks sometimes just run a SELECT * FROM all_the_tables
at the top of the notebook and then work with an in-memory version of the data. This, of course, is also slow— your local Python environment, Mac M1 chip and all, can't hope to compete with Snowflakes' enormous and optimized infrastructure.
This is what makes Hex such a perfect partner to Snowflake. Hex's data workspace includes true first-class support for both SQL and Python. You don't need to use a Python connector to query data from Snowflake; Just add a new SQL cell and write queries against your Snowflake data warehouse with full auto-complete, a built-in schema browser, smart formatting, and caching options. It's a night-and-day comparison.
This means you can easily and seamlessly make queries to Snowflake at any point in your project, swapping between Python and SQL wherever it makes sense. Analysts don't even need to switch between Hex and the Snowflake IDE to prototype queries; with the Hex schema browser, they can do all their work in one place.
Which leads to the next point...
Hex not only embraces SQL as a first-class citizen, it provides connection points between SQL and Python cells in a Hex project. You can parameterize SQL queries with Python variables from the rest of a project, allowing both for dynamic user input in queries and extremely complex chained queries— Or even queries that span multiple data sources.
Marketing team provide a .csv of the latest event attendees again rather than ETLing the data? No problem. Just load the file into a Hex project, read it as a dataframe, and use that as input to parameterize a Snowflake query:
You can even use complex jinja templating to build deeply complex queries with for loops, conditional execution, and more. Again, this lets data practitioners leverage the power and speed of Snowflake's distributed computing architecture, while maintaining the flexibility of a Python environment.
Hex powers complex analytical workloads, but it also serves data up to end users as curated and interactive data apps. These users expect low query latency and app-like responsiveness. Snowflake's performance and scalability means data scientists can run queries on enormous datasets behind the scenes, while still exposing a performant and usable interface to that data via a Hex app.
Hex also includes powerful caching so you don't accidentally re-run slow or costly queries. You can choose whether to always cache a query for end users, cache a query on a set schedule, or let users always run a fresh query.
Hex comes with built-in support for Snowflake’s Snowpark for Python API, which lets users write familiar Python code that’s executed on Snowflake, taking advantage of the massive scalability and distributed compute of the cloud data warehouse.
Under the hood, Snowpark operates with DataFrames just like those used by pandas or other Hex cells, with one major difference: Snowpark DataFrames are executed lazily, so data isn’t pulled into Hex until a user runs an evaluation on it. This means that for many workflows, Snowpark DataFrames are more efficient than Pandas DataFrames, and can help avoid memory limits common in traditional notebook workflows.
Snowpark can be an involved process to configure in other environments (see point #2). With Hex, it literally takes just two clicks. You can even get a Snowpark dataframe as the returned result from any SQL cell— and pass them into native Hex tables and charts to take full advantage of Snowpark's pushdown compute.
Want to learn more about Hex's Snowpark support? Check out this hands-on lab that guides you through a real machine learning project using Snowpark and Hex.