How to Query JSON Data in Snowflake

Published: December 13, 2024

down-chevron

Nikesh Vora

Technical Product Manager @ Coefficient

Desktop Hero Image Mobile Hero Image

Working with JSON data in Snowflake requires more than just basic SQL knowledge. Snowflake provides powerful, built-in functions to parse, extract, and manipulate JSON data.

This guide will explain the essential concepts and best practices for querying JSON data in Snowflake, helping you unlock the full potential of your Snowflake Data Cloud.

Querying JSON Data in Snowflake

Snowflake offers a range of built-in functions specifically designed for working with JSON data. These functions allow you to parse, extract, and manipulate JSON elements with ease.

Some of the most commonly used JSON functions in Snowflake include:

  • PARSE_JSON: Parses a JSON string and returns a JSON object. Its useful for converting text data into a usable JSON format.
  • GET_PATH: Extracts a value from a JSON object using a specified path. This is handy for pulling out specific fields from a complex JSON structure.
  • FLATTEN: Flattens nested JSON data into a tabular format. It is instrumental when you need to convert hierarchical data into a flat table for analysis.
  • ARRAY_SIZE: Returns the number of elements in a JSON array. This function helps when you need to understand the size and structure of your JSON arrays.
  • OBJECT_KEYS: Returns an array containing the keys of a JSON object. It allows you to iterate over or inspect the keys within a JSON object easi

Extract Values from JSON Objects

One of the most common tasks when working with JSON data is extracting specific values from JSON objects. Snowflake’s get_path function simplifies this process. Consider the following example:

SELECT

    get_path(json_data, ‘name’) AS name,

    get_path(json_data, ‘age’)::NUMBER AS age,

    get_path(json_data, ‘address’, ‘city’) AS city

FROM my_table;

This query uses  get_path to extract the name, age, and city values from the JSON object stored in the json_data column. The ::NUMBER cast is used to convert the age value to a numeric data type.

Flatten-Nested JSON Structures

JSON data often contains nested structures, which can be challenging to query directly. Snowflake’s flatten function helps overcome this challenge by transforming nested JSON into a tabular format.

Here’s an example:

SELECT

    f.value:first_name::STRING AS first_name,

    f.value:last_name::STRING AS last_name,

    f.value:age::NUMBER AS age

FROM my_table,

LATERAL FLATTEN(input => json_data:employees) f;

In this query, we use LATERAL FLATTEN in combination with the flatten function to unnest the employees array within the JSON data. Each element of the array is treated as a separate row, allowing us to access the first_name, last_name, and age values easily.

Handling JSON Arrays

JSON arrays are another common structure found in JSON data. Snowflake provides functions like array_size and array_contains to work with JSON arrays effectively. Consider the following example:

SELECT

    json_data:name::STRING AS name,

    array_size(json_data:hobbies) AS num_hobbies,

    array_contains(json_data:hobbies, ‘reading’) AS likes_reading

FROM my_table;

In this query, we extract the name value from the JSON object and use array_size to count the number of elements in the hobbies array. We also use array_contains to check if the hobbies array contains the value ‘reading’.

Integrating Coefficient with Snowflake

To further streamline your JSON data querying process in Snowflake, consider using Coefficient.

Coefficient is a data connector for spreadsheets. It allows users to import Snowflake data to Google Sheets or Excel for quick ad hoc analyses.  

Here’s a step-by-step guide on how you can query JSON data in Snowflake using Coefficient:

Step 1: Install Coefficient

Before getting started, install Coefficient for your spreadsheet. It only takes a few seconds, and you only have to do it once.

get started connected snowflake with spreadsheets

Step 2: Connect to Snowflake

Open Coefficient and click “Import from…”  

connect snowflake to spreadsheets

Choose Snowflake from the list of available data sources.

Enter your account credentials (Snowflake Account Name, Database Name, Username, Password, and Warehouse Name).

Step 2: Import Your Data

After connecting, Click “Import from” > “Snowflake” > “Custom SQL Query”.

Add your Snowflake query in the blue text area of the import preview window or use AI to write your query.

snowflake query data in spreadsheets

For JSON data, your query would include SQL functions to parse the JSON fields.

Here is an example SQL query to parse JSON data in Snowflake:

SELECT

    my_column:jsonField1 AS parsed_jsonField1,

    my_column:jsonField2 AS parsed_jsonField2

FROM my_table

WHERE my_column:jsonField3 = ‘some_value’;

Review the import preview and click “Import” to fetch the results into your spreadsheet.

snowflake data pull into google sheets

Step 4: Set up Auto-Refresh

Set up scheduled imports and snapshots to keep your data up-to-date automatically

Choose an update frequency (hourly, daily, or weekly) and specify a time for the data to refresh automatically.

auto refresh snowflake data in spreadsheets

Advanced JSON Querying Techniques

Snowflake offers several advanced techniques for querying JSON data, allowing you to handle complex scenarios efficiently.

Using JSON_EXTRACT for Complex Paths

When dealing with deeply nested JSON structures, the JSON_EXTRACT function provides a more flexible way to extract values. It allows you to specify a path using dot notation or bracket notation. Here’s an example:

SELECT

    JSON_EXTRACT(json_data, ‘$.employee.personal_info.first_name’) AS first_name,

    JSON_EXTRACT(json_data, ‘$.employee.work_info.department’) AS department

FROM my_table;

In this query, we use JSON_EXTRACT with the $ symbol to represent the root of the JSON object, followed by the path to the desired values.

Querying JSON with SQL/JSON Functions

Snowflake supports SQL/JSON functions, which provide a standardized way to query JSON data. These functions offer additional flexibility and can be particularly useful for complex queries. Here’s an example:

SELECT

    JSON_QUERY(json_data, ‘$.employee.personal_info’) AS personal_info,

    JSON_VALUE(json_data, ‘$.employee.work_info.department’) AS department

FROM my_table;

Optimizing JSON Queries in Snowflake

To optimize JSON queries in Snowflake, follow these best practices:

  1. Cast JSON Values to Appropriate Data Types:
    • Cast numeric values to NUMBER, INTEGER, or DECIMAL
    • Use VARCHAR or STRING for string values
    • Cast boolean values to BOOLEAN
  1. Leverage partitioning and clustering:
    • Partition large JSON tables using PARTITION BY on frequently queried columns
    • Use CLUSTER BY to co-locate similar data within partitions
    • Choose partitioning and clustering keys based on common query patterns and filters
  1. Use materialized views:
    • Create materialized views for frequently queried JSON data with complex transformations
    • Define materialized views using CREATE MATERIALIZED VIEW statement
    • Refresh materialized views periodically or on-demand
  1. Avoid unnecessary data scanning:
    • Use filters and predicates on indexed columns or partitioning keys
    • Apply WHERE clause to filter out unwanted JSON records early
    • Avoid LIKE or REGEX predicates on large datasets
    • Minimize CROSS JOIN or CARTESIAN JOIN operations on large JSON datasets

Unlocking Snowflake with Coefficient

By mastering Snowflake’s JSON querying capabilities, you can unlock valuable insights from your semi-structured data. To tap into Snowflake’s full potential, consider using Coefficient.

With Coefficient, you can give your business secure, easy access to live data from Snowflake.

Get started with Coefficient for free today to try it for yourself.