LogoLogo
HomeDocumentation
  • 🐳Using Whaly Guides
  • Core concepts
    • 📚Getting started
      • Data stack architecture
      • Consumers vs Builders
      • Data layers in Whaly
      • License Mapping
    • 🪄Data modeling
      • Understanding data models
      • Designing data models
      • Common modeling patterns
        • Event schema
      • Maintaining data models
      • Data models best practices
    • 🖌️Explorations
      • Understanding Explorations
      • Designing Explorations
      • Maintaining Explorations
      • Mistakes to avoid
  • Training
    • 👁️For viewers
    • 👩‍💻For editors
    • 🧙For builders
      • Setting up the training material
      • Creating a chart
      • Using and editing explorations
      • Filtering a dashboard
      • Creating explorations and models
  • Inspiration
    • 🗒️Use cases
      • Billing / Invoicing
      • Customer success
      • Fundraising
      • Marketing
      • Partnerships
      • Product
      • Sales
      • Strategy
    • 💬Communication
    • 💡Tips
  • Recipes
    • 🤝Customer care
      • How to build a 360° customer dashboard
    • 🏦Finance
      • Modeling your recurring revenue
        • SQL for simplified MRR calculation
        • SQL for advanced MRR calculation
    • 📣Marketing
      • Track your entire Marketing Funnel
      • Calculate your Customer Acquisition Cost
      • Create a partner dashboard
    • 💼Sales
      • Analyze the impact of your Sales velocity on your closing rate
      • Create a sales performance dashboard
      • Build a target oriented sales dashboard
  • Misc
    • 🧐SQL Fanout
    • 📦Backup your data using BigQuery
    • ☁️Embedding reports in Salesforce
    • 👨‍💻Useful SQL operations
      • Flattening categories
Powered by GitBook
On this page
  • Input
  • Output
  • Example query
  1. Misc
  2. Useful SQL operations

Flattening categories

Input

When dealing with categories it is common to encounter data that may be under such format :

id
toppings

1

salad;tomatoes;onions

2

cheese;bacon

3

null

Output

Ideally, the structure we would like to have to enable data analysis on such data format would be the following :

id
toppings

1

salad

1

tomatoes

1

onions

2

cheese

2

bacon

Example query

Let's see an example in order to achieve this result :

-- this is our fake database
WITH
  database AS (
    SELECT
      1 AS id,
      "salad;tomatoes;onions" AS toppings
    UNION ALL
    SELECT
      2 AS id,
      "cheese;bacon" AS toppings
    UNION ALL
    SELECT
      3 AS id,
      null AS toppings
  )
  
-- this is the output of the model
-- 1. we start by selecting our two columns "id" and "toppings"

-- 2. then we split the toppings using the ; delimiter character
--    this will generatate an array
--    https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#split

-- 3. Then we unnest the array, and flatten it using unnest and cross join
--    this will give us the expected output format
--    https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays#flattening_arrays
SELECT
  id,
  toppings
FROM
  database
  CROSS JOIN UNNEST(SPLIT(toppings, ";")) as toppings
with
  database as (
    select
      1 as id,
      'salad;tomatoes;onions' as toppings
    union all
    select
      2 as id,
      'cheese;bacon' as toppings
    union all
    select
      3 as id,
      null as toppings
  )
select d.id, f.value::string as toppings
from database d,
lateral flatten
(input=>split(d.toppings, ';')) f
PreviousUseful SQL operations

Last updated 2 years ago

👨‍💻