Designing data models

How to model data

Before creating a model, it's always useful to sit down with the people that will consume the explorations built over your model to understand which questions they will want to answer. By doing interview sessions, you will be able to identify :

  • The schema of your model

  • The granularity of your model data

  • The raw data you will use as input for your model

Once you know what your model output should be, you can use Whaly's built-in tools to build your data model :

Flow models

You can think of flow models as a building block interface designed to reshape data. It's useful for data savvy users that need to build models but do not want to rely on a programming language such as SQL in order to do so.

Flow models are easy to get started with, are auditable and easily modifiable. They are a great choice for creating low to medium complexity models.

SQL models

SQL models allow you to go beyond flow capabilities, and offer the full possibility of a proven data programming language. Whaly models uses the SQL flavor of your datawarehouse.

SQL models are a great choice when building complex models.

Whether you use Flow models or SQL models, the tools you will be able to use are the following :

  • Joining or combining data from different data sources

  • Transforming values: renaming columns, filtering data, cleaning values, bucketing values, ...

  • Filtering out data, either and entire line or column

  • Aggregating or ventilating data in order to change the grain

Combining the tools above will give you the ability to easily reshape raw data into a new model that makes sense for your business and your users.

Common modeling operations

When creating data models, we are often doing the following operations:

  • Cleaning, for example removing test values or duplicates.

  • Normalizing data, for example, renaming values to apply a cross data naming convention, making sure data share the same units, ...

  • Combining data, for example merging similar data into a new unique table.

  • Calculating complex values, to simplify the most complex operations or apply your business rules.

  • Creating links between your data model and the rest of the data your will analyze

Last updated