Friday, October 29, 2021

Activity schema: data modeling using a single table
Cedric Dussud
Cedric will present a new data modeling approach called the activity schema. It can answer any data question using a single time series table (only 11 columns and no JSON). Instead of facts and dimensions, data is modeled as a customer doing an activity over time. This approach works for any business data used for BI. 

This approach has some fundamental benefits over dimensional modeling.

  1. Single modeling layer. All aggregations, metrics, materialized views for BI, etc, are built directly from the single activity schema table. This means the only dependency is the raw source data.
  2. No more foreign key joins. Queries use relationships in time to relate activities together. This means that any data in the warehouse can be directly combined with any other data, without having to create foreign keys between them.
  3. Open source analyses. The activity schema specifies a specific table structure. This means that the data is structurally the same, no matter who models it. This allows analyses or queries to be directly shared between companies.