Required SQL knowledge: WITH clause, MIN and MAX functions, INNER JOINS, BigQuery arrays manipulation.
In order to model our revenue, we will write a query which will execute the following step:
Retrieve all subscriptions
Calculate oldest and most recent subscriptions dates
Generate a list of all the month between our oldest and most recent dates
Join our months with our subscriptions
In the end, our modeled dataset will look like this:
Retrieve all subscriptions
Let's start by querying all of our subscriptions (you may need to update the subscription table in the from command):
-- all subscriptionswith subscriptions as (select subscription_id, customer_id, monthly_amount,start_date, end_date,from ${TABLES["Subscriptions"]["subscriptions"]})select*from subscriptions
Run this query, you should see the content of your subscription table as the result.
Calculate oldest and most recent subscriptions dates
Then we will need to calculate our oldest start date and most recent end date. Let's do it using the min and max functions:
...-- min and max subscriptions datesdate_limits as (selectmin(start_date) as min_date,max(end_date) as max_datefrom subscriptions)select* date_limits
Run this query, you should see the min and max subscriptions dates.
Generate a list of all the month between our oldest and most recent dates
Now we will generate the list of months between our min_date and our max_date:
...-- array of month between min and max subscriptions datesmonths_array as (select GENERATE_DATE_ARRAY(CAST(min_date asDATE),CAST(max_date asDATE), INTERVAL 1month ) as arrfrom date_limits),-- list of month between min and max subscriptions datesmonths as (selectCAST(monthasTIMESTAMP) asmonthfrom months_array, UNNEST(months_array.arr) asmonth)select*from months
Run this query, you should see the list of months between our min_date and max_date.
Join our months with our subscriptions
Let's write the final step, which will use an inner join between our subscriptions and our months tables in order to generate one line per active subscription month.
The entire SQL query should now be as following:
-- all subscriptionswith subscriptions as (select subscription_id, customer_id, monthly_amount,start_date, end_date,from ${TABLES["Subscriptions"]["subscriptions"]}),-- min and max subscriptions datesdate_limits as (selectmin(start_date) as min_date,max(end_date) as max_datefrom subscriptions),-- array of month between min and max subscriptions datesmonths_array as (select GENERATE_DATE_ARRAY(CAST(min_date asDATE),CAST(max_date asDATE), INTERVAL 1month ) as arrfrom date_limits),-- list of month between min and max subscriptions datesmonths as (selectCAST(monthasTIMESTAMP) asmonthfrom months_array, UNNEST(months_array.arr) asmonth),-- months joined with active subscriptions monthssubscriptions_months as (select months.month, subscriptions.customer_id, subscriptions.monthly_amount as mrr, subscriptions.start_date, subscriptions.end_datefrom subscriptionsinner join months on months.month >= subscriptions.start_dateand months.month < subscriptions.end_date)select*from subscriptions_months
Run the query: we now have one line per active subscription month 🎉
Save the query (you can use a combination of month and customer_id as primary key).
Build the exploration and the dashboard
By creating an exploration on top on this dataset, we will be able to build the following dashboard: