Ask or search…
K
Links
📦

Backup your data using BigQuery

Objective: in this guide we will learn how to configure BigQuery in order to create daily data backups. This might prove useful if you want to keep an history of your tools data, like your CRM or finance software.
Prerequisite: in oder to follow this guide, you will need to have an admin access to the google.

Backup a table using BigQuery :

1. Enable the BigQuery data transfer API

The first thing we will need to do is enable the BigQuery data transfer API. This allows to schedule queries on a regular basis.
In order to activate the service:
There might be some propagation delay before this service is seen as activated for Google BigQuery

2. Create a new dataset in BigQuery

Now we will need to create a new dataset in BigQuery:
  1. 1.
    Open the BigQuery Console
  2. 2.
    Create a new dataset in your desired project
    1. 1.
      Give it a name, for example if we want to backup our jaffle_shop dataset, we will name it jaffle_shop_backup
    2. 2.
      Set the region to the same one as your initial table
    3. 3.
      Click on create
  3. 3.
    Create a table in the new dataset
    1. 1.
      Give it a name, for example if we want to backup our customer table, we will name it customer_history
    2. 2.
      Click on create

3. Write the backup query

Let's write the query that we will use to backup our table:
  • In the BigQuery console, open the table you want to backup
  • Click on "Query"
  • You should now see a Query looking like the following
SELECT FROM "whaly-temp-migration-guide.jaffle_shop.customer"LIMIT 1000
  • Edit it in order to remove the limit, select every columns in our table and add a new column named snapshot_ts that will be populated with the backup timestamp. The final query should look like
SELECT *, @run_time as snapshot_ts FROM "whaly-temp-migration-guide.jaffle_shop.customer"

4. Schedule the backup

Now we will add our query to the scheduler:
  • Click on Schedule -> Create a new scheduled query
  • Give it a name
  • Set the schedule according to your needs
  • Set the dataset and table to the ones created during step 2
  • Set Append to table under Destination table write preference: this will tell BigQuery to create a full backup of the table each day
  • Click on save 💾

5. Verify the query execution

Now that our query is scheduled, we should check that it's running correctly:
That's it
🎉