Google BigQuery as a Source
  • 3 Minutes to read
  • Dark
    Light

Google BigQuery as a Source

  • Dark
    Light

Article Summary

Google BigQuery is a cloud-based data warehouse and analytics platform provided by Google Cloud. It uses a columnar storage format, which allows for fast query performance and efficient data compression, making it well-suited for handling large-scale analytical workloads on vast amounts of data.

Authorize the Connection to Google Big Query

In order to connect Google BigQuery as a data source, you will need to first authorize the connection to your Google BigQuery.

Create a Google BigQuery Data Source

  1. On the Sources page, click on the Create Source button and select the connector from the list.
  2. From the drop-down menu, choose your account.
    Didn't find your account?

    Click on Add new Account at the bottom of the drop-down and follow the on-screen prompts. You can also go to the Authorizers tab and click on Add New Service.

  3. Name your data source and select your
    1. Google BigQuery account
    2. Project
    3. Dataset
    4. Table
    5. Columns
  4. On the Configuration tab, decide if you want to Use Read Session Mode. (For more information on the difference see the section on Limitations.)
    1. Yes: An auto-generated SQL statement will be prepared for you. You can use a WHERE clause to limit the data.
    2. No: Fill in your custom SQL statement.
  5. Configure your snapshotting preferences. Choose your sync frequency or the exact synchronization time under Show advanced settings.
    DATADDO TIP

    If you need to load historical data, please refer to the Data Backfilling article.

  6. Preview your data by clicking on the Test Data button in the top right corner. You can adjust the date range for a more specific time frame.
  7. Click on Save and congratulations, your new data source is ready!
WARNING

Dataddo relies on your query statements to accurately retrieve the right data. By choosing specific statements, you guide Dataddo's actions on your data.

Every command in your query statement directly affects your database. Special attention is required when using commands likeDELETE or UPDATE, which are more impactful than a simple SELECT.

SQL Statement Examples

  1. Extract all data from the clients table:
    SELECT * FROM clients
    
  2. Extract the name, email, company columns from the clients table:
    SELECT name,email,company FROM clients
    
  3. Extract data from specific rows (rows 240,000 to 800,000) from all columns from the clients table:
    SELECT * FROM clients  
    LIMIT 2400000,  
    800000
    
  4. Extract all data from columns for a specific country from the clients table:
    SELECT * FROM clients WHERE countryCode = 'USA'
    
  5. Extract all data from all columns if a job title contains a specific position from the clients table:
    SELECT * FROM clients WHERE jobTitle = '%Manager%' OR company= 'Dataddo'
    


Limitations

Read Session Mode Limitations

If you decide to use an auto-generated SQL statement, a READ SESSION mode will be applied, this implies the following advantages and disadvantages.

AdvantagesDisadvantages
Speed: Up to 5 times faster data extraction as large amounts of data are read in a highly efficient manner.No complex queries: Since you can only specify a WHERE clause, more complex operations (e.g. joins, aggregations, and subqueries) won't be possible.
Limit data: Specify a WHERE clause to limit data to a specific date range or extract records with specific values in a column.Limitation with views: As a 'view' is essentially a saved SQL query (aka it generates the data on-the-fly), it can't be executed a READ SESSION mode.

Please keep these trade-offs in mind when deciding between using an auto-generated or a custom SQL query.

Troubleshooting

Context Deadline Exceeded Error

ERROR CODE

rpc error: code = DeadlineExceeded desc = context deadline exceeded

This issue may be caused by extracting data over an extended timeframe. Use WHERE or LIMIT clauses in your SQL query to manage the size and scope of the data extraction.

  1. Use WHERE to specify the date range.
    SELECT * FROM your_table
    WHERE date_column BETWEEN '202X-01-01' AND '202X-01-31';
    
  2. Use LIMIT to specify the specify the maximum number of records to return.
    SELECT * FROM your_table
    LIMIT 1000;
    
  3. Combine WHERE and LIMIT for more precise control. The query in this example will return the first 1000 records where the date is after January 1, 202X.
    SELECT * FROM your_table
    WHERE date_column > '202X-01-01'
    LIMIT 1000;
    

Related Articles

Now that you have successfully created a data source, see how you can connect your data to a dashboarding app or a data storage.

Sending Data to Dashboarding Apps

Sending Data to Data Storages

Other Resources


Was this article helpful?