- 3 Minutes to read
- DarkLight
Google BigQuery as a Source
- 3 Minutes to read
- DarkLight
Google BigQuery is a cloud-based data warehouse and analytics platform provided by Google Cloud. It uses a columnar storage format, which allows for fast query performance and efficient data compression, making it well-suited for handling large-scale analytical workloads on vast amounts of data.
Authorize the Connection to Google Big Query
In order to connect Google BigQuery as a data source, you will need to first authorize the connection to your Google BigQuery.
Create a Google BigQuery Data Source
- On the Sources page, click on the Create Source button and select the connector from the list.
- From the drop-down menu, choose your account.Didn't find your account?
Click on Add new Account at the bottom of the drop-down and follow the on-screen prompts. You can also go to the Authorizers tab and click on Add New Service.
- Name your data source and select your
- Google BigQuery account
- Project
- Dataset
- Table
- Columns
- On the Configuration tab, decide if you want to Use Read Session Mode. (For more information on the difference see the section on Limitations.)
- Yes: An auto-generated SQL statement will be prepared for you. You can use a
WHERE
clause to limit the data. - No: Fill in your custom SQL statement.
- Yes: An auto-generated SQL statement will be prepared for you. You can use a
- Configure your snapshotting preferences. Choose your sync frequency or the exact synchronization time under Show advanced settings.DATADDO TIP
If you need to load historical data, please refer to the Data Backfilling article.
- Preview your data by clicking on the Test Data button in the top right corner. You can adjust the date range for a more specific time frame.
- Click on Save and congratulations, your new data source is ready!
Dataddo relies on your query statements to accurately retrieve the right data. By choosing specific statements, you guide Dataddo's actions on your data.
Every command in your query statement directly affects your database. Special attention is required when using commands likeDELETE
or UPDATE
, which are more impactful than a simple SELECT
.
SQL Statement Examples
- Extract all data from the
clients
table:SELECT * FROM clients
- Extract the name, email, company columns from the
clients
table:SELECT name,email,company FROM clients
- Extract data from specific rows (rows 240,000 to 800,000) from all columns from the
clients
table:SELECT * FROM clients LIMIT 2400000, 800000
- Extract all data from columns for a specific country from the
clients
table:SELECT * FROM clients WHERE countryCode = 'USA'
- Extract all data from all columns if a job title contains a specific position from the
clients
table:SELECT * FROM clients WHERE jobTitle = '%Manager%' OR company= 'Dataddo'
Limitations
Read Session Mode Limitations
If you decide to use an auto-generated SQL statement, a READ SESSION
mode will be applied, this implies the following advantages and disadvantages.
Advantages | Disadvantages |
---|---|
Speed: Up to 5 times faster data extraction as large amounts of data are read in a highly efficient manner. | No complex queries: Since you can only specify a WHERE clause, more complex operations (e.g. joins, aggregations, and subqueries) won't be possible. |
Limit data: Specify a WHERE clause to limit data to a specific date range or extract records with specific values in a column. | Limitation with views: As a 'view' is essentially a saved SQL query (aka it generates the data on-the-fly), it can't be executed a READ SESSION mode. |
Please keep these trade-offs in mind when deciding between using an auto-generated or a custom SQL query.
Troubleshooting
Context Deadline Exceeded Error
ERROR CODE
rpc error: code = DeadlineExceeded desc = context deadline exceeded
This issue may be caused by extracting data over an extended timeframe. Use WHERE
or LIMIT
clauses in your SQL query to manage the size and scope of the data extraction.
- Use
WHERE
to specify the date range.SELECT * FROM your_table WHERE date_column BETWEEN '202X-01-01' AND '202X-01-31';
- Use
LIMIT
to specify the specify the maximum number of records to return.SELECT * FROM your_table LIMIT 1000;
- Combine
WHERE
andLIMIT
for more precise control. The query in this example will return the first 1000 records where the date is after January 1, 202X.SELECT * FROM your_table WHERE date_column > '202X-01-01' LIMIT 1000;
Related Articles
Now that you have successfully created a data source, see how you can connect your data to a dashboarding app or a data storage.
Sending Data to Dashboarding Apps
Sending Data to Data Storages
Other Resources