GA4 BigQuery SQL: Querying Wildcard Tables

Updated: 2023-09-13

In this section of our GA4 BigQuery SQL Course, we’ll learn about the technique of querying Wildcard Tables.

We’ll start by exploring the dataset structure, followed by a detailed example that demonstrates querying for a single day, multiple days using an inefficient method, and finally, we’ll introduce the concept of Wildcard Tables and showcase a simple example query that streamlines data retrieval across days.

The dataset we’ll be working with is bigquery-public-data.ga4_obfuscated_sample_ecommerce, which simulates an e-commerce dataset. Each day of data is represented as a separate table, named events_YYYYMMDD.

Querying for a Single Day

Let’s begin by querying data for a single day, say, ‘20210115’:

-- bigquery-public-data.ga4_obfuscated_sample_ecommerce
SELECT *
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210115`;

This query retrieves all data for 15th of January.

Querying for Multiple Days (Inefficient Method - UNION ALL)

If we want to query data for multiple days, the naive approach involves using a UNION ALL to combine individual tables. For instance, to retrieve data for the 15th and 16th of January:

-- bigquery-public-data.ga4_obfuscated_sample_ecommerce
SELECT *
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210115`
UNION ALL
SELECT *
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210116`;

While this method works, it becomes cumbersome as the number of days increases.

Introducing Wildcard Tables

Now, let’s simplify this process using Wildcard Tables. We’ll employ the asterisk (*) as a wildcard character to query data across multiple days seamlessly.

-- bigquery-public-data.ga4_obfuscated_sample_ecommerce
SELECT *
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20210115' AND '20210116';

In this query:

  • We use the asterisk (*) to represent any sequence of characters in the table name.
  • _TABLE_SUFFIX is a special pseudo-column that captures the date portion of the table name.
  • BETWEEN allows us to filter tables within the specified date range.

This single query retrieves data from tables events_20210115 and events_20210116, making it an efficient and scalable way to work with daily data.

Conclusion

Wildcard Tables in BigQuery offer a dynamic and efficient approach to querying data across multiple tables with a common naming pattern. This technique simplifies data retrieval and analysis, especially when dealing with large datasets structured by day.

Wilcard Tables BigQuery Docs

As we progress in our GA4 BigQuery SQL Course, we’ll explore more advanced use cases and dive deeper into the world of data analysis with GA4. Stay tuned and continue mastering the power of GA4 BigQuery SQL!