← Back

2023-11-07

GA4 BigQuery: Nested Data & SQL Tips

Data AnalysisBigQueryTechnical Tutorial

GA4 BigQuery Export Schema Explained

Once GA4 data is exported to BigQuery, the first challenge is understanding how to clean and analyze the raw digital footprint. This article breaks down the GA4 schema into two main pillars: "Events" and "Users."

GA4 exports data in daily batches. Based on practical experience, GA4 data processing requires a buffer of about 2 days. This is because some user interactions might have a delay before appearing in GA4. Therefore, for accurate analysis, it is recommended to filter data up to D-2.

Event Data Structure

  • event_date | STRING | Date of the event (YYYYMMDD)
  • event_timestamp | INTEGER | Time of the event (UTC in microseconds)
  • event_name | STRING | Name of the event
  • event_params: Event parameters in STRUCT format

BigQuery uses nested records to store multiple {key: value} pairs for a single event. For example, keywords like page_location are stored with their corresponding string or numeric values.

User Data Structure

  • user_id: Unique ID (collected via GTM)
  • user_pseudo_id: Anonymous ID (Client ID)
  • is_active_user: Boolean flag for activity

Summary

Understanding the RECORD and STRUCT types in the GA4 BigQuery export is the foundation for building custom marketing attribution models or deep-dive user behavior analysis.