2023-11-07
GA4 BigQuery: Nested Data & SQL Tips
GA4 BigQuery Export Schema Explained
Once GA4 data is exported to BigQuery, the first challenge is understanding how to clean and analyze the raw digital footprint. This article breaks down the GA4 schema into two main pillars: "Events" and "Users."
GA4 exports data in daily batches. Based on practical experience, GA4 data processing requires a buffer of about 2 days. This is because some user interactions might have a delay before appearing in GA4. Therefore, for accurate analysis, it is recommended to filter data up to D-2.
Event Data Structure
event_date| STRING | Date of the event (YYYYMMDD)event_timestamp| INTEGER | Time of the event (UTC in microseconds)event_name| STRING | Name of the eventevent_params: Event parameters in STRUCT format
BigQuery uses nested records to store multiple {key: value} pairs for a single event. For example, keywords like page_location are stored with their corresponding string or numeric values.
User Data Structure
user_id: Unique ID (collected via GTM)user_pseudo_id: Anonymous ID (Client ID)is_active_user: Boolean flag for activity
Summary
Understanding the RECORD and STRUCT types in the GA4 BigQuery export is the foundation for building custom marketing attribution models or deep-dive user behavior analysis.