Replicate Default Channel Grouping in BigQuery
On October 10, 2024, Google added the Session Default and Session Custom Channel Group data to the BigQuery GA4 events export to the following fields:
- session_traffic_source_last_click.cross_channel_campaign.default_channel_group
- session_traffic_source_last_click.cross_channel_campaign.primary_channel_group
If you never created a Custom Channel Group in GA4, then these two fields contain the same data. The field is populated for all events.
In Google Analytics, the Default Channel Grouping categorizes traffic sources into broader, more manageable groups such as Organic Search, Paid Search, Social Media, and Direct Traffic.
This article explains how to recreate the Session and User Default Channel Group to closely match the data reported in the Google Analytics 4 (GA4) user interface and API/Looker Studio.
The goal of the procedure is to replicate the Default Channel from the raw medium, source, and campaign data before October 10, 2024.
I show how to create a user-defined function to convert event medium, source, and campaign into the Default Channel Group based on Google’s own definition, which you can find here.
I validated the code so that the output closely matches the newly available Default Channel Group. You can modify the code to create your own customized groups. The function can be applied to user, session, and event-scoped data.
The Process
The Channel Group is typically defined for a session or a user, and Google Analytics calls it scope. Defining the scope of your Channel is the first step in the process.
The second step is identifying the attribution data (medium, source, and Campaign) for that scope and then converting it to the Channel Group of your choice.
Determine the Scope of the Channel Group
Google Analytics UX and Looker Studio offer two levels of scope for the Default Channel Group: user-scoped and session-scoped.
- User-scoped, aka the First Click attribution, appears under Acquisition -> User Acquisition reports in the GA user interface. In Looker Studio, this variable is called the First user default channel group.
- Session-scoped, aka Last Click attribution, appears in the GA UI under Acquisition -> Traffic Acquisition. In Looker, this field is called the Session default channel group.
The event-scoped raw data is also available in BigQuery. This pageview level data has been populated since the beginning of the BigQuery exports and can be considered the third, lowest scope level.
Since session-scoped data is not available before November 1, 2023, for any analyses before that date, you would need to reconstruct the session-scoped attribution information from the pageview events.
If your business regularly has more than one source of pageviews within the same session, the analysis of event-level attribution variables (medium, source, campaign) may produce additional insights.
How to Find Medium, Source, and Campaign for Your Scope
Getting the right medium, source, and campaign is likely the most difficult part of the whole process. Historically, the BigQuery export schema only included user-scoped and event-scoped attribution information. The historical event-level information is still stored in the event_param field, which needs to be unnested before use.
Since November 2023, session-scoped attribution appeared in the session_start event, but it had to be applied to all other events in the session. In July 2024, Google added the proper session-scoped attribution information in a separate field. Below is the summary of attribution data availability in BigQuery by scope and date.
Field Name | Events | Scope | Populated |
---|---|---|---|
traffic_source.medium traffic_source.source traffic_source.name |
All | User | All dates |
event_params -> value.string_value where event_params.key in('medium', 'source', 'campaign') |
page_view | Event | All dates |
collected_traffic_source.manual_medium collected_traffic_source.manual_source collected_traffic_source.manual_campaign_name |
page_view | Event | Starting June 4, 2023 |
event_params -> value.string_value where event_params.key in('medium', 'source', 'campaign') |
session_start | Session | Starting November 1, 2023 |
collected_traffic_source.manual_medium collected_traffic_source.manual_source collected_traffic_source.manual_campaign_name |
session_start | Session | Starting November 1, 2023 |
session_traffic_source_last_click.manual_campaign.medium session_traffic_source_last_click.manual_campaign.source session_traffic_source_last_click.manual_campaign.campaign_name |
All | Session | Starting July 16, 2024 |
Convert Medium, Source, and Campaign into Default Channel Group
A user-defined function is a great way to roll up traffic sources into a Channel Group. The code below will create a function under Routines section of your schema/dataset, and you will be able to use it just like any other function in your code.
Examples of Using Default Channel Grouping Function
Below are two examples of how to create a monthly channel group report in BigQuery using two scopes (user, session) by summarizing three metrics – unique users, sessions, and pageviews.
Please note that there are differences in the total counts of all three metrics, which can be explained by how the scopes are applied to events. For example, the same user may visit your site more than once during the same month through different channels. In that case, the user is counted more than once in the session-scoped report (counted as a unique user for each channel), and they are counted once in the user-scoped report.
User-Scoped Attribution
The code below creates a monthly user acquisition summary by User Default Channel Group. It runs the data from the raw BigQuery events export, and you would need to specify your timezone, your GA4 export schema, and the schema where you saved your user-defined conversion function.
I bolded where the user-defined default channel grouping function is called.
Session-Scoped Attribution
The code below creates a monthly traffic acquisition summary by the Session Default Channel Group using the session last click variable available for all events starting in July 2024. To run this summary from an earlier period, you have to modify the code to summarise the session data first.
The code runs the data from the raw BigQuery events export, and you would need to specify your timezone, your GA4 export schema, and the schema where you saved your user-defined conversion function.
Attribution Considerations in BigQuery
Based on my experience, before you embark on a traffic attribution project in BigQuery, I would like to warn you of several data quirks that can make your life a nightmare. These quirks may change how you define your KPIs (sessions, pageviews, channels), as they impact different sites differently.
Lack of historical session-scoped attribution data
Before July 2024, the attribution data in session_start event did not appear in all events in the session, thus it was difficult to calculate the total number of pageviews attributable to the session channel. While the first pageview of the session typically has the attribution fields populated, the rest would usually have null values.
To report session-scoped traffic metrics, such as pageviews, unique pageviews, engagement, etc. you would need to calculate attribution information separately and join it with session summaries.
More than one channel in a session
Note: event-scoped attribution means pageview-scoped attribution since the page_view event has the attribution information populated more or less consistently.
Summarising event-scoped traffic sources to the session level can be challenging because different page views may have different sources. For example, it is common for page views in the same session to come from both Organic Google Search and Paid Search.
On top of being difficult to summarize, these sessions are interesting in their own right, and looking into their attribution can yield immediate actionable insights.
Customized Channel Groups
While Custom Channel Groups are available in GA4 user interface, they generally cannot be applied historically. Creating customized groups in BigQuery may be the answer you are looking for. This is how grouping traffic using a custom function can help you with attribution analytics in BigQuery:
- Simplified reporting: It condenses numerous individual sources into meaningful categories for easier analysis.
- Customization: Unlike the GA interface, BigQuery allows you to create more customized channel groups tailored to your business needs.
- Historical data analysis: You can apply your channel grouping rules consistently to historical data, which is not possible in the Custom Groups created through the GA interface.
- Data accuracy: Overcome limitations in the GA interface by directly working with raw data in BigQuery.
Conclusion
To replicate GA4’s default channel grouping in BigQuery, follow these key steps and considerations:
Attribution Data Scopes:
- User-scope: Available in traffic_source fields (all dates)
- Session-scope: Complete data in session_traffic_source_last_click (after July 2024)
- Event-scope: Available in event_params for page_view events (all dates)
Timing and Data Sources for Session-Scoped Default Groups
- After October 2024: Use the built-in default_channel_group field
- Before October 2024: Create custom channel grouping using available attribution data
Implementation Steps
- Determine your analysis scope (user or session)
- Select appropriate attribution data fields based on the time period
- Implement the user-defined channel grouping function
- Validate results against GA4 interface
Key Considerations
- When using event-level data, watch for sessions with multiple traffic sources
- Handle historical session data carefully (pre-July 2024)
- Regular validation ensures the accuracy of channel grouping logic
This approach enables both historical analysis and custom channel definitions while maintaining consistency with GA4’s channel groups.
by Tanya Zyabkina
Tanya Zyabkina has over 15 years of experience leading analytics functions for multiple Fortune 500 companies in the retail, telecom, and higher education. She works as the Director of Marketing Performance Analytics for the The Ohio State University. Go Bucks!