
Google Trends is a powerful keyword trend research website that provides insights into top search queries and their evolution across various countries and regions over time.
In March 2022, Google announced the availability of a public Google Trends international dataset in BigQuery. Starting with the USA public dataset in 2021, the international dataset covers approximately 50 additional countries worldwide [1].
Furthermore, Google listed several important notes about the international dataset [1]:
- the dataset is staying anonymized, indexed, normalized and aggregated before publication,
- new sets of top terms and top rising queries are being generated daily and inserted into a new table partition,
- the expiration date of the top term/top rising ** set is 30 day**s,
- every term within a set is enriched with a historical backfill over a rolling five-year period.
With this said, Bigquery users can now directly interact with the top 25 international Google Trends insights by running only SQL SELECT
statements.
In addition, Data Analysts can now present the trending insights in visual form by wrapping these queries and plotting the visualizations in any Business Intelligence tool installed on top of BigQuery.
For example, a public Looker dashboard: [Top 25 trending international Google Search Terms](https://datasignals.looker.com/embed/dashboards/14?Region+1=&Country+Name=&Region+2=&Region+3=&theme=GoogleWhite)
is already available. This dashboard allows end-users to query the insights per county and region(s).
However, one of the drawbacks is that we are missing the date filtering option on this dashboard. With this limitation, the end-users are limited only to inspecting the daily trending insights. This also implies that the provided dashboard is not helpful for keyword research for longer time ranges, as end-users still need to switch to Google Trends website to observe the evolution patterns and changes in the trending terms.
So, to overcome these shortcuts, we decided to explore the public BigQuery international Google Trends dataset and provide a different visualization aspect of the trending data insights. Using, of course – Looker instance. 🙂
Explaining the international Google Trends dataset
First, we need to explain the dataset schema. Google listed that the schema of the top 25 terms/top 25 rising terms contains the following attributes [2]:
term
– STRING – the human-readable identifier for a term, i.e. "Acme Inc",country_name
– STRING – stores the full-text name of the country,country_code
— STRING – stores the ISO 3166 Alpha-2 country code used to identify the country,region_name
– STRING – stores the full-text name of the region or state in the country,region_code
– STRING – stores the ISO 3166–2 country subdivision code used to identify the region or state in the country,week
– DATE – the first day of the week for the current row’s position in the time series for the combination of terms, country, region, and score,refresh_date
– DATE – the date when the new set of terms, country, region, and score combination was added; this column also serves as the partition key,score
– INT – index from 0–100 that denotes how popular this term was for a country’s region during the current date, relative to the other dates in the same time series for this term (260 weeks = 52 weeks * 5 years),rank
— INT – numeric representation of where the term falls in comparison to the other top terms for the day across the globe (e.g. rank of 1–25); the rank value shows the same rank across all historical data and all regions of a country,percent_gain
– INT – percentage gain (rate) at which term rose compared to the previous date period; available only for top 25 rising terms.
To provide deeper analytical insights, we need to understand the following numbers fully:
- The
score
metric – presents the maximum search interest of the term for the time and location selected [3]. The index of the value 100 will imply that in the specified time range, a specific term was trending mostly where the peek is, i.e. where the score was highest. The score of one term will change as we select different date ranges and countries/regions, which means that the score’s index is again normalized according to selected filters. We need to use theweek
attribute to present the score evolution of one term per country and region (interest over time). - The
rank
metric – presents the rank of the top terms, and it is updated in the {refresh_date, country, region, term, rank} set. Each day we get the new set of the top 25 trending terms, and we are able to observe the trending terms by selecting different dates, countries and regions.
When this is clear, we can start with our hands-on part. 🙂
Exploring the trending insights with Looker
The development methodology is divided into a 2×2 matrix diagram (2 parts x 2 steps):
![An overview of the development methodology [Image by author]](https://towardsdatascience.com/wp-content/uploads/2022/06/1hCR8f7fIzKjwMOTgkj24zw.png)
As seen from the image above, Part 1 is intended for Looker developers, while Part 2 is intended more for the end-users who can create insights from existing data models.
So, let’s start with the implementation to help both sides generate ideas for creating business value from the international Google Trends dataset. 🙂
Part 1: Data modelling in Looker’s semantic layer
The tasks of this part are split into two steps:
Step 1: LookML modelling: Create views on top of the datasets for the top 25 terms/top 25 rising terms
To implement this step, we can use the [derived_tables](https://docs.looker.com/reference/view-params/derived_table)
in a view file and create two new views:
#1: international_top_terms.view – holds derived table on top of the bigquery-public-data.google_trends.international_top_terms
dataset.
The view is enriched with the following dimensions/measures and functionalities:
- The
country_group
dimension is created to compare the top terms of the DACH countries to the top terms of other countries. - The
top_term_refresh_date
is cast to the EDT zone, as our BigQuery is in the UTC zone.
DATE(DATETIME(TIMESTAMP(${TABLE}.refresh_date), "America/New_York"))
- The links to the Google search and Google Trends websites are embedded within the
term
dimension. In this way, the end-users can easily search for selected terms from Looker and check their interest over time directly on Google Trends. - Measures
score
andrank
are calculated as average values, as the aim is to show these numbers per country group/country/region/week granularity.#2: international_top_rising_terms.view – holds derived table on top of the
bigquery-public-data.google_trends.international_top_rising_terms
dataset.
The view is enriched with similar dimensions/measures as the previous view.
The only additional measure that is added to this view is a percent_gain_measure
, which shows the rate at which term rose compared to the previous date period:
After this step is concluded, we can create a data model to join these two views into one Looker explore.
— – – – – – – – – – – – – – – – – – – – – – – – – – – –
Step 2: LookML modelling: Create a data model from the views created in Step 1
Amazing. When this part is finalized, and the data model is created, end-users can explore the insights and visualize them in the Looker’s presentation layer.
Part 2: Data visualization in Looker’s presentation layer
Let’s dive directly into the second part, where we combine the presentation outcome of two conducted steps.
In other words, after creating a list of the research questions, we created a new Looker dashboard named 🌏 International Google Trends Dataset: Country, Region and Interest over Time analysis
. The pdf of the dashboard can be found HERE.
The following global filter values were applied to get the outcomes presented in the above-attached pdf file:
Refresh Date
=2022/06/21
Country Name
=Austria
Top Term
=Heidi Klum
It is important to add that not all global filters are applied to each tile in the presented dashboard. This is because we created only one dashboard to provide combined end-user insights for the research questions.
Step 1: Analytical queries: Creating a set of the research questions
&
Step 2: Looker: Visualize the insights to answer the research questions
The following set of questions and visual "answers" cover the data visualization part:
Q1: Which top terms/top rising terms have the highest average rank
, and what is their average score
for selected refresh_date
, country
, and region
?
Answer in visual form:
![An overview of the rank and score measures for top terms/top rising terms per selected date and country [Image by author]](https://towardsdatascience.com/wp-content/uploads/2022/06/1RqJbQwsH8hnAgmGbCME8lg.png)
A quick check if we got the matched outcomes as visible on the Google Trends website for the date 2022/06/21
and the country Austria
:
![Comparing the Looker data insights against the Google Trend insights per selected date and country [Image by author]](https://towardsdatascience.com/wp-content/uploads/2022/06/11yFTfBgXGGWMr-v_xec-RQ.png)
As you can see from the image above, the rank of the search term is slightly differing, but the list of terms matches in both sources.
— – – – – – – – – – – – – – – – – – – – – – – – – – – –
Q2: Which top terms are trending in DACH Countries compared to other countries?
Answer in visual form:
![An overview of the trending terms per DACH Countries compared to Other Countries per selected date [Image by author]](https://towardsdatascience.com/wp-content/uploads/2022/06/1q5zUkoPY0GHl3DPjLO1KCw.png)
— – – – – – – – – – – – – – – – – – – – – – – – – – – –
Q3: Which top terms are trending in a specific region
, and how is their average score
differing from region to region?
Answer in visual form:
![An overview of the trending terms per region and selected date [Image by author]](https://towardsdatascience.com/wp-content/uploads/2022/06/1ioj7jf206UUVgMmjER5OJA.png)
— – – – – – – – – – – – – – – – – – – – – – – – – – – –
Q4: How did interest over time evolve for the selected top term?
As mentioned beforehand, the selected term was Heidi Klum
.
Answer in visual form:
![An overview of the selected trending term per interest over time [Image by author]](https://towardsdatascience.com/wp-content/uploads/2022/06/1QhHTsQFo0czbPOiVb0CYbQ.png)
And we can do a quick check with the same filters directly on Google Trends:
![Interest over time for the selected trending term on the Google Trends website [Image by author]](https://towardsdatascience.com/wp-content/uploads/2022/06/1YLzzvzrPnz4I-BanR4USA.png)
When comparing both graphs, it is visible that the trending patterns match, but the score
value differs.
— – – – – – – – – – – – – – – – – – – – – – – – – – – –
Q5: How did interest over time evolve in the past three years for the selected top term (year-over-year analysis for catching the seasonal patterns)?
Again, the selected term was Heidi Klum
.
Answer in visual form:
![An overview of the selected trending term in the past three years [Image by author]](https://towardsdatascience.com/wp-content/uploads/2022/06/1zpulVkRJjkrJgbq0F6rrPg.png)
From the above image, it can be noticed that the search interest for the selected term is higher in the first part of the year.
— – – – – – – – – – – – – – – – – – – – – – – – – – – –
Other functionalities/insights that we provided to our end-users are:
#1: External links embedded within top terms/top rising terms to the Google search and Google Trends websites:

#2: Ability to identify top terms of interest with the rank
= 1 in the past several weeks:

With this part, we concluded our development and visualization phase.
Conclusion
Finally, we can share our development summary, the business value created with the presented analysis and our thoughts on the used dataset.
The development summary:
- First, we provided a development tutorial for creating a data model in Looker on top of the international Google Trends dataset.
- Second, we presented the trending insights with the newly created Looker dashboard, enabling our end-users the possibility to inspect the top historical/recent trending terms per different dimensional granularities.
The business value of the presented analysis:
- The presented analysis is used by content creators for blogging/post creation purposes, as it shows the daily trending top search terms per country group, country, and region. With the implemented date filters, the content creators have insights into the trending terms for the whole week/month, creating more informational texts for targeting.
- In addition, the Looker alerts are implemented for specific conditions. An example: we created a list of the trending terms per specific country/region to alert different groups of users about these terms. This approach avoids manual observation of the trending insights, and users are informed about trending terms based on their predefined interests.
Final thoughts:
- In our opinion, the Google Trends public dataset is helpful for general blog writing purposes, where you can use the top 25 trending terms. However, if you need to inspect the trending terms to conduct keyword research for your ads on a specific topic (e.g. 3d printing), this should still be done via Google Trends website.
- The BigQuery Google Trends dataset should be taken with "a pinch of salt", as the Google listed the following disclaimer for it [2]:
Disclaimer: These datasets are provided "as is" without any warranties or representations of any kind. You are responsible for determining the suitability of this data for your purposes.
To conclude: we look forward to seeing the future extension of the public international Google Trends dataset in BigQuery, and possibly obtaining insights into related queries and topics for the trending search terms.
Feel free to share your thoughts on the presented analysis. 🙂
References:
[1] Google Cloud Blog, accessed: June 13th 2022, https://cloud.google.com/blog/products/data-analytics/international-google-trends-datasets-in-bigquery
[2] Google Cloud Public Google Trends International dataset, accessed: June 16th 2022, https://console.cloud.google.com/marketplace/product/bigquery-public-datasets/google-trends-intl?project=triple-silo-282319
[3] S. Rogers, What is Google Trends data – and what does it mean (2016), Medium post in Google News Lab