Member-only story
How to break down your query costs in BigQuery
In our analytics journey, the warehouse costs can grow exponentially as more data sources are ingested and users are onboarded.
And with the decreasing cost of cloud storage, it’s not uncommon for businesses with relatively small data to expect most of their BigQuery costs to come from the queries against the data.
When it comes to trying to cut this cost down, the most important first step is to identify which jobs have what impact on the actual cost.
The different sources of costs often follow the Pareto distribution. Without a clear label on how much cost is coming from where, it’s easy to get stuck on solving problems that won’t yield much impact.
This is where BigQuery’s information schema can help.
BigQuery’s information schema has a jobs view which provides a breakdown of different queries, which account ran them when and their cost.
select * from your_project.`your_region`.INFORMATION_SCHEMA.JOBS;
The key columns we care about in this context are:
- query: The actual SQL ran.
- total_bytes_billed: How much billable cost the query consumed, which will proportionally reflect how much money it would cost.
- creation_time: When the query was created. This the partitioning column of the table.
- user_email: Who ran the query.
- destination_table (record): Where the results of a job are stored.
First, we can see what the most expensive queries run between any 2 dates are, and what % of total cost each query took up:
-- see the most expensive queries
with total_cost_billed as (
-- for % calculation
select
sum(total_bytes_billed) as total_in_date_range
from
your_project.`your_region`.INFORMATION_SCHEMA.JOBS
where
creation_time between `START_DATE` and `END_DATE`
)
select
query,
user_email,
creation_time,
total_bytes_billed,
100*total_bytes_billed/total_in_date_range as percentage_of_total
from
your_project.`your_region`.INFORMATION_SCHEMA.JOBS
cross join
total_cost_billed
where
creation_time between `START_DATE` and `END_DATE`
order by total_bytes_billed desc;