Query Google Sheets with DuckDB
For those who rely on Google Sheets as their go-to tool for data storage and collaboration, there often comes a time when you need to take your data analysis to the next level. Google Sheets is fantastic for managing and sharing data, but when it comes to complex and powerful data queries, you may find it lacking.
Enter DuckDB, a high-performance analytical database engine that’s gaining traction among data professionals for its speed and ease of use.
In this blog, we’ll show you how to bridge the gap between your Google Sheets data and advanced querying capabilities, using DuckDB as your secret weapon. Whether you’re a data analyst, a business professional, or a curious tech enthusiast, you’re about to discover how to extract valuable insights from your Google Sheets data like never before.
Let’s dive in!
Accessing Google Sheets Data with DuckDB
DuckDB’s built-in features for reading CSV files make it a seamless choice for extracting data from Google Sheets. But how do we make the connection between Google Sheets and DuckDB? The process is simpler than you might think.
Here’s a step-by-step guide to query Google Sheets Data with DuckDB
- Make sure the Sheets you want to query is publicly accessible. This can be achieved by clicking “Share” button in the upright corner of Google Sheets interface and select “Anyone with the link” can view the Sheet.
- Copy the
Google Sheets ID
andgid
from the URL, example:docs.google.com/spreadsheets/d/
+{Google Sheets ID}
+/edit#gid=
+{gid}
. - Query it directly using Google Sheets’s Export URL:
Loading editor...
SELECT * FROM read_csv_auto('https://docs.google.com/spreadsheets/export?format=csv&id=1Q6-rPG7F_ETHEKNG03YWBdk9CYb3QnSyYIKGVP2KC74&gid=0')
- That’s it!
Create Macro to query Google Sheets Data with DuckDB
Now to make things simpler, we can create a TABLE MACRO to query the data.
By leveraging MACRO
, we don’t have to remember Google Sheet’s Export URL everytime we want to query data. We just need to pass the Google Sheets ID
and gid
as the parameters.
Loading editor...
CREATE OR REPLACE MACRO read_gsheet(id, gid) AS TABLE FROM read_csv_auto( 'https://docs.google.com/spreadsheets/export?format=csv&id=' || id || '&gid=' || gid );
Now, after creating the TABLE MACRO
, we can query the same data with just this:
Loading editor...
SELECT * FROM read_gsheet( '1Q6-rPG7F_ETHEKNG03YWBdk9CYb3QnSyYIKGVP2KC74', 0 )
Here, we’re querying data about Boven-Digoel Concentration Camp from a Google Sheets.
Transform the data
Remember that we can also transform the Google Sheets data:
Loading editor...
WITH unpivot_alias AS ( UNPIVOT read_gsheet( '1Q6-rPG7F_ETHEKNG03YWBdk9CYb3QnSyYIKGVP2KC74', 540675803 ) gsheet ON COLUMNS ( * EXCLUDE (jenis_penyakit) ) INTO NAME tahun VALUE jumlah ) SELECT * REPLACE(REPLACE(tahun, 'tahun_', '')::INT AS tahun) FROM unpivot_alias
In conclusion, DuckDB offers a game-changing solution for those seeking to unlock the full potential of their Google Sheets data. With the ability to seamlessly bridge the gap between Google Sheets and advanced data querying, you can now transform your raw data into actionable insights with ease.
By following the steps outlined in this post, you can streamline your data access, create macros for efficient queries, and even perform data transformations with unparalleled flexibility.
Stay tuned for more informative recipes in the Ducklicious DuckDB Cookbook.