DuckDB UNNEST: Preserving Rows with NULL and Empty Array
As data enthusiasts working with DuckDB, you might have encountered a common challenge when dealing with arrays—How to preserve rows with NULL
and empty array values during UNNEST
operations.
In this guide, we’ll explore this issue and provide you with multiple solutions to ensure you don’t lose valuable data.
The Scenario
Let’s consider a scenario where you have a table named base
with two columns: id
and arr
. This table contains various rows, some with NULL
values in the arr
column and others with empty array []
. Your goal is to perform an UNNEST
operation on this table, while keeping the NULL
and empty array rows intact.
Loading editor...
WITH base(id, arr) AS ( VALUES (1, NULL), (2, [1,2,3]), (3, []) ) -- Missing NULL and [] rows SELECT id, UNNEST(arr) AS val FROM base
In the above query, we want to ensure that rows with NULL
and []
are not omitted from the result set. Currently, they are missing from the output. We will fix this.
Solution I: Using UNION ALL
One way is to use UNION ALL
to combine the results of two separate queries. This approach allows you to preserve all rows while UNNEST
-ing the non-empty arrays.
Loading editor...
WITH base(id, arr) AS ( VALUES (1, NULL), (2, [1,2,3]), (3, []) ) -- Keeping NULL and [] rows SELECT id, UNNEST(arr) FROM base WHERE arr IS NOT NULL UNION ALL SELECT id, NULL FROM base WHERE arr IS NULL OR arr = []
Solution II: LEFT JOIN LATERAL with UNNEST
Another way to preserve the rows with NULL
and empty array values is by using a LEFT JOIN LATERAL
with UNNEST
. This approach ensures that all rows from the base table are retained, and UNNEST
-ed values are appropriately associated with their corresponding rows.
Loading editor...
WITH base(id, arr) AS ( VALUES (1, NULL), (2, [1,2,3]), (3, []) ) -- Keeping NULL and [] rows SELECT id, arr.val FROM base LEFT JOIN LATERAL UNNEST(arr) AS arr(val) ON TRUE
Solution III: Using a CASE Statement with UNNEST
In this approach, we utilize a CASE
statement within the UNNEST
operation to handle both NULL
and empty array values. If the arr
column contains an empty array or is NULL
, we replace it with an array containing a single NULL
element. This ensures that all rows are preserved during the UNNEST
operation, regardless of their original value in the arr
column.
Loading editor...
WITH base(id, arr) AS ( VALUES (1, NULL), (2, [1,2,3]), (3, []) ) -- Keeping NULL and [] rows SELECT id, UNNEST( CASE WHEN arr = [] THEN [null] WHEN arr IS NULL THEN [null] ELSE arr END ) AS val FROM base
Solution IV: Leveraging IF and IFNULL for UNNEST
Loading editor...
WITH base(id, arr) AS ( VALUES (1, NULL), (2, [1,2,3]), (3, []) ) -- Keeping NULL and [] rows SELECT id, UNNEST(IF(IFNULL(arr, []) != [], arr, [null])) AS val FROM base
In this solution, we use the IF
and IFNULL
functions to conditionally determine whether to UNNEST
the original arr
value or replace it with an array containing a single NULL
element.
Here’s a breakdown of the logic:
IFNULL(arr, [])
: This part checks if thearr
column isNULL
. If it’sNULL
, it returns an empty array[]
.IF(IFNULL(arr, []) != [], arr, [NULL])
: This part checks whether the array is empty or not. If it’s not empty, itUNNEST
s the original arrayarr
, preserving its elements. If it’s empty, it replaces it with an array containing a singleNULL
element.
By employing these approaches, you can maintain data integrity by preserving rows with NULL
and empty arrays during UNNEST
operations, ensuring that no valuable information is lost.
An empty array or an empty list []
and an array or list containing a single NULL
element [NULL]
are different concepts and have distinct meanings:
Empty Array/List
[]
: An empty array or list represents a collection with no elements. It is essentially a container with zero items. In SQL, an empty array signifies that there are no valid elements in the array.Array/List with NULL Element
[NULL]
: Represents an array or list that contains one element, and that element isNULL
. In SQL, this indicates the presence of a single element, which is explicitly set to aNULL
value.
Try out those approaches in your DuckDB projects and stay tuned for more informative recipes in the Ducklicious DuckDB Cookbook.