Join us for AI & Datanova 2025 — live in New York City on October 8-9 or virtually on October 22-23!

Share

Linkedin iconFacebook iconTwitter icon

More deployment options

Apache Iceberg is an open-source table format that provides database-like functionalities such as ACID transactions, schema evolution, and time travel for large analytic datasets stored in data lakes. Iceberg Spec v3 introduces significant advancements, including binary deletion vectors for faster deletes and updates, richer data types like variant for semi-structured data, nanosecond-precision timestamps, and built-in row lineage for enhanced data governance. 

Starburst Enterprise and Starburst Galaxy support Iceberg v3, enabling users to benefit from improved query performance, new use cases, and robust data tracking capabilities. 

 

A short history of the Apache Iceberg table format

Apache Iceberg is a high-performance, open table format for large analytic datasets. Importantly, Apache Iceberg is not a storage system or a query engine itself; instead, it’s a layer that sits on top of your existing data lake storage. This storage can take any form, and might include Amazon S3, Azure Data Lake Storage, or HDFS. This shift introduces database-like functionality to the data files associated with a table.

Any data professional who has worked with data lakes knows the journey from a collection of raw files to a reliable, queryable table has been a long one. Several years ago, Apache Iceberg emerged to address these limitations by introducing a metadata-driven approach that brought much-needed ACID transactions, schema evolution, and time travel to the open data ecosystem.

From Iceberg v1 to v2: Laying the Foundation for Updates 

The initial Iceberg v1 spec was a revolution in its own right. It broke the tight coupling between data files and directory structures, a common problem with formats like Hive

Iceberg Spec v2 addressed the challenge of efficient row-level updates and deletes for real-time analytics and CDC by introducing positional delete files. While this enabled efficient merge-on-read operations, it also introduced a potential drawback in the form of query engine slowdowns due to reconciling numerous small delete files.

The Leap to v3: Speed, Richness, and Data Governance 

The Iceberg Spec v3 is built on the solid foundation created by v1 and v2. It represents a strategic move to push Iceberg further into the realm of high-performance, complex data workloads. This includes the following changes. 

Deletion vectors for blazing-fast deletes

Iceberg v3 moves beyond positional delete files by introducing binary deletion vectors. This new, more compact format simplifies the process of applying deletes. It is designed to significantly improve read performance, particularly in scenarios with high-frequency updates and a large number of deletes. This is a game-changer for high-throughput CDC pipelines.

Richer data types for complex data  

Iceberg v3 supports complex data. To do this, it uses the variant data type. This type is significant when handling semi-structured data, such as JSON. At the same time, the new geometry and geography data types bring first-class support for geospatial analytics. It also adds nanosecond-precision timestamps and support for unknown types, making Iceberg a more flexible and expressive format.

Built-in Row Lineage for auditing and data governance 

Iceberg v3 adds support for row-level lineage. This feature tracks a row’s history by embedding metadata like a stable row ID and its last modified sequence number. This feature makes it much easier to implement robust CDC pipelines and provides a clear audit trail for compliance and debugging.

Default Values for enhanced schema evolution

In Iceberg v3, schema evolution is now even more seamless, allowing users to define default values for new columns.

Starburst delivers support for Iceberg Spec v3

Starburst remains committed to developing a leading Apache Iceberg connector for its Query Engine. Starburst’s investment in this area is a core part of its mission to enable fast, reliable, and open analytics on data lakes.

We’re proud to announce that the Starburst Enterprise (Release 476-e) and Starburst Galaxy now support Iceberg Spec v3. This isn’t just a simple update; it’s a major milestone that directly translates into real-world benefits for our users. These benefits include:

Faster Queries, Fewer Headaches

The support for binary deletion vectors means that our Trino query engine can process updates and deletes with significantly improved efficiency. For use cases like change data capture (CDC) and real-time analytics, this translates to faster query execution and lower compute costs.

Unlocking New Use Cases

With support for variant data types, you can now analyze semi-structured data directly in your data lake without complex ETL processes. Support has been added for nanosecond-precision timestamps, which is critical for time-series analytics and other applications where high-resolution temporal data is essential. Support for defining default values for new columns of a table enables more seamless schema evolution scenarios.

Enhanced Data Governance and Trust

The native support for row-level lineage in v3 allows you to track the history of individual rows. This is a powerful feature for auditing, compliance, and building trust in your data products.

Our investment in the latest Apache Iceberg features not only keeps us current with the industry but also positions us as leaders.

 

How to use Iceberg v3 in Starburst

How does it work in practice? In this section, we’ll show you step-by-step how to use Iceberg v3 with the Starburst Iceberg connector. Iceberg introduced format version 3 (v3) to support new features, including deletion vectors, default column values, new data types, and row lineage. Starburst gives you two ways to control which format version is used:

  • Table-level setting: using the format_version table property
  • Catalog-level setting: using the iceberg.format-version catalog configuration property

Creating a new v3 table

If you want to create a new table directly in v3 format, simply specify the table property in the WITH clause:

CREATE TABLE v3 (id int) 
WITH (format_version = 3);

This ensures that the new table is initialized with the v3 metadata layout.

Upgrading an existing table to v3

If you already have tables in v1 or v2 format, you can upgrade them to v3 using the ALTER TABLE statement:

ALTER TABLE t1 SET PROPERTIES format_version = 3;

⚠️ Important: Before upgrading, make sure the table is not being accessed by engines or applications that don’t support v3 yet. Once upgraded, downgrading to v1 or v2 is not supported. For this reason, all readers must be compatible with v3. Note that the default version remains v2.

Deletion vector

One of the key innovations introduced in Iceberg v3 is the use of deletion vectors. As was the case with v2, entire data files are not rewritten when rows are deleted or updated. Instead of writing a new delete file aligned to each affected data file, Iceberg v3 keeps track of deleted rows separately and can create a v3 delete file that spans many data files. This makes row-level modifications much more efficient than v2 position deletes by dramatically reducing the number of these small delete files.

When rows are deleted (or updated), Iceberg automatically generates a puffin file, which contains the deletion vector metadata. These puffin files indicate which rows in the existing Parquet files are no longer valid, eliminating the need to write a delete file for each affected data file.

Here’s a simple example:

DELETE FROM region WHERE name = 'AFRICA';
DELETE FROM region WHERE name = 'ASIA';
DELETE FROM region WHERE name = 'EUROPE';

After running this command, you can query the special $files system table to inspect which files were affected:

SELECT file_path, file_format FROM "region$files";

Assuming three rows were deleted, the output might look like this:

file_path file_format
s3://…/data/data.parquet PARQUET
s3://…/data/deletes.puffin PUFFIN

As shown above, the Parquet files contain the original data, while the Puffin file holds the deletion vector information. With v2, separate delete files would have been created to align with each of the affected three rows.

Image depicting how Iceberg v3 deletion vectors operate.

This design allows queries to efficiently skip deleted rows without having to read many small v2 delete files, improving both performance and storage efficiency:

  • Lower cloud storage costs (fewer files are created and written).
  • Faster updates and deletes on large tables.

Default column value

Iceberg (via the connector) now supports default values for columns in CREATE TABLE statements. This feature makes it easier to handle missing values during inserts by automatically filling in predefined defaults.

You can specify a default value for a column using the DEFAULT keyword. For example, the following statement creates a table where the data column defaults to ‘bob’ if no value is provided. Note that only literal values of primitive types are supported. Complex types such as arrays, maps, structs, and variants, as well as other expressions like CURRENT_USER, are not allowed.

CREATE TABLE default_column (
    id int,
    data varchar DEFAULT 'bob'
);

Now, when you insert rows, you can either provide a value for data explicitly or omit it and let Iceberg assign the default

INSERT INTO default_column VALUES (1, 'alice');
INSERT INTO default_column (id) VALUES (2);

Querying the table shows the expected behavior:

SELECT * FROM default_column;
id data
1 alice
2 bob

As you can see, the second row automatically filled in ‘bob’ for the data column, since no value was provided.

You can add a new column with a default value like this:

ALTER TABLE default_column 
ADD COLUMN birthday DATE DEFAULT DATE '2025-06-02';
INSERT INTO default_column (id, data) VALUES (3, 'charlie');
SELECT * FROM default_column;
id data birthday
1 alice NULL
2 bob NULL
3 charlie 2025-06-02

This feature helps simplify schema design and reduces the need for application-side logic to handle missing values.

Variant type

Notably, the Iceberg v3 connector now supports JSON data by mapping Trino’s JSON type to Iceberg’s VARIANT type. This feature allows you to store and query semi-structured data in Iceberg tables while maintaining compatibility with Trino’s SQL syntax.

For example, when you create a table with a JSON column in Trino, the connector transparently maps it to a VARIANT column in Iceberg:

CREATE TABLE variant (data JSON);
INSERT INTO variant 
  VALUES JSON '{"company": "Starburst"}', JSON '{"engine": "Trino"}';

Querying the table returns the JSON values as expected:

data
{“company”:”Starburst”}
{“engine”:”Trino”}

How it’s stored internally

While the query results look like regular JSON, the underlying Parquet files reveal that Iceberg represents the VARIANT type using two fields:

  • metadata – contains structural information, such as field names
  • value – contains the actual encoded data

A simplified example of the internal Parquet structure looks like this:

[
  {
    "data": {
      "metadata": "\u0001\u0001\u0000\u0007company",
      "value": "\u0002\u0001\u0000\u0000\n%Starburst"
    }
  },
  {
    "data": {
      "metadata": "\u0001\u0001\u0000\u0006engine",
      "value": "\u0002\u0001\u0000\u0000\u0006\u0015Trino"
    }
  }
]

This encoding enables Iceberg to efficiently store and query semi-structured JSON data while maintaining flexible schema evolution.

Nanosecond timestamp type

Iceberg v3 introduces support for nanosecond precision timestamps, which the Trino connector maps directly to Trino’s TIMESTAMP(9) and TIMESTAMP(9) WITH TIME ZONE types. This ensures that applications can capture events with extremely high precision — down to the nanosecond — which is valuable in use cases such as financial transactions, scientific measurements, or system event logging.

For example, you can define a table with a TIMESTAMP(9) column and insert a nanosecond-precision value:

CREATE TABLE timestamp_nanos (data TIMESTAMP(9));
INSERT INTO timestamp_nanos VALUES TIMESTAMP '2025-08-21 12:34:56.123456789';
SELECT * FROM timestamp_nanos;

Query results preserve the full nanosecond precision:

data
2025-08-21 12:34:56.123456789

These nanosecond-precision timestamps are also fully compatible with Iceberg’s partition transforms. You can continue to use familiar expressions such as:

  • year(ts)
  • month(ts)
  • day(ts)
  • hour(ts)

This makes it easy to organize fine-grained event data without losing precision at the storage level.

Row lineage

Iceberg v3 introduces row lineage, a feature that can be used to build incremental materialized views in the future. The connector adds new metadata columns that make it easier to track the lifecycle of rows over time. These columns are automatically available and require no schema changes:

  • $row_id: An implicit, unique row identifier that is automatically assigned when a row is created.
  • $last_updated_sequence_number: A sequence number that indicates the last time a row was updated.

This feature allows you to reason about row versioning and lineage directly from queries.

For example, you can inspect these metadata columns with a simple query:

SELECT "$row_id", "$last_updated_sequence_number", name 
FROM region;

Result:

$row_id $last_updated_sequence_number name
0 1 AFRICA
1 1 AMERICA
2 1 ASIA
3 1 EUROPE
4 1 MIDDLE EAST

Updating Rows

When a row is updated, the $row_id remains stable, ensuring consistent row identity. However, the $last_updated_sequence_number increments, indicating a new version of that row.

For instance, updating the ASIA row:

UPDATE region 
SET name = 'Asia' 
WHERE name = 'ASIA'
SELECT "$row_id", "$last_updated_sequence_number", name 
FROM region;

Result:

$row_id $last_updated_sequence_number name
0 1 AFRICA
1 1 AMERICA
2 2 Asia
3 1 EUROPE
4 1 MIDDLE EAST

Image denoting row lineage in Iceberg v3.

 

This capability is beneficial for:

  • Auditing row-level changes
  • Debugging data pipelines
  • Building slowly-changing dimension (SCD) tables
  • Tracking row versions across snapshots

 

Why Iceberg v3 matters

Iceberg v3 represents a significant leap forward in table format capabilities, delivering features that enhance performance, expand data type support, and strengthen data governance. 

With the introduction of binary deletion vectors, richer data types like variant and nanosecond precision timestamps, and built-in row lineage, Iceberg continues to solidify its position as a leading open table format for modern data lakes. 

Starburst’s immediate and comprehensive support for Iceberg v3 means users can leverage these advancements today, accelerating their analytics, enabling new use cases, and building even greater trust in their data. 

We strongly encourage you to explore these new features and experience the power of Iceberg v3 with Starburst.

Start for Free with Starburst Galaxy

Try our free trial today and see how you can improve your data performance.
Start Free