Starburst integration with Amazon S3 Tables
Yuya Ebihara
Software Engineer
Starburst
Lester Martin
Developer Adocate
Starburst
Yuya Ebihara
Software Engineer
Starburst
Lester Martin
Developer Adocate
Starburst


More deployment options
Starburst, the data platform for analytics, apps, and AI, has once again joined forces with Amazon Web Services (AWS). This time, we are thrilled to announce that Starburst Galaxy now seamlessly supports Amazon S3 Tables.Â
This is great news for several reasons. For Starburst users, it allows even more optionality. Starburst believes in choice, and this move brings yet more choice to your data architecture. For AWS, it expands access beyond native AWS services.Â
Excited? Read more. We’ll show you why this matters, what it means for Iceberg and Starburst, and how to use this new feature in Starburst Galaxy.
Amazon S3 Tables + Apache Iceberg + StarburstÂ
Amazon S3 Tables offer a REST endpoint that allows Starburst to integrate with the new table format. This integration not only allows you to query and modify Amazon S3 Tables using Apache Iceberg but also federates access to this data alongside data from any other data source. With this, Amazon S3 Tables joins over 50 other data sources accessed via Starburst. Once accessed, this data can be used to power workflows that support analytics, data applications, or AI/ML use cases.Â
What are Amazon S3 Tables?
Amazon S3 Tables were introduced by AWS at the end of 2024. They operate as a new kind of bucket type for AWS S3. You can think of them as a managed Apache Iceberg table hosting offering. Working with Amazon S3 Tables and table buckets provides more information directly from AWS, as well as instructions on how to create this new object storage bucket type. Additionally, Amazon S3 Tables automatically address table maintenance activities, including table compaction.
Starburst and Iceberg in production
Starburst and Iceberg have a long history together. Apache Iceberg is the foundation of the Starburst Icehouse architecture, and the table format of choice for our compute engine. In production, Starburst clusters use the Iceberg connector to access Iceberg tables. This includes storing Iceberg metadata and data files on S3 buckets and integrating with a variety of megastores, including Iceberg REST catalogs.Â
What makes Amazon S3 Tables different? Amazon S3 Tables feature their own mechanisms for controlling catalog access and addressing security. This can impact how organizations implement data governance, enforce data access policies, and integrate with existing security frameworks.
Let’s look at this in more detail.Â
The importance of Iceberg table maintenance
Iceberg requires table maintenance activities such as compaction, snapshot expiration, and orphaned file removal. Starburst has already automated this effort with new features that assist with data maintenance. What makes Amazon S3 Tables rather unique is that these maintenance tasks are handled automatically by AWS. Â
This means users can access Amazon S3 Tables just like any other Iceberg table. This includes executing federated queries against them with any other configured data source from our extensive list of connectors. Starburst’s integration with Iceberg, coupled with a REST catalog interface from AWS, ensures a seamless fit between these two technologies.Â
How to connect Starburst to Amazon S3 Tables
Want to get hands-on? One of the best things about this new integration is that you can try it out for yourself.Â
Starburst Enterprise’s Iceberg connector already includes instructions showing you how to configure Amazon S3 Tables. Additionally, a joint article between AWS and Starburst details how to build a managed Apache Iceberg data lake using Starburst and Amazon S3 Tables.Â
Let’s examine how to integrate Amazon S3 Tables using Starburst Galaxy. These instructions are also included in our Starburst Galaxy documentation.
Note: This is a public preview feature. Contact Starburst support with questions or feedback.
Prerequisites
To complete this configuration, you need access to Starburst Galaxy. Check out our free trial if you are not already set up. You will also need an existing Amazon S3 Table bucket. AWS’ Tutorial: Getting started with S3 Tables provides detailed instructions, if needed.
Step 1 – Prepare Amazon S3 bucketÂ
Go to the Amazon S3 page in your AWS Console and select Table buckets. As shown in the screenshot below, collect the region, account ID, and bucket name.
Each S3 table bucket has a unique “table bucket ARN” that starts with a string that uses the following convention.
arn:aws:s3tables:{REGION}:{ACCOUNT_ID}:bucket/{S3_BUCKET_NAME}
Construct this string based on your specific values. You will need this in the next step.
Step 2 – Set the Amazon S3 Tables catalog
After logging into Starburst Galaxy, navigate to Data and then Catalogs from the menu on the left. Click Create catalog in the newly rendered page.
Click on the Amazon S3 Tables option.
As detailed in the Starburst documentation, define the catalog Name and description, and complete the Amazon S3 Tables configuration. Use the string created in the prior step for the Table bucket ARN value. Click Test connection, which will present a confirmation message and a new Connect catalog button to click on.Â
Continue to leverage the documentation for the values on this new configuration screen before clicking on Set permissions & add to cluster.
Step 3 – Define default schema
S3 Tables do not come with a default schema. In Starburst Galaxy, navigate to the Query editor. Assuming you named your new catalog s3tables, create a schema named example with this SQL statement.
CREATE SCHEMA s3tables.example;
Note: In Starburst, a schema corresponds to a namespace in Amazon S3 Tables.
Step 4 – Create and read a new table
The schema can now be populated with tables. Using the CTAS approach, you can create a new table from any other across all the configured catalogs in your cluster. For this scenario, you can use the Starburst Galaxy sample dataset.
CREATE TABLE s3tables.example.account AS  SELECT * FROM sample.burstbank.account; Now, verify the data was inserted into the new table. SELECT * FROM s3tables.example.account;
Note: S3 Table integration supports Iceberg features like time travel queries, schema evolution, and more. However, some maintenance procedures, such as expire_snapshots, are not supported. Instead, these are handled automatically by AWS for your Amazon S3 Tables.
Starburst and Amazon S3 Tables: The perfect match
Starburst and Amazon S3 Tables are a natural fit. Compatibility with Starburst Enterprise and Starburst Galaxy builds on our long-standing focus on Apache Iceberg and benefits from AWS automated table maintenance tasks. Starburst is an AWS Data and Analytics and Financial Services Competency Partner and is available via AWS Marketplace.
Take the next step with Starburst
Would you like to automatically maintain Iceberg tables that aren’t persisted with Amazon S3 Tables? This includes other tables backed by normal S3 buckets. Starburst Galaxy offers automated data maintenance across all storage types.Â
These maintenance jobs help boost performance and reduce storage usage for Apache Iceberg tables. Supported tasks include data file compaction, statistics collection, and cleanup of outdated snapshots and orphaned files.