
AI workflows require more than raw compute power. They need a data foundation that is accessible, governed, and built to scale across your organization. Starburst Galaxy delivers this using a data lakehouse architecture that combines Apache Iceberg with the performance of Trino. Together, they provide the single, reliable foundation needed to move from analytics to production AI in the cloud.
Over the past three months, we have introduced several key enhancements to Starburst Galaxy, including:
- File ingestion supporting AWS S3
- Iceberg Data Maintenance
- SQL Jobs and Automated Materialized View Refresh
- Performance Improvements
- AI Features
Together, these updates strengthen Galaxy as the easiest way to run analytics or AI-ready on Iceberg. By unifying ingestion, maintenance, performance, and AI features, Starburst Galaxy ensures that your data is always accessible, governed, and optimized for both analytics and AI.
Iceberg helps organizations deliver AI
Galaxy already leverages Iceberg to create a single foundation for all our data. Recently, we’ve expanded its capabilities to make it even easier to build governed, high-performance AI workflows using Apache Iceberg.
These improvements are not just incremental upgrades. They push Starburst further as the best foundation for modern analytics and AI.
Let’s look at them one by one.
New support for Iceberg v3
Iceberg v3 is the latest iteration of Apache Iceberg, designed to improve both performance and flexibility for large-scale analytic workloads. It includes a number of enhancements, specifically:
- Binary deletion vectors that speed up row-level deletes and updates by avoiding the overhead of many small delete files.
- Support for new data types, including variant data types, while expanding support for semi-structured and time-series workloads.
- Row-level lineage to bring clearer auditability and governance.
These improvements are available immediately, giving teams better performance, broader use cases, and stronger trust in their data.
Announcing Iceberg branching
Starburst Galaxy now supports Iceberg branching, bringing Git-style version control directly into your data workflows.
Iceberg branching allows teams to experiment with transformations, run large backfill jobs, or test what-if scenarios without touching production data. It is also a powerful way to enable safe collaboration, with multiple teams working in parallel while maintaining data integrity.
With Starburst Galaxy, branching is built into the same environment you already use for analytics and AI, alongside access controls and governance features. This makes it simple to isolate changes, audit them, and then publish confidently, giving organizations both speed and trust as they scale their Iceberg workloads in the cloud.
Galaxy supports file ingest using AWS S3
Ingestion has always been central to Starburst Galaxy, whether streaming data in real time from Kafka or hydrating Iceberg tables through batch processes.
With this release, Galaxy now extends ingestion to AWS S3 files, allowing data engineers to move raw files directly into governed Iceberg tables without custom code or external orchestration.
This makes building and maintaining a live lakehouse dramatically easier. Files are continuously ingested as they land in S3, automatically optimized as Iceberg Live Tables, and immediately queryable for analytics or AI. The result is faster time to insight, less pipeline overhead, and one more reason to trust Galaxy as the single foundation for all your data.
Want to know more? Explore why ingestion matters in this step-by-step blog post.
Iceberg data maintenance is now easier than ever
Iceberg is the backbone of modern lakehouses, but regular data maintenance is still vital. As tables scale and evolve, they accumulate snapshots, metadata, and small files that can sometimes slow performance and inflate storage costs.
To combat this, regular maintenance helps to keep Iceberg running at its best.
With this release, Galaxy makes Iceberg maintenance easier than ever. You can now schedule maintenance tasks across a single table, multiple tables, or even entire schemas and catalogs, all using a no-code interface. Built-in retries, execution history, and error notifications ensure jobs run smoothly, while new automation removes the operational burden of manual upkeep.
The result is a data lakehouse that stays optimized, governed, and query-ready without constant intervention, so data teams can focus on building insights instead of babysitting infrastructure.
Support for SQL jobs and automated materialized views
Operational efficiency in a lakehouse comes down to repeatable, reliable tasks. With this release, Galaxy makes those tasks simpler and fully automated.
SQL jobs feature is now generally available to all users, giving you the ability to schedule single SQL statements on a recurring basis. Jobs can be configured through the UI or API, tracked with execution history, and monitored with built-in failure notifications. Backend improvements also deliver greater reliability at scale, with multi-statement support planned for later this year.
Materialized view refresh automation is also generally available. From the Jobs interface, users can schedule and monitor refreshes, ensuring views stay accurate as underlying data changes. Whether configured through SQL or the UI, this feature keeps analytics performant and trustworthy without manual intervention.
How Starburst Galaxy helps improve performance
Starburst Galaxy is built to power both analytics and AI, and every update strengthens that foundation. This release delivers optimizations that speed up queries using large amounts of intermediate data, improve efficiency in resource-constrained environments, and enhance dynamic filtering.
Whether running complex analytical workloads or preparing data for AI pipelines, Galaxy ensures that your Iceberg data lakehouse performs reliably and cost-effectively as demands grow.
Let’s explore two areas of increased performance.
Starburst ODBC Driver
Starburst Galaxy now supports the V3 ODBC driver, delivering faster performance and stronger compatibility for BI workloads. The new driver is a direct replacement for version 2 and is available today for Windows users. If you connect to Galaxy through tools like Tableau or Power BI, upgrading to V3 ensures a smoother, more efficient experience.
AI-Powered Auto-Tagging
AI-Powered Auto-Tagging is now GA in Galaxy. This is a huge component of our AI story, allowing a built-in capability that uses LLMs to automatically identify and tag columns containing PII, regulatory data, or anything else you define.
Starburst Galaxy drives AI workloads
AI initiatives succeed when data is both accessible and governed. Galaxy builds on Iceberg to give teams a consistent way to ingest, maintain, and share data while preserving context and control. By aligning collaboration with governance it helps data engineers ensure that the same datasets support analytics and AI reliably, without sacrificing trust or compliance.
Let’s explore how Starburst Galaxy helps improve AI workloads.
Starburst AI Workflows
Starburst AI Workflows are a new suite of capabilities designed to accelerate enterprise AI adoption. This means moving AI strategy from experimentation to production by making governed, proprietary data instantly usable for a variety of use cases. These features are fully integrated with Starburst Enterprise itself, enabling the seamless integration of AI workloads alongside your analytics. Starburst AI Workflows are designed to address the key challenges enterprises face when scaling AI initiatives, particularly in terms of access, usability, and control.
Highlights of this feature are listed below.
Starburst AI Search
The Starburst AI Search feature transforms structured, semi-structured, and unstructured data into vector embeddings stored in Iceberg tables, allowing AI agents to access and leverage the data.
Starburst AI SQL Functions
The Starburst SQL Functions enable analysts to apply generative AI directly within SQL queries, making it easy to analyze and transform unstructured text using functions like classification, sentiment analysis, and translation.
Starburst AI Model Access Management
The Starburst AI Model Access Management feature provides a framework for managing access to proprietary and third-party AI models within a robust data governance framework.
Read a complete list of supported AI models here.
Why Starburst Galaxy is the easiest way to use Apache Iceberg
Collectively, these updates strengthen Starburst Galaxy’s position as the most complete environment for building and running Iceberg-based data lakehouses in the cloud.
By unifying ingestion, maintenance, performance, and AI features under one platform, Galaxy reduces operational friction while preserving governance and trust. The result is a data foundation that not only scales for analytics but also prepares organizations to deliver AI workloads with confidence and efficiency.



