Source-Available Core Multi-Cloud SOC2 Ready

Your Data.
Your Rules.
Your Island.

The sovereign lakehouse platform with versioned storage, SQL analytics, enterprise security, and AI-native integration. Runs on your infrastructure.

Data Island platform architecture — layered view showing Gatekeeper security foundation, Core data platform, Studio workshop, and Agents AI tier
The future of AI + Data

Versioned lakehouse + agentic AI workbench.

One platform, two surfaces. SQL-first storage for engineers and BI, natural-language reasoning for everyone else — sharing the same data, the same RBAC, the same audit trail.

Data Island Core Versioned lakehouse
Data Island Core — command center with SQL Editor, Tables, Ingestion, and Data Quality
Lighthouse Agentic AI workbench
Lighthouse — Chat workspace turning a natural-language question into a chart, SQL, and explanation

Four Layers. One Platform.

Data Island is a complete data platform built from the ground up with security, governance, and AI at its core.

Lighthouse Available
AI workbench layer — natural-language chat, auto-discovered catalogs, a per-table quality copilot, stewardship workflows, and multi-LLM routing.
NL→SQL Quality Copilot Stewardship Multi-LLM
Studio Coming Soon
Developer & operations environment — build, schedule, and operate data flows.
Connectors Notebook Pipes Scheduler
Core Available
Versioned lakehouse engine — immutable storage, SQL analytics, multi-cloud, RBAC, and full API.
Versioned Storage SQL Analytics OData MCP
Gatekeeper Available
Zero-trust identity & authorization — OAuth2/OIDC, RS256 JWT, SCIM 2.0, encrypted storage, compliance-ready audit.
OAuth2/OIDC SCIM 2.0 RS256 JWT Fernet Encryption SOC2 / DORA

Everything You Need to Own Your Data

A complete data platform that replaces a dozen tools. Versioned, governed, and ready for AI.

Versioned Storage

Every write creates an immutable snapshot. No data is ever silently overwritten. Point-in-time queries let you see data as it existed at any moment.

Learn more

Multi-Cloud Freedom

Run on AWS S3, Azure Blob, Google Cloud Storage, MinIO, or local disk. Switch backends with a configuration change — no migration project.

Learn more

SQL-First Analytics

Standard SQL queries with automatic engine selection. DuckDB for fast OLAP, Spark SQL for heavy workloads. Zero configuration required.

Learn more

Enterprise Security

Role-based access control with row-level and column-level security. Token-based auth with configurable permission tiers. Gatekeeper handles identity.

Learn more

AI-Native Integration

Built-in MCP server lets Claude Desktop and AI assistants query tables, explore schemas, and execute SQL through natural conversation.

Learn more

BI Tool Integration

OData 4.0 endpoint for Power BI, Excel, and Tableau. Analysts configure a URL and token — no drivers, no plugins, no ETL pipelines.

Learn more

Up and Running in Minutes

Three steps from zero to querying your data.

1

Connect

Point Data Island at your storage backend — S3, Azure Blob, GCP, MinIO, or local disk. One environment variable.

2

Ingest

Write data via the REST API or Python SDK. Every write is automatically versioned, deduplicated, and audit-logged.

3

Analyze

Query with SQL, connect BI tools via OData, or use AI assistants through MCP. The engine auto-selects based on data size.

Built for Every Data Role

One platform, five workflows — from engineering to compliance to AI-assisted analytics.

Data Engineers

Build & Ingest

Write with the Python SDK, query with SQL. The platform handles schema evolution, dedup, and storage optimization — no Spark cluster to manage.

Python SDK Schema Evolution Streaming Writes

Compliance Officers

Audit & Governance

Immutable audit trails with SHA-256 hash chains, row- and column-level access control, and 7-year log retention. Built for DORA, SOC 2, and GDPR.

DORA SOC 2 GDPR 7-yr Retention

BI Analysts

Analyze & Visualize

Connect Power BI or Excel directly via OData. Live dashboards against production data — no waiting for engineers to build ETL pipelines.

OData 4.0 Power BI Excel Tableau

AI / ML Teams

Reason & Automate

Connect Claude Desktop or Cursor via MCP. Query tables, explore schemas, and profile columns through natural-language conversation.

MCP Server Claude Desktop Cursor

Platform Builders

Scale & Integrate

Multi-org isolation, cross-org data sharing, and open-format mirroring for Spark, Databricks, and dbt ecosystem interoperability.

Multi-Org Open Parquet Spark Mirror dbt

Decision Makers

Control & Sovereignty

Own the data layer end-to-end. No phone-home, no per-query egress, no vendor lock-in — sovereign infrastructure with a transparent license.

EU-Hosted No Telemetry Source-Available
5
Storage Backends
128+
REST API Endpoints
24
MCP Tools
16
Data Quality Checks

Integrates With

Ready to Take Control of Your Data?

Start with the free Community edition or talk to our team about Enterprise deployment.