Skip to content

Sluice

Sluice — clean data flows through

Clean data flows through.

A config-driven ETL toolkit that validates your data before it reaches its destination — not after.

Data quality is the hidden blocker for both migrations and AI adoption.

Sluice is a data migration and data quality tool that validates your data before it reaches its destination — not after. You describe the entire migration as a YAML file: where the data comes from, the quality rules it has to pass, how each field maps to the target. Sluice validates the source, transforms it, and loads only the clean records — the bad rows go to a rejection report so you can fix the source.

Config-driven

Pipelines defined in YAML, no code required for standard migrations.

Source & target agnostic

Built-in adapters for MSSQL, PostgreSQL, CSV, XLSX, REST. ERP connectors (IFS, Business Central, BlueCherry) available as paid add-ons.

Data quality first

Validate before you load. Rejection CSVs and DQ summary reports built in.

AI data readiness

Use sluice validate as a pre-AI quality gate — know your data is fit for Copilot, Power BI, or any LLM tool before it causes damage.

Twenty lines of YAML migrate a CSV of customers, validate emails, normalise field casing, and write a clean CSV — with bad rows dropped to a rejection report.

pipeline:
name: customers-quickstart
client: demo
version: "1.0"
entity: Customer
source:
adapter: csv
file: ./data/customers.csv
dq:
rules:
- field: email
checks:
- { type: notNull, severity: critical }
- { type: email, severity: warning }
transform:
fields:
- { from: name, to: Name, type: string, cleanse: trim|titleCase }
- { from: email, to: Email, type: string, cleanse: trim|lowercase }
- { from: country, to: Country, type: string, default: GB }
target:
adapter: csv
output: ./output/customers-clean.csv

Run it:

Terminal window
sluice run customers.pipeline.yaml

That’s the whole product. Read the Quickstart for the ten-minute walkthrough, or jump straight into the Pipeline YAML Schema.


Built and maintained by Caracal Lynx Limited — an IT and data consultancy specialising in data migrations and data quality for organisations adopting AI tools. Open source under the Elastic Licence 2.0.