Open Source Series: What is dlt?

Modern, Pythonic ELT Built for the Data Engineer in Code

If you're a Python developer who enjoys clean, flexible workflows directly in code, dlt Hub (short for Data Load Tool) is a minimalist but powerful open-source ELT framework. It makes data ingestion feel like writing native Python scripts—perfect for embedding pipelines in modern cloud-native apps, internal data platforms, or notebooks.

Unlike UI-heavy tools, dlt Hub is a code-first framework built for developers and data engineers who want to stay close to Python while abstracting the hard parts of ELT like schema evolution, incremental loading, and deployment.

What is dlt?

dlt is an open-source Python library for building and running ELT pipelines. It’s designed for teams that want flexibility, composability, and native integration with modern tools like dbt, DuckDB, and BigQuery.

Instead of YAML configs or UIs, you define your pipelines using Python functions enriched by dlt's decorators and helpers.

dlt handles:

Incremental loading
Schema evolution
Credential management
Destination writes

‍

If you want full programmatic control without writing orchestration and ingestion from scratch, dlt hits a unique sweet spot.

Core Concepts

To understand dlt, it helps to get familiar with its building blocks:

Concept	Description
Pipelines as Python Code	Pipelines are functions decorated with `@dlt.resource`, returning iterable data.
Auto Schema Inference	DLT handles schema evolution and versioning automatically.
State and Incrementality	Built-in state management to support incremental loads.
Destinations	Write to Snowflake, BigQuery, DuckDB, Redshift, Postgres, etc.
Modular Pipelines	Compose reusable, testable pipelines like Python modules.

A Simple Example

Here’s a minimal dlt pipeline that pulls GitHub issues and loads them into Snowflake.

import dlt 
import requests 

@dlt.resource(write_disposition="append") 
def github_issues(): 
    response = requests.get("https://api.github.com/repos/org/repo/issues") 
    yield from response.json() 

pipeline = dlt.pipeline( 
    pipeline_name="github_pipeline", 
    destination="snowflake", 
    dataset_name="github_data" 
) 

pipeline.run(github_issues())

‍

No YAML files, no complex config—the schema is inferred, the state is managed, and the data is loaded.

Supported Integrations

dlt is source-agnostic and ships with adapters for common data sources:

Component	Description
Sources	API-based sources such as Shopify, Stripe, Google Analytics, GitHub, Salesforce
Databases	Supports extraction from databases like Postgres and SQLite
Destinations	Snowflake, BigQuery, DuckDB, Redshift, and Postgres
Transformations	Optional dbt integration for running models post-load
Flexibility	Designed for full-code data pipelines and easy embedding in apps and scripts

Who is dlt For?

dlt is best suited for:

Python-native teams who want to stay in code, not GUIs
Developers building internal tools or platforms
Data engineers needing clean abstractions for incremental data loads
Fast-moving startups and prototyping teams

Pros and Considerations

Why choose dlt?

Pros	Considerations
Native Python experience	No GUI; code-only interface
Fast to prototype, easy to deploy	Smaller ecosystem compared to Airbyte or Meltano
Auto schema evolution and state management	You manage orchestration unless integrated into another tool
Composable, modular pipeline design	Primarily maintained by a small but growing community
Great for embedding in apps and custom platforms	No UI-driven pipeline configuration or scheduling out of the box

Community and Evolution

While it’s newer than tools like Airbyte or Meltano, it’s growing rapidly.

Active GitHub repo with frequent updates
Solid documentation with code-first examples
Expanding integrations and open ecosystem

Conclusion

dlt hub is the “developer’s ELT”—lean, composable, and great for embedding or scripting data ingestion. While it doesn’t have the vast native connector ecosystem of tools like Airbyte or Meltano, it makes it easy to build repeatable, API-based connections using Python. dlt excels in use cases where full-code and flexibility are the priority. If you're already building pipelines in python taking a further look at dlt may save you time.

In future posts, we'll compare it directly to Airbyte and Meltano and later we'll show you how you can productionize dlt.

Until then, you can try it out at dltHub.com or browse the code on GitHub.

Open Source Series: What is dlt?

What is dlt?

Core Concepts

A Simple Example

Supported Integrations

Who is dlt For?

Pros and Considerations

Community and Evolution

Conclusion

More blog posts

What is Snowflake and Why do we Like it?

Open Source Series: Deploying dbt using Airflow

Open Source Series: One-Click Deployment of Meltano