Open Source Series: What is dlt?

Why dlt Hub Is a Pythonic Alternative to Traditional ELT Tools

Modern, Pythonic ELT Built for the Data Engineer in Code

If you're a Python developer who enjoys clean, flexible workflows directly in code, dlt Hub (short for Data Load Tool) is a minimalist but powerful open-source ELT framework. It makes data ingestion feel like writing native Python scripts—perfect for embedding pipelines in modern cloud-native apps, internal data platforms, or notebooks.

Unlike UI-heavy tools, dlt Hub is a code-first framework built for developers and data engineers who want to stay close to Python while abstracting the hard parts of ELT like schema evolution, incremental loading, and deployment.

What is dlt?

dlt is an open-source Python library for building and running ELT pipelines. It’s designed for teams that want flexibility, composability, and native integration with modern tools like dbt, DuckDB, and BigQuery.

Instead of YAML configs or UIs, you define your pipelines using Python functions enriched by dlt's decorators and helpers.

dlt handles:

  • Incremental loading
  • Schema evolution
  • Credential management
  • Destination writes

If you want full programmatic control without writing orchestration and ingestion from scratch, dlt hits a unique sweet spot.

Core Concepts

To understand dlt, it helps to get familiar with its building blocks:

Concept Description
Pipelines as Python Code Pipelines are functions decorated with @dlt.resource, returning iterable data.
Auto Schema Inference DLT handles schema evolution and versioning automatically.
State and Incrementality Built-in state management to support incremental loads.
Destinations Write to Snowflake, BigQuery, DuckDB, Redshift, Postgres, etc.
Modular Pipelines Compose reusable, testable pipelines like Python modules.

A Simple Example

Here’s a minimal dlt pipeline that pulls GitHub issues and loads them into Snowflake.

import dlt 
import requests 

@dlt.resource(write_disposition="append") 
def github_issues(): 
    response = requests.get("https://api.github.com/repos/org/repo/issues") 
    yield from response.json() 

pipeline = dlt.pipeline( 
    pipeline_name="github_pipeline", 
    destination="snowflake", 
    dataset_name="github_data" 
) 

pipeline.run(github_issues())

No YAML files, no complex config—the schema is inferred, the state is managed, and the data is loaded.

Supported Integrations

dlt is source-agnostic and ships with adapters for common data sources:

Component Description
Sources API-based sources such as Shopify, Stripe, Google Analytics, GitHub, Salesforce
Databases Supports extraction from databases like Postgres and SQLite
Destinations Snowflake, BigQuery, DuckDB, Redshift, and Postgres
Transformations Optional dbt integration for running models post-load
Flexibility Designed for full-code data pipelines and easy embedding in apps and scripts

Who is dlt For?

dlt is best suited for:

  • Python-native teams who want to stay in code, not GUIs
  • Developers building internal tools or platforms
  • Data engineers needing clean abstractions for incremental data loads
  • Fast-moving startups and prototyping teams

Pros and Considerations

Why choose dlt?

Pros Considerations
Native Python experience No GUI; code-only interface
Fast to prototype, easy to deploy Smaller ecosystem compared to Airbyte or Meltano
Auto schema evolution and state management You manage orchestration unless integrated into another tool
Composable, modular pipeline design Primarily maintained by a small but growing community
Great for embedding in apps and custom platforms No UI-driven pipeline configuration or scheduling out of the box

Community and Evolution

While it’s newer than tools like Airbyte or Meltano, it’s growing rapidly.

  • Active GitHub repo with frequent updates
  • Solid documentation with code-first examples
  • Expanding integrations and open ecosystem

Conclusion

dlt hub is the “developer’s ELT”—lean, composable, and great for embedding or scripting data ingestion. While it doesn’t have the vast native connector ecosystem of tools like Airbyte or Meltano, it makes it easy to build repeatable, API-based connections using Python. dlt excels in use cases where full-code and flexibility are the priority. If you're already building pipelines in python taking a further look at dlt may save you time.

In future posts, we'll compare it directly to Airbyte and Meltano and later we'll show you how you can productionize dlt.

Until then, you can try it out at dltHub.com or browse the code on GitHub.

More blog posts