Apache Gobblin

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

Download

Latest News

Aug 2023 Apache Gobblin 0.17.0 released.
Feb 2022 Apache Gobblin 0.16.0 released.
Jan 2021 Apache Gobblin is now a Top Level Project.
Dec 2020 Apache Gobblin 0.15.0-incubating released.
Mar 2020 Apache Gobblin has improved High Level Consumer.
Dec 2018 Apache Gobblin 0.14.0-incubating released.
Sep 2018 Apache Gobblin 0.13.0-incubating released.

Execution Modes

Standalone

Runs as standalone application on a single box. Also supports embedded mode.

Mapreduce Mode

Runs as an mapreduce application on multiple Hadoop versions. Also supports Azkaban for launcing mapreduce jobs.

Cluster / Yarn

Runs as a standalone cluster with primary and worker nodes. This mode supports high availability, and can run on bare metals as well.

Cloud

Runs as elastic cluster on public cloud. This mode supports high availability.


Copyright © 2021 The Apache Software Foundation
Apache, Apache Gobblin, the Apache feather and the Gobblin logo are trademarks of The Apache Software Foundation

Foundation | License | Events | Security | Sponsorship | Thanks