Oozie, Workflow Engine for Apache Hadoop

Oozie v3 is a server based Bundle Engine that provides a higher-level oozie abstraction that will batch a set of coordinator applications. The user will be able to start/stop/suspend/resume/rerun a set coordinator jobs in the bundle level resulting a better and easy operational control.

Oozie v2 is a server based Coordinator Engine specialized in running workflows based on time and data triggers. It can continuously run workflows based on time (e.g. run it every hour), and data availability (e.g. wait for my input data to exist before running my workflow).

Oozie v1 is a server based Workflow Engine specialized in running workflow jobs with actions that execute Hadoop Map/Reduce and Pig jobs.

Distribution Contents

Oozie distribution consists of a single 'tar.gz' file containing:

  • Readme, license, notice & Release log files.
  • Oozie server: oozie-server directory.
  • Scripts: bin/ directory, client and server scripts.
  • Binaries: lib/ directory, client JAR files.
  • Configuration: conf/ server configuration directory.
  • Archives:
    • oozie-client-*.tar.gz : Client tools.
    • oozie.war : Oozie WAR file.
    • docs.zip : Documentation.
    • oozie-examples-*.tar.gz : Examples.
    • oozie-sharelib-*.tar.gz : Share libraries (with Streaming, Pig JARs).

Quick Start

Enough reading already? Follow the steps in Oozie Quick Start to get Oozie up and running.

Licensing Information

Oozie is distributed under Apache License 2.0 .

For details on the license of the dependent components, refer to the Dependencies Report, Licenses section .

Oozie bundles an embedded Apache Tomcat 6.x.

Some of the components in the dependencies report don't mention their license in the published POM. They are:

Oozie uses a modified version of the Apache Doxia core and twiki plugins to generate Oozie documentation.

Engineering Documentation

Oozie User Authentication Documentation