<
::Go back to Oozie Documentation Index::
The shell action runs a Shell command.
The workflow job will wait until the Shell command completes before continuing to the next action.
To run the Shell job, you have to configure the shell action with the =job-tracker=, name-node and Shell exec elements as well as the necessary arguments and configuration.
A shell action can be configured to create or delete HDFS directories before starting the Shell job.
Shell launcher configuration can be specified with a file, using the job-xml element, and inline, using the configuration elements.
Oozie EL expressions can be used in the inline configuration. Property values specified in the configuration element override values specified in the job-xml file.
Note that Hadoop mapred.job.tracker and fs.default.name properties must not be present in the inline configuration.
As with Hadoop map-reduce jobs, it is possible to add files and archives in order to make them available to the Shell job. Refer to the [WorkflowFunctionalSpec#FilesAchives][Adding Files and Archives for the Job] section for more information about this feature.
The output (STDOUT) of the Shell job can be made available to the workflow job after the Shell job ends. This information could be used from within decision nodes. If the output of the Shell job is made available to the workflow job the shell command must follow the following requirements:
Syntax:
<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:0.3"> ... <action name="[NODE-NAME]"> <shell xmlns="uri:oozie:shell-action:0.1"> <job-tracker>[JOB-TRACKER]</job-tracker> <name-node>[NAME-NODE]</name-node> <prepare> <delete path="[PATH]"/> ... <mkdir path="[PATH]"/> ... </prepare> <job-xml>[SHELL SETTINGS FILE]</job-xml> <configuration> <property> <name>[PROPERTY-NAME]</name> <value>[PROPERTY-VALUE]</value> </property> ... </configuration> <exec>[SHELL-COMMAND]</exec> <argument>[ARG-VALUE]</argument> ... <argument>[ARG-VALUE]</argument> <env-var>[VAR1=VALUE1]</env-var> ... <env-var>[VARN=VALUEN]</env-var> <file>[FILE-PATH]</file> ... <archive>[FILE-PATH]</archive> ... <capture-output/> </shell> <ok to="[NODE-NAME]"/> <error to="[NODE-NAME]"/> </action> ... </workflow-app>
The prepare element, if present, indicates a list of paths to delete or create before starting the job. Specified paths must start with hdfs://HOST:PORT .
The job-xml element, if present, specifies a file containing configuration for the Shell job.
The configuration element, if present, contains configuration properties that are passed to the Shell job.
The exec element must contain the path of the Shell command to execute. The arguments of Shell command can then be specified using one or more argument element.
The argument element, if present, contains argument to be passed to the Shell command.
The env-var element, if present, contains the environemnt to be passed to the Shell command. env-var should contain only one pair of environment variable and value. If the pair contains the variable such as $PATH, it should follow the Unix convention such as PATH=$PATH:mypath. Don't use ${PATH} which will be substitued by Oozie's EL evaluator.
A shell action creates a Hadoop configuration. The Hadoop configuration is made available as a local file to the Shell application in its running directory. The exact file path is exposed to the spawned shell using the environment variable called OOZIE_ACTION_CONF_XML .The Shell application can access the environemnt variable to read the action configuration XML file path.
If the capture-output element is present, it indicates Oozie to capture output of the STDOUT of the shell command execution. The Shell command output must be in Java Properties file format and it must not exceed 2KB. From within the workflow definition, the output of an Shell action node is accessible via the String action:output(String node, String key) function (Refer to section '4.2.6 Action EL Functions').
All the above elements can be parameterized (templatized) using EL expressions.
Example:
How to run any shell script or perl script or CPP executable
<workflow-app xmlns='uri:oozie:workflow:0.3' name='shell-wf'> <start to='shell1' /> <action name='shell1'> <shell xmlns="uri:oozie:shell-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>mapred.job.queue.name</name> <value>${queueName}</value> </property> </configuration> <exec>${EXEC}</exec> <argument>A</argument> <argument>B</argument> <file>${EXEC}#${EXEC}</file> <!--Copy the executable to compute node's current working directory --> </shell> <ok to="end" /> <error to="fail" /> </action> <kill name="fail"> <message>Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name='end' /> </workflow-app>
The corresponding job properties file used to submit Oozie job could be as follows:
oozie.wf.application.path=hdfs://localhost:9000/user/kamrul/workflows/script#Execute is expected to be in the Workflow directory. #Shell Script to run EXEC=script.sh #CPP executable. Executable should be binary compatible to the compute node OS. #EXEC=hello #Perl script #EXEC=script.pl jobTracker=localhost:9001 nameNode=hdfs://localhost:9000 queueName=default
How to run any java program bundles in a jar.
<workflow-app xmlns='uri:oozie:workflow:0.3' name='shell-wf'> <start to='shell1' /> <action name='shell1'> <shell xmlns="uri:oozie:shell-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>mapred.job.queue.name</name> <value>${queueName}</value> </property> </configuration> <exec>java</exec> <argument>-classpath</argument> <argument>./${EXEC}:$CLASSPATH</argument> <argument>Hello</argument> <file>${EXEC}#${EXEC}</file> <!--Copy the jar to compute node current working directory --> </shell> <ok to="end" /> <error to="fail" /> </action> <kill name="fail"> <message>Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name='end' /> </workflow-app>
The corresponding job properties file used to submit Oozie job could be as follows:
oozie.wf.application.path=hdfs://localhost:9000/user/kamrul/workflows/script#Hello.jar file is expected to be in the Workflow directory. EXEC=Hello.jar jobTracker=localhost:9001 nameNode=hdfs://localhost:9000 queueName=default
Shell action's stdout and stderr output are redirected to the Oozie Launcher map-reduce job task STDOUT that runs the shell command.
From Oozie web-console, from the Shell action pop up using the 'Console URL' link, it is possible to navigate to the Oozie Launcher map-reduce job task logs via the Hadoop job-tracker web-console.
Although Shell action can execute any shell command, there are some limitations.
<?xml version="1.0" encoding="UTF-8"?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:shell="uri:oozie:shell-action:0.1" elementFormDefault="qualified" targetNamespace="uri:oozie:shell-action:0.1"> <xs:element name="shell" type="shell:ACTION"/> <xs:complexType name="ACTION"> <xs:sequence> <xs:element name="job-tracker" type="xs:string" minOccurs="1" maxOccurs="1"/> <xs:element name="name-node" type="xs:string" minOccurs="1" maxOccurs="1"/> <xs:element name="prepare" type="shell:PREPARE" minOccurs="0" maxOccurs="1"/> <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="1"/> <xs:element name="configuration" type="shell:CONFIGURATION" minOccurs="0" maxOccurs="1"/> <xs:element name="exec" type="xs:string" minOccurs="1" maxOccurs="1"/> <xs:element name="argument" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="env-var" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="capture-output" type="shell:FLAG" minOccurs="0" maxOccurs="1"/> </xs:sequence> </xs:complexType> <xs:complexType name="FLAG"/> <xs:complexType name="CONFIGURATION"> <xs:sequence> <xs:element name="property" minOccurs="1" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" minOccurs="1" maxOccurs="1" type="xs:string"/> <xs:element name="value" minOccurs="1" maxOccurs="1" type="xs:string"/> <xs:element name="description" minOccurs="0" maxOccurs="1" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> <xs:complexType name="PREPARE"> <xs:sequence> <xs:element name="delete" type="shell:DELETE" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="mkdir" type="shell:MKDIR" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <xs:complexType name="DELETE"> <xs:attribute name="path" type="xs:string" use="required"/> </xs:complexType> <xs:complexType name="MKDIR"> <xs:attribute name="path" type="xs:string" use="required"/> </xs:complexType> </xs:schema>