::Go back to Oozie Documentation Index::

Action Authentication

Background

A secure cluster requires that actions have been authenticated (typically via Kerberos). However, due to the way that Oozie runs actions, Kerberos credentials are not easily made available to actions launched by Oozie. For many action types, this is not a problem because they are self contained (beyond core Hadoop components). For example, a Pig action typically only talks to MapReduce and HDFS. However, some actions require talking to external services (e.g. HCatalog, HBase Region Server, Hive Server 2) and in these cases, the actions require some extra configuration in Oozie to authenticate. To be clear, this extra configuration is only required if an action will be talking to these types of external services; running a typical MapReduce, Pig, Hive, etc action will not require any of this.

For these situations, Oozie will have to use its Kerberos credentials to obtain "delegation tokens" (think of it like a cookie) on behalf of the user from the service in question. The details of what this means is beyond the scope of this documentation, but basically, Oozie needs some extra configuration in the workflow so that it can obtain this delegation token.

Oozie Server Configuration

The code to obtain delegation tokens is pluggable so that it is easy to add support for different services by simply subclassing org.apache.oozie.action.hadoop.Credentials to retrieve a delegation token from the service and add it to the Configuration.

Out of the box, Oozie already comes with support for some credential types (see Built-in Credentials Implementations ). The credential classes that Oozie should load are specified by the following property in oozie-site.xml. The lefthand side of the equals sign is the type for the credential type, while the righthand side is the class.

   <property>
      <name>oozie.credentials.credentialclasses</name>
      <value>
         hcat=org.apache.oozie.action.hadoop.HCatCredentials,
         hbase=org.apache.oozie.action.hadoop.HbaseCredentials,
         hive2=org.apache.oozie.action.hadoop.Hive2Credentials
      </value>
   </property>

Workflow Changes

The user should add a credentials section to the top of their workflow that contains 1 or more credential sections. Each of these credential sections contains a name for the credential, the type for the credential, and any configuration properties needed by that type of credential for obtaining a delegation token. The credentials section is available in workflow schema version 0.3 and later.

For example, the following workflow is configured to obtain an HCatalog delegation token, which is given to a Pig action so that the Pig action can talk to a secure HCatalog:

   <workflow-app xmlns='uri:oozie:workflow:0.4' name='pig-wf'>
      <credentials>
         <credential name='my-hcat-creds' type='hcat'>
            <property>
               <name>hcat.metastore.uri</name>
               <value>HCAT_URI</value>
            </property>
            <property>
               <name>hcat.metastore.principal</name>
               <value>HCAT_PRINCIPAL</value>
            </property>
         </credential>
      </credentials>
      ...
      <action name='pig' cred='my-hcat-creds'>
         <pig>
            <job-tracker>JT</job-tracker>
            <name-node>NN</name-node>
            <configuration>
               <property>
                  <name>TESTING</name>
                  <value>${start}</value>
               </property>
            </configuration>
         </pig>
      </action>
      ...
   </workflow-app>

The type of the credential is "hcat", which is the type name we gave for the HCatCredentials class in oozie-site.xml. We gave the credential a name, "my-hcat-creds", which can be whatever you want; we then specify cred='my-hcat-creds' in the Pig action, so that Oozie will include these credentials with the action. You can include multiple credentials with an action by specifying a comma-separated list of credential names. And finally, the HCatCredentials required two properties (the metastore URI and principal), which we also specified.

Built-in Credentials Implementations

Oozie currently comes with the following Credentials implementations:

  1. HCatalog and Hive Metastore: org.apache.oozie.action.hadoop.HCatCredentials
  2. HBase: org.apache.oozie.action.hadoop.HBaseCredentials
  3. Hive Server 2: org.apache.oozie.action.hadoop.Hive2Credentials

HCatCredentials requires these two properties:

  1. hcat.metastore.principal
  2. hcat.metastore.uri

Note: The HCatalog Metastore and Hive Metastore are one and the same and so the "hcat" type credential can also be used to talk to a secure Hive Metastore, though the property names would still start with "hcat.".

HBase does not require any additional properties since the hbase-site.xml on the Oozie server provides necessary information to the obtain delegation token; though properties can be overwritten here if desired.

Hive2Credentials requires these two properties:

  1. hive2.server.principal
  2. hive2.jdbc.url

::Go back to Oozie Documentation Index::