For user authentication in Hadoop, CloverETL can use the Kerberos authentication protocol.
To use Kerberos, you have to set up your Java, project and HDFS connection. For more information, see Kerberos requirements and setting.
Note that the following instructions are applicable for Tomcat application server and Unix-like systems.
There are several ways of setting Java for Kerberos.
In case of the first two options (configuration via system properties
and via configuration file), you must modify both setenv.sh
in CloverETL Server
and CloverETLDesigner.ini
in CloverETL Designer.
Additionally, add the parameters in CloverETL Designer to → → → pane.
Configuration via system properties
Set the Java system property java.security.krb5.realm
to the name of your Kerberos realm, e.g.
-Djava.security.krb5.realm=EXAMPLE.COM
Set the Java system property java.security.krb5.kdc
to the hostname of your Kerberos
key distribution center, e.g.
-Djava.security.krb5.kdc=kerberos.example.com
Configuration via config file
Set the Java system property java.security.krb5.conf
to point to the location of your Kerberos
configuration file, e.g.
-Djava.security.krb5.conf="/path/to/krb5.conf"
Configuration via config file in Java installation directory
Put the krb5.conf
file into the %JAVA_HOME%/lib/security
directory, e.g. /opt/jdk1.8.0_144/jre/lib/security/krb5.conf.
.keytab
file into the project, e.g.
conn/clover.keytab
![]() | Note |
---|---|
Kerberos authentication requires the
|
HDFS and MapReduce Connection
clover/clover@EXAMPLE.COM
Set the following parameters in the Hadoop Parameters pane:
cloveretl.hadoop.kerberos.keytab=${CONN_DIR}/clover.keytab hadoop.security.authentication=Kerberos yarn.resourcemanager.principal=yarn/_HOST@EXAMPLE.COM
Example 34.1. Properties needed to connect to a Hadoop High Availability (HA) cluster in Hadoop connection
mapreduce.app-submission.cross-platform\=true yarn.application.classpath\=\:$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*, $HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*, $HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*, $HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*, $HADOOP_YARN_HOME/lib/*\: yarn.app.mapreduce.am.resource.mb\=512 mapreduce.map.memory.mb\=512 mapreduce.reduce.memory.mb\=512 mapreduce.framework.name\=yarn yarn.log.aggregation-enable\=true mapreduce.jobhistory.address\=example.com\:port yarn.resourcemanager.ha.enabled\=true yarn.resourcemanager.ha.rm-ids\=rm1,rm2 yarn.resourcemanager.hostname.rm1\=example.com yarn.resourcemanager.hostname.rm2\=example.com yarn.resourcemanager.scheduler.address.rm1\=example.com\:port yarn.resourcemanager.scheduler.address.rm2\=example.com\:port fs.permissions.umask-mode\=000 fs.defaultFS\=hdfs\://nameservice1 fs.default.name\=hdfs\://nameservice1 fs.nameservices\=nameservice1 fs.ha.namenodes.nameservice1\=namenode1,namenode2 fs.namenode.rpc-address.nameservice1.namenode1\=example.com\:port fs.namenode.rpc-address.nameservice1.namenode2\=example.com\:port fs.client.failover.proxy.provider.nameservice1\=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider type=HADOOP host=nameservice1 username=clover/clover@EXAMPLE.COM hostMapred=Not needed for YARN
![]() | Tip |
---|---|
The |
If you encounter an error:
No common protection layer
between client and server
set the hadoop.rpc.protection
parameter to match
your Hadoop cluster configuration.
Hive Connection
Add ;principal=hive/_HOST@EXAMPLE.COM
to the URL, e.g.
jdbc:hive2://hive.example.com:10000/default;principal=hive/_HOST@EXAMPLE.COM
clover/clover@EXAMPLE.COM
cloveretl.hadoop.kerberos.keytab=${CONN_DIR}/clover.keytab
in Advanced JDBC properties.