美文网首页
Hadoop开启kerberos(基于CDH6.3)

Hadoop开启kerberos(基于CDH6.3)

作者: 清蒸三文鱼_ | 来源:发表于2024-04-06 18:37 被阅读0次

步骤

系统环境: Centos7.9
用户: root

一. 搭建Kerberos环境

1. 搭建Kerberos server

1. 安装

yum install krb5-server krb5-libs krb5-auth-dialog krb5-workstation openldap-clients -y

执行命令后会生成/etc/krb5.conf、/var/kerberos/krb5kdc/kadm5.acl、/var/kerberos/krb5kdc/kdc.conf三个文件

2. 修改配置, 将realm改为HADOOP.COM(按需)

  • /etc/krb5.conf
# Configuration snippets may be placed in this directory as well
includedir /etc/krb5.conf.d/

[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 dns_lookup_realm = false
 ticket_lifetime = 24h
 renew_lifetime = 7d
 forwardable = true
 rdns = false
 pkinit_anchors = FILE:/etc/pki/tls/certs/ca-bundle.crt
# default_realm = EXAMPLE.COM
default_realm = HADOOP.COM
# 该配置建议去掉, 可能会导致客户端认证失败
# default_ccache_name = KEYRING:persistent:%{uid}

[realms]
# EXAMPLE.COM = {
#  kdc = kerberos.example.com
#  admin_server = kerberos.example.com
# }
HADOOP.COM = {
  # kdc认证服务的host名称, 需要提前在/etc/hosts做配置
  kdc = kdc-server-host1
  admin_server = kdc-server-host1
}

[domain_realm]
# .example.com = EXAMPLE.COM
# example.com = EXAMPLE.COM
.example.com = HADOOP.COM
example.com = HADOOP.COM
  • /var/kerberos/krb5kdc/kadm5.acl
*/admin@HADOOP.COM     *
  • /var/kerberos/krb5kdc/kdc.conf
[kdcdefaults]
 kdc_ports = 88
 kdc_tcp_ports = 88

[realms]
 HADOOP.COM = {
  #master_key_type = aes256-cts
  acl_file = /var/kerberos/krb5kdc/kadm5.acl
  dict_file = /usr/share/dict/words
  admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
  supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:norm
al
 }

3. 初始化Kerberos

  • 创建Kerberos数据库
kdb5_util create –r http://HADOOP.COM -s

途中提示输入密码, 重复两次即可

  • 创建Kerberos的管理账号
执行kadmin.local, 输入admin/admin@HADOOP.COM

途中提示输入密码, 重复两次即可

  • 启动kdc和kadmin服务
systemctl enable krb5kdc
systemctl enable kadmin
systemctl start krb5kdc
systemctl start kadmin

2. Kerberos Client访问

建议所有hadoop和微服务相关的全部机器都进行Kerberos client的安装, 并将Kerberos server的配置/etc/krb5.conf拷贝一份过来到相同目录下

  • 安装客户端
# RHEL
yum install krb5-workstation krb5-libs
# SUSE
zypper install krb5-client
# Ubuntu, Debian
apt-get install krb5-user

二. Hadoop环境配置

1. hdfs配置

需要操作的角色: Datanode, Namenode, SecondaryNameNode

1) Kerberos凭证生成和初始化

在Kerberos server所在机器(root用户)生成hdfs相关的凭证, 每台hdfs的节点都需要用于组件内部通讯, 示例为namenode数量为1, datanode数量为3

  • 凭证名称
hdfs/hadoop-node1@HADOOP.COM
hdfs/hadoop-node2@HADOOP.COM
hdfs/hadoop-node3@HADOOP.COM

HTTP/hadoop-node1@HADOOP.COM
HTTP/hadoop-node2@HADOOP.COM
HTTP/hadoop-node3@HADOOP.COM
  • 输入命令给所有机器生成凭证, 途中需要输入两次密码
kadmin.local: 
add_principal hdfs/hadoop-node1@HADOOP.COM
add_principal hdfs/hadoop-node2@HADOOP.COM
....

list_principals为查看命令
  • 导出凭证保存为hdfs.keytab文件, 将keytab文件上传到datanode机器的相同目录上
kadmin.local: 
ktadd -k /tmp/hadoop-node1/hdfs.keytab hdfs/hadoop-node1@HADOOP.COM
ktadd -k /tmp/hadoop-node2/hdfs.keytab hdfs/hadoop-node2@HADOOP.COM
.....

2) core-site.xml

<property>
  <name>hadoop.security.authentication</name>
  <value>kerberos</value>
</property>

<property>
  <name>hadoop.security.authorization</name>
  <value>true</value>
</property>

<property>
  <name>java.security.krb5.conf</name>
  <value>/path/to/krb5.conf</value>
</property>

3) hdfs-site.xml

  • NameNode
<property>
  <name>dfs.namenode.kerberos.principal</name>
  <value>hdfs/_HOST@<REALM></value>
</property>

<property>
  <name>dfs.namenode.keytab.file</name>
  <value>/path/to/hdfs.keytab</value>
</property>
  • DataNode
<property>
  <name>dfs.datanode.kerberos.principal</name>
  <value>hdfs/_HOST@<REALM></value>
</property>

<property>
  <name>dfs.datanode.keytab.file</name>
  <value>/path/to/hdfs.keytab</value>
</property>
  • Secondary Namenode
<property>
  <name>dfs.secondary.namenode.kerberos.principal</name>
  <value>hdfs/_HOST@<REALM></value>
</property>

<property>
  <name>dfs.secondary.namenode.keytab.file</name>
  <value>/path/to/hdfs.keytab</value>
</property>

开启Kerberos后, 可能会对系统的pid有要求需要小于1024, 如果启动Datanode报错时, 尝试在Datanode节点上的hdfs-site.xml加入如下配置, 对Datanode的数据收发端口做配置


<property>
  <name>dfs.datanode.address</name>
  <value>datanode-host1:1004</value>
</property>
  <property>
  <name>dfs.datanode.http.address</name>
  <value>datanode-host1:1006</value>
</property>

hdfs http web控制台Kerberos(按需)

core-site.xml
<property>
    <name>hadoop.http.filter.initializers</name>
    <value>org.apache.hadoop.security.AuthenticationFilterInitializer</value>
</property>
<property>
    <name>hadoop.http.authentication.type</name>
    <value>kerberos</value>
</property>

<property>
    <name>hadoop.http.authentication.signature.secret.file</name>
    <value>/opt/http-auth-signature-secret</value>
</property>
<property>
    <name>hadoop.http.authentication.cookie.domain</name>
    <value></value>
</property>
<property>
    <name>hadoop.http.authentication.kerberos.keytab</name>
    <value>/path/to/hdfs.keytab</value>
</property>
<property>
    <name>hadoop.http.authentication.kerberos.principal</name>
    <value>HTTP/_HOST@HADOOP.COM</value>
</property>
hdfs-site.xml
  <property>
    <name>dfs.web.authentication.kerberos.keytab</name>
    <value>/path/to/hdfs.keytab</value>
  </property>
    <property>
    <name>dfs.web.authentication.kerberos.principal</name>
    <value>HTTP/_HOST@HADOOP.COM</value>
  </property>

开启http的Kerberos认证后, hdfs相关的web界面不能直接访问了,必须通过域名进行访问, 不能ip访问会报错401或403

  • curl
    curl --negotiate -u : http://hdfs-node1:9870
  • 火狐浏览器

1.之所以用火狐是因为配置较为简单, 地址栏输入about:config,调整如下两个配置 network.negotiate-auth.trusted-uris=hdfs-node1,hdfs-node2输入目标机器的域名; network.negotiate-auth.using-native-gsslib=true;</br>2.下载window版本的MIT,即kerberos的环境,修改配置krb5.ini(可以将linux kdc下的/etc/krb5.conf拷贝下来),随后GetTicket输入账号密码;
3.访问http://hdfs-node1:9870

重启hdfs所有节点, 并挑选机器进行验证, 机器需要安装Kerberos Client

kdestroy
kinit -kt /tmp/hadoop-node1/hdfs.keytab hdfs/hadoop-node1@HADOOP.COM
hadoop fs -ls /

命令能正常输出即为正常

2. yarn配置

涉及角色: NodeManager,ResourceManager,JobHistory Server
yarn同样需要提前生成kerberos的keytab文件,参考hdsf章节

yarn/yarn-node1@HADOOP.COM
yarn/yarn-node2@HADOOP.COM

core-site.xml

  <property>
    <name>hadoop.security.authentication</name>
    <value>kerberos</value>
  </property>
    <property>
    <name>hadoop.security.authorization</name>
    <value>true</value>
  </property>

hdfs-site.xml

<property>
  <name>dfs.namenode.kerberos.principal</name>
  <value>hdfs/_HOST@HADOOP.COM</value>
</property>

<property>
  <name>dfs.namenode.kerberos.internal.spnego.principal</name>
  <value>HTTP/_HOST@HADOOP.COM</value>
</property>

</property>
  <property>
  <name>dfs.datanode.kerberos.principal</name>
  <value>hdfs/_HOST@HADOOP.COM</value>
</property>

yarn-site.xml

  • ResourceManager
<!-- resource manager -->
  <property>
    <name>yarn.resourcemanager.principal</name>
    <value>yarn/_HOST@HADOOP.COM</value>
  </property>
  
  <property>
    <name>yarn.resourcemanager.webapp.spnego-principal</name>
    <value>HTTP/_HOST@HADOOP.COM</value>
  </property>
  
  <property>
    <name>yarn.resourcemanager.keytab</name>
    <value>/path/to/yarn.keytab</value>
  </property>
  
  <property>
    <name>yarn.resourcemanager.webapp.spnego-keytab-file</name>
    <value>/path/to/yarn.keytab</value>
  </property>
  • NodeManager
  <!-- nodemanager -->
  <property>
    <name>yarn.nodemanager.principal</name>
    <value>yarn/_HOST@HADOOP.COM</value>
  </property>
  
  <property>
    <name>yarn.nodemanager.webapp.spnego-principal</name>
    <value>HTTP/_HOST@HADOOP.COM</value>
  </property>
  
  <property>
    <name>yarn.nodemanager.keytab</name>
    <value>/path/to/yarn.keytab</value>
  </property>
  
  <property>
    <name>yarn.nodemanager.webapp.spnego-keytab-file</name>
    <value>/path/to/yarn.keytab</value>
  </property>

yarn http web控制台Kerberos(按需)

前提: hdfs开启了控制台的Kerberos认证

  • core-site.xml
 <property>
    <name>hadoop.http.authentication.type</name>
    <value>kerberos</value>
  </property>
    <property>
    <name>hadoop.http.authentication.signature.secret.file</name>
    <value>/opt/http-auth-signature-secret</value>
  </property>
  <property>
    <name>hadoop.http.authentication.kerberos.principal</name>
    <value>HTTP/_HOST@HADOOP.COM</value>
  </property>
  <property>
    <name>hadoop.http.authentication.cookie.domain</name>
    <value></value>
  </property>
  <property>
    <name>hadoop.http.authentication.kerberos.keytab</name>
    <value>yarn.keytab</value>
  </property>

开启yarn的Kerberos认证后, yarn相关的web界面不能直接访问了,必须通过域名进行访问, 不能ip访问会报错401或403



  • curl
    curl --negotiate -u : http://yarn-node1:8088/cluster
  • 火狐浏览器


1.之所以用火狐是因为配置较为简单, 地址栏输入about:config,调整如下两个配置 network.negotiate-auth.trusted-uris=yarn-node1,yarn-node2输入目标机器的域名; network.negotiate-auth.using-native-gsslib=true;
2.下载window版本的MIT,即kerberos的环境,修改配置krb5.ini(可以将linux kdc下的/etc/krb5.conf拷贝下来),随后GetTicket输入账号密码;
3.访问http://yarn-node1:8088

三. kerberos适配

  1. flink相关脚本改造
# 导入凭证
kdestroy
kinit -kt /tmp/hdfs.keytab hdfs@HADOOP.COM
# 执行
export KRB5_CONFIG=/tmp/krb5.conf
$FLINK_HOME/bin/flink run -d \
-yD security.kerberos.krb5-conf.path=/tmp/krb5.conf \
-yD security.kerberos.login.keytab=/tmp/hdfs.keytab \
-yD security.kerberos.login.principal=hdfs@HADOOP.COM \
....
  1. hdfs
public static void main(String[] args) throws Exception {
    String local_dir = "/tmp/";
    String userName = "hdfs/hdfs@HADOOP.COM";
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS", "hdfs://xxx:8020");
    conf.set("hadoop.security.authentication", "kerberos");
    conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
    
    System.setProperty("java.security.krb5.conf", local_dir + "krb5.conf");
    UserGroupInformation.setConfiguration(conf);
    UserGroupInformation.loginUserFromKeytabAndReturnUGI(userName, local_dir + "hdfs.keytab");
    SecurityUtil.setSecurityInfoProviders(new AnnotatedSecurityInfo());
    FileSystem fs = UserGroupInformation.getLoginUser().doAs(
            (PrivilegedExceptionAction<FileSystem>) () -> FileSystem.get(conf));
}

五. 常见异常处理

Requested user hdfs is not whitelisted and has id 981,which is below the minimum allowed 1000

修改nodemanager的container-executor.cfg配置, min.user.id=0

Requested user hdfs is banned

修改nodemanager的container-executor.cfg配置, 在users的配置项去掉hdfs用户,banned.users=root,bin

Flink任务运行一段时间后异常, Unable to set the Hadoop login user, Checksum failed
Shutting YarnJobClusterEntrypoint down with application status FAILED. Diagnostics org.apache.flink.runtime.security.modules.SecurityModule$Security
hdfs@HADOOP.COM from keytab /yarn/nm/usercache/hdfs/appcache/application_1712457844733_0001/container_1712457844733_0001_03_000001/krb5.keytab javax
Caused by: javax.security.auth.login.LoginException: Checksum failed
解决:

说明hadoop集群的kb认证可能到期了或者变更了, 需要在cdh上重新生成凭据, 然后重启集群


相关文章

网友评论

      本文标题:Hadoop开启kerberos(基于CDH6.3)

      本文链接:https://www.haomeiwen.com/subject/ptaatjtx.html