美文网首页
NameNode Full GC导致集群退出问题剖析,以及重启j

NameNode Full GC导致集群退出问题剖析,以及重启j

作者: 邵红晓 | 来源:发表于2021-09-02 11:02 被阅读0次

元数据流程图

元数据流程图

问题1

在AsyncLogger异步flush写JournalNode过程中,出现IOException,而这个IOException的依赖判断是基于系统时间,获取系统时间之前发生Full GC,导致timeout,terminate namenode直接退出。

  • 分析调用链
    JournalSet.flushAndSync.mapJournalsAndReportErrors.flushAndSync()
    ->实现QuorumOutputStream.flushAndSync()->
    loggers.waitForWriteQuorum(qcall, writeTimeoutMs, "sendEdits");->
  q.waitFor(
          loggers.size(), // either all respond   5
          majority, // or we get a majority successes   3
          majority, // or we get a majority failures, 3
          timeoutMs, operationName);// timeoutMs 20秒 

->timeoutMs 20秒 等待journal node 20s 请求结果必须返回,否则就是超时
QuorumCall.waitFor()

public synchronized void waitFor(
      int minResponses, int minSuccesses, int maxExceptions,
      int millis, String operationName)
      throws InterruptedException, TimeoutException {
    // millis = 20s,必须在20s内方法返回,要不然就是有2个journalnodes写入成功了,要不然就是写入失败
    long st = Time.monotonicNow();
    long nextLogTime = st + (long)(millis * WAIT_PROGRESS_INFO_THRESHOLD);
    /**
     * 计算超时时间点
     */
    long et = st + millis;
    while (true) {//多次尝试的意思     
      checkAssertionErrors();
      /**
       *
       * 思路:
       * 写journalnode 可以,无论写成功和失败,你都跟我个回应。
       *
       */
      // 第一种情况:写入成功的 + 写入失败的 >= 5,5个journalnodes不管是成功还是失败,都给你返回了一个响应
      //countResponses用来统计 journalnode返回来响应点个数
      if (minResponses > 0 && countResponses() >= minResponses) return;
      // 第二种情况:写入成功的 >= 3,只要有3个journalnodes是写入成功的,也立马返回
      if (minSuccesses > 0 && countSuccesses() >= minSuccesses) return;
      // 第三种情况:写入失败的 >=3,只要有3个journalnodes是写入失败的,也可以立马返回
      if (maxExceptions >= 0 && countExceptions() > maxExceptions) return;
      long now = Time.monotonicNow();
      if (now > nextLogTime) {
        long waited = now - st;
        //打印日志
        String msg = String.format(
            "Waited %s ms (timeout=%s ms) for a response for %s",
            waited, millis, operationName);
        if (!successes.isEmpty()) {
          msg += ". Succeeded so far: [" + Joiner.on(",").join(successes.keySet()) + "]";
        }
        if (!exceptions.isEmpty()) {
          msg += ". Exceptions so far: [" + getExceptionMapString() + "]";
        }
        if (successes.isEmpty() && exceptions.isEmpty()) {
          msg += ". No responses yet.";
        }
        //如果等待的时间超过了
        if (waited > millis * WAIT_PROGRESS_WARN_THRESHOLD) {
          QuorumJournalManager.LOG.warn(msg);//warn
        } else {
          //如果没有超过,那么就打印info日志
          QuorumJournalManager.LOG.info(msg);
        }
        //重新计算下一次日志打印的时间:16:00:07 + 1 =16:00:08
        nextLogTime = now + WAIT_PROGRESS_INTERVAL_MILLIS;
      }
      long rem = et - now;
      /**
       * 说明已经超时了重点代码
       */
      if(rem <= 0){ 
        throw  new TimeoutException();
      }
      rem = Math.min(rem, nextLogTime - now);
      rem = Math.max(rem, 1);
      wait(rem);//5
    } 
  }

JournalSet.mapJournalsAndReportError
 for (JournalAndStream jas : journals) {//
      try {
          //FileJouanlManager
        closure.apply(jas);
      } catch (Throwable t) {
        if (jas.isRequired()) {
          final String msg = "Error: " + status + " failed for required journal ("
            + jas + ")";
          LOG.fatal(msg, t);
          // If we fail on *any* of the required journals, then we must not
          // continue on any of the other journals. Abort them to ensure that
          // retry behavior doesn't allow them to keep going in any way.
          abortAllJournals();
          // the current policy is to shutdown the NN on errors to shared edits
          // dir. There are many code paths to shared edits failures - syncs,
          // roll of edits etc. All of them go through this common function 
          // where the isRequired() check is made. Applying exit policy here 
          // to catch all code paths.
          terminate(1, msg);
        } else {
          LOG.error("Error: " + status + " failed for (journal " + jas + ")", t);
          badJAS.add(jas);          
        }
      }
    }

总结

  • 通过源码分析
    1、 关注这行代码if(rem <= 0){ throw new TimeoutException(); },和while 循环外赋值 long now = Time.monotonicNow();long rem = et - now;如果while true循环中复制now之前发生full gc,等待时间超过20s,那这里就会抛出TimeoutException异常,追溯到JournalSet.mapJournalsAndReportErrors方法捕获异常,执行terminate操作,导致namenode的直接退出。
  • 解决办法
    1、修改源码,直接在源码中进行代码执行时间的判断,去除掉GC时间
    2、(推荐)进行jvm调优,增加内存,设置预期最大stw时间 ms,选用G1 gc,添加启动参数
    -Xmx4096M -Xms4096M -XX:+UseG1GC -XX:MaxGCPauseMillis=100

问题2

重启30节点,隔6min,重启31,隔6min,重启32

2022-01-12 13:21:29,779 WARN  client.QuorumJournalManager (IPCLoggerChannel.java:call(388)) - Remote journal 10.32.40.30:8485 failed to write txns 2887442538-2887442538. Will try to write to this JN again after the next log roll.
java.io.EOFException: End of File Exception between local host is: "shyt-hadoop-4031.xxx.com.cn/10.32.40.31"; destination host is: "shyt-hadoop-4030.xxx.com.cn":8485; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1558)
        at org.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy11.journal(Unknown Source)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolTranslatorPB.journal(QJournalProtocolTranslatorPB.java:167)
        at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:385)
        at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:378)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1119)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1014)
2022-01-12 13:21:29,796 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [10.32.40.32:8485, 10.32.40.30:8485, 10.32.40.31:8485], stream=QuorumOutputStream starting at txid 2887441291))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 1 successful responses:
10.32.40.31:8485: null [success]
2 exceptions thrown:
10.32.40.30:8485: End of File Exception between local host is: "shyt-hadoop-4031.xxx.com.cn/10.32.40.31"; destination host is: "shyt-hadoop-4030.xxx.com.cn":8485; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException
10.32.40.32:8485: /export/hadoop/hdfs/journal/xxxdata/current/last-promised-epoch.tmp (No such file or directory)
        at java.io.FileOutputStream.open0(Native Method)
        at java.io.FileOutputStream.open(FileOutputStream.java:270)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
        at org.apache.hadoop.hdfs.util.AtomicFileOutputStream.<init>(AtomicFileOutputStream.java:58)
        at org.apache.hadoop.hdfs.util.PersistentLongFile.writeFile(PersistentLongFile.java:78)
        at org.apache.hadoop.hdfs.util.PersistentLongFile.set(PersistentLongFile.java:64)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.updateLastPromisedEpoch(Journal.java:326)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:434)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.heartbeat(Journal.java:417)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.heartbeat(JournalNodeRpcServer.java:159)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.heartbeat(QJournalProtocolServerSideTranslatorPB.java:172)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25423)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)

        at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:707)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:641)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2721)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2586)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:736)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:409)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
2022-01-12 13:21:29,796 WARN  client.QuorumJournalManager (QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at txid 2887441291
2022-01-12 13:21:29,814 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1

分析

3节点的journalnode,只有一个响应成功,30重启,连接不上是正常的,
31 节点连接成功
32 节点连接失败,异常
namenode连接journalnode,edit日志push要求必须过半成功,既2个节点返回成功才可以,否则namenode直接terminate exit 挂了

解决

重新初始化32节点journalnode的目录
1、sudo su - hdfs && mkdir -p /export/hadoop/hdfs/journal/xxxdata/current/
2、复制31节点/export/hadoop/hdfs/journal/xxxdata/current/VERSION到32节点
3、重启32节点journalnode服务,完美解决

相关文章

网友评论

      本文标题:NameNode Full GC导致集群退出问题剖析,以及重启j

      本文链接:https://www.haomeiwen.com/subject/cfmqwltx.html