Massive number of "cannot find name for group ID" errors found in the NameNode log and MapReduce cluster performance is impacted
search cancel

Massive number of "cannot find name for group ID" errors found in the NameNode log and MapReduce cluster performance is impacted

book

Article ID: 294528

calendar_today

Updated On:

Products

Services Suite

Issue/Introduction

You have had multiple users that kept submitting MapReduce jobs to the clusters and as a result, the cluster performance became slow. All jobs completed fine.

There a massive number of errors similar to the following in the NameNode log:

org.apache.hadoop.security.UserGroupInformation: No groups available for user test_user>
>
2014-06-05 15:07:38,527 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception trying to get groups for user vsetbct1
org.apache.hadoop.util.Shell$ExitCodeException: id: cannot find name for group ID 100

at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)

at org.apache.hadoop.util.Shell.run(Shell.java:129)

at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:322)

at org.apache.hadoop.util.Shell.execCommand(Shell.java:411)

at org.apache.hadoop.util.Shell.execCommand(Shell.java:394)

at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:83)

at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52)

at org.apache.hadoop.security.Groups.getGroups(Groups.java:89)

at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1287)

at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.<init>(FSPermissionChecker.java:51)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4696)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:4663)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:3512)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:3491)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:671)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:476)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:45867).

This problem happens because the user who summited the jobs has a group ID which cannot be recognized by the Hadoop cluster. Since MapReduce keeps checking and complaining about this problem, the cluster performance slows down.


Resolution

Workaround

Add a group name and user ID for that user in /etc/group on the host where the job was submitted.