You have had multiple users that kept submitting MapReduce jobs to the clusters and as a result, the cluster performance became slow. All jobs completed fine.
There a massive number of errors similar to the following in the NameNode log:
org.apache.hadoop.security.UserGroupInformation: No groups available for user test_user> > 2014-06-05 15:07:38,527 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception trying to get groups for user vsetbct1 org.apache.hadoop.util.Shell$ExitCodeException: id: cannot find name for group ID 100 at org.apache.hadoop.util.Shell.runCommand(Shell.java:202) at org.apache.hadoop.util.Shell.run(Shell.java:129) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:322) at org.apache.hadoop.util.Shell.execCommand(Shell.java:411) at org.apache.hadoop.util.Shell.execCommand(Shell.java:394) at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:83) at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52) at org.apache.hadoop.security.Groups.getGroups(Groups.java:89) at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1287) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.<init>(FSPermissionChecker.java:51) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4696) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:4663) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:3512) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:3491) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:671) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:476) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:45867).
This problem happens because the user who summited the jobs has a group ID which cannot be recognized by the Hadoop cluster. Since MapReduce keeps checking and complaining about this problem, the cluster performance slows down.
Add a group name and user ID for that user in /etc/group on the host where the job was submitted.