How to Change Hadoop Daemon log4j.properties
search cancel

How to Change Hadoop Daemon log4j.properties

book

Article ID: 295087

calendar_today

Updated On:

Products

Services Suite

Issue/Introduction

By default, Pivotal HD (PHD) service Daemons use various log4j file appenders. Some do not allow the user to control how much data is generated by Hadoop Daemon logs. The sysadmin must manage and maintain the generated log data.

This article explains how to configure log4j.properties for all PHD core components to help sysadmins control and understand the PHD Daemon log management.


Environment


Resolution

Refer to the following Java Docs for log4j and sample configuration parameters:
 

DailyRollingFileAppender_DRFA
#
# Daily Rolling File Appender
# Rollover at midnight
log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
RollingFileAppender_RFA
#
# Rolling File Appender - cap space usage at 256mb.
#
hadoop.log.maxfilesize=256MB
hadoop.log.maxbackupindex=20
log4j.appender.RFA=org.apache.log4j.RollingFileAppender
log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file}
log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize}
log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex}
log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
FileAppender
#
# File Appender
#
log4j.appender.FA=org.apache.log4j.FileAppender
log4j.appender.FA.File=${hive.log.dir}/${hive.log.file}
log4j.appender.FA.layout=org.apache.log4j.PatternLayout
log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
ConsoleAppender
#
# console appender options
#
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
log4j.appender.console.encoding=UTF-8


Refer below to the Root logger environmental variables for more information:
 

VariableServicesWhere to Override
HADOOP_ROOT_LOGGER=INFO,RFANamenode
Journalnode
ZKFC
Datanode
condary Namenode
/etc/gphd/hadoop/conf/hadoop-env.sh
HADOOP_MAPRED_ROOT_LOGGER=INFO,RFA  MapReduce History Server /etc/gphd/hadoop/conf/mapred-env.sh
YARN_ROOT_LOGGER=INFO,RFA

Resourcemanager
Nodemanager

/etc/gphd/hadoop/conf/yarn-env.sh
ZOO_LOG4J_PROP=INFO,ROLLINGFILEzookeeper/etc/gphd/zookeeper/conf/java.env
HBASE_ROOT_LOGGER=INFO,RFA

hbase master
HBase regionserver

/etc/gphd/hbase/conf/hbase-env.sh


Where is the log directory defined?

All services will have the $<SERVICE>_LOG_DIR variable defined in /etc/default/<service>. For example, the DataNode service has "HADOOP_LOG_DIR=/var/log/gphd/hadoop-hdfs". So all logs will be found in "/var/log/gphd/hadoop-hdfs" for the DataNode service.

[gpadmin@hdw1 ~]$ cat /etc/default/hadoop-hdfs-datanode  | egrep ^export
export HADOOP_PID_DIR=/var/run/gphd/hadoop-hdfs
export HADOOP_LOG_DIR=/var/log/gphd/hadoop-hdfs
export HADOOP_NAMENODE_USER=hdfs
export HADOOP_SECONDARYNAMENODE_USER=hdfs
export HADOOP_DATANODE_USER=hdfs
export HADOOP_IDENT_STRING=hdfs

Namenode, Journalnode, ZKFC, DataNode, and Secondary Namenode daemons

These Daemons source their log4j settings from the following location: /etc/gphd/hadoop/conf/log4j.properties.


The HADOOP_ROOT_LOGGER environmental variable is used to control the default logger and is sourced in the file "/usr/lib/gphd/hadoop/sbin/hadoop-daemon.sh". This sets the root logger to RollingFileAppender by default. This can be overridden here: /etc/gphd/hadoop/conf/hadoop-env.sh.

export HADOOP_ROOT_LOGGER=INFO,RFA 

Audit logging

Audit logging uses the DRFAS as per the "hadoop.security.logger" setting configured by /etc/gphd/hadoop/conf/hadoop-env.sh with HADOOP_NAMENODE_OPTS, HADOOP_DATANODE_OPTS, and HADOOP_SECONDARYNAMENODE_OPTS environmental variables.


Mapreduce History server

Mapreduce History service sources log4j settings from /etc/gphd/hadoop/conf/log4j.properties.


HADOOP_MAPRED_ROOT_LOGGER environmental variable is used to control the default logger and it is sourced in the file "/usr/lib/gphd/hadoop-mapreduce/sbin/mr-jobhistory-daemon.sh" which sets the Mapreduce History server logger to RollingFileAppender by default. This can be overridden here /etc/gphd/hadoop/conf/mapred-env.sh.

export HADOOP_MAPRED_ROOT_LOGGER=INFO,RFA 

Resource Manager, Nodemanager

These Daemons source their log4j settings from /etc/gphd/hadoop/conf/log4j.properties.


YARN_ROOT_LOGGER environmental variable is used to control the default logger and it is sourced in the file "/usr/lib/gphd/hadoop-yarn/sbin/yarn-daemon.sh", which sets the default logger to RollingFileAppender. This can be overridden in /etc/gphd/hadoop/conf/yarn-env.sh.

export YARN_ROOT_LOGGER=INFO,RFA

Zookeeper

Zookeeper sources log4j settings from /etc/gphd/zookeeper/conf/log4j.properties.

The ZOO_LOG4J_PROP environmental variable is used to control the default logger and it is sourced in the file "/usr/bin/zookeeper-server" which sets the default logger to RollingFileAppender. This can be overridden by exporting this value in the following file location /etc/gphd/zookeeper/conf/java.env.

export ZOO_LOG4J_PROP=INFO,ROLLINGFILE

Hbase Master, Hbase Regionserver

These Daemons source their log4j settings from /etc/gphd/hbase/conf/log4j.properties.


The HBASE_ROOT_LOGGER environmental variable is used to control the default logger and it is sourced in the file "/usr/lib/gphd/hbase/bin/hbase-daemon.sh", which sets the default logger to RollingFileAppender. This can be overridden in /etc/gphd/hbase/conf/hbase-env.sh.

export HBASE_ROOT_LOGGER=INFO,RFA

HIVE

Hive sources log4j settings from /etc/gphd/hive/conf/hive-log4j.properties. In PHD all Hive daemon logs will source this file for hive.root.logger.

hive.root.logger=WARN,DRFA
hive.log.dir=/tmp/${user.name}
hive.log.file=hive.log

The following file location "/etc/init.d/hive-server" is used to start the Hive server and will set the Hive server log to file name "hive-server.log". It uses the default hive.root.logger defined in the hive-log4j.properties file. This log file will get truncated each time the Hive server daemon restarts.

NAME="hive-server"
LOG_FILE="/var/log/gphd/hive/${NAME}.log"

The following file location "/etc/init.d/hive-metastore" is used to start the Hive metastore and will set the Hive server log to the file name "hive-metastore". It uses the default hive.root.logger defined in hive-log4j.properties. This log file gets truncated each time the Hive metastore Daemon restarts.

NAME="hive-metastore"
LOG_FILE="/var/log/gphd/hive/${NAME}.log"
Both the hive-server and the hive-metastore daemon will log their data to "hive.log" as defined in hive-log4j.properties. The consolidated hive.log will get rotated as per hive.root.logger which is set to DRFA and defined in hive-log4j.properties.


Hive Query History Log

The history log file location is governed by "hive.querylog.location" from the hive-stie.xml. By default, this parameter is set to "/<hdfs-site.xml hadoop.tmp.dir>/${user.name}/".