Ambari is sending out critical YARN alerts followed by OK alerts sent minutes later.
Ambari notification email promotes an alert, producing the following messages:
Alert Summary: <ClusterName> - OK[0], Warning[0], Critical[1], Unknown[0] Services Reporting Alerts http://AmbariServer:8080/#/main/dashboard/metrics CRITICAL [YARN] YARN CRITICAL App Timeline Web UI
Connection failed to http://AppTimelineServer:8188
This issue will be fixed with a new release of Ambari 2.2.x.
2. Increase the associated timeout value using the method below:
Note: These steps need to be run from the Ambari server. 'hdp24a', in the example below, is the cluster name. Substitute it with your own cluster name.
[root@admin ~]# curl -H "X-Requested-By: ambari" -X GET -u admin:admin http://localhost:8080/api/v1/clusters/hdp24a/alert_definitions { "href" : "http://localhost:8080/api/v1/clusters/hdp24a/alert_definitions", "items" : [ : : { "href" : "http://localhost:8080/api/v1/clusters/hdp24a/alert_definitions/74", "Aler
tDefinition" : { "cluster_name" : "hdp24a", "id" : 74, <<<!!! Note this id of App Timeline Web UI. "label" : "App Timeline Web UI", "name" : "yarn_app_timeline_server_webui" } }, : : ] }
[root@admin ~]# curl -H "X-Requested-By: ambari" -X GET -u admin:admin http://localhost:8080/api/v1/clusters/hdp24a/alert_definitions/74 >alert.json
[root@admin ~]# vi alerts.json "href" : "http://localhost:8080/api/v1/clusters/hdp24a/alert_definitions/74", <<!! Remove this line.* : : "default_port" : 0.0, <<!! Remove this line "connection_timeout" : 25.0 <<!! Change to 25 from default 5. :
[root@admin ~]# curl -X PUT -d @alert.json -i -u admin:admin -H 'X-Requested-By: ambari' http://localhost:8080/api/v1/clusters/hdp24a/alert_definitions/74 HTTP/1.1 100 Continue HTTP/1.1 200 OK X-Frame-Options: DENY X-XSS-Protection: 1; mode=block User: admin Set-Cookie: AMBARISESSIONID=1g1rebkc8aziuciu8vi0jwgk;Path=/;HttpOnly Expires: Thu, 01 Jan 1970 00:00:00 GMT Content-Type: text/plain Content-Length: 0 Server: Jetty(8.1.17.v20150415)
e. Restart the Ambari server.
[root@admin ~]# ambari-server restart Using python /usr/bin/python Restarting ambari-server Using python /usr/bin/python Stopping ambari-server Ambari Server stopped Using python /usr/bin/python Starting ambari-server Ambari Server running with administrator privileges. Organizing resource files at /var/lib/ambari-server/resources... Server PID at: /var/run/ambari-server/ambari-server.pid Server out at: /var/log/ambari-server/ambari-server.out Server log at: /var/log/ambari-server/ambari-server.log Waiting for server start.................... Ambari Server 'start' completed successfully.