Description:
This is an expansion of a Tuesday Tip sent to the Community Site
Solution:
Operating System Administration and APM Administration use similar approaches. For APM Administration, it comes down to making a choice:
Will I do the MINIMUM needed and then "pay as you go" when things go wrong?
Will I have a REACTIVE or PROACTIVE APM environment?
This is summarized in the following table:
Philosophy | Admin Activities | System Stability | Use of Tool | Time-Consuming |
Minimum/ Do Nothing | Only if absolutely necessary. | System runs until it breaks. Not optimized. | Generate metrics | Only if fixing an APM system. |
Reactive | Fix broken components. Add agents, dashboards etc. without being concerned about implications | Visible issues fixed but deeper ones may not be addressed. | Generate metrics Minimum reporting | Only if fixing an APM system or resolving upgrade issues. In "Reactive Hell" state because not applying lessons learned during outages. |
Proactive | Frequent, to have optimized and current systems. | Good because visible and deeper issues addressed, architecture and configuration is optimized | Generate metrics Full use of reporting Event correlation Application optimization Capacity planning Takes full advantage of the solution. | Time spent on proactive tasks is offset by reduced outage time. |
References
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Introduction_To_System_Administration/ch-philosophy.html -- Red Hat Document on System Administration Philosophy
http://kagan.mactane.org/essays/sysadmin.php -- Another Overview on OS System Administration