This is a generic document intended to provide some information about the functions performed by the most important of the Access Control Unix daemons, seosd, how to identify any problems related to this daemon, and how to troubleshoot them. Note that although the information provided is valid for any flavor of Unix running Access Control, the Operating system commands illustrated, are valid only on the Solaris platform.
- What can cause the seosd daemon to stop working properly?
There are a variety of reasons, and some of them may be caused by external factors, such as, an underlying problem with the OS that causes seosd to hang (waiting on other processes on heavily utilized systems, I/O problems, etc) which would clearly delay or totally impede seosd's capacity to process authorization requests. Others could be AC related such as inconsistencies in the Access Control database.
- How does a problem with seosd manifest itself?
One manifestation of a problem with the seosd daemon, could be that resources for which there are rules in the Access Control database allowing access to them, are suddenly or unexpectedly denied to users (even though the same users have had access to such resources before).
- How can we detect and troubleshoot problems with the seosd daemon?
Here's what can be done;
- Scan the Unix syslogs and search for any messages related to seosd, oftentimes, timeouts will occur and are logged, these timeouts relate to the periodic check that the watchdog daemon (seoswd) carries out by contacting seosd and expecting an answer from it, if seosd, does not respond, then seoswd, will attempt to restart seosd, which can take place with no loss of service. Other times however, it could well be that there is a problem with seosd, and it won't respond to or be restarted by the watchdog, in which case, we need to then inspect AC generically and see if things are working. i.e., can new rules be created; are the rules in places being honored; are events being audited; etc.
- If we detected any anomalies with any of the above, then the best course of action is to see if seosd has cored dumped, this will obviously depend on the underlying Operating System being configured to generate core files. That being the case, a core file would be placed in the AC directory.
If a core file has not been generated, one can be forced by whichever means the underlying OS provides, for instance on Solaris 10, the following command can be used to force a seosd core dump 'gcore -g <PID>' where the PID would be seosd's and then 'pstack corefile' (to get the stacktrace) where corefile would be whichever naming convention the OS has for core files (on Solaris 10, one can use the following command in order to make sure that each core file has been suffixed with the binary that generated it and its PID #coreadm -e global -g /opt/CA/AccessControl/core.%f.%p). This would then be sent to CA for analysis in order to determine what was preventing seosd from doing its job.
- Database related problems cannot be totally ruled out when there is a problem with seosd, and therefore the integrity of the database should be checked periodically. Upon restarting Access Control, a fast check is performed but this quick check may not be sufficient to detect and /or correct all possible problems with the database, in which case the 'dbmgr' utility can be used. For example, the following command would rebuild the database indices if we suspect that there is a problem with the integrity of the database '#dbmgr -u -build all' (this command should be invoked from the 'seosdb' directory after having stopped AC)
- The 'check' command can be used from within the selang command line, in order to get what seosd thinks should be the access level to a particular resource. For example say we have a user 'frank' who ordinarily should have write access to a particular file 'myfile' but we have noticed that all of a sudden user 'frank' can no longer access this particular file;
AC> check FILE /tmp/myfile uid(frank) access(r)
Which should return:
AC>Access to FILE /tmp/myfile GRANTED
Stage: Default record universal access check
/tmp/myfile (FILE ) R, X, Chdir
The example above, can help to identify whether seosd is coming back with the level of access defined in the seosdb database, in other words, if all is well, the command above would return the access level granted or denied that appears in the ACL for the specific file, (in this case myfile) if that is not the case, we know that there is a problem.
- A seosd trace can also help to detect and troubleshoot problems, for example and as mentioned, resources being denied access when there are rules in the database permitting access to them. A good way to run a trace is by stopping AC first and then changing the seos.ini token 'trace_to' by default set to 'file,stop' and set it to just 'file' so that when we restart AC, a seosd trace will be automatically generated for us. In addition, the debug levels can also be changed via the seos.ini file so that aseos_debug file is generated when AC starts. In release 12.5 of AC, this is achieved by just setting the'debug_level' token in the seos.ini file to 'low'. In release 8.0SP1, the following tokens need to be inserted in the 'seosd' section of the seos.ini file 'debug_level = 90' 'debug_zone = 8'
Once the seosd trace is running, we can then perform simple actions such as accessing resources for which a rule exists and see if we get the expected results, actions which would be captured in the seosd.trace and the seos_debug file which can help CA Support to diagnose and find root cause.