When SSL is activated on the DUAS node, launching an Uproc with a user other than the node owner fails.
After performing an implementation with the following characteristics:
We can submit jobs as root and as the $U admin, but all other jobs stay in status 'Pending' indefinitely.
Please see the excerpts of the universe.log that show the errors when submitting jobs:
If we disable SSL, we can submit as any user (SUID is properly enabled on uxdqmlan). If we re-enable SSL, the same behaviour resumes. If we use uxrights to change the $U admin, the new $U admin user can now submit jobs, but the old one cannot.
The SERVERCERT has been configured to use the CN of the primary server $U is installed on as well as a SAN for the secondary server $U is installed on.
The job remain in status Pending during the time of the Launch Window and abort at the end of the Launch Window, see Job log:
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
BATCH_INIT error: unable to connect to IO [UNIMVE]/[X]/[local].
Environment not found (codproc/verproc) - Job status set to ABORTED
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
error: unable to connect to IO server [UNIMVE]/[X]/[local]
BATCH_END error: unable to connect to IO [UNIMVE]/[X]/[local].
universe.log:
| 2023-05-30 09:05:14 |ERROR|X|INI|pid=10287.140481834661696| owls_ssl_client_startup | Initialize ssl context in error: Error loading trusted CA certificate file '/AUTOMIC/DUAS/UNIMVE_NODE/data/security/root.cer', error:8000000D:sys
| 2023-05-30 09:05:14 |ERROR|X|INI|pid=10287.140481834661696| o_connect_auth | k_connect_auth_timeout returns error [200]
| 2023-05-30 09:05:14 |ERROR|X|INI|pid=10287.140481834661696| u_io_callsrv_connect_r | Error connecting to target IO server: Errno syserror 9: Bad file descriptor (error closing ssl socket)
| 2023-05-30 09:05:44 |ERROR|X|INI|pid=10287.140481834661696| owls_ssl_client_startup | Initialize ssl context in error: Error loading trusted CA certificate file '/AUTOMIC/DUAS/UNIMVE_NODE/data/security/root.cer', error:10080002:BIO
| 2023-05-30 09:05:44 |ERROR|X|INI|pid=10287.140481834661696| o_connect_auth | k_connect_auth_timeout returns error [200]
| 2023-05-30 09:05:44 |ERROR|X|INI|pid=10287.140481834661696| u_io_callsrv_connect_r | Error connecting to target IO server: Errno syserror 9: Bad file descriptor (error closing ssl socket)
| 2023-05-30 09:06:14 |ERROR|X|INI|pid=10287.140481834661696| owls_ssl_client_startup | Initialize ssl context in error: Error loading trusted CA certificate file '/AUTOMIC/DUAS/UNIMVE_NODE/data/security/root.cer', error:05880002:x50
| 2023-05-30 09:06:14 |ERROR|X|INI|pid=10287.140481834661696| o_connect_auth | k_connect_auth_timeout returns error [200]
| 2023-05-30 09:06:14 |ERROR|X|INI|pid=10287.140481834661696| u_io_callsrv_connect_r | Error connecting to target IO server: Errno syserror 9: Bad file descriptor (error closing ssl socket)
| 2023-05-30 09:06:44 |ERROR|X|INI|pid=10287.140481834661696| owls_ssl_client_startup | Initialize ssl context in error: Error loading trusted CA certificate file '/AUTOMIC/DUAS/UNIMVE_NODE/data/security/root.cer', error:8000000D:sys
| 2023-05-30 09:06:44 |ERROR|X|INI|pid=10287.140481834661696| o_connect_auth | k_connect_auth_timeout returns error [200]
| 2023-05-30 09:06:44 |ERROR|X|INI|pid=10287.140481834661696| u_io_callsrv_connect_r | Error connecting to target IO server: Errno syserror 9: Bad file descriptor (error closing ssl socket)
Release : 7.00.11 / 6.10.101
Linux/Unix
Workaround:
Perform the following command as root or DUAS admin user:
chmod 755 <DUAS>/data/security
(also make sure the user is part of the DUAS group.)
Solution:
This is a defect that will be fixed in a future version.
chmod 755 <DUAS>/data/security resolves the issue.