What is Event log - Unable to encode string to UTF8, reason U_ILLEGAL_CHAR_FOUND

book

Article ID: 168547

calendar_today

Updated On:

Products

ProxySG Software - SGOS

Issue/Introduction

The event log entry - Unable to encode string to UTF8, reason U_ILLEGAL_CHAR_FOUND, occurs because of an issue with the character set of the user's password for basic authentication. It's likely not valid UTF-8 format and It might be based on another charset. If you know what charset the client will use, you can use the authenticate.charset()action to tell the ProxySG appliance how to decode it. 

Resolution

authenticate.charset( )
----------------------------
Specify the character encoding used by HTTP basic authentication. The HTTP Basic authentication protocol sends the username and password to the proxy or origin server as an array of bytes. The character encoding is arbitrarily chosen by the client. Within the HTTP protocol, there is no way for the client to tell the upstream device which encoding is used. If the username or password contains non-ASCII characters, then the SG appliance needs to know what this character encoding is. Since there is no way for the proxy to determine this from the HTTP request, it must be specified in policy, using the authenticate.charset property. The default value is ascii.
 
If the HTTP Basic credentials are not encoded as specified by the authenticate.charset property, then the HTTP request is terminated by an invalid_credentials exception. Therefore, if authenticate.charset is set to its default value of ascii, and the username or password contain
non-ascii characters, then the request will be terminated.
 
To resolve this issue, use the policy action, authenticate.charset() to allow the use of non-ascii credentials using the HTTP basic authentication protocol. An alternative to configuring this property is to use a different client-side authentication protocol, such as IWA, or forms-based authentication.
 
Syntax :
authenticate.charset(charset)
where:
charset A MIME charset name. Any of the standard charset names for encoding types commonly supported by Web browsers may be used.
One list of standard charset names is: http://www.iana.org/assignments/character-sets.
 
If your environment uses Microsoft Windows, then you can use the "chcp" command in the Windows CLI to find out your active code page. Once you know the code page number n, you can use "windows-n" as the charset name. The default value is ascii.
 
Layer and Transaction Notes
• Use in <Proxy> and <Admin> layers.
• Applies to HTTP proxy transactions.


Example
Set the authentication character encoding to "windows-936", which is the "extended ascii" encoding used by Microsoft Windows in North America.
<proxy>
authenticate(myrealm) authenticate.charset(windows-936)