What is the algorithm used by the Webagent to encode/decode URL?
If the URL contains any of the following characters, webagent will Encode the URL:
' '(space), '&', '+', '?', '%', or '$'.
First, the URL is prepended with '-SM-' if LegacyEncoding=NO, or with '$SM$' if LegacyEncoding=YES.
Next, the following rules are applied in order:
' 'is replaced with '%20'
'&' is replaced with '%26'
'+' is replaced with '%2b'
'?' is replaced with '%3f'
'@' is replaced with '%40'
'"' is replaced with '"' (no changes/encoding)
'=' is replaced with '%3d'
'%' is replaced with '$%' or '-%'
Case of '-' and '$':
'-' is used as delimiter for framework agents (LegacyEncoding=NO):
'-' is replaced with '--'
'--' is replaced with '----'
'---' is replaced with '------'
'$' is replaced with '%24'
=====================================================================
'$' is used as delimiter for traditional agents (LegacyEncoding=YES):
'$' is replaced with '$$'
'$$' is replaced with '$$$$'
'$$$' is replaced with '$$$$$$'
'-' is replaced with '-' (no changes/encoding)
When decoding, it will reverse the logic, and the agent will remove a single - or $ for every one it added before.
Scenario with traditional agents (LegacyEncoding=YES):
URL being encoded is:
http://server.domain.com/resource?P1=A+B&P2=Space%20Here
SM-Encoded, it becomes:
$SM$http%3a%2f%2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20Here
WebAgent decode an URL:
=================
If the URL starts with '$SM$', then scan the string from the beginning. If the current character is '$', skip to the next character and return it. If the current character is %, then read the next TWO characters and return the urldecoded value. Otherwise return the current character. The algorithm will *not* urldecode a value such as $%20, because the % will have been skipped by the first case.
So, if the URL being decoded is:
$SM$http%3a%2f%2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20Here
Here first strip off the $SM$:
http%3a%2f%2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20Here
then parse down the string until we find a '$' or a '%':
http%3a%2f%2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20Here
At this point, we see a %. So, we urldecode the % and the next two characters and then continue:
http:%2f%2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20Here
Again, we see a %. Repeat:
http:/ %2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20Here
Repeat (x times)
http://server.domain.com/resource?P1=A+B&P2=Space$%20Here
Now, we see a '$' character, that means we return the *next* character and continue scanning.
http://server.domain.com/resource?P1=A+B&P2=Space%20Here
And now we've reached the end of the string. This is the SM-Decoded value.
Scenario with Framework agents (LegacyEncoding=NO):
http://server.domain.com/protected/HeaderDumper.asp?1%202&3+4?5%6$7@8"9=10-11--12---13
becomes
-SM-HTTP%3a%2f%2fserver%2edomain%2ecom%2fprotected%2fHeaderDumper%2easp%3f1-%202%263%2b4%3f5-%6%247%408"9%3d10--11----12------13
This sm-encoded value is decoded the same as the LegacyEncoding=YES example above, except removing - instead of $.
===============================================================================================
Are you running a Traditional Web Agent or a Framework Agent?
Framework Agents are installed on the following web servers:
IIS 6.0
Apache 2.0
Apache 2.0-based servers: IBM HTTP Server, Covalent ERS 2.x, HP Apache, and Oracle 10.x HTTP server
Sun Java Systems 6.0 and 6.1
Note: The Sun Java System Web server was formerly called the Sun ONE Web server or the iPlanet Web server.
Traditional Web Agents are installed on the following web servers:
IIS 5.0
Apache 1.x
Apache 1.x-based servers: IBM HTTP Server, Covalent Fast Start 2.x, and Oracle 9.x server
Domino
If you happen to have a mix of traditional and framework agents in the SSO environment, you will want to set LegacyEncoding=YES on all agents. You cannot mix the two encoding modes, even if all agents are framework agents.