Web server not sending the "Cache-Control: must-revalidate" HTTP header causing the proxy not refreshing its cache
book
Article ID: 406684
calendar_today
Updated On:
Products
ProxySG Software - SGOSISG Proxy
Issue/Introduction
There are cases where the web server is not sending a "Cache-Control: must-revalidate" HTTP header causing the proxy not refreshing its cache in a timely manner and serving the clients with old content (objects)
Environment
The issue can be worked around bypassing the proxy cache for the specific domain thanks to SWG policies ("cache(no) and bypass_cache(yes)). The cache can be completely bypassed and at every client request there will be an upstream proxy request toward the web server to fetch the requested (fresh) object.
When a client transaction (GET for an object) matches a policy rule to "bypass_cache(yes)" "the cache will not be queried for the content and the web-server's response will also not be cached."
When a client transaction (GET for an object) matches a policy rule to "cache(no)" "any object previously cached shall be deleted, and that any future acquisition of such object will not be cached"
But there is the ask to leave the caching service enabled
Cause
HTTP setting "Cache expired objects" is in CLI "# (config) http [no] cache expired" (refer to "# (config) http" chapter).
When enabled (by default) it "retains cached objects older than the explicit expiration". The "explicit expiration" is the OCS (web-server) specified (or never sent) expiration time. (server HTTP "Cache-Control" header).
Settings can be seen thanks to the "#show http" command, output example:
SWG#(config)show http Supported protocol version: HTTP version preserved Caching options: Cache authenticated data: enabled Cache expired objects: enabled Cache personal pages: disabled Strip From Headers: disabled Byte range support: enabled Force NTLM on proxy IE: disabled Rewrite redirects for XP: disabled Revalidate "pragma: no-cache": disabled WWW redirect if host not found: enabled Force explicit expirations: Never refresh before: disabled Never serve after: enabled Add headers: "Front-end-https": disabled "Via": disabled "X-forwarded-for": disabled "Client-ip": disabled Parsing options: HTML meta tag "Cache-Control": enabled HTML meta tag "Expires": enabled HTML meta tag "Pragma: no-cache": enabled Persistent connections: Client connections: enabled Server connections: enabled Clientless request: Global limit: 200 Pipeline limit: 5 Server limit: 50 Pipeline: Client requests: disabled Client redirects: disabled Prefetch requests: disabled Prefetch redirects: disabled Substitute simple Get for: Get "if-modified-since": disabled Get "pragma: no-cache": disabled HTTP 1.1 Conditional get: disabled Internet Explorer reload: disabled Proprietary header extensions: Blue Coat extensions: disabled FTP proxy: Url path is: absolute from root Configuration/access log uploads: will use PASV Persistent connection timeouts: Server: 900 Client: 360 Receive timeouts: Server: 180 Client: 120 Refresh: 90 tolerant-request-parsing: disabled location-header-rewrite: disabled exception-on-network-error: enabled allow-upstream-407: disabled detect-server-close-while-idle: disabled Client connection limit: 10000
The above "http cache expired" setting goes in parallel with:
# (config) http [no] strict-expiration refresh (Forces compliance with explicit expiration by never refreshing objects before their explicit expiration. The no parameter clears the setting.) Disabled by default
# (config) http [no] strict-expiration serve (Forces compliance with explicit expiration by never serving objects after their explicit expiration. The no parameter clears the setting.) Enabled by default
Resolution
Can leverage the "time-to-live (TTL)" "ttl()" cache layer property to set the time-to-live in cache for objects downloaded from specific hosts as per business/web application requirements.
Refer to SGOS admin guide "ttl()" chapter and article ID: 166456 - "How to manually set cache time (ttl) for specific URL?".
A practical example for "example.com":
Installing a CPL policy to refresh its cached objects every 10 seconds:
<Cache> ; Set the specified cached objects to expire after 10 seconds. url=https://example.com/images/smile.ico ttl(10) ; a specific URL object url=//www.example.com/dyn_images ttl(10) ; a specific URL object url.domain=example.com ttl(10) ; all objects downloaded from a specific domain
Every time a client will reach out for the proxy for the object the proxy will check its TTL and if expired pull a new object
TTL related stats can be seen in SGOS advanced URL "https://edgeSWG:8082/CE/Info/https/" stats, example "https://edgeSWG:8082/CE/Info/https/example.com