How to detect applications that return multiple transfer-encoding headers
search cancel

How to detect applications that return multiple transfer-encoding headers

book

Article ID: 297408

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

Users of applications running on Tanzu Application Service (TAS) for VMs may experience unexpected 502 errors. Upon looking into /var/vcap/sys/log/gorouter/gorouter.stdout.log, you will see "too many transfer encodings" in the error message. This happens after upgrading to TAS for VMs  2.7.30, 2.8.24, 2.9.18 or 2.10.10, or Isolation Segment 2.7.29, 2.8.23, 2.9.17, 2.10.9.

For example:

"error":"net/http: HTTP/1.x transport connection broken: too many transfer encodings: 
[\"chunked\" \"chunked\"] 
 

This knowledge article is going to discuss how operation teams and developer teams can prepare for this change and detect applications that are incorrectly returning transfer-encoding headers so they can be prepared to upgrade to newer versions of TAS.

For background into the issue and to better understand the details of this problem, please see the knowledge article Applications on TAS for VMs get 502 chunked response error.

Resolution

This particular issue is difficult to debug because, in older versions of Tanzu Application Service, the Gorouter and Go components would silently ignore multiple transfer-encoding headers. Thus there are no log messages you can search for to detect this problem.

To ensure that customers are able to detect problem applications prior to upgrading to a version of TAS that does not permit multiple transfer-encoding headers, VMware Tanzu Engineering has published specific patch versions of Tanzu Application Service and Isolation Segment Tiles called the mitigation patch
 

The below tables show which Tile versions introduced this breaking change for Tanzu Application Server and Tanzu Application Isolation Segment.

  • First Version with Transfer Encoding Breaking Change
    • Apps that have not been updated will be impacted after upgrading to one of these Patch versions
  • Mitigation Patch
    • Operators can use this patch version that allows them to detect which apps are impacted by the transfer encoding chunked header before upgrading to a later release.  Apps will not be impacted after upgrading to this release.
  • Transfer Encoding Breaking Change Reintroduced
    • The mitigation patch is removed in this release and all future releases of the tile.  Apps that have not been updated will be impacted after upgrading to these releases


Tanzu Application Tile Versions

Major Version First Version with  Transfer Encoding Breaking Change Mitigation Patch Transfer Encoding Breaking Change Reintroduced
2.7 2.7.30 2.7.31 2.7.32+
2.8 2.8.24 2.8.25 2.8.26+
2.9 2.9.18 2.9.19 2.9.20+
2.10 2.10.10 2.10.11 2.10.12+
2.11 2.11.0 Not Applicable All version of 2.11 include the Transfer Encoding breaking change

Tanzu Application Isolation Segment Tile Versions

Major Version First Version with  Transfer Encoding Breaking Change Mitigation Patch Transfer Encoding Breaking Change Reintroduced
2.7 2.7.29 2.7.30 2.7.31+
2.8 2.8.23 2.8.24 2.8.25+
2.9 2.9.17 2.9.18 2.9.19+
2.10 2.10.9 2.10.10 2.10.11+
2.11 2.11.0 Not Applicable All version of 2.11 include the Transfer Encoding breaking change




To determine if customers will be impacted by this change in Go & Gorouter, all customers should do the following:

  1. Upgrade to one of the specific mitigation patch versions of TAS listed above
  2. Use the methods described below to locate all applications that are returning multiple transfer-encoding headers or unsupported transfer-encodings
  3. Work with application developers to remove the extra or incorrect transfer-encoding headers
  4. Validate that no applications are returning multiple transfer-encoding headers. Then and only then you are ready to upgrade beyond the maintenance versions listed above.

For example, if you are on the 2.7 LTS branch of TAS, you should first upgrade to TAS 2.7.31, then validate your applications are correct, and after doing that, you may safely upgrade to 2.7.32+.

Operations Teams

As an operations team, you want a solution to monitor all applications running on your platform so that you can catch any application returning multiple transfer-encoding headers and then notify your application development teams. Mitigation patch versions 2.7.31, 2.8.25, 2.9.19, and 2.10.11 of TAS and versions 2.7.30, 2.8.24, 2.9.18, 2.10.10 of Isolation Segment all have the following functionality built in to accomplish this.

  1. You may monitor the Gorouter logs. These versions of TAS will output log entries when multiple transfer-encoding headers or an invalid transfer-encoding header (like gzip or identity) are returned. The application log message includes request information as well as the org, space, and application name so that you can easily notify the correct application development team. The route service log entries include the route URL as well as the x-cf-forwarded-url header which shows you the request that triggered the problem.

    If you are using a service like Splunk or ELK, you can search for the phrases received-deprecated-response-from-app and received-deprecated-response-from-route-service. The former will indicate applications that return multiple transfer-encoding headers and the latter will return route services that return multiple transfer-encoding headers.

    router/c99bebc8-2612-45cb-987d-af3b2c17f5e9: stdout | {"log_level":3,"timestamp":"2021-02-10T03:18:00.340162741Z","message":"received-deprecated-response-from-app","source":"vcap.gorouter","data":{"route-endpoint":{"ApplicationId":"ad561f0c-####-443a-9969-0cbc9afd34aa","Addr":"10.###.##.23:61046","Tags":{"app_id":"ad561f0c-####-443a-9969-0cbc9afd34aa","app_name":"route-service-example","component":"route-emitter","instance_id":"0","organization_id":"c968e6b8-d8bf-48f4-8e4f-272bf3295052","organization_name":"#####","process_id":"ad561f0c-####-443a-9969-0cbc9afd34aa","process_instance_id":"ba9f9703-####-4b73-5504-f29e","process_type":"web","source_id":"ad561f0c-####-443a-9969-0cbc9afd34aa","space_id":"98764b84-ab78-4244-a92e-eee94b392abc","space_name":"dev"},"RouteServiceUrl":""},"host":"route-service-example-#########-#########-cr.cfapps-44.slot-59.###.#######.com"}} router/c99bebc8-####-45cb-987d-af3b2c17f5e9: stdout | {"log_level":3,"timestamp":"2021-02-11T14:44:51.251480691Z","message":"received-deprecated-response-from-route-service","source":"vcap.gorouter","data":{"route-service-url":{"Scheme":"https","Opaque":"","User":null,"Host":"route-service-example-#######-########-cr.cfapps-44.slot-59.pez.vmware.com","Path":"","RawPath":"","ForceQuery":false,"RawQuery":"","Fragment":"","RawFragment":""},"x-cf-forwarded-url":"https://spring-music.cfapps-##.slot-##.###.#####.com/incorrect"}}
  2. If as an operations team you are redirecting all application logs to an aggregation service like Splunk or ELK or you are directly sending Gorouter access logs to syslog, you can also search for x_cf_routererror where the value is too_many_transfer_encodings. Every access log entry written to the application log stream by Gorouter will have the x_cf_routererror field and any request which would have failed due to the restrictions Gorouter is imposing on transfer-encoding headers will have an error message in this field.

    2021-02-11T09:40:45.53-0500 [RTR/0] OUT route-service-example-anxious-wildebeest-cr.cfapps-44.slot-59.pez.vmware.com - [2021-02-11T14:40:45.457165501Z] "GET / HTTP/1.1" 200 0 4669 "-" "curl/7.64.1" "10.213.54.5:43688" "10.213.54.23:61046" x_forwarded_for:"10.###.##.253, 10.###.##.5, 10.###.##.19, 10.###.##.5" x_forwarded_proto:"https" vcap_request_id:"0be1fee9-462b-####-5d4f-3d2f6c2d91f4" response_time:0.080853 gorouter_time:0.000789 app_id:"ad561f0c-####-443a-9969-0cbc9afd34aa" app_index:"0" x_cf_routererror:"too_many_transfer_encodings" x_b3_traceid:"a509eb58605e4327" x_b3_spanid:"a509eb58605e4327" x_b3_parentspanid:"-" b3:"a509eb58605e4327-a509eb58605e4327"

    If you are not sending application logs to an aggregations service or syslog, then you can capture the Gorouter access logs directly from the VM. They reside under /var/vap/sys/log/gorouter or you can obtain them with bosh logs or through Ops Manager's log retrieval mechanism. 

    Once you have the logs, you can run grep 'x_cf_routererror:"too_many_transfer_encodings"' access.log. This will return the log lines for requests to impacted applications. Alternatively, you can use the top-logs tool to extract aggregate information from your access logs, including the number of requests that are flagged as impacted.

  3. You can monitor the metric deprecated_responses. Gorouter will emit this metric which is a counter that will indicate how many instances of this transfer-encoding issue have been observed by Gorouter.

    Here's an example of what the metric looks like observed with the cf nozzle cf cli plugin.

    $ cf nozzle -no-filter | grep deprecated
    origin:"gorouter" eventType:CounterEvent timestamp:1612881548929244098 deployment:"cf" job:"router" index:"173a2629-####-####-988c-afb5b046e899" ip:"10.#.#.#" tags:<key:"source_id" value:"gorouter" > counterEvent:<name:"deprecated_responses" delta:0 total:1 >

    If you are sending TAS foundation metrics to a metrics monitoring platform like Prometheus/Grafana, you may want to add a chart to visualize this metric. To ensure that your foundation is not going to be impacted when you upgrade, you will want this metric to consistently display a zero value. Zero means that there are no requests which have been impacted by the stricter requirements in Go 1.15.

Development Teams

As a development team, you can also watch for this problem. Once your Operations Team has upgraded your foundation to one of these TAS versions 2.7.31, 2.8.25, 2.9.19, and 2.10.11, and one of these Isolation Segment 2.7.30, 2.8.24, 2.9.18, 2.10.10 versions then you will be able to use the following methods to determine if your applications will be impacted.

  1. Watch the log stream for your application. This can be done with cf logs or by using a log drain and routing logs to a log aggregation service (syslog, Splunk, ELK, etc..).

    Search your application logs for lines where the x_cf_routererror field has a value of  too_many_transfer_encodings. Every access log entry written to the application log stream by Gorouter will have the x_cf_routererror field and any request that would have failed due to the new restrictions Go 1.15 is imposing on transfer-encoding headers will have an error message in this field.

    2021-02-11T09:40:45.53-0500 [RTR/0] OUT route-service-example-anxious-wildebeest-cr.cfapps-44.slot-59.pez.vmware.com - [2021-02-11T14:40:45.457165501Z] "GET / HTTP/1.1" 200 0 4669 "-" "curl/7.64.1" "10.###.##.#:43688" "10.###.##.23:61046" x_forwarded_for:"10.166.51.253, 10.213.54.5, 10.213.54.19, 10.###.##.5" x_forwarded_proto:"https" vcap_request_id:"0be1fee9-462b-####-5d4f-3d2f6c2d91f4" response_time:0.080853 gorouter_time:0.000789 app_id:"ad561f0c-####-443a-9969-0cbc9afd34aa" app_index:"0" x_cf_routererror:"too_many_transfer_encodings" x_b3_traceid:"a509eb58605e4327" x_b3_spanid:"a509eb58605e4327" x_b3_parentspanid:"-" b3:"a509eb58605e4327-a509eb58605e4327"
  2. When making requests to your applications, if a request is impacted Gorouter will insert the header Cf-Deprecated-Response. If you are seeing this header in response headers from requests to your application, then those requests are impacted by this change.

    For example:

    > GET /albums HTTP/1.1
    > Host: spring-music.cfapps-44.slot-##.###.#####.com
    > User-Agent: curl/7.64.1
    > Accept: */*
    >
    < HTTP/1.1 200 OK
    < Cf-Deprecated-Response: too many transfer encodings: ["chunked" "chunked"]
    < Content-Type: application/json;charset=UTF-8
    < Date: Wed, 10 Feb 2021 03:25:03 GMT
    < X-Vcap-Request-Id: 19a46950-####-####-####-9c557f995bf2
    < Transfer-Encoding: chunked

Additional Information

VMware Tanzu Engineering has made the following changes to ensure that customers are easily able to find & fix problem applications running on TAS versions 2.7.31, 2.8.25, 2.9.19, and 2.10.11.
  1. Go 1.15 has been forked and adjusted so that it does not error when multiple transfer-encoding headers are present so long as all of the transfer-encoding headers present are set to "chunked".
  2. The Go 1.15 fork will add a header Cf-Deprecated-Response when multiple transfer-encoding headers are present and all set to "chunked".
  3. Gorouter has been compiled with the forked version of Go 1.15.
  4. Gorouter has been modified to not fail when there are multiple transfer-encoding headers.
  5. When multiple transfer-encoding headers are present Gorouter emits a log. Either received-deprecated-response-from-app or received-deprecated-response-from-route-service. These logs show up in gorouter.stdout.log.
  6. Gorouter emits a metric deprecated_responses
  7. Gorouter sets the header X-Cf-Routererror: too_many_transfer_encodings. This header shows up in access logs and app logs.
  8. Go 1.15 has also removed support for the transfer-encoding value of identity. This is still supported with Go 1.14, but neither Go 1.15 or the fork of Go 1.15 include support for using a value of identity. You will continue to see an error and response code of 502 if you have applications that return a transfer-coding of identity.
The functionality above which has been added to TAS 2.7.31, 2.8.25, 2.9.19, and 2.10.11 will only be present in these versions of TAS. Starting with TAS 2.7.32, 2.8.26, 2.9.20, and 2.10.12, these features will be removed. This is because it is not feasible to continue maintaining the fork of Go which is required to implement these features.

All customers must fix applications that are returning multiple or invalid transfer-encoding headers. This is a requirement before TAS can be upgraded beyond versions 2.7.31, 2.8.25, 2.9.19, and 2.10.11.