Virtual services are down due to "No Network/subnet matched server IP" after upgrading Avi, AKO, and the workload cluster.
search cancel

Virtual services are down due to "No Network/subnet matched server IP" after upgrading Avi, AKO, and the workload cluster.

book

Article ID: 408940

calendar_today

Updated On:

Products

VMware Avi Load Balancer

Issue/Introduction

1> Routes are not getting synced to Avi controller.

2> In the AKO logs after enabling the DEBUG mode we could see routes are getting detected as vrf cache was not getting updated.

cat ako-afterreboot.txt | grep rest | grep vrf
2025-08-26T09:19:07.246Z    INFO    rest/dequeue_nodes.go:99    key: admin/global, msg: processing vrf object
2025-08-26T09:19:07.246Z    INFO    rest/avi_obj_vrf.go:123    vrf cache object NOT found for vrf name: global
2025-08-26T09:19:07.246Z    WARN    rest/dequeue_nodes.go:190    key: admin/global, vrf global not found in cache, exiting 

 

 

Environment

Previous AKO version: 1.7.6
Upgraded AKO version: 1.10.3
AKO Service Type: ClusterIP

Cause

The primary problem was a data type mismatch within the Avi Kubernetes Operator (AKO). Specifically, the local_as field in the BGP profile was defined as a 32-bit signed integer (int32).

An Autonomous System (AS) number for BGP can range from 1 to 4,294,967,295 (a 32-bit unsigned integer). The error message "cannot unmarshal number 4200050067 into Go struct field ... of type int32" clearly shows that the AS number you were using was too large for a signed 32-bit integer. It exceeded the maximum value of 2,147,483,647.

When AKO tried to read this large value from the Avi Controller, the parsing failed due to this integer overflow. This failure caused the entire VRF context cache to become corrupted or fail to update, as indicated by the log "vrf cache object NOT found for vrf name: global".

Resolution

From the AKO logs the VRF and static routes were not synced. This was because the VRFContext cache was missing from the AKO cache, as shown in the logs:

cat ako-afterreboot.txt | grep rest | grep vrf
2025-08-26T09:19:07.246Z    INFO    rest/dequeue_nodes.go:99    key: admin/global, msg: processing vrf object
2025-08-26T09:19:07.246Z    INFO    rest/avi_obj_vrf.go:123    vrf cache object NOT found for vrf name: global
2025-08-26T09:19:07.246Z    WARN    rest/dequeue_nodes.go:190    key: admin/global, vrf global not found in cache, exiting 
 

On further review, we found that during the VRF cache update there was an integer conversion issue, as evident from the error below:

 
2025-08-26T09:19:00.882Z WARN #####    Failed to unmarshal data, err: json: cannot unmarshal number 4200050067 into Go struct field BgpProfile.bgp_profile.local_as of type int32

 

Resolution

The logic for the local_as field in the BGP profile of the VRFContext has been updated from int32 to uint32 to prevent parsing errors.
This fix is included in AKO 1.12.3 and later releases.

Workaround

As a temporary workaround, the following steps can be used:

  1. Toggle the knob disableStaticRouteSync from false to true. (In AKO config-map)

  2. Delete the AKO pod.

  3. Revert the knob back to false.

  4. Delete the pod again.

After performing these steps, the VRFContext correctly synced, and all routes will be visible.