Multiple serialized bridge configurations and its impact on Sourcefire application monitoring when a sensing circuit/interface goes down

book

Article ID: 167826

calendar_today

Updated On:

Products

XOS

Issue/Introduction

This article describes the application monitoring behavior when a sensing interface is down.XOS 9.6.X/SF 4.10
 
Issue:- If one of the sensing circuits tied to an IPS bridge in serialized configuration goes down (interface admin down or cable unplugged), below are the symptoms for single or multiple bridge scenarios where “link-state-resistant” is not defined:
 
Single Bridge
  • XVNIM reports the event such as the one below
Mar 5 11:02:37 ids_1 kernel: XVNIM info: netdev_event: Network device inside is DOWN !
  • No traffic passes through Sourcefire VAP but the application still shows "Active" until a reload vap-group <vapgroupname> is done
  • "pmtool checkdestatus" on the SF APM, still shows the result as "0" (successful) and shows DE are running while checking "pmtool status | grep -i running" from the VAP. 


Multiple Bridges
 
  • No traffic flows after a reload all or reload vap-group is done
  •  Post a  reload all or reload vap-group, the Sourcefire application shows "UP" until the sensing interface is plugged or administratively brought up. If the circuit is then configured with "link-state-resistant", XVNIM detects the change and the Sourcefire application shows up "Active.
  • In serialized L2-L3 configurations, this behavior causes and outage since nothing passes post a reload VAP-group through other bridges that are UP when one bridge is down
  • When sensing interface is plugged or administratively brought up (no link-state-resistant defined), XVNIM sees the interface going UP, prints the message such as one below and the application is marked "Active" and traffic starts flowing automatically through other Bridges which are UP
Mar 6 14:36:41 ids_1 kernel: XVNIM info: netdev_event: Network device inside UP         !cable plugged or link-state-resistant defined on the circuit
 

Cause

(Single bridge) 
If a sensing interface on a Sourcefire VAP, (with no "link-state-resistant defined on the circuit") is unplugged or administratively taken down, no traffic passes as the “reader” is taken down as seen in the “dmesg |grep debug”  output below:
 
 
XVNIM debug: [2991]: Removing /proc/xvnim/readers/2991... 
 
XVNIM debug: [2991]: Removed proc entry. 
 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
 
XVNIM debug: [2991]: Destroyed reader. 
 
XVNIM debug: [2991]: Closed XVNIM device node. 
 
 
 
/var/log/messages: (similar messages such as the ones below seen for multiple bridges that may exist on the system)
 
Mar 14 10:21:28 ids_1 kernel: XVNIM error: Reader wanted to add unknown or removed device 'inside2' (int=0 in irqs=0 irqs off=0) 
 
Mar 14 10:21:28 ids_1 snort[3099]: FATAL ERROR: Can't start DAQ (-1) - Error adding devices! (22)! 
 
--------------------------
 
Multiple bridges ( A and B) defined & no "link-state-resistant" set on the circuit 
 User-added image

Please refer to the above diagram.
If one of the sensing circuit is down on bridge ser , traffic will continue for bridge ser2 ( application state remains “ACTIVE”) until a “reload vap” or “reload all” is done. After reload app status stays “ UP” and traffic does not flow thorough bridge ‘B’ until the condition on bridge ‘ser’ is remediated.
 
+++++++++++++++++
Regarding the Sourcefire behavior in the multiple bridge scenario when an interfaces is unplugged or admin down:
Q1. Why does the application stop passing traffic from other bridge(s) with active sensing circuits upon a reload all or reload VAP in this scenario? 
A1. Without link-state-resistant, the network device will be brought down by vapcfgd if the physical interface is down. XVNIM driver detects the change and then calls cbs_update_vnd_config to de-active the VND circuit. No traffic should be forwarded on it afterwards. Here are logs in dmesg when the physical interface was bounced. 
XVNIM info: netdev_event: Network device inside2 is GOING_DOWN 
XVNIM info: netdev_event: Network device inside2 is DOWN 
XVNIM debug: new_vnd = ffff810204a92d80, old_vnd = ffff810204a92d80 
XVNIM debug: cbs_update_vnd_config reported change=0x2 for dev inside2 (identifier=3) 
XVNIM debug: Downed VND inside2 (1032). 
XVNIM info: netdev_event: Network device inside2 is UP 
XVNIM debug: new_vnd = ffff81021646c780, old_vnd = 0000000000000000 
XVNIM debug: cbs_update_vnd_config reported change=0x1 for dev inside2 (identifier=4) 
XVNIM debug: Brought VND inside2 (1032) back up. 
VNIM info: vnd_info_print: dev inside2 circuit id=1032 vlan=fff dev=ffff8102035b8000 internal=0 last_change=0x1 
VNIM info: Other bridge members: ser2 
XVNIM info: NPM ports: 1 / 5 
With link-state-resistant configuration, physical link status change is shielded from XVNIM driver by vapcfgd. 
After a vap is reloaded, the network device of a sensing interface needs to be UP so that XVNIM driver can initialize readers for it. Otherwise XVNIM driver will close the device node and the application cannot be active. And it cannot forward traffic with multiple bridges if one of the bridges has no reader. Here are relevant logs.
Mar 14 10:20:28 ids_1 kernel: XVNIM info: Allocated 5275648 bytes of memory in a ring consisting of 161 regions. 
Mar 14 10:20:28 ids_1 last message repeated 7 times 
Mar 14 10:20:28 ids_1 snort[2976]: Acquiring network traffic from "inside2:ser2" 
XVNIM error: Reader wanted to add unknown or removed device 'inside2' (int=0 in irqs=0 irqs off=0) 
XVNIM debug: [2991]: Removing /proc/xvnim/readers/2991... 
XVNIM debug: [2991]: Removed proc entry. 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
XVNIM debug: Freed 5275648 bytes of memory in a
 ring consisting of 161 chunks. 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
XVNIM debug: Freed 5275648 bytes of memory in a ring consisting of 161 chunks. 
XVNIM debug: [2991]: Destroyed reader. 
XVNIM debug: [2991]: Closed XVNIM device node. 
Mar 14 10:21:28 ids_1 kernel: XVNIM error: Reader wanted to add unknown or removed device 'inside2' (int=0 in irqs=0 irqs off=0) 
Mar 14 10:21:28 ids_1 snort[3099]: FATAL ERROR: Can't start DAQ (-1) - Error adding devices! (22)! 
Once the network device is UP and XVNIM driver is able to finish initialization for it, the application will be declared as ACTIVE automatically. Any link state change afterwards will not affect the associated readers for XVNIM or the application status. 

2. Why is app_status check behavior during normal operations Vs after a reload of a vap-group different when a link is down? 
An SF internal enhancement request was submitted by the SF support engineer working the case 51522 and is targeted for a future 5.x release. No ETA for the fix at this time.

Resolution

Workaround


  1. As documented, one must enable link-state-resistant on the sensing circuits if you need to take down the interface temporarily so traffic does not stop for other sensing circuits in case a reload VAP or reload all is performed.
  2. If an sensing circuit is no longer used, make sure to remove it from the bridge and configuration

Attachments