More often than not, network devices managed by InCharge are not connected continuously due to various reasons. For example:
As a result, the managed domain is broken into network partitions. In some scenarios, a failure in the network can break connectivity between the management station and all the devices in a partition. Support for Partition in the InCharge Application adds the diagnosis of Partition Down failure. A Partition Down failure occurs when the management station loses connection to ALL the devices in a partition.
A partition is defined as a group of managed computer systems in the InCharge topology that are interconnected to each other, that is, a managed computer system can reach any other managed computer system in the same partition by traversing layer 2 and/or layer 3 connectivity relationships.
New class called Partition in ICIM.
New root cause: Partition Down
ConsistsOf/MemberOf relationships between Partition and ICIM_UnitaryComputerSystem (Switch, Router, Host, etc.)
New computed relationshipset ConnectedSystems in IPNetwork
- show computer systems in a network, similar to ConnectedSystems in a VLAN.
Configuration:
- Enable/disable partitioning through sm_tpmgr and discovery.conf
- Set partition DisplayName through partition.conf file
It is very important to notice that InCharge does not report failures on ANY unmanaged entity. It is thus important to notice the effect of that on partitions. If two managed devices are connected via two unmanaged interfaces, if the interfaces go down, InCharge will not send any notifications. Users could be mislead to think that because
the devices are connected via unmanaged interfaces, that the partition program was going to create two +separate partitions. However, only unmanaged devices cause partition connectivity to break. Unmanaged interfaces do not cause that.
Internal specification:
Introduce class Partition which inherits from ICIM_Group.
Incorporate Partition Down failure to current ICIM model.
Add the following operations to ICIM_UnitaryComputerSystem. This should follow the conventions in ICIM and Object Factory makePartition(PartitionKey)
- create partition with specified PartitionKey and create ConsistsOf/MemberOf relationships between the partition and the computer system getPartition()
- get the partition that this computer system is a member of. getNeighboringSystems()
- get all the computer systems through all connectivity relations, including IPNetwork, BridgedVia, Peer, etc. This is needed to enable access to all computer systems in the
neighborhood. This operation needs to be refined in ICIM_RelayDevice. Note that the NeighboringSystems relations for a host only returns the neighboring routers
Add following operations to ICIM_ObjectFactory makePartition(PartitionKey)
- create partition with specified PartitionKey findPartition(PartitionKey)
- find partition of given PartitionKey findPartitionOfComputerSystem(nameOrAddr)
- find partition that ConsistsOf the computer system related to given name or ip address.
Add ConnectedSystems relation and implementation to IPNetwork class
- easier topology traversal
Add following attribute to ICF_TopologyManager.mdl PartitionEnabled
- This attribute will be used by sm_tpmgr and discovery.conf to enable/disable partitioning
Partitioning will be added to post processing in discovery.
Partitioning algorithm. This is a brief description.
- Get all comupter systems
- Foreach system get partition or create partition if it does not belong to any partition
- Get all neighboring systems. Put them in the same partition if they do not belong to any. If they belong to a different partition, merge the two partitions. Some
optimizations have been disccussed.
- If a system has no neighbor, do not create Partition for this system.
- If a system is unsupport or unmanaged, do not include it in any partition.
Set partition DisplayName according to partition.conf
The format can be as following:-
ipAddr1 server in NY # partition that consists of the system
# that hosts ipAddr1
router1 routers in Boston # partition that consists of router1