3 Configuration Control Module

The basis for managing any network is knowing its configuration. Thus, this module is in some senses the 'core' of the application. This chapter details the manner in which the application will learn and maintain up to date information about the network configuration.

3.1 Node / Component Discovery

Upon startup, every process component shall execute an algorithm to determine the number and types of nodes within its sphere of control, as well as the number and types of communication links that interconnect those nodes. For each such component key information will be learned as well, such as capacity. For some components and for some characteristics of others, discovering the information automatically may not be possible at the time of startup. It shall be possible for the details to be completed via manual intervention or via a pre-set configuration information file. Note that some important characteristics of many nodes in a network have to do with how that node routes traffic from itself to different destinations. That is, expected route 'costs' must be known. This information is generally known only to route control agents, and will thus need to be learned from them.

Upon start up:

  1. Determine number and types of nodes
  2. Determine number and types of communication links
  3. Get key information: capacity...

Some components information is obtained via manual intervention/pre-set configuration file

Get expected route 'costs' from route control agents

3.2 Configuration Change Notification

Over time, of course, the network configuration will change. Components will be added or removed intentionally. Components will become unavailable at times. To the greatest extent possible, a process component will be able to receive automatic notification from other nodes in its sphere of control concerning these changes. In some cases, the process component must be able to automatically detect these changes, as in failure of a node to respond for some specified length of time. In other cases, the information will have to be supplied via manual intervention. In all cases, if the configuration change is relevant only to displays or reports produced locally to the process component, the appropriate display / report database will simply be updated. If the configuration change is to a part of the network for which displays or reports are generated upstream from the process component, the change notification will be forwarded to all such upstream process components. As an option, unexpected changes of configuration may cause an alarm notice to be produced locally and/or at specified upstream locations. This function is further described in the next section.

A process component will receive automatic notification

  1. A process component must automaticallly detect changes
  2. Information will have to be supplied via manual intervention

If change is local, the appropriate display/report database will be updated.

If a network change, the change notification will be forwarded to all such upstream process components

Unexpected changes may cause an alarm notice

3.3 ALARM Management

Under some circumstances specific real time events need to be dealt with in real time. The component process that detects such events will generate an alarm. Some unexpected changes of configuration may be deemed important enough to cause an alarm. In some cases, even a planned configuration change (that is, one made intentionally by operations personnel) might generate an alarm. If a key node is going to be taken down, it may be important to notify other nodes so they may adjust their routing tables, etc. Operations personnel may or may not know about all such dependencies. Other alarms have to do with exceeding certain threshold values. Each process component, at a minimum, will monitor its own use of CPU and memory resources in its resident processor, and generate an alarm when those thresholds are exceeded. Similarly, it will monitor its utilization of communication links for performing its network management functions, to ensure it does not consume more than a preset percentage of that resource. It may also be desirable to generate alarms when utilization of a link exceeds specified thresholds, or when error rates for a link reach a certain level. When a processing node (host, router, etc.) becomes irresponsive (excessive delay in responding) for a certain length of time, this may also be cause for alarm.

There shall be a mechanism to specify, or enable an alarm. This specification identifies the component and resource being monitored, the threshold value or other event that causes alarm, and the action to be taken (handling) when an alarm occurs. This will normally be a two-step process. It may be desirable to define an alarm condition, but not always have alarm detection enabled.

For an alarm that has been defined and enabled, it may be desirable to disable the alarm temporarily. It may also be desirable, at times, to edit or delete the alarm specification.

Alarm definition, enabling, editing, disabling, and deletion shall be accomplished by a privileged user, both for the sphere of control for the local processor and for downstream processes. Some cases of alarm definition and enabling may be accomplished automatically at process startup time.

There are three levels of action to be taken when an alarm occurs. In some cases the responsible process component will simply record the fact that the alarm has occurred (increment a counter) and ignore it. Note that in some cases incrementing such a counter to exceed a preset value may generate yet another alarm.

The second level of alarm handling is to notify an operator and/or an upstream process component that the alarm has occurred. This may involve changing a display or textual report, or sending a message upstream. Note that the upstream process receiving this message may combine it with other information to detect a specific event and generate yet another alarm.

The third level of alarm handling is to notify and react. In addition to performing the notify function precisely as for level two handling, a specific action must be performed. This means executing a procedure that does more than generate an informative message. There are two types of action that might be take. Some preset actions will be provided. For a simple point-to-point link, for instance, one preset action might be to reset (disconnect and reestablish connection) the link. As the development and deployment of the Network Management Application evolves over time, a set of preset actions will be clearly defined.

An alternative to doing preset actions is to execute a specific user-supplied procedure or event handler. In this case the application process will simply execute the user- provided function. This option will provide a valuable means of developing and testing proposed preset actions, as well as handling very specialized circumstances.

An alarm occurs when

  1. CPU and memory resources use exceed threshold vales
  2. A utilization of a communication link exceeds specified thresholds or when error rates for a link reach a certain level
  3. A processing node(host, router, etc) becomes irresponsive in a time frame

A mechanism to specify or enable an alarm, it should identify

  1. The component and resources being monitored
  2. The threshold value or other event that causes alarm
  3. The action to be taken(handling) when an alarm occurs

For a defined and enabled alarm, it may be desirable to

  1. disable the alarm temporarily
  2. at times, to edit or delete the alarm specification

A priveleged user can do

  1. alarm definition
  2. alarm enabling
  3. alarm disabling
  4. alarm deletion

Both for the sphere of control for the local processor and downstream processes.

Three levels of action when an alarm occurs:

Level 1. The responsible process component records the alarm--increment a counter and ignore it, but it may generate another alarm

Level 2. Notify an operator/upstream process and may change a display/textual report or send a message

Level 3. Notify and react, execute a specific user-supplied procedure or event handler