Send Queue % Used Monitor

  • ID:  Microsoft.SystemCenter.HealthService.CollectionRule.Performance.SendQueuePercentUsedMonitor
  • Description:  This monitor measures the Health Service Management Groups\Send Queue % Used counter for the Health service.
  • Target:  Agent
  • Enabled:  Yes

Operational States

Name State Description
Under Lower Threshold Success  
Between Thresholds Warning  
Over Upper Threshold Error  

Overridable Parameters

Parameter Name Default Value Description Override
Frequency 180  
Number of Samples 3  
Warning Threshold 90 Warning threshold value
Error Threshold 95 Error threshold value

Alert Details

Monitor State Message Priority Severity Auto Resolution
Over Upper Threshold (Error) {0}: The Health Service send queue on this system is filling up High Critical Yes

Run As Profiles

Name
Default

Monitor Knowledgebase

Summary

This monitor measures the Health Service Management Groups\Send Queue % Used and generates the following states:

Monitor State

Send Queue % Used Threshold

Warning

50 %

Critical

60 %

Causes

This can be caused by a low bandwidth or high latency connection from this System Center Management Health Service to its parent Management Server. This can also be caused by rules that are collecting more data than the parent Management Server can process; especially when the parent Management Server has many agents reporting to it sending large amounts of data.

Resolutions

Check with your network administrators if the network connection from the System Center Management Health Service to the parent Management Servers is saturated. If so, you may need to upgrade your networks to accommodate the traffic.

If you cannot upgrade your network (e.g. if your System Center Management Health Service or Gateway Server is at a remote branch office), you can disable unnecessary collection rules. Below are a list of rule types you can disable and their impact of disabling them:

Rule Type

Rule Purpose

Impact when disabled

Performance Collection

Collects performance data to either the Operational Database, Data Warehouse, or both

When a performance collection rule is disabled, any view that shows that performance data will no longer have data viewable. If the rule was collecting data to the Data Warehouse, reports dependent on that performance will no longer render any data.

Event Collection

Collects event data for diagnosis. In some cases, an event may not be helpful to alert on, but is helpful for either forensic troubleshooting or near real time troubleshooting.

When an event collection rule is disabled, any view that shows that event data will no longer have data viewable. If the rule was collecting data to the Data Warehouse, reports dependent on that event will no longer render any data.

Lastly, if you still need that data, one other option you can implement in the system to attempt to reduce the amount of data sent over the network is to use optimized performance counter collection rules and event consolidation collection rules. The below table summarizes their benefit and explains how the data is summarized.

Rule Type

Benefit

How data is summarized

Optimized Performance Collection Rule

Only sends the performance data sample if it deviates from the last sample within some percentage. E.g., if the last sample was 42, and the rule was configured to only collect to a new sample with a tolerance of 10%, the next sample will need to 42 +/- 4.2 (e.g. next sample needs to be greater than 46.2 or less than 37.8)

Because only performance data that exceeds the configured tolerance is sent to the Operational Database or Data Warehouse, the data will be less precise. The larger your tolerance, the less the precision.

Consolidated Event Collection Rule

This type of event collection rule sends the data if one of the parameters it is configured with differs from the last event. E.g., you can configure a consolidated collection rule to consolidate events where the following are identical:

  • Event Source

  • Event ID

  • Source Computer

  • Description

You can then configure a timeframe to consolidate these events (e.g. 10 minutes). If the above criteria match for any event, within that 10 minute window, only 1 event is sent up with its Repeat Count property incremented. If this event was occurring frequently on a single agent, this means there would only be 144 events sent up in a 24 hour period, which may be significantly less than the number of events actually logged to the event log

You need to be aware of which event parameters and properties you consolidate on. Configuring for example on the Description will mean that if the Event Description is typically unique (e.g. it contains a username) then you will still get many events sent up. For that example, you would instead want to consolidate over the Event Parameter that represents the username field.

Also, having a very large consolidation windows has to affects:

  • Delayed events viewable in the Event View or Reports (since the data needs to be consolidated until the end of that consolidation window)

  • Slightly higher resource utilization on the agent. With a low number of consolidation rules, this may be negligible. With a large number of these rules types compounded with long consolidation windows, the resource utilization will increase correspondingly.

See the product help or navigate to the Authoring space in the console to create the type of rules mentioned above.

External References
This monitor does not contain any external references.

See Also for System Center 2016 Operations Manager inbox MP hotfix for WMI health monitor issue Management Pack


Downloads for System Center 2016 Operations Manager inbox MP hotfix for WMI health monitor issue Management Pack

AZURE OPTIMIZATION ASSESSMENT GET STARTED
MIGRATION TO AZURE GET STARTED
SYSTEM CENTER MIGRATION TO AZURE GET STARTED
MIGRATION TO AZURE FOR SQL AND WINDOWS 2008 GET STARTED