|Knowledge Base Article #:||60001|
|Applicable Versions:||N-central 6.x, 7.x, 8.x, 9.x|
|Date Created/Updated:||April 27, 2011|
Sometimes, you can be in a situation where your customer has contacted you about an outage that you have not detected within your SolarWinds N-central console. Very rarely is this an issue with the product and most of the time it can be attributed to the way you have your Services and Notifications configured. This article will cover some basic strategies to consider when setting up Notifications and monitoring for your customer.
Since the setup phase of SolarWinds N-central is so critical, it is best to work with one of our Solution Architects to ensure that you are on a good path for setting up Notifications and monitoring for your customers. They have a daily Question and Answer period where you can find out more. Check out http://www.n-able.com/qa for more information.
We can break this process down into 3 common points of failure:
- The device is not configured to detect the type of failure that has occurred.
- The device is not properly associated with a Notification Trigger.
- The notification was sent but you did not receive it.
In order to diagnose where your issue may lie, follow these steps:
Step 1: What failed?
Frequently, we are contacted by our partners and they indicate that their customer's server 'went down' without receiving a notification. What you need to identify is what failed? Did DNS stop responding? Did Exchange stop relaying Mail? Was there a Power outage? Did a critical application stop responding?
Step 2: What are you monitoring on this device?
Once you have identified what failed in Step #1, look at the services you are monitoring on this Device and decide whether or not one of these Services *should* have gone failed.
For example, let us say that you are monitoring a Database Server that offers a critical Database system. If you are only monitoring this server for Connectivity, Disk, CPU, and Memory Services, the database could conceivably go offline without you being ever notified. In this scenario, you would want to add a Windows Service monitor to Ensure that the Applicable Database services are running.
Step 3: How are you monitoring the service that should have gone or did go failed?
It is also important to see what the Monitoring device is for the service you have configured. If the outage compromized the monitoring probe, then you may not receive a failure notification. The services will simply go stale once the Time to Stale threshold is met after no data has been submitted.
This type of issue frequently occurs when you are dealing with a power outage. You are monitoring several servers with the Connectivity Service, monitored by a Probe located at the same location as the Servers. When a power outage occurs, even the Probe is affected and therefore it can not relay the information to the N-central server to trigger the failure.
To avoid this, it is always a good idea to monitor at least one device (such as a firewall or switch) with a device that is not located in the same physical location. The ideal choice is the N-central Server, provided that the device has a public IP.
Step 4: Did the Service go failed?
Now that you have confirmed that there was a service on this device that should have gone failed, let's check to see whether or not it did go failed.
- Browse to the specified device in the N-central UI
- Click on the Status (Services) tab
- Click on the Service that you believe should have gone failed.
- Click on the Reports tab
- Select the Detailed Status report from the Drop-menu and specify a time period where the failure should have occurred.
- Check the report to see whether the service had gone failed.
- If the service went failed, move onto the next step and take note of the length of time the Service was in a failed state. We will need this below.
If it did not go failed I would advise contacting N-Able Technical Support for further assistance.
Step 5: Was a Notification Sent?
Just because you did not receive the email does not mean it was not sent. You can see what notifications have been sent by running the Notifications Sent report. It is available from the SO Level.
If N-central is reporting that a Notification was Sent, be sure to check your SMTP settings in the Administration Console. Are you receiving any notifications from N-central? The next step is to check your Exchange relay server for filtering.
Step 6: Is a Notification Profile Configured for this Device?
So far we have determined that a Service on the device has gone failed but you were not notified. Now we must determine whether or not this device is associated with any notification profiles that you have configured.
- Browse to the specified device in the N-central UI
- Click on the Associations (N-central 7.0+) tab
- In the Notifications section, check to see if any Notifications are associated with this device. If yes, take note of them.
If there are no notifications associated with this device, then this is why you did not receive a notification. For more information, see Notifications.
Step 7: Are you a recipient of the Notification Profile(s)?
Now that we have found a notification associated with the device, let's check to see whether or not you are specified as a recipient for this Notification Profile.
- From the Customer level where the affected device is located, click Configuration > Monitoring > Notifications
- Click on the Notification Profile that you found in the previous Step.
- Verify that you are one of the Default Recipients or one of the Selected Recipients
- While you are here, also check the Delay. In a previous step we took note of how long the service was consecutively in a Failed State; if the Delay is greater than the time that the service was failed, then the Notification Profile is working as designed.
If necessary, add yourself as a recipient and this should resolve your issue going forward. Otherwise, move on.
Step 8: Is the affected Service associated with the Notification Profile that we have found?
Take the Notification Profile(s) that you found in the previous step and check to see whether you have configured a trigger for the Service that you had hoped to be notify on Failures for.
- Assuming you are already in the Notification Profile from the Previous Step, click on the Triggers tab.
- Under the Active Triggers, verify that there is a Trigger for the specified State (ie. Failed), Service, Folder and Device.
If the Service, State and Folder/Device are not listed, you'll want to add a trigger that will suit your needs for this application. For more information, see Notifications.