Let's Talk IT Monitoring
Data Center Monitoring Overview
Monitoring is a complicated topic, further muddled by a lack of cohesion in what we all mean when we say "monitoring," especially in the IT industry. To some, it is a needed (sometimes branded and expensive) service, and to others, it is unnecessary and risky.
Because every environment has its own unique needs and challenges, it's impossible to say whether it's always beneficial, or conversely, always problematic. As with most things in life, it's far more nuanced than that.
Our goal is to equip people with the knowledge they need to make informed decisions on what's best for their environment. So we're going to breakdown the elements of monitoring and explore the different options.
- Device Alerts, Notifications, & "Call Home"
- Third Party Monitoring Software & Automated Services
- Branded Monitoring Package VS Monitoring Services
Flow Chart Infographic
- The Network
- Remote Access
- Monitoring: A Two Way Street of Communication
Monitoring Options Compared
Monitoring: The Breakdown
If we zoom out for just a minute from the IT space and think about the definition of monitoring in its broadest sense, it involves observing and tracking something, often with some kind of device or system to collect and analyze data. Sounds simple enough right? But when we start talking about how monitoring works in the IT space, it can get a little confusing, so let's break it down.
Device Alerts, Notifications, & "Call Home"
All monitoring starts with the devices themselves. Most equipment across manufacturers comes with the built-in capability to send notifications about various device information including diagnostics, issues, failures, and other system events. These notifications are often called "Call Home" alerts.
The same basic interface (IPMI, or Intelligent Platform Management Interface) is used across manufacturers, but each OEM can customize how Call Home alerts look and sound and what they gather. For instance, a NetApp storage alert may include logs, while a Dell alert is more likely to have only basic, broad details.
How do Call Home alerts work?
Sensors Be Sensing
Regardless of manufacturer or type of equipment, devices are always equipped with sensors. They can gather all sorts of details about the functional operation of equipment. For instance, here are just a few examples of what information sensors could be gathering:
- Temperature
- Speed the hard drive is spinning
- How long the hard drive has been spinning
- How much traffic is on the port
- How consistently has the device been working
- PCI slot of a hard drive
- Amount of power available to the device
- And much more
Alerts use these sensors to give some information. For instance, you might receive an alert indicating that your power supply has experienced a failure due to loss of an external power source or due to internal component failure. This device information can be routed to whoever is in charge of doing something about it (internal staff or support provider) or collected by a third party monitoring program or service. For many companies, utilizing Call Home alerts is exactly the right amount of monitoring.
Gathering logs is another example of using the information collected by sensors. Logs are snapshots of the information the sensors are collecting right now, which is why they are often used for system maintenance or troubleshooting. Monitoring, in its most basic form (whether an automated service or a human engineer), is sort of like consistently getting logs and analyzing patterns.
Access: A One Way Street Or a 2 Way Street
When Call Home alerts or notifications are sent from the device (whether the alert stays internal or is forwarded to your provider), the data only flows in one direction, and no external access to your system or network is required. Think of it like the difference between a one-way and a two-way street. On a one-way street, traffic only flows in one direction.
However, with continual monitoring (especially with automated services or applications), data is flowing in both directions. Access to the network is necessary for the application, software, or even a human (if that's the route you go).
Third Party Monitoring Software or Automated Service
Automated monitoring (such as Parkview and others) requires continual access to your network and systems because it proactively collects the notifications and alerts.
It's important to be aware that because of the continual network and systems access required to run these programs, it's also possible for other information to be gathered (besides what the sensors be sensing), such as performance, network traffic, device fingerprinting, and more.
Here's another way to think about automated monitoring: it's like giving a robot the keys to your mailbox and full access to everything inside. Regardless of the bells and whistles the automated monitoring service may tout, it's essentially like a hiring (a pretty expensive) robot to open your mail, read it, and decide what to do with it.
While it might be helpful in some cases, it can also be expensive and possibly cause problems. The robot may throw away an important piece of mail (a critical alert) or keep a bunch of junk mail (prioritizing unnecessary notifications) that has to be waded through.
And once you've given the robot a set of keys, you have to trust that they will keep those keys safe and that they will not collect other information that they stumble on while going through your mail. If you get a ton of mail, having that help could be beneficial, but there could also be some significant downsides. Besides being expensive, errors can happen, and providing network and system access can create vulnerabilities, including unwanted information gathering and network risks.
In contrast, utilizing Call Home alerts is like forwarding only the relevant mail to the person who needs it (whether that's the support provider or internal staff). No access to the mailbox is needed.
Branded Package Vs À La Carte Services
Monitoring services come in various shapes and sizes. Much like Kleenex is a brand of tissue, there are branded monitoring services (think Solar Winds or ParkView). All Kleenexes are tissues, but not all tissues are Kleenex brand.
Some monitoring services are branded packages that use specific software with full access to the customer's network (think Solar Winds, ParkView, and others). Other services might not have a name or dedicated software, but still require the same access to be effective.
Some customers prefer an approach that gives them more control over the network access they grant to vendors and opt for a hybrid approach that provides access to a human engineer only when needed (rather than the constant access needed by automated services).
Key Takeaways
- All monitoring starts with device-level diagnostics and alerts (and for some companies, this is sufficient).
- This technology can be leveraged for built in notifications to be sent (Call Home) to be sent or collected.
- Any monitoring service that goes beyond receiving Call Home alerts will require some form of access to your system and network.
- Automated services will require continual access to your system.
If you're ready to look at your options, skip ahead to our monitoring options compared table.
Monitoring Flow Chart
Let's follow the monitoring path in our flowchart. All monitoring starts with the information collected by device sensors and the corresponding notifications. What happens with those alerts is where things can differ a little.
Unless you are self supporting and using internal resources, the end result will also be similar. Your support provider will take whatever action is needed. The timeline for resolution, if there is an issue, still depends on your SLA and parts stocking strategy.
Monitoring, Access, & The Network
An important conversation surrounding monitoring involves access and network security. Many companies have strict policies about granting access into their network. And it's important to understand that any monitoring service that goes beyond forwarded device alerts will require at least some network access.
Access & The Network
All monitoring must come through the network. Networks are like highways that allow traffic to flow through the system. Because cybersecurity is such a big deal right now, most company network departments are pretty diligent about keeping an eye on what goes in and out of the system on the network "roads."
Monitoring is like adding an additional on-ramp and off-ramp to the highway. Automated services that are continually monitoring need access to this "road" that is the network.
Remote Access & What That Can Look Like IRL
Another part of the network and monitoring conversation is remote access. Every company has a unique set of challenges, policies, and priorities to consider when determining the appropriate access for vendors and support providers. For some, the biggest priority is to be as hands-off as possible and avoid involvement in any part of device maintenance or diagnostics. And if that's the case, they may find that it's worth providing the continuous remote access required by automated monitoring.
For others, network security and data privacy may take precedence, and strict policies may limit (or even prohibit) the possibility of any remote access. Working within stricter network policies and security measures (e.g., firewalls) can sometimes be a challenge for automated monitoring services.
It's critical to understand the access requirements for the monitoring services you want to implement. To get an idea of what this could look like IRL, let's walk through a pretend scenario with differing levels of remote access and customer involvement.
Quick disclaimer: these scenarios are for demonstration purposes and aren't meant to reflect any specific provider's system or process but rather to help give an idea of a general workflow and the sort of access that would be needed. Most (but not all, you know who we mean) support providers can work with customers to create a solution that will work for them.
Let's say you have a 3PAR device with a failed power supply.
Automated Monitoring = Continual Remote Access + Connection the Vendor's System + Less Customer Involvement
With an automated monitoring service, the 3PAR alert is collected by software and sent to the provider's system, where a service ticket is generated. Remember that two-way street we talked about earlier? Because of the continual remote access you were required to grant and the connection to the vendor's system, information flows in both directions. Once the service ticket is generated in their system, you'll be contacted, and service will be scheduled depending on your SLA.
Custom Monitoring Scenario 1 = Consistent Remote Access + Less Customer Involvement + Human Engineer
The same 3PAR device alert is automatically forwarded to your service provider (if you've set it up to do so). The human engineer will log in when they receive the alert and begin troubleshooting (remote access capabilities must be set up in advance to make this possible without customer involvement). If necessary, the human engineer will open a service ticket. Just like with the automated service, you'll be contacted, and service will be scheduled according to your SLA.
This option still requires some level of remote access for the human engineer to log in to the system. However, the access need not be continual. The same functionality can be attained with a human engineer as with an automated service, and because of the human engineer's critical thinking capabilities, unnecessary tickets can be prevented, small issues can be easily resolved, and troubleshooting can begin more quickly.
Custom Monitoring Scenario 2 = Limited (or no) Remote Access + More Customer Involvement + Human Engineer
If you want to keep things more in-house and limit outside access to your network as much as possible, you've still got options.
The same 3PAR alert is forwarded to your service provider. Your service provider will open a service ticket, and you will be contacted to retrieve logs. Depending on the situation, you could choose to give your provider (human engineer) temporary access (for instance, via a VPN). After reviewing the information, service will be scheduled according to your SLA.
Key Takeaways: Monitoring, Network Access & Security
Whatever your company's specific regulations, priorities, staff resources, and security level needs, there is a way to implement a form of monitoring that fits your circumstances.
- Monitoring (beyond device- level alerts) requires access to your network.
- Sometimes, monitoring services' access requirements conflict with the company's network access and security policies.
- There are options that don't involve continual access to the system and network.
- There is a trade-off between access and involvement. Less access = more customer involvement.
What kind of monitoring is right for you?
We firmly believe that's for you to decide. None of the options below reflects any particular brand or service provider. Instead, they offer a zoomed-out view of the possibilities. Don't forget that "Call Home" refers to the device-level alerting system built into most equipment.
Monitoring Options Compared
Call Home + Internal Staff
- Device-level alerts (Call Home) sent to internal staff
- Internal staff review alerts & decide what to do
- Internal staff fixes issues
Pros:
Cost-effective - leveraging internal resources & budget already allocated
Secure - no outside access needed
No unnecessary tickets opened - human with expertise reviews alerts
Cons:
Resources may not be available
May take more time if internal staff lack the expertise to resolve issues
Call Home + Internal Staff + Support Provider
- Device-level alerts (Call Home) sent to internal staff
- Internal staff review alerts & notify support provider
- Service ticket opened if necessary
- Provider fixes issues
Pros:
Cost-effective - leveraging combo of internal resources and outside help when needed
Secure - no outside monitoring access needed
No unnecessary tickets opened - human with expertise reviews alerts
Cons:
May take more time since more eyes/hands are on it
Call Home + Support Provider
- Device-level alerts (Call Home) forwarded to provider
- Provider reviews alerts
- Service ticket opened if necessary
- Provider fixes issues
Pros:
Cost effective - getting the most out of the support contract you already have
Secure - no outside monitoring access needed
No unnecessary tickets opened - human with expertise reviews alerts
No internal resources needed for operation
Cons:
May take more time since a human is reviewing alerts (vs automated opening of tickets)
Third Party Software or Device
- Device-level alerts (Call Home) collected and reviewed by automated system
- Service ticket automatically opened
- Provider fixes issues
Pros:
Faster opening of service tickets - automated collection and opening of service tickets
No internal resources needed for operation
Cons:
Most expensive option
Access required for monitoring to operate
Other data may be collected without knowledge or consent.
It's important to mention that there are many companies and situations where the answer to the question of what kind of monitoring is "none." That's a completely valid take. Some companies have internal staff resources whose job it is to monitor performance, alerts and more, so an outside piece of software or vendor is not only unnecessary but a waste of money and a potential security risk.
It's important to note that every environment is unique, so adjustments to these categories are possible (unless your provider gives you no choice.) We've created these buckets to clarify monitoring options, but, IRL, it could look different.
Like most things, the right monitoring option for you depends on your top priorities. If security and budget are your biggest concerns, you'll likely want to go with an option that restricts vendor access, as these typically also cost less. On the other hand, if you have a high volume of devices and don't mind a higher price tag or the security risks, you might want to opt for the all-in-one automated monitoring service.
Let M Global Help
Want help figuring out if you need monitoring and what sort of solution is right for you? We can help! Let's talk through what your needs and and instead of being told what you have to do, let's work together to create a solution that matches your needs, resources, comfort levels, and policies.
We're all about turning challenges into successes and obstacles into an opportunity for growth. We love creating solutions for our clients, no matter how difficult the challenge.
We want you to consider us an extension of your team, a trusted resource and an advisor. Fill out the form or give us a call at 855-304-4600 to find out more.
Let's Chat.
More to Explore