Once you have established a list of SQL servers in your environment, you will want to ask some critical questions:
- How healthy is this instance?
- When is the last time it was backed up?
- Does it have sufficient CPU, memory, and storage to meet its SLA?
- What kind of workloads run on this instance?
- What applications and users use this instance?
- When is the workload at its busiest?
- Is there a failover strategy in place?
- Is this a mission-critical instance?
- Does it need to be available 24/7?
- What kind of performance challenges does this instance have?
These may seem like obvious questions, but if you start monitoring your SQL server workloads for the first time, you will be surprised and possibly a little horrified to see how many of them have fundamental issues.
Set Performance Goals
Think about what you want to achieve with your SQL server monitoring efforts and prioritize these goals. By framing your activities in terms of key performance indicators, you make it easier to articulate the value of the energy and any investments required. The following list will get you started.
What are your availability stats? Remember that unavailability to the user is when they cannot access the service. This may be caused by a complete outage or by a performance bottleneck that effectively makes the service unavailable. Do you have an always-on configuration in place? If so, do you know its status?
From the point at which an issue is reported, how quickly can you isolate the source, diagnose the symptoms, and respond to those impacted?
How quickly can you resolve the symptom to restore normal operation? The “sticky plaster” solution is an important start, but should not represent the end of the matter. Have you explored the root cause of the issue? Can you be confident that you will not see a reoccurrence?
Cost of Ownership
Cost of ownership is a critical factor when deciding where your SQL server instances should reside. It is important to gauge the up-front investment costs associated with infrastructure and licensing, the ongoing maintenance costs, and any consumption-based costs that are associated with a cloud-based workload.
If you are trying to decide how much your instance will cost in the cloud, the critical metrics include CPU usage, read and write activity, and storage. You will need to measure these over an extended period to figure out your workload boundaries to ensure that you are resourced for the spectrum of workload you expect for a particular instance.
Getting familiar with the particular characteristics of the workloads running across your SQL server instances will put you in a much better place to make sure everything keeps running as it should and to meet both current and future needs of your business.
Study Performance Over Time with SQL Server Monitoring
Databases are fluid systems. Very few have steady, repetitive, and predictable workloads. It is far more common to see wide variations over time which fluctuate based on numbers of users, automated jobs, numbers of transactions, volume of data, and so on.
A sales database will get busy at the end of the month. It will also see a spike in activity around seasonal events or driven by marketing promotions.
Classifying your performance based on small snapshots is not a good policy. The more history you can gather and analyze, the more insight you can gain about the variations and boundaries of each of your workloads' characteristics.
You will need to judge just how much history it is practical to keep and whether you have the resources to process it. There are cost and performance implications to consider.
Get the Complete Picture
Every database is a complex system with many moving parts. Many configuration criteria can have an effect on its performance.
The design and architecture of the database itself will influence outcomes. Code efficiency can make or break performance, also. Configuration options will determine how the SQL server instance consumes resources available to it.
Consider the following scenario: An instance is slowing to a halt and the DBA spots a spike in CXPACKET or CXCONSUMER wait. The knee-jerk reaction is to shut down parallelism. The waits disappear and the current bottleneck is temporarily relieved. Now the entire instance is running slower, but the DBA does not want to re-enable parallelism. If some further investigation had been done, it would have revealed that a query was running particularly slowly and the cause was a missing index.
There are many influencing factors when it comes to database performance. Knowing the normal fluctuations of your particular instance by monitoring your SQL server helps you to identify when behavior is trending out of control and to anticipate problems before they occur. It also helps you to distinguish between what are natural growing pains caused by increased workload or seasonal spikes that may just require more resources versus deeper performance issues that require code tuning, index optimization, or configuration tuning. Monitoring many different metrics in parallel helps to accurately identify the root cause and avoid expensive misdiagnosis that can cause a repeat or even escalation of the same issue.