Exchange 2013: Basic Monitoring

Exchange 2013: Basic Monitoring

If you’ve been reading my other posts then you can probably tell that I’ve had Exchange on my mind lately.  I’ve just finished my second migration from a legacy Exchange environment to Exchange 2013/O365 Hybrid and anyone who’s been through one of these migrations knows how intense these times can be, but also really fun and exciting!  Another thing that happens is we get to use ALOT of PowerShell to help with the migration process and it is a good opportunity to sharpen our scripting skills.

One of the first things I do after coming into a new Exchange environment (especially after they are on 2013, but this would apply to other version as well) is to get some basic systems monitoring in place.  Now I realize that there are dozens of ways to monitor systems these days, but they typically give you WAY too much information to start with.  All I’m looking for is a quick one page report that gives me the basic info on the status of the Exchange environment.  The other more robust monitoring systems like SCOM, Solarwinds, etc., etc., etc. still have there place for a much more in depth look at server health, but this script is good enough for most day to day things.

The Script

Each morning when I come into the office I need to be able to answer four basic questions about the Exchange environment.  If I can answer these basic questions then I know that the Exchange environment is in pretty good shape.

  1. What are the states of the databases?
  2. Is there any mail stuck in a queue?
  3. Do I have any disks that are filling up?
  4. Are the IIS logs clearing out?

What are the states of the databases?

This one is probably the most important stat to keep an eye on.  The code is really straight forward (Get-MailboxDatabaseCopyStatus) and while it doesn’t give you all the information about a database it’s does give you exactly what you need for a quick view.

The table below shows typical output from this cmdlet.  When I receive my status email at 6:30 every morning this is the first thing I look at.  I chose each of these stats for a specific reason so let’s go through each column and I’ll let you know why I think they are important.

NameStatusCopyQueueLengthReplayQueueLengthLastInspectedLogTimeContentIndexState
DB01\EXC01Mounted00Healthy
DB03\EXC01Mounted00Healthy
DB02\EXC01Healthy018/27/2015 6:30Healthy
DB04\EXC01Healthy008/27/2015 6:29Healthy
DB02\EXC02Mounted00Healthy
DB04\EXC02Mounted00Healthy
DB01\EXC02Healthy008/27/2015 6:29Healthy
DB03\EXC02Healthy008/27/2015 6:29Healthy
DB01\EXC03Healthy0231108/27/2015 6:29Healthy
DB02\EXC03Healthy0319238/27/2015 6:30Healthy
DB03\EXC03Healthy0159018/27/2015 6:29Healthy
DB04\EXC03Healthy0561888/27/2015 6:29Healthy

Name

I really hope this is obvious 🙂

Status

Most of us know where our databases are mounted and when they aren’t where we left them we should start asking questions.  From the example below you can see that DB1 and DB3 are mounted on EXC1 and DB2 and DB4 are mounted on EXC2.  It also tell us that all of the other copies of the database are in a healthy state. So all is right in the world.  Replication in Exchange 2013 is really good.  Databases can move from one server to another so smoothly that usually no one notices it.  And unless you are monitoring your Exchange environment you won’t either!  So if you notice that a database has been mounted on a different Exchange server then somethings up and you need to investigate.  Here are some other status codes that you could see.  If you want a full list and their description then check out this technet article.

FailedSeeding
SeedingSourceSuspended
HealthyServiceDown
InitializingResynchronizing
MountedDismounted
MountingDismounting
DisconnectedAndHealthyDisconnectedAndResynchronizing
FailedAndSuspendedSinglePageRestore

Copy Queue Length

The copy queue length represents the number of logs are waiting to be copied to a mailbox server that holds a passive copy of the database.  Typically we want to see this number at or very near 0.  This means that Exchange replication is up-to-date.  Anything higher means there are most likely some issues that need to be addressed.  If a copy queue length is WAY out then there are ways to fix the copies, like reseeding the database, but I’ll have to cover that in another post.

Replay Queue Length

The replay queue length goes hand in hand with copy queue length.  It is the number of logs that have been copied to the mailbox server, but have not been “replayed” into the passive copy of the database.  Just like the copy queue length we typically want to see this number at or near 0 except for one special case.  If you are using lag copies of databases then we want to see a much higher number than zero.  In the example above the database copies on EXC04 are all lag copies of the database that run 7 days behind the active copy.  You can read some more information on Managing mailbox database copies from this technet article.

Last Inspected Log Time

This displays the date and time of the last log file that was inspected by the LogInspector.  Notice that some fields are blank?  That’s normal since this only applies to passive copies of the databases.

Content Index State

This show you the current state of the index for all of the mailboxes in a given mailbox database.  These indexes become really important when users try to search for emails (in online mode).  So while this stat isn’t crucial to mail delivery it is important for user experience.

Is there any mail stuck in a queue?

A quick glance at the mail queues is usually all that is necessary to make sure that your Exchange environment is in good shape.  As long as the queue lengths are at or near “0” then move on with the rest of your day.  However, if your queues are stuck…that’s another story all together.  Again this post is really meant to let you know what is going on and not how to fix a given issue.  If you want to read more on Exchange mail queues check out this article.

This is what a typical output will look like for this cmdlet.  Let’s go through each one of these columns

IdentityDeliveryTypeStatusMessageCountVelocityRiskLevelOutboundIPPoolNextHopDomain
EXC01\1SmtpDeliveryToMailboxActive00Normal0db01
EXC01\2SmtpDeliveryToMailboxActive20Normal0db02
EXC01\SubmissionUndefinedReady10Normal0Submission
EXC01\Shadow\3ShadowRedundancyReady00Normal0Exc02

Identity

OK so I won’t give you a hard time on this one.  It’s not quite as cut and dry as “Name”, but close.  This tells us which queue we are are looking at and it is displayed as “ServerName”\QueueName.

DeliveryType

This tell us how the transport service will transmit the message to the next hop.  It could be headed directly to a mailbox or to another Exchange server.

Status

There are only 4 values that can be here.

  • Active – This means the queues are actively transmitting messages
  • Connecting – The queue is in the process of connecting to the next hop
  • Retry – The last connection failed and it is retrying the connection (NOT GOOD!)
  • Suspended – This won’t happen unless you tell it to.  Only an administrator can suspend a queue.

Message Count

This tells you how many messages are currently in the queue.

Velocity

The property represents the rate that the queues are draining.  To get the velocity Exchange subtracts the IncomingRate from the OutgoingRate.  Zero is a really good number here.  This means that messages are leaving the queue as fast as they are coming into the queue.  If the velocity is greater than 0 then messages are leaving faster than they are coming into the queue.  Here is a potentially bad one…if the velocity is less than 0 then messages are entering the queue faster than they are leaving.  This means you potentially have work to do!

Next Hop Domain

Don’t over think this one.  Just know that all this tells us is where is message is going after it leaves this queue.  Sometimes this is another Exchange server or another queue even another domain.

Do I have any disks that are filling up?

In a perfect world we have unlimited storage and people can keep all the email their hearts desire, but that’s a perfect world.  In the real world storage is always at a premium and so we need to keep an eye on the drives on the Exchange servers to make sure we aren’t filling them up.  We especially need to keep an eye on the system drives.  They have tendency to be the smallest drives and if you didn’t move the location of your IIS logs then you WILL fill up your system drive.

This is a pretty straight forward cmdlet.  We are getting the volumes for each of the Exchange servers and how much capacity and free space they have.  The only little bit of trickery is that we substitute the byte count with GB count and rename the column.  Pretty slick.

 

 

Are the IIS logs clearing out?

This the very last thing that I check and it is really brief check.  I have a process that will delete all but the last 7 days of IIS log files and I just want to make sure that those files are getting removed like they should.  Typically this will be a very long list of file names so as long as there is some data here that’s usually good enough.

 

Wrap Up

That’s really about all there is to this 2013 Exchange Monitoring script.  We’ve answered those four basic questions so we can put Exchange on the back burner for the rest of the day.  There are a few more enhancements like formatting all this information with HTML and sending it out through email, but you can check that out in the full script download.  This is by no means a full monitoring solution and it only meant to give a quick glimpse into what is going on is your Exchange environment.

 

There are a few things you will need to edit in the full script to make it work for you.  I’ll be updating this script later so that you will have less to change.

  1. Edit the connection string for YOUR exchange servers
  2. Change the names to include YOUR exchange servers
  3. Edit the File locations to something that works for where you plan on running the script
  4. Edit the SMTP information including SMTPServer, the FromAddress, and To Address.

Please leave me a comment if you have any trouble running the script in your environment.

As always your comments and questions are welcomed.  If you like what you’ve read please leave a comment or review and make sure you sign up for our newsletter so you know when the latest blog post arrive.

 

Thanks,

Matt