Don’t we all like reports?! The managerial lot does, for sure. Can vROps do some reports for you? Sure and pretty neatly too! Read on..
The requirements given to me were to report on:
- Undersized virtual machines and oversized virtual machines combined into one to produce a to-be-rightsized virtual machines list
- VMware Tools out of date and VMware Tools not running combined into one list
- VMs by low disk space
- Hot hosts (meaning those running greater than 50% CPU and memory, consistently)
- Datastores running out of space
- Scheduled daily emailed report
Since the report needs to fulfill multiple requirements, I needed multiple Views (‘V’ capitalized to indicate a name). I used a mix of some pre-canned Views that I tweaked and some that I came up with. So Views are the first thing you need to prepare.
To-Be-Rightsized Virtual Machines:
There’s a pre-canned View for something similar that I tweaked to suit my needs.
- I gave it an intuitive name (nothing spectacular, but you could end up with lots of views, so..)
- For Presentation, I choose ‘list’ and opted to get 100 items per page given the size of my environment
- In Subjects, I chose Virtual Machine since this is a View for VMs. Your choice here determines the kind of attributes vROps makes available for you to choose from. Note that attributes = metrics, don’t get confused.
- In Data, I chose a number of metrics and changed their names to something that makes sense to a not-so-technical person
- vCPUs Provisioned came from CPU | Provisioned (vCPU(s))
- vCPUs Recommended came from CPU | Recommended Size (vCPU(s))
- vCPUs Reclaimable came from CPU | Reclaimable Capacity (vCPU(s)). I chose to sort in the descending order to make highly oversized machines show up at the top.
- Memory Provisioned came from Memory | Guest Configured Memory. I chose GB as the unit for readability.
- Memory Recommendation came from Memory | Recommended Size. I chose GB as the unit for readability.
- Memory Reclaimable came from Memory | Reclaimable Capacity. I chose GB as the unit for readability. I chose to sort in the descending order to make highly oversized machines show up at the top.
- I chose to show data for the last 24 hours.
- Leave Transformation sitting at the value of ‘last’. Doing so display the values in the report as you’d expect. Other values such as ‘sum’ display the total of vCPUs against all machines (not what you need). I honestly don’t see the purpose of or really understand this Transformation ‘thing’.
- Select preview source is a great thing to have in here. Let’s you quickly see a preview of how your report would look like, just make sure you run it against a small set for a quick preview.
- Note that columns you see – the vCPUs Provisioned, Recommended etc can be moved around to further tailor the View.
- In the Filter tab, I put a number of filters in to make the report clearer. The report will display minor gains in resources too, RAM gains of as little as 10MB are displayed which is great but don’t do much to enhance the readability of the report. So:
- I chose the report to display machines for which at least 1GB RAM and 1 vCPU could be reclaimed
- I chose the report to not display machines with the names containing RDS and PVS
- Note: This is the only way I could get the criteria sets to work together. When I combined the CPU and RAM attributes it wouldn’t display any machine, hence why I separated them out. This resulted in a slightly higher number of entries but it made for a more granular report. If someone can come up with a better filter set, lemme know I’m all ears!
- The Summary tab is pretty handy – lets you sum up how much you can reclaim. I chose to display the sum of vCPUs and RAM reclaimable.
We know you can’t do without VMware Tools, hence the need to keep an eye on which machines dont have them installed or are out of date (wide latitude is permitted, I know).
There’s a pre-baked View for these two separately, I came up with one that combined them, no rocket science here.
- Tools Status came from Summary | Guest Operating System | Tools Version Status
- Tools Running? came from Summary | Guest Operating System | Tools Running Status
VMs with low disk space
VMs with low disk were those with less than 11% left:
- Disk remaining came from Disk Space | Capacity Remaining (%)
I used a single filter here:
- Disk Space | Capacity Remaining (%) is less than 11%
Hots hosts were deemed ones that had been running at over 50% CPU and RAM utilization over the past 24 hours. I came up with the following View:
- CPU Usage – hot hosts came from CPU | Capacity Usage (%). Sorted in the descending order.
- RAM Usage – hot hosts came from RAM | Capacity Usage (%). Sorted in the descending order but once again, it’ll sort by one metric (attribute) only.
- Read Latency came from Storage | Read Latency
- Write Latency came from Storage | Write Latency
- Read IOPS came from Storage | Reads per second
- Write IOPS came from Storage | Writes per second
I used the following filters:
- Display hosts with CPU utilization at least 50%
- Display hosts with RAM utilization at least 50%
I chose not to summarize anything here.
Datastores running out of space
I customized the pre-baked View (actually cloned the original one and customized the clone):
- Total came from Capacity | Total Space (TB)
- Usage came from Capacity | Used Space (TB). I chose to sort in the descending order.
- Overcommit was handy too – I could see which datastores were being hit the hardest given the thin-provisioning of vmdk’s.
Next, I added all these Views to a new report, called Daily Report, that I created:
Finally, I created a schedule so this report ran only on Monday through to Friday and only against a custom group that I created. Thing is, with large environments, reports can take a long time to run and run up the IOPS numbers on the array. I ran the report against my entire environment and it took about 3.5 hours to finish (some backups were running during this time too). My point is, run the report for your intended target.
I’ll post on custom groups next in a few days, these things are very useful when it comes to grouping certain areas of your infrastructure. You can then apply supermetrics against these groups and get some pretty good stats out of them.