Why Virtualize Splunk?

By | March 20, 2019

Why should you virtualize Splunk?

I get asked that question all the time.  Let us first take a look at a typical Splunk installation on bare metal servers.

What information do we even need to collect to size a Splunk?

We need to start with the following questions:

  1. How much data are you going in ingest into your Splunk environment?
  2. How many days of data do you want to be stored in the hot/warm data tier?
  3. How many days of data do you want to be stored in the cold data tier?
  4. How many days of data do you want to be stored in the frozen data tier?
  5. How many concurrent users will be pulling reports?
  6. What are the Splunk modules and versions you are deploying?
  7. What Search factor are you looking to use?
  8. What replication factor are you looking to use?

There are many more questions that need to be asked but this will allow for the sizing process to start.  How do I use this data?  We need to take a trip to https://splunk-sizing.appspot.com/. On this website, you will have to enter the information you gathered above.  Once all the data has been entered, the site will reveal the node count and required storage for each node.  In this context, I am referring to a node as a physical server.

In a future blog, we will take a deeper look into the Splunk architecture.

At this point, I am sure you have already started to pick out why to virtualize.

Physical hardware can fail.  Depending on the Splunk setup, a single physical server outage could take down the entire infrastructure.  If clustering has been enabled, a loss in performance and possibly limited access to the complete data set.

Here are the reasons to virtualize:

  • Ability to fail a server to a new hypervisor host
  • Shared storage
  • Networking between virtual servers should be redundant
  • SSD and HDD storage arrays
  • Pooled resources


Virtualization uses high availability as the key reason to why virtualize Splunk.  If you take into account replication factors and search factors, virtualize is not needed or required.  Splunk took high availability and created it as part of the application.

Join me on my next blog, why run Splunk on Nutanix.


Leave a Reply

Your email address will not be published. Required fields are marked *

8 + eight =