Why Virtualize Splunk?

By | March 20, 2019

Why should you virtualize Splunk?

I get asked that question all the time.  So let us first look at a typical Splunk installation on bare metal servers.

What information do we even need to collect to size a Splunk?

We need to start with the following questions:

  1. How much data are you going in ingest into your Splunk environment?
  2. How many data days do you want to be stored in the hot/warm data tier?
  3. How many data days do you want to be stored in the cold data tier?
  4. How many days of data do you want to be held in the frozen data tier?
  5. How many concurrent users will be pulling reports?
  6. What are the Splunk modules and versions you are deploying?
  7. What Search factor are you looking to use?
  8. Finally, what replication factor are you looking to use?

Many more questions must be asked, allowing the sizing process to start.  How do I use this data?  First, we need to take a trip to https://splunk-sizing.appspot.com/.  On this website, you must enter the information you gathered above.  Once all the data has been entered, the site will reveal each node’s node count and required storage.  In this context, I refer to a node as a physical server.

In a future blog, we will look deeper into Splunk architecture.

At this point, I am sure you have already started to pick out why to virtualize.

Physical hardware can fail.  Depending on the Splunk setup, a single physical server outage could take down the entire infrastructure.  If clustering has been enabled, a loss in performance and possibly limited access to the complete data set.

Here are the reasons to virtualize:

  • Ability to fail a server to a new hypervisor host
  • Shared storage
  • Networking between virtual servers should be redundant
  • SSD and HDD storage arrays
  • Pooled resources

Conclusion

Virtualization uses high availability as the key reason for why virtualizing Splunk.  Virtualization is unnecessary or required if you consider replication factors and search factors.  Splunk took high availability and created it as part of the application.

Join me on my next blog, why run Splunk on Nutanix.