Category Archives: Data Analytics

What is Hadoop?

What is Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver… Read More »

What is Elasticsearch?

What is Elasticsearch? Elasticsearch is an open-source, RESTful, distributed search and analytics engine built on Apache Lucene.  Since it was released in 2010, Elasticsearch has quickly become the most popular search engine and is commonly used for log analytics, full-text search, security intelligence, business analytics, and operational intelligence use cases. Why do you need Elastisearch?… Read More »

Splunk 101

Welcome back to my Splunk series.  Let’s continue our journey with Splunk. Splunk 101 What data can Splunk ingest?  Let us take a view of Splunk at 1000 feet.                 One thing that Splunk strives for is they can ingest any data.  Splunk software collects and indexes data… Read More »

What is Splunk?

I have decided to start writing again.  I am going to start with the three main big data platforms and add more as time allows.  As you can see by the title, we are going to dig into Splunk today. What is Splunk? Splunk is a software technology that is used for monitoring, searching, analyzing… Read More »

Why Virtualize Splunk?

Why should you virtualize Splunk? I get asked that question all the time.  Let us first take a look at a typical Splunk installation on bare metal servers. What information do we even need to collect to size a Splunk? We need to start with the following questions: How much data are you going in… Read More »