Splunk

From ICO wiki
Jump to navigationJump to search

Splunk

[1]

Splunk[2] is a powerful log database that can be used for analysis of any sort of log data through its easy to use search engine. Security logs, Syslog, Web server logs and Windows logs are just the beginning. One of the great features of Splunk is that you can feed pretty much any log into it and start searching. Splunk is not open source, it is commercial however it does have a Free option that allows up to 500mb of data to be added into the system per day. For larger volume than 500mb per day the licensing costs start to add up. Splunk installation under Ubuntu is so easy, you can fire up an instance to do ad-hoc analysis of static log

Why splunk ?

Their features site says it - to Collect and Index All Log Files while having very flexible data input choises. Good example of use is for Mashine Learning. https://www.splunk.com/en_us/products/splunk-light/features.html

Open Source Splunk Alternative

If you are interesting in a purely Open Source log search engine, take a look at Greylog2.

About Splunk Free

Splunk Free is the totally free version of Splunk. The Free license lets you index up to 500 MB per day and will never expire. The 500 MB limit refers to the amount of new data you can add (we call this indexing) per day. But you can keep adding data every day, storing as much as you want. For example, you could add 500 MB of data per day and eventually have 10 TB of data in Splunk Enterprise.

If you need more than 500 MB/day, you'll need to purchase an Enterprise license. See How Splunk licensing works for more information about licensing. Splunk Free regulates your license usage by tracking license violations. If you go over 500 MB/day more than 3 times in a 30 day period, Splunk Free continues to index your data, but disables search functionality until you are back down to 3 or fewer warnings in the 30 day period.

Is Splunk Free for you?

Splunk Free is designed for personal, ad hoc search and visualization of IT data. You can use Splunk Free for ongoing indexing of small volumes (<500 MB/day) of data. Additionally, you can use it for short-term bulk-loading and analysis of larger data sets Splunk Free lets you bulk-load much larger data sets up to 3 times within a 30 day period. This can be useful for forensic review of large data sets.

What is included with Splunk Free?

Splunk Free is a single-user product. All Splunk Enterprise features are supported, with the following exceptions:

  • Distributed search configurations (including search head clustering) are not available.
  • Forwarding in TCP/HTTP formats is not available. This means you can forward data to other Splunk platform instances, but not to non-Splunk software.
  • Deployment management capabilities are not available.
  • Alerting (monitoring) is not available.
  • Indexer clustering is not available.
  • Report acceleration summaries are not available.
  • While a Splunk Free instance can be used as a forwarder (to a Splunk Enterprise indexer) it cannot be the client of a deployment server.
  • There is no authentication or user and role management when using Splunk Free. This means:
  • There is no login. The command line or browser can access and control all aspects of Splunk Free with no user/password prompt.
  • All accesses are treated as equivalent to the admin user. There is only one role (admin), and it is not configurable. You cannot add more roles or create user accounts.
  • Searches are run against all public indexes, 'index=*'.
  • Restrictions on search, such as user quotas, maximum per-search time ranges, and search filters, are not supported.
  • The capability system is disabled. All available capabilities are enabled for all users accessing Splunk Free.

Ways you can configure Splunk software

Splunk software maintains its configuration information in a set of configuration files. You can configure Splunk by using any (or all!) of these methods:

  • Use Splunk Web.
  • Use Splunk's Command Line Interface (CLI) commands.
  • Edit Splunk's configuration files directly.
  • Use App setup screens that use the Splunk REST API to update configurations.

All of these methods change the contents of the underlying configuration files. You may find different methods handy in different situations.

Use Splunk Web

You can perform most common configuration tasks in Splunk Web. Splunk Web runs by default on port 8000 of the host on which it is installed: If you're running Splunk on your local machine, the URL to access Splunk Web is http://localhost:8000. If you're running Splunk on a remote machine, the URL to access Splunk Web is http://<hostname>:8000, where <hostname> is the name of the machine Splunk is running on. Administration menus can be found under Settings in the Splunk Web menu bar.

Edit configuration files

Most of Splunk's configuration information is stored in .conf files. These files are located under your Splunk installation directory (usually referred to in the documentation as $SPLUNK_HOME) under /etc/system. In most cases you can copy these files to a local directory and make changes to these files with your preferred text editor.

Use Splunk CLI

Many configuration options are available via the CLI. These options are documented in the CLI chapter in this manual. You can also get CLI help reference with the help command while Splunk is running: ./splunk help

Feed Splunk Data and Search!

Start getting data in the system and then you can search on that data. Data can be input from simple files for some one off analysis, it can read known log files or can listen on a port similar to a syslog server. It is very flexible, for example running it on a TCP port you could even use netcat to pipe a file over the network into Splunk server, or have a syslog server forward some of its logs to the Splunk instance. This would leave you with your existing syslog infrastructure intact for archival purposes but you also have the Splunk instance for easy analysis. Now you are up to the point where it depends on your network and requirements, so think about how you are going to use it, feed it some data and start searching for stuff. The stuff could be configuration issues, errors, utilization trends or security events. If you want to do some easy testing, just grab a web server log file or other log and feed it in directly with the a file or directory option.

Configure the universal forwarder

Before a forwarder can forward data, it must have a configuration. A configuration:

  • Tells the forwarder what data to send.
  • Tells it where to send the data.

Because the universal forwarder does not have Splunk Web, you must give the forwarder a configuration either during the installation (on Windows systems only) or later, as a separate step. To perform post-installation configuration, you can:

  • Use the CLI. The CLI lets you do nearly all configuration in a small number of steps, but does not give you full access to the feature set of the forwarder.
  • Create or modify configuration files on the forwarder directly.
  • Use a deployment server. The deployment server can ease distribution of configurations, but does not make a forwarder forward data by itself. You must use the deployment server to deliver configurations to the forwarders so that they collect the data you want and send it to the place you want.

About configuring the universal forwarder with configuration files

Configuration files are text files that the universal forwarder reads when it starts up or when you reload a configuration. Forwarders must read configuration files to know where to get and send data. These files give you full access to the forwarder feature set, but editing configuration files can be difficult or mistake-prone at times. Key configuration files are:

  • inputs.conf controls how the forwarder collects data.
  • outputs.conf controls how the forwarder sends data to an indexer or other forwarder.
  • server.conf for connection and performance tuning.
  • deploymentclient.conf for connecting to a deployment server.

You make changes to configuration files by editing them with a text editor. You can use any editor that you want as long as it can write files in ASCII/UTF-8 format. The forwarder works with configurations for forwarding data in outputs.conf in $SPLUNK_HOME/etc/system/local/). See Configure forwarding with outputs.conf. The universal forwarder has a SplunkUniversalForwarder app, which includes preconfigured settings that let the forwarder run in a streamlined mode. Do not edit any configuration files within that app unless you receive specific instructions.

Best practices for deploying configuration updates across universal forwarders

You can use the following methods to deploy configuration updates across your set of universal forwarders:

  • Edit or copy the configuration files for each universal forwarder manually (This is only useful for small deployments.)
  • Use the Splunk deployment server to push configured apps to your set of universal forwarders.
  • Use your own deployment tools (puppet or Chef on *nix or System Center Configuration Manager on Windows) to push configuration changes.

Configure the universal forwarder from the CLI

The CLI lets you configure most forwarding parameters without having to edit configuration files. It does not give you full access to all forwarding parameters, and you must edit configuration files in those cases. When you make configuration changes with the CLI, the universal forwarder writes the configuration files. This prevents typos and other mistakes that can occur when you edit configuration files directly. The forwarder writes configurations for forwarding data to outputs.conf in $SPLUNK_HOME/etc/system/local/). Examples for using the CLI to configure a universal forwarder Following are example procedures on how to configure a universal forwarder to connect to a receiving indexer.

Configure the universal forwarder to connect to a receiving indexer

From a shell or command prompt on the forwarder, run the command:

  • ./splunk add forward-server <host name or ip address>:<listening port>

For example, to connect to the receiving indexer with the hostname idx.mycompany.com and that host listens on port 9997 for forwarders, type in:

  • ./splunk add forward-server idx1.mycompany.com:9997

Configure the universal forwarder to connect to a deployment server

From a shell or command prompt on the forwarder, run the command:

  • ./splunk set deploy-poll <host name or ip address>:<management port>

For example, if you want to connect to the deployment server with the hostname ds1.mycompany.com on the default management port of 8089, type in:

  • ./splunk set deploy-poll ds1.mycompany.com:8089

Configure a data input on the forwarder

Determine what data you want to collect.

  • From a shell or command prompt on the forwarder, run the command that enables that data input. For example, to monitor the /var/log directory on the host with the universal forwarder installed, type in:

./splunk add monitor /var/log

  • The forwarder asks you to authenticate and begins monitoring the specified directory immediately after you log in.

Restart the universal forwarder Some configuration changes might require that you restart the forwarder.

To restart the universal forwarder, use the same CLI restart command that you use to restart a full Splunk Enterprise instance:

On Windows: Go to %SPLUNK_HOME%\bin and run this command:

      *splunk restart 

On *nix systems: From a shell prompt on the host, go to $SPLUNK_HOME/bin, and run this command:

     *./splunk restart

Configure forwarding with outputs.conf

The outputs.conf file defines how forwarders send data to receivers. You can specify some output configurations at installation time (Windows universal forwarders only) or the CLI, but most advanced configuration settings require that you edit outputs.conf. The topics that describe various forwarding topologies, such as load balancing and intermediate forwarding, provide detailed examples on configuring outputs.conf to support those topologies. Although outputs.conf is a required file for configuring forwarders, it addresses only the outputs from the forwarder, where you want the forwarder to send the data it collects. To specify the data that you want to collect from the forwarder, you must separately configure the inputs, as you would for any Splunk instance.

Edit outputs.conf to configure forwarding

This procedure details the steps you must take to edit the default outputs.conf which is in $SPLUNK_HOME/etc/system/local.

  1. On the host that forwards that data that you want to collect, open a shell or command prompt or PowerShell window.
  2. Go to the configuration directory for the forwarder.

Unix

  • cd $SPLUNK_HOME/etc/system/local

Windows

  • cd %SPLUNK_HOME%\etc\system\local
  1. Open outputs.conf for editing with a text editor.

Unix

  • vi outputs.conf

Windows

  • notepad outputs.conf

Edit outputs.conf. Add a minimum of at least one forwarding target group or a single receiving host.

  1. Save the outputs.conf file and close it.
  2. Restart the universal forwarder to complete your changes.

Unix

  • cd $SPLUNK_HOME/bin
  • ./splunk restart

Windows

  • cd %SPLUNK_HOME%\bin
  • .\splunk restart

TL;DR

Install Splunk

  1. Run the dpkg command to install Splunk Light into the default directory.
pkg -i splunk_package_name.deb

You cannot install the DEB package into another directory.

  1. Start Splunk.

./splunk start --accept-license

Configure the universal forwarder to connect to a deployment server

  1. For Forwarder, from a shell or command prompt on the forwarder, run the command:

./splunk set deploy-poll <host name or ip address>:<management port>

  1. Configure a data input on the forwarder

./splunk add monitor /var/log

  1. Restart the universal forwarder

./splunk restart

Configure your inputs

  1. Edit inputs.conf

Ex.

  1. The following configuration directs Splunk to listen on TCP port 9995 for raw data from 10.1.1.10.
  2. All data is assigned the host "webhead-1", the sourcetype "access_common" and the
  3. the source "//10.1.1.10/var/log/apache/access.log".

[tcp://10.1.1.10:9995] host = webhead-1 sourcetype = access_common source = //10.1.1.10/var/log/apache/access.log

More examples and info. http://docs.splunk.com/Documentation/Splunk/6.5.1/admin/Inputsconf https://docs.splunk.com/Documentation/Splunk/6.5.1/Data/Monitorfilesanddirectorieswithinputs.conf

Splunk Web

  1. Login to Splunk Web

The Splunk Web interface is at http://localhost:8000

  1. Enter credentsials
username: admin
password: changeme

References