Category Archives: troubleshooting

SPLUNK! Goes the Syslog Server…

The use of a “syslog” server is important in today’s data center. Most network and SAN switches, along with Unix and Linux servers are capable of sending logging information to a syslog server. The obvious reason for a syslog server is to centralize all of your logs. This enables you to troubleshoot issues more efficiently. Most syslog servers allow you to do a time-line based analysis of log data so that you have an enterprise – wide view of all activity. This allows you to see how different devices interact.

An less obvious reason for a syslog server is for security purposes. The theory is that an attacker will attempt to elevate to root privileges and then try to delete or alter logs to hide evidence of the attack. If all log information is relayed to a syslog server, the hope is that this data is secured for forensic study, if needed.

I have tried a few different “free” and non-free syslog servers. I didn’t do extensive research into all available syslog servers, but I have to say that I like Splunk the best. It starts with a free server with a limited amount of data. This may be fine for smaller shops. There is also a paid version that allows for more data collection. The fully “free” syslog server that came close was the combination of syslogd and phplogcon on a Linux server. I also tried Kiwi syslog, which also has a “free” version and a paid version. But it only installs on winders. Most of the syslog servers are great. There were a few capabilities I felt made Splunk a nice syslog server:

  • Act as a standard syslog server.
  • The ability to “scrape” directories.
  • Monitor Windows logs.
  • Allow for upload of log data.
  • Provide Time line Analysis.

Acting as a standard syslog server is really a no-brainer. All of the packages that I tested worked fine in this respect. You set up pointers to the syslog server in the *nix /etc/syslog.conf file and all logs are automatically sent.

When dealing with collecting logs on an ESX server, the standard syslog.conf settings may not cut it. The HA logs reside in a different location and should be “scraped”. In this context, “scraping” is the process of reading all of the text files in a specified directory and compiling them into the syslog database.

Monitoring Windows logs is also a key ingredient in the datacenter stew. If you are going to do centralized collection of logs, collect everything. Splunk uses WMI to gather this information.

The ability to upload log data manually is also a nice option. I was recently troubleshooting an issue with VMware Consolidated Backup and I was able to manually upload all of the related VCB logs right into a Splunk server VM. I exported the Windows system and application logs to .csv files and copied them to a directory on the Splunk server. I also copied the VCB logs and ESX logs to the same directory. After a few minutes, the data was assimilated into the database and ready for analysis. I was able to look at a specific point in time and look at errors across the entire environment. I could see errors in the VCB logs and relate them to errors in the Windows system and application logs. I was also able to track all of the ESX and VM logs for the time period.

The Splunk server offers WAY more than the logging functions described here. It is also a great tool for compliance, change control, security, server management, etc. It has install packages for winders, Linux, Solaris (x86, x64 AND Sparc), Mac OSX, FreeBSD and AIX.

As you can see, the Splunk server is very useful for capturing all kinds of logs for security and troubleshooting purposes. In part two, I will dig deeper into setting up a Splunk server and configuring *nix, ESX, ESXi and winders machines to send their logs. As with the VCB Proven Practice Guide, there will be a companion doc on the VI:OPS site.