Running Real-Time Searches: To Search or Not to Search
By David Allen | Senior Splunk Consultant
One of the characteristics of modern life seems to be that we are moving at an ever-increasing rate, regardless of turbulence or obstacles. We have come to expect things to be faster than they were even a week ago. We expect computers to run faster, cars to go faster as we desire more speed and instant gratification. As a result, one would generally embrace the idea that getting data faster by running real-time searches in Splunk would also be a good thing. Well not so fast, let’s slow down a bit and think this through.
In this blog, we will discuss the pros and cons of running real-time searches in Splunk and what a best practice search scenario should look like.
To start off, we need to discuss the type of alert response system that you will be using. Is your response system completely automated or are you responding using humans? In almost all cases there will be some form of human decision making and reaction time involved. This is important because once you add in human delays you can get a more realistic idea of what the latency is to for analysts to even start to work on your alerts. Often, due to these delays, it is not necessary to put a strain on your Splunk infrastructure to get your tickets a few seconds faster.
For instance, consider the amount of time it will take for one of your experienced analysts to receive, mentally process, and react to various alerts. Generally, if the alert occurs during regular work hours, then this reaction time may be in the order of 5-10 minutes. If you are not staffed for 24/7 analyst support, there are after-hour delays or delays when the analysts take breaks periodically though out the workday. Then there are the normal weekend delays which could be an hour or two. You get the idea.
In many cases there may be substantial delays in reacting to an alert and if the alerts came in a few minutes later the overall reaction time would not change significantly. Weighing this against the significant impact to the Splunk infrastructure performance, in almost all cases it would be a much better approach to use indexed time real-time alerts.
Now let’s dig into the impact of the Splunk infrastructure when running real-time searches. Real-time searches need to be running all the time in the background and as a result, will consume one core on the Splunk search head and one core on EACH of your Splunk indexers for as long as the search is running. As your cores get consumed with more and more concurrent real-time searches the overall Splunk infrastructure performance comes crashing down.
For example, if your search head and indexers have 12 cores each and you have 10 continuously running real-time searches, this leaves 2 cores for all the remaining work that the enterprise must do. So, if you have a dashboard with 10 panels and it takes one core per panel to run then you only have 2 cores remaining and the performance will drop to one-fifth of what could be if all cores were available to run the dashboard.
Now compare this to a regular search running every 5 minutes that takes only 10 seconds to complete. This search consumes one core on every search head and indexer but for only 10 seconds. The results are the same, but this search consumes roughly 3% of the processing power of the real-time search.
Hopefully, by now you are starting to see that running real-time searches are not to be used carelessly but by the Splunk professional and only for use cases that are short-lived ad hoc searches.
Let’s look at a couple of ways to protect your Splunk infrastructure from Splunk users hogging up precious system resources. The following settings control a lot of the real-time searching capabilities…
rtsearch – This setting enables the user to do real-time searches
schedule_rtsearch – This setting enables the user to schedule real-time searches.
Remove Real-Time Search Capability for Certain Users
By default, rtsearch and schedule_rtsearch are enabled for the power role and is inherited by other roles. So as a minimum, you can disable these power role settings so users with this role do not have access to any real time searching capability. Be sure to also disable these settings for any future roles that you create.
The easiest way to do this is through the GUI. Go to Settings then under the Users and Authentication section select Roles.
Then at the Roles screen select Edit for the power role then select Capabilities.
From the capabilities screen, deselect the rtsearch and schedule_rtsearch settings and Save the updates.
Remember that the admin role inherits the roles of the power role so if you want your admins to have real-time searching capabilities then you will need to turn these settings on specifically for the admin role.
Remove Real-Time Search Capability from ALL Users
The easiest way to do this would be to use the GUI as described above and disable the rtsearch and schedule_rtsearch settings but that would not prevent anyone from easily enabling those settings later using the GUI.
A better way would be to disable the rtsearch and schedule_rtsearch settings using the CLI.
Remember that all settings in the /etc/system/local folder are system-wide settings and have a higher precedence than that same setting in any other folder. So, to turn off real-time searching for the entire system you will need to disable the rtsearch and schedule_rtsearch settings in the /etc/system/local folder. Here is how to do that.
Using the CLI go to the /etc/system/local folder on the search head or your all-in-one box and open the authorize.conf file and add the two settings to the default stanza as shown and restart Splunk.
[default]
rtsearch = disabled
schedule_rtsearch = disabled
Indexed Real-Time Search
If you decide that you do not need up-to-the-second accuracy, you can get close to real-time searching speed by running your real-time searches after the events are indexed which will greatly improve indexing performance. This runs searches like historical searches, but also continually updates the search with new events as the events appear on disk and looks just like a real-time search.
To select indexed real-time searching change the default indexed_realtime_use_by_default setting in the limits.conf to true as shown below and restart Splunk.
[realtime]
indexed_realtime_use_by_default=true
Disable RT Search Panel in Time Picker
For those that want to have real-time or indexed real-time search capability but would like to disable the real-time search panel in the time picker presets so casual users do not select the real-time presets, you may disable the show_realtime setting in times.conf as shown below.
[settings]
show_realtime=false
For those who would like to disable individual real-time time picker settings, you can do that by disabling the respective stanzas in times.conf.
[settings]
[real_time_last30s]
disabled = 1
[real_time_last1m]
disabled = 1
[real_time_last5m]
disabled = 1
[real_time_last30m]
disabled = 1
[real_time_last1h]
disabled = 1
[real_time_all]
disabled = 1
To know how many real-time searches are actually running on your enterprise, this search shows who is running which searches and how long they have been running.
| rest /services/search/jobs | search eventSorting=realtime | table label, author, dispatchState, eai:acl.owner, label, isRealTimeSearch, performance.dispatch.stream.local.duration_secs, runDuration, searchProviders, splunk_server, title
In conclusion, running real-time searches is very powerful and beneficial if they are run for short periods of time by the right people who need to monitor streaming data before it is indexed. But for most use cases and considering the impact to the infrastructure, real-time searches should not be used but should be run as indexed real-time searches with very little difference in latency and virtually no impact on the infrastructure. Learn how to further optimize your Splunk searches or get help from one of our Splunk experts.