How to Enable Splunk Boot-Start Using Systemd
By: Jon Walthour | Senior Splunk Consultant, Team Lead
Splunk switched to a default of enabling boot-start to systemd back in 7.2.2. It did this because using systemd has become the default system initialization and service manager for most major Linux distros. They switched back to using SysV init in version 7.3 to 8.1.0 because of shortcomings in how Splunk was utilizing systemd for service startup and shutdown. Since the startup and shutdown actions prompted for root credentials, this broke many automated processes out in the wild. It wasn’t until version 8.1.1 that an option was added to the “enable boot-start” command to install “Polkit” rules to grant non-root users like “Splunk” to have a certain level of centralized system control to allow for the starting and stopping of the Splunk systemd service. Starting with version 8.1.1, the preferred method for setting up boot-start for Splunk Enterprise is via systemd.
What are the advantages of using systemd, you might ask? Plenty. First, systemd offers parallel processing to allow more to be done concurrently during system boot-up. Additionally, it allows for a standard framework for expressing dependencies between processes. This means, in the case of the Splunk systemd initialization, Splunk’s startup can be dependent on network services starting successfully. The configuration of systemd is standardized with unit text files and does require the creation of custom scripts.
Systemd also offers enhancements specifically to Splunk in that it provides a way to monitor and manage the splunkd service independent of Splunk itself. It provides tools for debugging and troubleshooting boot-time and service-related issues with Splunk — again, independent of the Splunk software itself. Most importantly, systemd allows for the use of Linux control groups (cgroups), which forms the backbone of the workload management features in Splunk Enterprise.
Splunk Enable Boot-Start Overview
- Install Polkit (if not already installed).
- Edit the splunk.service file.
- Add a systemd service for disabling THP.
- Enable the whole thing and reboot.
- Check your work.
Below are the steps to enable Splunk to start at system boot under systemd as well as other recommended operating system configurations for Splunk:
1. Install Polkit (if not already installed).
sudo su -
yum -y update
yum -y install polkit
sudo /opt/splunk/bin/splunk enable boot-start -systemd-managed 1 -systemd-unit-file-name splunk -create-polkit-rules 1 -user splunk -group splunk
NOTE: If you get message “CAUTION: The system has systemd version < 237 and polkit version > 105. With this combination, polkit rule created for this user will enable this user to manage all systemd services. Are you sure you want to continue [y/n]?”, select “y,” then create the following two files and run the following chmod command:
vi /etc/polkit-1/rules.d/10-Splunkd.rules
polkit.addRule(function(action, subject) {
if (action.id {== "org.freedesktop.systemd1.manage-units" &&
subject.user ==} "splunk") {
try {
polkit.spawn(["/usr/local/bin/polkit_splunk", ""+subject.pid]);
return polkit.Result.YES;
} catch (error) {
return polkit.Result.AUTH_ADMIN;
}
}
});
vi /usr/local/bin/polkit_splunk
#!/bin/bash -x
COMM=($(ps --no-headers -o cmd -p $1))
if [[ "${COMM[1]}" {== "start" ]] ||
[[ "${COMM[1]}" ==} "stop" ]] ||
[[ "${COMM[1]}" == "restart" ]]; then
if [[ "${COMM[2]}" {== "Splunkd" ]] ||
[[ "${COMM[2]}" ==} "Splunkd.service" ]]; then
exit 0
fi
fi
exit 1
chmod 755 /usr/local/bin/polkit_splunk
2. Edit the splunk.service file and make the following adjustments:
vi /etc/systemd/system/splunk.service
File created in /etc/systemd/system/splunk.service:
#This unit file replaces the traditional start-up script for systemd
#configurations, and is used when enabling boot-start for Splunk on
#systemd-based Linux distributions.
[Unit]
Description=Systemd service file for Splunk, generated by 'splunk enable boot-start'
After=network.target
[Service]
Type=simple
Restart=always
ExecStart=/opt/splunk/bin/splunk _internal_launch_under_systemd
KillMode=mixed
KillSignal=SIGINT
TimeoutStopSec=600
LimitCORE=0
LimitDATA=infinity
LimitNICE=0
LimitFSIZE=infinity
LimitSIGPENDING=385952
LimitMEMLOCK=65536
LimitRSS=infinity
LimitMSGQUEUE=819200
LimitRTPRIO=0
LimitSTACK=infinity
LimitCPU=infinity
LimitAS=infinity
LimitLOCKS=infinity
LimitNOFILE=1024000
LimitNPROC=512000
TasksMax=infinity
SuccessExitStatus=51 52
RestartPreventExitStatus=51
RestartForceExitStatus=52
User=splunk
Group=splunk
Delegate=true
CPUShares=1024
MemoryLimit=<value>
PermissionsStartOnly=true
ExecStartPost=/bin/bash -c "chown -R splunk:splunk
/sys/fs/cgroup/cpu/system.slice/%n"
ExecStartPost=/bin/bash -c "chown -R splunk:splunk
/sys/fs/cgroup/memory/system.slice/%n"
[Install]
WantedBy=multi-user.target
Change or check the following settings in the splunk.service file:
- – Change TimeoutStopSec to 600
- – Add all Limit____ lines and TasksMax
- – Check user and group
- – Set MemoryLimit to the total system memory available in bytes
- – Check both “ExecStartPost” chown is right user:group
3. Add a systemd service for disabling THP:
[Unit]
Description=Disable Transparent Huge Pages (THP)
[Service]
Type=simple
ExecStart=/bin/sh -c "echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled && echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag"
[Install]
WantedBy=multi-user.target
4. Finally, enable the whole thing and reboot:
sudo systemctl daemon-reload
sudo systemctl start disable-thp
sudo systemctl enable disable-thp
sudo systemctl start splunk
sudo systemctl enable splunk
shutdown -r now
5. Lastly, check your work to ensure (a) Splunk was started under systemd and (b) transparent huge pages is disabled and ulimits are set according to the values defined in the systemd init file.
$ ps -ef|grep splunkd
splunk 3848 1 3 12:10 ? 00:00:04 splunkd --under-systemd --systemd-delegate=yes -p 8089 _internal_launch_under_systemd
splunk 4731 3848 0 12:10 ? 00:00:00 [splunkd pid=3848] splunkd --under-systemd --systemd-delegate=yes -p 8089 _internal_launch_under_systemd [process-runner]
splunk 4931 4731 0 12:10 ? 00:00:00 /opt/splunk/bin/splunkd instrument-resource-usage -p 8089 --with-kvstore
splunk 5504 5442 0 12:12 pts/0 00:00:00 grep --color=auto splunkd
$ splunk status
splunkd is running (PID: 3848).
splunk helpers are running (PIDs: 4731 4749 4853 4931).
cat /opt/splunk/var/log/splunk/splunkd.log | grep ulimit
06-01-2021 12:05:17.513 +0000 INFO ulimit - Limit: virtual address space size: unlimited
06-01-2021 12:05:17.513 +0000 INFO ulimit - Limit: data segment size: unlimited
06-01-2021 12:05:17.513 +0000 INFO ulimit - Limit: resident memory size: unlimited
06-01-2021 12:05:17.513 +0000 INFO ulimit - Limit: stack size: unlimited
06-01-2021 12:05:17.513 +0000 INFO ulimit - Limit: core file size: 0 bytes
06-01-2021 12:05:17.513 +0000 WARN ulimit - Core file generation disabled.
06-01-2021 12:05:17.513 +0000 INFO ulimit - Limit: data file size: unlimited
06-01-2021 12:05:17.513 +0000 INFO ulimit - Limit: open files: 1024000 files
06-01-2021 12:05:17.513 +0000 INFO ulimit - Limit: user processes: 512000 processes
06-01-2021 12:05:17.513 +0000 INFO ulimit - Limit: cpu time: unlimited
06-01-2021 12:05:17.513 +0000 INFO ulimit - Linux transparent hugepage support, enabled="never" defrag="never"
06-01-2021 12:05:17.513 +0000 INFO ulimit - Linux vm.overcommit setting, value="0"
06-01-2021 12:10:55.997 +0000 INFO ulimit - Limit: virtual address space size: unlimited
06-01-2021 12:10:55.997 +0000 INFO ulimit - Limit: data segment size: unlimited
06-01-2021 12:10:55.997 +0000 INFO ulimit - Limit: resident memory size: unlimited
06-01-2021 12:10:55.997 +0000 INFO ulimit - Limit: stack size: unlimited
06-01-2021 12:10:55.997 +0000 INFO ulimit - Limit: core file size: 0 bytes
06-01-2021 12:10:55.997 +0000 WARN ulimit - Core file generation disabled.
06-01-2021 12:10:55.997 +0000 INFO ulimit - Limit: data file size: unlimited
06-01-2021 12:10:55.997 +0000 INFO ulimit - Limit: open files: 1024000 files
06-01-2021 12:10:55.997 +0000 INFO ulimit - Limit: user processes: 512000 processes
06-01-2021 12:10:55.997 +0000 INFO ulimit - Limit: cpu time: unlimited
06-01-2021 12:10:55.997 +0000 INFO ulimit - Linux transparent hugepage support, enabled="never" defrag="never"
06-01-2021 12:10:55.997 +0000 INFO ulimit - Linux vm.overcommit setting, value="0"
Want to learn more about Splunk boot-start and systemd? Contact us today!