Regex v. Rex Commands in Splunk SPL
by Alex Trejo, Splunk Consultant
A regular expression is used to capture a pattern of characters in text. This can be become very useful when either filtering data or extracting new fields in Splunk. The SPL commands Splunk provide us with for regular expressions are the ‘regex’ and ‘rex’ commands. They are both regular expression commands but are utilized in different way. When used together, we can make the most out of regular expressions with just the search bar in Splunk.
Regex
regex [<field>=<regular expression>]
The regex command is primarily used for filtering data. The command can filter for both data that does or does not match the regular expression. The regex command can be applied to specified fields otherwise the default is _raw.
Rex
rex [field=<field>] [regular expression]
The rex command can be used for search-time field extractions and string replacement. The rex command can be applied to specified fields otherwise the default is _raw.
Here we have a set of customer transactions with no predefined field extractions. We will be responsible for pulling in customer IDs, names, transactions, Ips, and statuses.
To begin we will want to use the rex command to extract the fields we are looking for. Since we do not have any predefined fields to work off, we will have to use the default _raw field for the rex command. Since we are using the default field of _raw we do not need to define field in the command.
To extract the CustomerId field we will use the following rex command:
rex “CustomerId(?<CustomerId>\w+_\d+)”
We will follow the same format for the following fields of transaction, IP, and STATUS.
Next, we will extract the customers’ names. Since the CustomerId field already contains the customers’ names we will use that field in our rex command instead of the default _raw.
To extract the CustomerName field we will use the following rex command:
rex field=CustomerId “(?<CustomerName>\w+)_\d+”
We will follow the same format for the CustomerNum field.
Next, we will display the fields in table for better visibility.
Now, we will use the regex command to filter out IP addresses that start with 3 digits. Since we are targeting the IP field here, we will have to specify that field within the command to override the default _raw field.
regex IP!=\d{3}\.\d+\.\d+\.\d+
If we wanted to filter for the IPs that began with 3 digits, we would simply remove the ‘!’ before the ‘=’ in the command.
regex IP=\d{3}\.\d+\.\d+\.\d+
Another interesting feature of the rex command we did not touch on before is its ability to replace strings. Using the mode=sed option, the command can replace strings within a field. Note, this does not change the data in any form. Here we will use the command to mask the beginning of IP addresses that begin with 3 digits.
rex field=IP mode=sed "s/(\d{3}\.)/x./g"
There you have it – a short comparison of Regex vs. Rex in Splunk SPL. I hope this was helpful, and am happy to answer any questions using the request here: