A Quick Way to Find Substrings in Strings
By Jon Walthour, Senior Technical Architect
Back when I was an Oracle database administrator, one function I often used was INSTR(). In Oracle SQL, the INSTR function could look for a substring inside a larger string and return the position of that substring inside the string. It also allowed me to search for the second, third and fourth occurrence of that substring. There are several techniques you can use in SPL to find a substring in a larger string. These include match() and like(). I find sometimes these functions are insufficient where, say, I want to find if a string contains multiple occurrences of a substring. A quick solution I often use is to leverage Splunk’s multivalue functions.
First, is “split”. split takes a string and creates a multivalue field from it. Second, is “mvcount”. mvcount takes a multivalue field and counts the number of items in it. Taken together, if you split a string by the substring you are looking for, the count of items in the resulting multivalue field would be 2 if it was found; 1 if it wasn’t. Additionally, if the string you’re looking for is found twice, the result of mvcount would be 3.

This is where using split() and mvcount() together can work better than match() or like(). What if I wanted to find the portion of the string after the second match, with this technique, that’s easy:

To get a count of how many times a substring occurs in a string, you’d just use split() and mvcount() – 1, like this:

Note that this methodology of using split() and mvcount() works as fast or faster than match() and definitely faster than like().
So, all together, this method would look like this:

Finally, to show you how all the INSTR() features, can be realized by this technique, returning the position of the matched string, you could do the following:

I have endeavored to demonstrate here for you how you can quickly and easily find substrings in strings in the context of “eval” and other evaluation functions that has more flexibility and performs just as fast or faster than the traditionally used functions in this use case, match() and like().
For more in-depth resources, check out our Resource Center.