Escaping special characters in elasticsearch
Yes, those characters will need to be replaced within content you want to search in a query_string query. To do that (assuming you are using PyLucene), you should be able to use QueryParserBase.escape(String)
.
Barring that, you could always adapt the QueryParserBase.escape
source code to your needs:
public static String escape(String s) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < s.length(); i++) { char c = s.charAt(i); // These characters are part of the query syntax and must be escaped if (c == '\\' || c == '+' || c == '-' || c == '!' || c == '(' || c == ')' || c == ':' || c == '^' || c == '[' || c == ']' || c == '\"' || c == '{' || c == '}' || c == '~' || c == '*' || c == '?' || c == '|' || c == '&' || c == '/') { sb.append('\\'); } sb.append(c); } return sb.toString();}
I adapted this code I found there:
escapeRules = {'+': r'\+', '-': r'\-', '&': r'\&', '|': r'\|', '!': r'\!', '(': r'\(', ')': r'\)', '{': r'\{', '}': r'\}', '[': r'\[', ']': r'\]', '^': r'\^', '~': r'\~', '*': r'\*', '?': r'\?', ':': r'\:', '"': r'\"', '\\': r'\\;', '/': r'\/', '>': r' ', '<': r' '}def escapedSeq(term): """ Yield the next string based on the next character (either this char or escaped version """ for char in term: if char in escapeRules.keys(): yield escapeRules[char] else: yield chardef escapeESArg(term): """ Apply escaping to the passed in query terms escaping special characters like : , etc""" term = term.replace('\\', r'\\') # escape \ first return "".join([nextStr for nextStr in escapedSeq(term)])
to answer the question directly, below is a cleaner python solution using re.sub
import reKIBANA_SPECIAL = '+ - & | ! ( ) { } [ ] ^ " ~ * ? : \\'.split(' ')re.sub('([{}])'.format('\\'.join(KIBANA_SPECIAL)), r'\\\1', val)
however a better solution is to properly parse out the bad characters that get sent to elasticsearch:
import six.moves.urllib as urlliburllib.parse.quote_plus(val)