Escaping special characters in elasticsearch Escaping special characters in elasticsearch elasticsearch elasticsearch

Escaping special characters in elasticsearch


Yes, those characters will need to be replaced within content you want to search in a query_string query. To do that (assuming you are using PyLucene), you should be able to use QueryParserBase.escape(String).

Barring that, you could always adapt the QueryParserBase.escape source code to your needs:

public static String escape(String s) {  StringBuilder sb = new StringBuilder();  for (int i = 0; i < s.length(); i++) {    char c = s.charAt(i);    // These characters are part of the query syntax and must be escaped    if (c == '\\' || c == '+' || c == '-' || c == '!' || c == '(' || c == ')' || c == ':'      || c == '^' || c == '[' || c == ']' || c == '\"' || c == '{' || c == '}' || c == '~'      || c == '*' || c == '?' || c == '|' || c == '&' || c == '/') {      sb.append('\\');    }    sb.append(c);  }  return sb.toString();}


I adapted this code I found there:

escapeRules = {'+': r'\+',               '-': r'\-',               '&': r'\&',               '|': r'\|',               '!': r'\!',               '(': r'\(',               ')': r'\)',               '{': r'\{',               '}': r'\}',               '[': r'\[',               ']': r'\]',               '^': r'\^',               '~': r'\~',               '*': r'\*',               '?': r'\?',               ':': r'\:',               '"': r'\"',               '\\': r'\\;',               '/': r'\/',               '>': r' ',               '<': r' '}def escapedSeq(term):    """ Yield the next string based on the        next character (either this char        or escaped version """    for char in term:        if char in escapeRules.keys():            yield escapeRules[char]        else:            yield chardef escapeESArg(term):    """ Apply escaping to the passed in query terms        escaping special characters like : , etc"""    term = term.replace('\\', r'\\')   # escape \ first    return "".join([nextStr for nextStr in escapedSeq(term)])


to answer the question directly, below is a cleaner python solution using re.sub

import reKIBANA_SPECIAL = '+ - & | ! ( ) { } [ ] ^ " ~ * ? : \\'.split(' ')re.sub('([{}])'.format('\\'.join(KIBANA_SPECIAL)), r'\\\1', val)

however a better solution is to properly parse out the bad characters that get sent to elasticsearch:

import six.moves.urllib as urlliburllib.parse.quote_plus(val)