Kibana using regex doesn't work as expected
Tested on version 6.2.4
Added the below index with mapping as shown below
PUT test{ "mappings": { "_doc": { "properties": { "message": { "type": "text" }, "message2": { "type": "keyword" } } } }}
Added 2 documents to the index as below
PUT test/_doc/1?refresh{ "message": "hellothere", "message2":"2018-10-20 23:10:21,068 GMT*8*PEGA0001*8087*1000*8ce767fc2b32*NA*NA*HKVZWM7PHSLMGR3ZXP4OEKEBG3DFFS30K*Test.User*Case-CAS-FS-Work-Svc*Solution:01.03.01*00cb8b6febb234d359369e54a60a865f*Y*3*HKVZWM7PHSLMGR3ZXP4OEKEBG3DFFS30K*35*http-apr-8080-exec-26*STANDARD*com.pega.pegarules.session.internal.engineinterface.service.HttpAPI*192.168.99.100|192.168.99.1*Activity=Pega-UI-CommandPalette.pzGetPaletteOptions*Rule-Obj-Activity:pzGetPaletteOptions*PEGA-UI-COMMANDPALETTE PZGETPALETTEOPTIONS #20161123T194957.445 GMT Step: 2 Circum: 0*NA*****pxRDBIOElapsed=0.03;pxRDBIOCount=4;pxRunStreamCount=811;pxTotalReqCPU=2.81;pxRunModelCount=270;pxOutputBytes=584,268;pxRunWhenCount=1,904;pxDeclarativePageLoadElapsed=6.84;pxRulesExecuted=3,471;pxOtherCount=314;pxDBInputBytes=3,553,909;pxTotalReqTime=8.09;pxActivityCount=967;pxAlertCount=1;pxOtherFromCacheCount=66;pxInteractions=1;pxLegacyRuleAPIUsedCount=1;pxRuleCount=254;pxInputBytes=101;pxRuleIOElapsed=0.09;pxRulesUsed=4,262;pxDeclarativePageLoadCount=6;pxRuleFromCacheCount=254;pxOtherIOElapsed=0.99;pxTrackedPropertyChangesCount=106;pxOtherIOCount=255;*NA*NA*NA*NA*NA*pyActivity=Pega-UI-CommandPalette.pzGetPaletteOptions;primaryPageClass=Data-Portal-DesignerStudio;*HTTP interaction has exceeded the elapsed time alert threshold of 1000 ms: 8088 ms.*" } PUT test/_doc/2?refresh{ "message": "2018-10-20 23:10:21,068 GMT*8*PEGA0001*8087*1000*8ce767fc2b32*NA*NA*HKVZWM7PHSLMGR3ZXP4OEKEBG3DFFS30K*Test.User*Case-CAS-FS-Work-Svc*Solution:01.03.01*00cb8b6febb234d359369e54a60a865f*Y*3*HKVZWM7PHSLMGR3ZXP4OEKEBG3DFFS30K*35*http-apr-8080-exec-26*STANDARD*com.pega.pegarules.session.internal.engineinterface.service.HttpAPI*192.168.99.100|192.168.99.1*Activity=Pega-UI-CommandPalette.pzGetPaletteOptions*Rule-Obj-Activity:pzGetPaletteOptions*PEGA-UI-COMMANDPALETTE PZGETPALETTEOPTIONS #20161123T194957.445 GMT Step: 2 Circum: 0*NA*****pxRDBIOElapsed=0.03;pxRDBIOCount=4;pxRunStreamCount=811;pxTotalReqCPU=2.81;pxRunModelCount=270;pxOutputBytes=584,268;pxRunWhenCount=1,904;pxDeclarativePageLoadElapsed=6.84;pxRulesExecuted=3,471;pxOtherCount=314;pxDBInputBytes=3,553,909;pxTotalReqTime=8.09;pxActivityCount=967;pxAlertCount=1;pxOtherFromCacheCount=66;pxInteractions=1;pxLegacyRuleAPIUsedCount=1;pxRuleCount=254;pxInputBytes=101;pxRuleIOElapsed=0.09;pxRulesUsed=4,262;pxDeclarativePageLoadCount=6;pxRuleFromCacheCount=254;pxOtherIOElapsed=0.99;pxTrackedPropertyChangesCount=106;pxOtherIOCount=255;*NA*NA*NA*NA*NA*pyActivity=Pega-UI-CommandPalette.pzGetPaletteOptions;primaryPageClass=Data-Portal-DesignerStudio;*HTTP interaction has exceeded the elapsed time alert threshold of 1000 ms: 8088 ms.*", "message2":"hellothere" }
Search for message2: /192.168.99.[0-9]{3}/
results in 0 results
Search for message: /192.168.99.[0-9]{3}/
results in doc#2
Search for message2: /.*192.168.99.[0-9]{3}.*/
results in doc#1
Search for message: /pegarules.session/
results in 0 results.
But search for message: /.*pegarules.session.*/
results in doc#1because inverted index has "token": "com.pega.pegarules.session.internal.engineinterface.service.httpapi"
Search for message2: /.*pegarules.session.*/
results in doc#1`
So, the message filed (type text
) is tokenized and regex search for wild card token patterns is returning results.
Where as, the message2 field (type keyword
) is not analyzed and is put in to the inverted index as is. Regex search for a pattern like 192.168.99.[0-9]{3}
is not returning anything unless we add greedy quantifier (.*)
The Lucene regular expression engine is not Perl-compatible but supports a smaller range of operators so it may not work and match results like regular regex.