How to prevent JSON parser crashing when there are illigal characters in JSON? How to prevent JSON parser crashing when there are illigal characters in JSON? json json

How to prevent JSON parser crashing when there are illigal characters in JSON?


It is not a bug in the parser. The parser verifies the trailing characters before null terminator are white spaces. And it returns error code when error happens. But if there is no null terminator, it may cause segmentation fault, similar to strlen().

In the newer versions of RapidJSON, there is a kParseStopWhenDoneFlag. When it is enabled, the parser will stop reading trailing characters after a complete JSON value. E.g.

Document d;const char* s =    "{messageType\" : \"Test1\", \"from\" : \"F2D0B5C6-9875-46B5-8D4F\"}����1";d.Parse<kParseStopWhenDoneFlag>(s);assert(!d.HasParseError());

By using this flag, the parser will stop after reading }, without reporting error.

It is not yet documented in the guide. Please refer to discussion in https://github.com/miloyip/rapidjson/pull/83


I think you should consider rolling your own pre-processing function that goes through every character in the JSON string searching for characters that are not part of your legal set and either removes or replaces them with white space. Then pass the newly repaired string forward to RapidJSON.

It's probably better to detect when you've had the comms problems in the first place (and therefore the JSON may be incomplete and or incorrect) and throw away and retry the entire session as opposed to 'patching up' the data as you want to here which solves you short term problem (program crashing) but could easily generate data inconsistencies and other more subtle and difficult to diagnose problems.

Also if you are seeing mostly bad data at the end of a string like this I think you should check carefully that your issue is actually with the comms - the case you give here looks more like a string buffer that was not terminated correctly and has additional junk (uninitialized memory) after the end of the string - perhaps you expected C++ to clear (set to zero) an allocated buffer?


File a bug report. A JSON parser should accept whatever input you throw at it and return an appropriate error message. If it crashes, that sounds like a vulnerability that could allow your application to be attacked by hackers. Probably best to find a different parser.

JSON data should never be modified by the receiver to make it work. It should be taken as it is, and if it isn't acceptable data, then it should be refused. If there are "communication errors" that are due to bugs in your code, fix the bug. If they are due to server bugs, complain to whoever wrote the server code. If they are genuine transmission errors, how do you know that you don't have changes that keep the JSON valid, like a payment amount changed from $100 to $900?