Use JSON Input step to process uneven data Use JSON Input step to process uneven data json json

Use JSON Input step to process uneven data


What I have done is use JSON Input using $.address[*] to read to a jsonRow field the full map of each element p.e:

{"address":[    {"AddressId":"1_1","Street":"A Street"},      {"AddressId":"1_101","Street":"Another Street"},      {"AddressId":"1_102","Street":"One more street", "Locality":"Buenos Aires"},       {"AddressId":"1_102","Locality":"New York"} ]}

This results in 4 jsonRows one for each element, p.e. jsonRow = {"AddressId":"1_101","Street":"Another Street"}. Then using a Javascript step I map my values using this:

var AddressId = getFromMap('AddressId', jsonRow);var Street = getFromMap('Street', jsonRow);var Locality = getFromMap('Locality', jsonRow);

In a second script tab I inserted minified JSON parse code from https://github.com/douglascrockford/JSON-js and the getFromMap function:

function getFromMap(key,jsonRow){  try{   var map = JSON.parse(jsonRow);  }  catch(e){   var message = "Unparsable JSON: "+jsonRow+" Desc: "+e.message;   var nr_errors = 1;   var field = "jsonRow";   var errcode = "JSON_PARSE";   _step_.putError(getInputRowMeta(), row, nr_errors, message, field, errcode);   trans_Status = SKIP_TRANSFORMATION;   return null;  }  if(map[key] == undefined){   return null;  }  trans_Status = CONTINUE_TRANSFORMATION;  return map[key]}


You can solve this by changing the JSONPath and splitting up the steps in two JSON input steps. The following website explains a lot about JSONPath: http://goessner.net/articles/JsonPath/

$..AddressId

Does in fact return all the AddressId's in the address array, BUT since Pentaho is using grid rows for input and output [4 rows x 3 columns], it can't handle a missing value aka null value when you want as results return all the Streets (3 rows) and return all the Locality (2 rows), simply because there are no null values in the array itself as in you can't drive out of your garage with 3 wheels on your car instead of the usual 4.

I guess your script returns null (where X is zero) values like:

A S XA S XA S LA X L

The scripting step can be avoided same by changing the Fields path of the first JSONinput step into:

$.address[*]

This is to retrieve all the 4 address lines. Create a next JSONinput step based on the new source field which contains the address line(s) to retrieve the address details per line:

$.AddressId$.Street$.Locality

This yields the null values on the four address lines when a address details is not available in an address line.