How do I create padded batches in Tensorflow for tf.train.SequenceExample data using the DataSet API? How do I create padded batches in Tensorflow for tf.train.SequenceExample data using the DataSet API? python python

How do I create padded batches in Tensorflow for tf.train.SequenceExample data using the DataSet API?


You need to pass a tuple of shapes.In your case you should pass

dataset = dataset.padded_batch(4, padded_shapes=([vectorSize],[None]))

or try

dataset = dataset.padded_batch(4, padded_shapes=([None],[None]))

Check this code for more details. I had to debug this method to figure out why it wasn't working for me.


If your current Dataset object contains a tuple, you can also to specify the shape of each padded element.

For example, I have a (same_sized_images, Labels) dataset and each label has different length but same rank.

def process_label(resized_img, label):    # Perfrom some tensor transformations    # ......    return resized_img, labeldataset = dataset.map(process_label)dataset = dataset.padded_batch(batch_size,                                padded_shapes=([None, None, 3],                                               [None, None]))  # my label has rank 2


You may need to get help from the dataset output shapes:

padded_shapes = dataset.output_shapes