Embedding lookup table doesn't mask padding value Embedding lookup table doesn't mask padding value python-3.x python-3.x

Embedding lookup table doesn't mask padding value


@Alessandro SugliaI think this feature is useful, unfortunately tf not support right now.One workaround to get the same result but is slower is to lookup twice.like below

  lookup_result = tf.nn.embedding_lookup(emb, index)  masked_emb = tf.concat(0, [tf.zeros([1, 1]),                              tf.ones([emb.get_shape()[0] - 1, 1])  mask_lookup_result = tf.nn.embedding_lookup(masked_emb, index)  lookup_result = tf.mul(lookup_result, mask_lookup_result)


It seem that in rnn model, we don't need mask the padding value as long as we mask the loss( the loss is same no matter whether we mask the input padding, I get the result by run a test code)!

Of course, zero the padding may speed-up the computation for zero multiply when sequence_len parameter in tf.nn.dynamic_rnn is not passed.

At the end, if the model will do interactive between sequence(such as CNN, convolution may affect the padding embedding), zero padding embedding is necessary.