How to convert predicted sequence back to text in keras?
You can use directly the inverse tokenizer.sequences_to_texts
function.
text = tokenizer.sequences_to_texts(<list-of-integer-equivalent-encodings>)
I have tested the above and it works as expected.
PS.: Take extra care to make the argument be the list of the integer encodings and not the One Hot ones.
Here is a solution I found:
reverse_word_map = dict(map(reversed, tokenizer.word_index.items()))
I had to resolve the same problem, so here is how I ended up doing it (inspired by @Ben Usemans reversed dictionary).
# Importing libraryfrom keras.preprocessing.text import Tokenizer# My textstexts = ['These are two crazy sentences', 'that I want to convert back and forth']# Creating a tokenizertokenizer = Tokenizer(lower=True)# Building word indicestokenizer.fit_on_texts(texts)# Tokenizing sentencessentences = tokenizer.texts_to_sequences(texts)>sentences>[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10, 11, 12, 13]]# Creating a reverse dictionaryreverse_word_map = dict(map(reversed, tokenizer.word_index.items()))# Function takes a tokenized sentence and returns the wordsdef sequence_to_text(list_of_indices): # Looking up words in dictionary words = [reverse_word_map.get(letter) for letter in list_of_indices] return(words)# Creating texts my_texts = list(map(sequence_to_text, sentences))>my_texts>[['these', 'are', 'two', 'crazy', 'sentences'], ['that', 'i', 'want', 'to', 'convert', 'back', 'and', 'forth']]