Convert python sequence with multiple datatypes to tensor
In tensorflow you can't have a tensor with more than one data type.
Quoting the documentation:
It is not possible to have a tf.Tensor with more than one data type. It is possible, however, to serialize arbitrary data structures as strings and store those in tf.Tensors.
Hence a workaround could be to create a tensor with data type tf.String
and, on the occurrence, cast the field to the desired data type
You want a tensor for each of your features (columns). Only if it's a multi-dimensional feature (like an image, a video, list of strings, vector) would you have more dimensions in the tensor and even then they would all have the same datatype.
tf.data.Dataset.from_tensor_slices()
will accept your input as a dictionary of lists (key is the name of the feature, value is a list of the values in that feature), or as a list of lists. I can't remember if it eats Pandas dataframes but if it doesn't you can easily convert it to a dictionary df.to_dict()
.
However, you can't input None
values. You will have to find some value for those before converting into a tensor. Classic approaches to that is median value, zero value, most common value, "missing"/"unknown" value for strings or categories, or imputation.