What is the difference between Dataset.from_tensors and Dataset.from_tensor_slices? What is the difference between Dataset.from_tensors and Dataset.from_tensor_slices? python python

What is the difference between Dataset.from_tensors and Dataset.from_tensor_slices?


from_tensors combines the input and returns a dataset with a single element:

>>> t = tf.constant([[1, 2], [3, 4]])>>> ds = tf.data.Dataset.from_tensors(t)>>> [x for x in ds][<tf.Tensor: shape=(2, 2), dtype=int32, numpy= array([[1, 2],        [3, 4]], dtype=int32)>]

from_tensor_slices creates a dataset with a separate element for each row of the input tensor:

>>> t = tf.constant([[1, 2], [3, 4]])>>> ds = tf.data.Dataset.from_tensor_slices(t)>>> [x for x in ds][<tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 2], dtype=int32)>, <tf.Tensor: shape=(2,), dtype=int32, numpy=array([3, 4], dtype=int32)>]


1) Main difference between the two is that nested elements in from_tensor_slices must have the same dimension in 0th rank:

# exception: ValueError: Dimensions 10 and 9 are not compatibledataset1 = tf.data.Dataset.from_tensor_slices(    (tf.random_uniform([10, 4]), tf.random_uniform([9])))# OK, first dimension is samedataset2 = tf.data.Dataset.from_tensors(    (tf.random_uniform([10, 4]), tf.random_uniform([10])))

2) The second difference, explained here, is when the input to a tf.Dataset is a list. For example:

dataset1 = tf.data.Dataset.from_tensor_slices(    [tf.random_uniform([2, 3]), tf.random_uniform([2, 3])])dataset2 = tf.data.Dataset.from_tensors(    [tf.random_uniform([2, 3]), tf.random_uniform([2, 3])])print(dataset1) # shapes: (2, 3)print(dataset2) # shapes: (2, 2, 3)

In the above, from_tensors creates a 3D tensor while from_tensor_slices merge the input tensor. This can be handy if you have different sources of different image channels and want to concatenate them into a one RGB image tensor.

3) A mentioned in the previous answer, from_tensors convert the input tensor into one big tensor:

import tensorflow as tftf.enable_eager_execution()dataset1 = tf.data.Dataset.from_tensor_slices(    (tf.random_uniform([4, 2]), tf.random_uniform([4])))dataset2 = tf.data.Dataset.from_tensors(    (tf.random_uniform([4, 2]), tf.random_uniform([4])))for i, item in enumerate(dataset1):    print('element: ' + str(i + 1), item[0], item[1])print(30*'-')for i, item in enumerate(dataset2):    print('element: ' + str(i + 1), item[0], item[1])

output:

element: 1 tf.Tensor(... shapes: ((2,), ()))element: 2 tf.Tensor(... shapes: ((2,), ()))element: 3 tf.Tensor(... shapes: ((2,), ()))element: 4 tf.Tensor(... shapes: ((2,), ()))-------------------------element: 1 tf.Tensor(... shapes: ((4, 2), (4,)))


Try this :

import tensorflow as tf  # 1.13.1tf.enable_eager_execution()t1 = tf.constant([[11, 22], [33, 44], [55, 66]])print("\n=========     from_tensors     ===========")ds = tf.data.Dataset.from_tensors(t1)print(ds.output_types, end=' : ')print(ds.output_shapes)for e in ds:    print (e)print("\n=========   from_tensor_slices    ===========")ds = tf.data.Dataset.from_tensor_slices(t1)print(ds.output_types, end=' : ')print(ds.output_shapes)for e in ds:    print (e)

output :

=========      from_tensors    ===========<dtype: 'int32'> : (3, 2)tf.Tensor([[11 22] [33 44] [55 66]], shape=(3, 2), dtype=int32)=========   from_tensor_slices      ===========<dtype: 'int32'> : (2,)tf.Tensor([11 22], shape=(2,), dtype=int32)tf.Tensor([33 44], shape=(2,), dtype=int32)tf.Tensor([55 66], shape=(2,), dtype=int32)

The output is pretty much self-explanatory but as you can see, from_tensor_slices() slices the output of (what would be the output of) from_tensors() on its first dimension. You can also try with :

t1 = tf.constant([[[11, 22], [33, 44], [55, 66]],                  [[110, 220], [330, 440], [550, 660]]])