TypeError: object of type 'numpy.int64' has no len() TypeError: object of type 'numpy.int64' has no len() numpy numpy

TypeError: object of type 'numpy.int64' has no len()


I think the issue is that after using random_split, index is now a torch.Tensor rather than an int. I found that adding a quick type check to __getitem__ and then using .item() on the tensor works for me:

def __getitem__(self, index):    if type(index) == torch.Tensor:        index = index.item()    x = torch.tensor(self.x_data.iloc[index].values, dtype=torch.float)    y = torch.tensor(self.y_data.iloc[index], dtype=torch.float)    return (x, y)

Source: https://discuss.pytorch.org/t/issues-with-torch-utils-data-random-split/22298/8


Reference:
https://github.com/pytorch/pytorch/issues/9211

Just add .tolist() to indices line.

def random_split(dataset, lengths):    """    Randomly split a dataset into non-overlapping new datasets of given lengths.    Arguments:        dataset (Dataset): Dataset to be split        lengths (sequence): lengths of splits to be produced    """    if sum(lengths) != len(dataset):        raise ValueError("Sum of input lengths does not equal the length of the input dataset!")    indices = randperm(sum(lengths)).tolist()    return [Subset(dataset, indices[offset - length:offset]) for offset, length in zip(_accumulate(lengths), lengths)]


Why not simply to try:

self.len = len(self.x_data)

len works fine with pandas DataFrame w/o conversion to array or tensor.