Data Augmentation in PyTorch

python image-processing dataset pytorch data-augmentation

I assume you are asking whether these data augmentation transforms (e.g. RandomHorizontalFlip) actually increase the size of the dataset as well, or are they applied on each item in the dataset one by one and not adding to the size of the dataset.

Running the following simple code snippet we could observe that the latter is true, i.e. if you have a dataset of 8 images, and create a PyTorch dataset object for this dataset when you iterate through the dataset, the transformations are called on each data point, and the transformed data point is returned. So for example if you have random flipping, some of the data points are returned as original, some are returned as flipped (e.g. 4 flipped and 4 original). In other words, by one iteration through the dataset items, you get 8 data points(some flipped and some not). [Which is at odds with the conventional understanding of augmenting the dataset(e.g. in this case having 16 data points in the augmented dataset)]

class experimental_dataset(Dataset):    def __init__(self, data, transform):        self.data = data        self.transform = transform    def __len__(self):        return len(self.data.shape[0])    def __getitem__(self, idx):        item = self.data[idx]        item = self.transform(item)        return item    transform = transforms.Compose([        transforms.ToPILImage(),        transforms.RandomHorizontalFlip(),        transforms.ToTensor()    ])x = torch.rand(8, 1, 2, 2)print(x)dataset = experimental_dataset(x,transform)for item in dataset:    print(item)

Results: (The little differences in floating points are caused by transforming to pil image and back)

Original dummy dataset:

tensor([[[[0.1872, 0.5518],          [0.5733, 0.6593]]],    [[[0.6570, 0.6487],      [0.4415, 0.5883]]],    [[[0.5682, 0.3294],      [0.9346, 0.1243]]],    [[[0.1829, 0.5607],      [0.3661, 0.6277]]],    [[[0.1201, 0.1574],      [0.4224, 0.6146]]],    [[[0.9301, 0.3369],      [0.9210, 0.9616]]],    [[[0.8567, 0.2297],      [0.1789, 0.8954]]],    [[[0.0068, 0.8932],      [0.9971, 0.3548]]]])

transformed dataset:

tensor([[[0.1843, 0.5490],     [0.5725, 0.6588]]])tensor([[[0.6549, 0.6471],     [0.4392, 0.5882]]])tensor([[[0.5647, 0.3255],         [0.9333, 0.1216]]])tensor([[[0.5569, 0.1804],         [0.6275, 0.3647]]])tensor([[[0.1569, 0.1176],         [0.6118, 0.4196]]])tensor([[[0.9294, 0.3333],         [0.9176, 0.9608]]])tensor([[[0.8549, 0.2275],         [0.1765, 0.8941]]])tensor([[[0.8902, 0.0039],         [0.3529, 0.9961]]])

python image-processing dataset pytorch data-augmentation

The transforms operations are applied to your original images at every batch generation. So your dataset is left unchanged, only the batch images are copied and transformed every iteration.

The confusion may come from the fact that often, like in your example, transforms are used both for data preparation (resizing/cropping to expected dimensions, normalizing values, etc.) and for data augmentation (randomizing the resizing/cropping, randomly flipping the images, etc.).

What your data_transforms['train'] does is:

Randomly resize the provided image and randomly crop it to obtain a (224, 224) patch
Apply or not a random horizontal flip to this patch, with a 50/50 chance
Convert it to a Tensor
Normalize the resulting Tensor, given the mean and deviation values you provided

What your data_transforms['val'] does is:

Resize your image to (256, 256)
Center crop the resized image to obtain a (224, 224) patch
Convert it to a Tensor
Normalize the resulting Tensor, given the mean and deviation values you provided

(i.e. the random resizing/cropping for the training data is replaced by a fixed operation for the validation one, to have reliable validation results)

If you don't want your training images to be horizontally flipped with a 50/50 chance, just remove the transforms.RandomHorizontalFlip() line.

Similarly, if you want your images to always be center-cropped, replace transforms.RandomResizedCrop by transforms.Resize and transforms.CenterCrop, as done for data_transforms['val'].

python image-processing dataset pytorch data-augmentation

Yes the dataset size does not change after the transformations. Every Image is passed to the transformation and returned, thus the size remaining the same.

If you wish to use the original dataset with transformed one concat them.

e.g increased_dataset = torch.utils.data.ConcatDataset([transformed_dataset,original])

CodeHunter

Data Augmentation in PyTorch

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last