Meaning of parameters in torch.nn.conv2d

python machine-learning artificial-intelligence pytorch

Here is what you may find

torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')

Parameters

in_channels (int) – Number of channels in the input image
out_channels (int) – Number of channels produced by the convolution
kernel_size (int or tuple) – Size of the convolving kernel
stride (int or tuple, optional) – Stride of the convolution. (Default: 1)
padding (int or tuple, optional) – Zero-padding added to both sides of the input (Default: 0)
padding_mode (string, optional) – zeros
dilation (int or tuple, optional) – Spacing between kernel elements. (Default: 1)
groups (int, optional) – Number of blocked connections from input to output channels. (Default: 1)
bias (bool, optional) – If True, adds a learnable bias to the output. (Default: True)

And this URL has helpful visualization of the process.

So the in_channels in the beginning is 3 for images with 3 channels (colored images).For images black and white it should be 1.Some satellite images should have 4.

The out_channels is what convolution will produce so these are the number of filters.

Let's create an example to "prove" that.

import torchimport torch.nn as nnc = nn.Conv2d(1,3, stride = 1, kernel_size=(4,5))print(c.weight.shape)print(c.weight)

Out

torch.Size([3, 1, 4, 5])Parameter containing:tensor([[[[ 0.1571,  0.0723,  0.0900,  0.1573,  0.0537],          [-0.1213,  0.0579,  0.0009, -0.1750,  0.1616],          [-0.0427,  0.1968,  0.1861, -0.1787, -0.2035],          [-0.0796,  0.1741, -0.2231,  0.2020, -0.1762]]],        [[[ 0.1811,  0.0660,  0.1653,  0.0605,  0.0417],          [ 0.1885, -0.0440, -0.1638,  0.1429, -0.0606],          [-0.1395, -0.1202,  0.0498,  0.0432, -0.1132],          [-0.2073,  0.1480, -0.1296, -0.1661, -0.0633]]],        [[[ 0.0435, -0.2017,  0.0676, -0.0711, -0.1972],          [ 0.0968, -0.1157,  0.1012,  0.0863, -0.1844],          [-0.2080, -0.1355, -0.1842, -0.0017, -0.2123],          [-0.1495, -0.2196,  0.1811,  0.1672, -0.1817]]]], requires_grad=True)

If we would alter the number of out_channels,

c = nn.Conv2d(1,5, stride = 1, kernel_size=(4,5))print(c.weight.shape) # torch.Size([5, 1, 4, 5])

We will get 5 filters each filter 4x5 as this is our kernel size.If we would set 2 channels, (some images may have 2 channels only)

c = nn.Conv2d(2,5, stride = 1, kernel_size=(4,5))print(c.weight.shape) # torch.Size([5, 2, 4, 5])

our filter will have 2 channels.

I think they have terms from this book and since they haven't called it filters, they haven't used that term.

So you are right; filters are what conv layer is learning and the number of filters is the number of out channels. They are set randomly at the start.

Number of activations is calculated based on bs and image dimension:

bs=16x = torch.randn(bs, 3, 28, 28)c = nn.Conv2d(3,10,kernel_size=5,stride=1,padding=2)out = c(x)print(out.nelement()) #125440 number of activations

python machine-learning artificial-intelligence pytorch

Checking the docs https://pytorch.org/docs/stable/nn.html#torch.nn.Conv2d you have 3 in_channels and 10 out_channels so these 10 out_channels are @thefifthjack005 filters also known as features.

CodeHunter

Meaning of parameters in torch.nn.conv2d

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last