I recently spent some time translating some of my numpy image processing experiments to Elixir Nx.
I noticed a few unexpected missing features, especially regarding convolution. Maybe it is partly due my misunderstanding, so I would love to be corrected. If the features below are indeed missing, I would like to help implementing them.
Nx.conv/2 works only for 4+ dimensional tensors. I understand that Nx uses the basic concept of {batch, channel, width, height} dimensions and from this point of view a convolution across batches might not make sense. But from a more general mathematical perspective of discrete convolution I would expect it to to also work on any combination of 1D, 2D and 3D data and kernels.
- cyclic padding:
Nx.conv/2 supports only zero padding and Nx.pad/2 supports only constant padding, and Nx.reflect is a separate function for reflection padding. From perspective of signal processing and discrete fourier transforms, cyclic padding ([1,2,3] => [2,3, 1,2,3 ,1,2]) would be the most natural since discrete convolution with cyclic padding corresponds to multiplication in the discrete fourier spectra. But Nx does not support cyclic padding at all. The simplest improvement would be to add Nx.cycle/2 as alternative to Nx.reflect/2 or to allow more options to Nx.pad/2 like numpy.pad
Nx.grad/2 does not work with Nx.conv(image, kernel) if the the image is constant. I know that there is Axon for building neural networks, but at least for small experiments and tutorials it would be nice to quickly calculate the gradient only in respect to the kernel.
image = Nx.tensor([[0, 0, 0], [0, 1, 0], [0, 0, 0]]) |> Nx.new_axis(0) |> Nx.new_axis(0)
kernel = Nx.tensor([[1, 1, 1], [1, 1, 1], [1, 1, 1]]) |> Nx.new_axis(0) |> Nx.new_axis(0)
# this does not work
Nx.Defn.grad({kernel}, fn {k} ->
Nx.conv(
image,
k
)
|> Nx.sum()
end)
# but this does
Nx.Defn.grad({kernel, image}, fn {k,i} ->
Nx.conv(
i,
k
)
|> Nx.sum()
end)
# and this does
Nx.Defn.grad({image}, fn {i} ->
Nx.conv(
image,
k
)
|> Nx.sum()
end)
As far as I understand from the error message the issue with grad/2 is that conv/2 calls Nx.pad internally (even for :same with 0 size padding) and this call to pad can not be serialized as required by grad if image is not part of the computation graph. On the other hand applying pad manually to image inside grad does work, even if image is constant:
# this works:
Nx.Defn.grad({kernel}, fn {k} ->
Nx.pad(
image,
0,
[{1, 1, 1}, {1, 1, 1}, {0, 0, 0}, {0, 0, 0}]
)
|> Nx.add(k)
|> Nx.product()
end)
I have composed a Livebook with some more detailed examples:
https://github.com/laszlokorte/nx-improvements-livebook/blob/main/ideas.livemd
I also noticed that there is https://github.com/elixir-nx/nx_signal that may also provide conv implementations (both fft based and time-domain based) but it is not quite clear how official/maintained nx_signal is and it would make Nx more approachable if either Nx' own conv implementation would be be more feature complete/consistent or if it would be clearly stated somewhere to not use Nx.conv but NxSignal.conv instead.
I recently spent some time translating some of my numpy image processing experiments to Elixir Nx.
I noticed a few unexpected missing features, especially regarding convolution. Maybe it is partly due my misunderstanding, so I would love to be corrected. If the features below are indeed missing, I would like to help implementing them.
Nx.conv/2works only for 4+ dimensional tensors. I understand that Nx uses the basic concept of{batch, channel, width, height}dimensions and from this point of view a convolution across batches might not make sense. But from a more general mathematical perspective of discrete convolution I would expect it to to also work on any combination of 1D, 2D and 3D data and kernels.Nx.conv/2supports only zero padding andNx.pad/2supports only constant padding, andNx.reflectis a separate function for reflection padding. From perspective of signal processing and discrete fourier transforms, cyclic padding ([1,2,3] => [2,3, 1,2,3 ,1,2]) would be the most natural since discrete convolution with cyclic padding corresponds to multiplication in the discrete fourier spectra. But Nx does not support cyclic padding at all. The simplest improvement would be to addNx.cycle/2as alternative toNx.reflect/2or to allow more options toNx.pad/2like numpy.padNx.grad/2does not work withNx.conv(image, kernel)if the theimageis constant. I know that there is Axon for building neural networks, but at least for small experiments and tutorials it would be nice to quickly calculate the gradient only in respect to the kernel.As far as I understand from the error message the issue with
grad/2is thatconv/2callsNx.padinternally (even for:samewith 0 size padding) and this call topadcan not be serialized as required bygradifimageis not part of the computation graph. On the other hand applyingpadmanually toimageinsidegraddoes work, even ifimageis constant:I have composed a Livebook with some more detailed examples:
https://github.com/laszlokorte/nx-improvements-livebook/blob/main/ideas.livemd
I also noticed that there is https://github.com/elixir-nx/nx_signal that may also provide
convimplementations (both fft based and time-domain based) but it is not quite clear how official/maintainednx_signalis and it would make Nx more approachable if either Nx' ownconvimplementation would be be more feature complete/consistent or if it would be clearly stated somewhere to not useNx.convbutNxSignal.convinstead.