How to compute the gradient of an image based on a certain class? How to compute the gradient of an image based on a certain class? numpy numpy

How to compute the gradient of an image based on a certain class?


Class scores only

If you only have access to class scores for any image you suggest there's not much fancy you can do to truly compute a gradient.

If what is returned can be seen as a relative score for each category it is a vector v that is the result of some function f acting on a vector A that contains all the information on the image*. The true gradient of the function is given by the matrix D(A), which depends on A, such that D(A)*B = (f(A + epsilon*B) -f(A))/epsilon in the limit of small epsilon for any B. You could approximate this numerically using some small value for epsilon and a number of test matrices B (one for each element of A should be enough), but this is likely to be needlessly expensive.

What you are trying to do is maximize the difficulty the algorithm has in recognizing the image. That is, for a given algorithm f you want to maximize some appropriate measure for how poorly the algorithm recognizes each of your images A. There is a plethora of methods for this. I'm not too familiar with them, but a talk I saw recently had some interesting material on this (https://wsc.project.cwi.nl/woudschoten-conferences/2016-woudschoten-conference/PRtalk1.pdf, see page 24 and onwards). Computing the whole gradient is usually way too expensive if you have high dimensional input. Instead you just modify a randomly chosen coordinate and take many (many) small, cheap steps each more or less in the right direction rather than going for somehow optimal large, but expensive steps.

Model available and suitable

If you know the model in full and it is possible to write is explicitly as v = f(A) then you can compute the gradient of the function f. This would be the case if the algorithm you're trying to beat is a linear regression, possibly with multiple layers. The form of the gradient should be easier for you to figure out than for me to write it down here.

With this gradient available and fairly cheap to evaluate its value for different images A you can proceed with, for example, a steepest descent (or ascent) approach to making the image less recognizable for the algorithm.

Important note

It's probably best not to forget that your approach should not render the image illegible to humans too, that would make it all rather pointless.

  • I think for the purpose of this discussion it's best to see the image as a vector