InfogainLoss layer

numpy neural-network protocol-buffers deep-learning caffe

1. Is there any tutorial/example on the usage of InfogainLoss layer?:
A nice example can be found here: using InfogainLoss to tackle class imbalance.

2. Should the input to this layer, the class probabilities, be the output of a Softmax layer?
Historically, the answer used to be YES according to Yair's answer. The old implementation of "InfogainLoss" needed to be the output of "Softmax" layer or any other layer that makes sure the input values are in range [0..1].

The OP noticed that using "InfogainLoss" on top of "Softmax" layer can lead to numerical instability. His pull request, combining these two layers into a single one (much like "SoftmaxWithLoss" layer), was accepted and merged into the official Caffe repositories on 14/04/2017. The mathematics of this combined layer are given here.

The upgraded layer "look and feel" is exactly like the old one, apart from the fact that one no longer needs to explicitly pass the input through a "Softmax" layer.

3. How can I convert an numpy.array into a binproto file:

In python

H = np.eye( L, dtype = 'f4' ) import caffeblob = caffe.io.array_to_blobproto( H.reshape( (1,1,L,L) ) )with open( 'infogainH.binaryproto', 'wb' ) as f :    f.write( blob.SerializeToString() )

Now you can add to the model prototext the INFOGAIN_LOSS layer with H as a parameter:

layer {  bottom: "topOfPrevLayer"  bottom: "label"  top: "infoGainLoss"  name: "infoGainLoss"  type: "InfogainLoss"  infogain_loss_param {    source: "infogainH.binaryproto"  }}

4. How to load H as part of a DATA layer

Quoting Evan Shelhamer's post:

There's no way at present to make data layers load input at different rates. Every forward pass all data layers will advance. However, the constant H input could be done by making an input lmdb / leveldb / hdf5 file that is only H since the data layer will loop and keep loading the same H. This obviously wastes disk IO.

numpy neural-network protocol-buffers deep-learning caffe

The layer is summing up

-log(p_i)

and so the p_i's need to be in (0, 1] to make sense as a loss function (otherwise higher confidence scores will produce a higher loss). See the curve below for the values of log(p).

enter image description here

I don't think they have to sum up to 1, but passing them through a Softmax layer will achieve both properties.

numpy neural-network protocol-buffers deep-learning caffe

Since I had to search through many websites to puzzle the completecode, I thought I share my implementation:

Python layer for computing the H-matrix with weights for each class:

import numpy as npimport caffeclass ComputeH(caffe.Layer):    def __init__(self, p_object, *args, **kwargs):        super(ComputeH, self).__init__(p_object, *args, **kwargs)        self.n_classes = -1    def setup(self, bottom, top):        if len(bottom) != 1:            raise Exception("Need (only) one input to compute H matrix.")        params = eval(self.param_str)        if 'n_classes' in params:            self.n_classes = int(params['n_classes'])        else:            raise Exception('The number of classes (n_classes) must be specified.')    def reshape(self, bottom, top):        top[0].reshape(1, 1, self.n_classes, self.n_classes)    def forward(self, bottom, top):        classes, cls_num = np.unique(bottom[0].data, return_counts=True)        if np.size(classes) != self.n_classes or self.n_classes == -1:            raise Exception("Invalid number of classes")        cls_num = cls_num.astype(float)        cls_num = cls_num.max() / cls_num        weights = cls_num / np.sum(cls_num)        top[0].data[...] = np.diag(weights)    def backward(self, top, propagate_down, bottom):        pass

and the relevant part from the train_val.prototxt:

layer {    name: "computeH"    bottom: "label"    top: "H"    type: "Python"    python_param {        module: "digits_python_layers"        layer: "ComputeH"        param_str: '{"n_classes": 7}'    }    exclude { stage: "deploy" }}layer {  name: "loss"  type: "InfogainLoss"  bottom: "score"  bottom: "label"  bottom: "H"  top: "loss"  infogain_loss_param {    axis: 1  # compute loss and probability along axis  }  loss_param {      normalization: 0  }  exclude {    stage: "deploy"  }}

CodeHunter

InfogainLoss layer

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last