How to improve accuracy of Tensorflow camera demo on iOS for retrained graph

android ios objective-c machine-learning tensorflow

Since you are not using YOLO Detector the MAINTAIN_ASPECT flag is set to false. Hence the image on Android app is not getting cropped, but it's scaled. However, in the code snippet provided I don't see the actual initialisation of the flag. Confirm that the value of the flag is actually false in your app.

I know this isn't a complete solution but hope this helps you in debugging the issue.

android ios objective-c machine-learning tensorflow

Tensorflow Object detection have default and standard configurations, below is the list of settings,

Important things you need to check based on your input ML model,

-> model_file_name - This according to your .pb file name,

-> model_uses_memory_mapping - It's up to you to reduce overall memory usage.

-> labels_file_name - This varies based on our label file name,

-> input_layer_name/output_layer_name - Make sure you are using your own layer input/output names which you are using during graph(.pb) file creation.

snippet:

// If you have your own model, modify this to the file name, and make sure// you've added the file to your app resources too.static NSString* model_file_name = @"graph";//@"tensorflow_inception_graph";static NSString* model_file_type = @"pb";// This controls whether we'll be loading a plain GraphDef proto, or a// file created by the convert_graphdef_memmapped_format utility that wraps a// GraphDef and parameter file that can be mapped into memory from file to// reduce overall memory usage.const bool model_uses_memory_mapping = true;// If you have your own model, point this to the labels file.static NSString* labels_file_name = @"labels";//@"imagenet_comp_graph_label_strings";static NSString* labels_file_type = @"txt";// These dimensions need to match those the model was trained with.const int wanted_input_width = 224;const int wanted_input_height = 224;const int wanted_input_channels = 3;const float input_mean = 117.0f;const float input_std = 1.0f;const std::string input_layer_name = "input";const std::string output_layer_name = "final_result";

Custom Image Tensorflow detection, you can use below working snippet:

-> For this process you just need to pass the UIImage.CGImage object,

NSString* RunInferenceOnImageResult(CGImageRef image) {    tensorflow::SessionOptions options;    tensorflow::Session* session_pointer = nullptr;    tensorflow::Status session_status = tensorflow::NewSession(options, &session_pointer);    if (!session_status.ok()) {        std::string status_string = session_status.ToString();        return [NSString stringWithFormat: @"Session create failed - %s",                status_string.c_str()];    }    std::unique_ptr<tensorflow::Session> session(session_pointer);    LOG(INFO) << "Session created.";    tensorflow::GraphDef tensorflow_graph;    LOG(INFO) << "Graph created.";    NSString* network_path = FilePathForResourceNames(@"tensorflow_inception_graph", @"pb");    PortableReadFileToProtol([network_path UTF8String], &tensorflow_graph);    LOG(INFO) << "Creating session.";    tensorflow::Status s = session->Create(tensorflow_graph);    if (!s.ok()) {        LOG(ERROR) << "Could not create TensorFlow Graph: " << s;        return @"";    }    // Read the label list    NSString* labels_path = FilePathForResourceNames(@"imagenet_comp_graph_label_strings", @"txt");    std::vector<std::string> label_strings;    std::ifstream t;    t.open([labels_path UTF8String]);    std::string line;    while(t){        std::getline(t, line);        label_strings.push_back(line);    }    t.close();    // Read the Grace Hopper image.    //NSString* image_path = FilePathForResourceNames(@"grace_hopper", @"jpg");    int image_width;    int image_height;    int image_channels;//    std::vector<tensorflow::uint8> image_data = LoadImageFromFile(//                                                                  [image_path UTF8String], &image_width, &image_height, &image_channels);    std::vector<tensorflow::uint8> image_data = LoadImageFromImage(image,&image_width, &image_height, &image_channels);    const int wanted_width = 224;    const int wanted_height = 224;    const int wanted_channels = 3;    const float input_mean = 117.0f;    const float input_std = 1.0f;    assert(image_channels >= wanted_channels);    tensorflow::Tensor image_tensor(                                    tensorflow::DT_FLOAT,                                    tensorflow::TensorShape({        1, wanted_height, wanted_width, wanted_channels}));    auto image_tensor_mapped = image_tensor.tensor<float, 4>();    tensorflow::uint8* in = image_data.data();    // tensorflow::uint8* in_end = (in + (image_height * image_width * image_channels));    float* out = image_tensor_mapped.data();    for (int y = 0; y < wanted_height; ++y) {        const int in_y = (y * image_height) / wanted_height;        tensorflow::uint8* in_row = in + (in_y * image_width * image_channels);        float* out_row = out + (y * wanted_width * wanted_channels);        for (int x = 0; x < wanted_width; ++x) {            const int in_x = (x * image_width) / wanted_width;            tensorflow::uint8* in_pixel = in_row + (in_x * image_channels);            float* out_pixel = out_row + (x * wanted_channels);            for (int c = 0; c < wanted_channels; ++c) {                out_pixel[c] = (in_pixel[c] - input_mean) / input_std;            }        }    }    NSString* result;//    result = [NSString stringWithFormat: @"%@ - %lu, %s - %dx%d", result,//              label_strings.size(), label_strings[0].c_str(), image_width, image_height];    std::string input_layer = "input";    std::string output_layer = "output";    std::vector<tensorflow::Tensor> outputs;    tensorflow::Status run_status = session->Run({{input_layer, image_tensor}},                                                 {output_layer}, {}, &outputs);    if (!run_status.ok()) {        LOG(ERROR) << "Running model failed: " << run_status;        tensorflow::LogAllRegisteredKernels();        result = @"Error running model";        return result;    }    tensorflow::string status_string = run_status.ToString();    result = [NSString stringWithFormat: @"Status :%s\n",              status_string.c_str()];    tensorflow::Tensor* output = &outputs[0];    const int kNumResults = 5;    const float kThreshold = 0.1f;    std::vector<std::pair<float, int> > top_results;    GetTopN(output->flat<float>(), kNumResults, kThreshold, &top_results);    std::stringstream ss;    ss.precision(3);    for (const auto& result : top_results) {        const float confidence = result.first;        const int index = result.second;        ss << index << " " << confidence << "  ";        // Write out the result as a string        if (index < label_strings.size()) {            // just for safety: theoretically, the output is under 1000 unless there            // is some numerical issues leading to a wrong prediction.            ss << label_strings[index];        } else {            ss << "Prediction: " << index;        }        ss << "\n";    }    LOG(INFO) << "Predictions: " << ss.str();    tensorflow::string predictions = ss.str();    result = [NSString stringWithFormat: @"%@ - %s", result,              predictions.c_str()];    return result;}

Scaling Image for custom width and height - C++ code snippet,

std::vector<uint8> LoadImageFromImage(CGImageRef image,                                     int* out_width, int* out_height,                                     int* out_channels) {    const int width = (int)CGImageGetWidth(image);    const int height = (int)CGImageGetHeight(image);    const int channels = 4;    CGColorSpaceRef color_space = CGColorSpaceCreateDeviceRGB();    const int bytes_per_row = (width * channels);    const int bytes_in_image = (bytes_per_row * height);    std::vector<uint8> result(bytes_in_image);    const int bits_per_component = 8;    CGContextRef context = CGBitmapContextCreate(result.data(), width, height,                                                 bits_per_component, bytes_per_row, color_space,                                                 kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);    CGColorSpaceRelease(color_space);    CGContextDrawImage(context, CGRectMake(0, 0, width, height), image);    CGContextRelease(context);    CFRelease(image);    *out_width = width;    *out_height = height;    *out_channels = channels;    return result;}

Above function helps you to load the image data based on your custom ratio. High accurate image pixel ratio for both Width and height during tensorflow classification is 224 x 224.

You need to call above LoadImage function from RunInferenceOnImageResult, with actual custom width and height arguments along with Image reference.

android ios objective-c machine-learning tensorflow

Please change at this code:

// If you have your own model, modify this to the file name, and make sure// you've added the file to your app resources too.static NSString* model_file_name = @"tensorflow_inception_graph";static NSString* model_file_type = @"pb";// This controls whether we'll be loading a plain GraphDef proto, or a// file created by the convert_graphdef_memmapped_format utility that wraps a// GraphDef and parameter file that can be mapped into memory from file to// reduce overall memory usage.const bool model_uses_memory_mapping = false;// If you have your own model, point this to the labels file.static NSString* labels_file_name = @"imagenet_comp_graph_label_strings";static NSString* labels_file_type = @"txt";// These dimensions need to match those the model was trained with.const int wanted_input_width = 299;const int wanted_input_height = 299;const int wanted_input_channels = 3;const float input_mean = 128f;const float input_std = 1.0f;const std::string input_layer_name = "Mul";const std::string output_layer_name = "final_result";

Here change : const float input_std = 1.0f;

CodeHunter

How to improve accuracy of Tensorflow camera demo on iOS for retrained graph

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last