MulticoreWare

Case Studies

Optimizing & Enhancing the Performance of an Image Processing Algorithm

November 30, 2022

This case study emphasizes our role in creating an optimized pipeline for Chroma Correction Algorithm and future enhancements for one of our clients.

The Client

The Customer is a leading global developer of semiconductor solutions. The client was building the world’s smallest image sensor for smartphone cameras and ISPs and the corresponding software pipeline around it.

The Project

The client had a complex image processing-based pipeline as part of their RGB sensor and camera ISP module. The goal of the project was to optimize the Chroma Correction module of this software pipeline by a factor of ~10x to achieve higher performance (in terms of speedup).

Challenges

  • A very naïve version of the algorithm serving as a base to start with
  • Substantial dependency on third party libraries like OpenCV
  • Data bandwidth related issues had to be managed optimally across modules

Typical Software Optimization Workflow

A typical Software Optimization workflow can be split into the following phases:

Phase 1: This phase would require modifying, compiling & building the application in the target platform ideally with all compiler optimizations disabled. The goal is to determine the correctness of the software.

Phase 2: This phase is called Profiling, to find the areas of code where the application spends most of its run time.

Phase 3: This phase is where actual optimization happens

  • Enabling relevant compiler optimization
  • Cache Friendly Algorithms
  • Optimal usage of available registers & memory transfers
  • Hardware specific optimizations

All the phases and its interdependencies can be pictorially represented as below

Phases of a typical Software Optimization workflow

Solutions Proposed

  • Create control flow graph
  • Hand-optimize modules to replace API calls from OpenCV
  • Design Cache-Aware Algorithm to reduce cache trash
  • Loop Optimizations
    • Code Motion/Loop Invariant
    • Iteration Reordering
    • Loop Unrolling

The MulticoreWare Advantage

MulticoreWare’s gene pool consists of deep-rooted expertise in performance optimization especially for image and video processing pipelines. We possess in-depth experience in creating software solutions and tool development for multi-core and heterogeneous computing environments. This project had the perfect mix of Optimization and Video/Image processing, another area where MulticoreWare is considered as a market leader.

Redefining the Technical Architecture – With our experience in developing bare metal image/video API’s that are out there as open-source SDK’s (x265/rpp/rocAL) it was an easy task for the MulticoreWare team to remove the dependent third-party libraries like OpenCV. Once the external dependency was removed, designing the new control flow was next step.

OUTCOME

Within the estimated project timeline, MulticoreWare team was able to squeeze in ~8x performance speedup for the algorithm

Share Via

Explore More

Dec 5 2024

Enabling Fortran Compiler Support on Windows

Client
Customer is a semiconductor-based technology company.

Read more
Nov 15 2024

Advancing Compiler Support for a Semiconductor Provider

Client
Customer is a semiconductor-based technology company.

Read more
Oct 3 2024

Enhancing AI Model Support for RISC-V

Client
The customer is a RISC-V based AI accelerator company.

Read more

GET IN TOUCH

    (Max 300 characters)