Case Studies

Optimizing & Enhancing the Performance of an Image Processing Algorithm

November 30, 2022

This case study emphasizes our role in creating an optimized pipeline for Chroma Correction Algorithm and future enhancements for one of our clients.

The Client

The Customer is a leading global developer of semiconductor solutions. The client was building the world’s smallest image sensor for smartphone cameras and ISPs and the corresponding software pipeline around it.

The Project

The client had a complex image processing-based pipeline as part of their RGB sensor and camera ISP module. The goal of the project was to optimize the Chroma Correction module of this software pipeline by a factor of ~10x to achieve higher performance (in terms of speedup).


  • A very naïve version of the algorithm serving as a base to start with
  • Substantial dependency on third party libraries like OpenCV
  • Data bandwidth related issues had to be managed optimally across modules

Typical Software Optimization Workflow

A typical Software Optimization workflow can be split into the following phases:

Phase 1: This phase would require modifying, compiling & building the application in the target platform ideally with all compiler optimizations disabled. The goal is to determine the correctness of the software.

Phase 2: This phase is called Profiling, to find the areas of code where the application spends most of its run time.

Phase 3: This phase is where actual optimization happens

  • Enabling relevant compiler optimization
  • Cache Friendly Algorithms
  • Optimal usage of available registers & memory transfers
  • Hardware specific optimizations

All the phases and its interdependencies can be pictorially represented as below

Phases of a typical Software Optimization workflow

Solutions Proposed

  • Create control flow graph
  • Hand-optimize modules to replace API calls from OpenCV
  • Design Cache-Aware Algorithm to reduce cache trash
  • Loop Optimizations
    • Code Motion/Loop Invariant
    • Iteration Reordering
    • Loop Unrolling

The MulticoreWare Advantage & Approach

MulticoreWare’s gene pool consists of deep-rooted expertise in performance optimization especially for image and video processing pipelines. We possess in-depth experience in creating software solutions and tool development for multi-core and heterogeneous computing environments. This project had the perfect mix of Optimization and Video/Image processing, another area where MulticoreWare is considered as a market leader.

Redefining the Technical Architecture – With our experience in developing bare metal image/video API’s that are out there as open-source SDK’s (x265/rpp/rocAL) it was an easy task for the MulticoreWare team to remove the dependent third-party libraries like OpenCV. Once the external dependency was removed, designing the new control flow was next step.


Within the estimated project timeline, MulticoreWare team was able to squeeze in ~8x performance speedup for the algorithm

Share Via

Explore More

Mar 27 2024

Optimising CNN Model on Low Power Vision DSP

The customer, an IP company, specializes in vision-based DSPs utilized for Imaging, Computer Vision, and AI applications.

Read more
Mar 15 2024

Multi-Object Tracking using Cadence Tensilica ConnX 220 DSP

The Client Cadence, a prominent player in electronic design, utilizes its computational software proficiency to provide comprehensive solutions in software, hardware, and IP that empower companies to develop cutting-edge electronic systems spanning various industries.

Read more
Sep 14 2022 DSP Optimization of RADAR Perception

DSP Optimization of RADAR Perception Software

The client is a US-based leading technology company that develops sensors, sensor-based solutions, sensor software, and other mission-critical products.

Read more