Enhancing AI Model Support for RISC-V
October 3, 2024Client
The customer is a RISC-V based AI accelerator company.
Challenge
The customer’s accelerator hardware initially supported a minimal set of models through their NN software ecosystem. The project’s goal was to extend support to various other models.
Our team was tasked with creating end-to-end model inference pipelines, demos, and benchmarking for various CNN and NLP models using their architectures and custom APIs. This included rewriting torch models using functional torch APIs, converting operations to the customer’s custom NN library, and optimizing the functional models for more efficient hardware use.
Solution
Leveraging our expertise in end-to-end model inference pipelines across various customer hardware, our team of solution architects successfully added support for models such as Stable Diffusion, Llama2, RoBERTa, Swin, and various CNN, NLP, and transformer-based models on different architectures of the customer’s hardware. The correctness of the model inference pipeline was verified using PyTorch reference code and the PCC metric.
With MulticoreWare’s expertise and the rapid development of APIs and features, we swiftly adapted to new APIs and enhanced models by analyzing memory layouts and configurations despite minimal documentation. At the customer’s request, we conducted unit tests of the operations for select models across various input resolutions.
Despite challenges with limited documentation and a rapidly evolving repository, our team successfully met the customer’s requirements, demonstrating our ability to quickly learn and apply new technologies, overcoming obstacles to deliver quality results.
Solution
Leveraging our expertise in end-to-end model inference pipelines across various customer hardware, our team of solution architects successfully added support for models such as Stable Diffusion, Llama2, RoBERTa, Swin, and various CNN, NLP, and transformer-based models on different architectures of the customer’s hardware. The correctness of the model inference pipeline was verified using PyTorch reference code and the PCC metric.
With MulticoreWare’s expertise and the rapid development of APIs and features, we swiftly adapted to new APIs and enhanced models by analyzing memory layouts and configurations despite minimal documentation. At the customer’s request, we conducted unit tests of the operations for select models across various input resolutions.
Despite challenges with limited documentation and a rapidly evolving repository, our team successfully met the customer’s requirements, demonstrating our ability to quickly learn and apply new technologies, overcoming obstacles to deliver quality results.
Solution Highlights
- Developed an end-to-end model inference pipeline for 35+ models for the customer’s hardware architecture using their NN APIs.
- Benchmarked more than 15 models with public datasets for CNN and NLP models.
- Conducted unit testing and reported unsupported operation variants and issues for over 15 models.
Business Impact
MulticoreWare enhanced the customer’s market competitiveness by offering a comprehensive AI ecosystem, attracting a broader customer base. The project also increased revenue opportunities through higher adoption of their AI hardware and APIs, leading to business growth for the customer.
Conclusion
MulticoreWare demonstrated proficiency in creating end-to-end model inference pipelines, rapid adaptation to evolving APIs, and effective benchmarking and unit testing. Discover how we can help you achieve innovative results. Contact our team at info@multicorewareinc.com