Damian Bogunowicz, Neural Magic: On the Deep Learning Revolution with CPUs

Damian Bogunowicz, Neural Magic: On the Deep Learning Revolution with CPUs

AI News spoke with Damian Bogunowicz, a machine learning engineer at Neural Magic, to shed light on the company’s innovative approach to deep learning model optimization and CPU inference.

One of the major challenges in developing and implementing deep learning models lies in their size and computational requirements. However, Neural Magic tackles this problem head-on through a concept called compound sparsity.

Compound sparsity combines techniques such as unstructured pruning, quantization, and distillation to significantly reduce the size of neural networks while maintaining their accuracy.

We developed our own sparsity-aware runtime that leverages the CPU architecture to accelerate sparse models. This approach challenges the notion that GPUs are necessary for effective deep learning, Bogunowicz explains.

Bogunowicz highlighted the benefits of their approach, pointing out that more compact models lead to faster deployments and can run on ubiquitous CPU-based machines. The ability to optimize and run specific networks efficiently without relying on specialized hardware is a game-changer for machine learning professionals, enabling them to overcome the limitations and costs associated with GPU usage.

When asked about the suitability of sparse neural networks for enterprises, Bogunowicz explained that the vast majority of companies can benefit from using sparse models.

By removing up to 90% of parameters without impacting accuracy, companies can achieve more efficient implementations. While highly critical domains like autonomous driving or autonomous airplanes may require the highest accuracy and lowest scarcity, the benefits of sparse models outweigh the limitations for most businesses.

Looking to the future, Bogunowicz expressed his excitement about the future of large language models (LLMs) and their applications.

I am particularly excited about the future of large language LLM models. Mark Zuckerberg has discussed enabling AI agents, who act as personal assistants or salespeople, on platforms like WhatsApp, says Bogunowicz.

One example that caught his eye was a chatbot used by Khan Academy, an AI tutor who guides students through problem solving by making suggestions rather than revealing outright solutions. This application demonstrates the value that LLMs can bring to the education sector by facilitating the learning process and enabling students to develop problem-solving skills.

Our research has shown that LLMs can be optimized efficiently for CPU implementation. We published a research paper on SparseGPT demonstrating the removal of about 100 billion parameters using one-shot pruning without compromising model quality, Bogunowicz explains.

This means that there may be no need for GPU clusters in the future of AI inference. Our aim is to provide open source LLMs to the community soon and enable companies to have control over their own products and models, rather than relying on big technology companies.

Regarding the future of Neural Magic, Bogunowicz revealed two exciting developments that they will share at the upcoming AI & Big Data Expo Europe.

First, they will showcase their support for running AI models on edge devices, specifically x86 and ARM architectures. This expands the possibilities for AI applications in various industries.

Second, they will unveil their template optimization platform, Sparsify, which enables the seamless application of state-of-the-art pruning, quantization, and distillation algorithms via an intuitive web app and simple API calls. Sparsify aims to accelerate inference without sacrificing accuracy by providing enterprises with an elegant and intuitive solution.

Neural Magic’s commitment to democratizing machine learning infrastructure by leveraging CPUs is impressive. Their focus on compound scarcity and their upcoming advances in edge computing demonstrate their dedication to empowering businesses and researchers.

While we eagerly await the developments presented at AI & Big Data Expo Europe, it is clear that Neural Magic is poised to make a significant impact in the field of deep learning.

You can watch our full interview with Bogunowicz below:

(Photo by Google DeepMind on Unsplash)

Neural Magic is a key sponsor of this year’s AI & Big Data Expo Europe, taking place in Amsterdam from 26-27 September 2023.

Visit the Neural Magics booth at booth no. See 178 to learn more about how the company enables organizations to use computationally intensive models in a cost-effective and scalable way.

  • Ryan Daws

    Ryan is a senior editor at TechForge Media with over a decade of experience covering the latest technology and interviewing leading industry figures. He can often be spotted at tech conferences with a strong coffee in one hand and a laptop in the other. If he’s brilliant, he probably likes it. Find him on Twitter (@Gadget_Ry) or Mastodon (@gadgetry@techhub.social)

    View all posts

Tags: ai expo, ai expo europe, ai inference, compound sparsity, damian bogunowicz, deep learning, large language models, neural magic

#Damian #Bogunowicz #Neural #Magic #Deep #Learning #Revolution #CPUs
Image Source : www.artificialintelligence-news.com

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *