Edge Impulse, the leading platform for building, refining and deploying machine learning models to edge devices, has launched new capabilities that leverage generative AI to create and manage synthetic data on the edge, from images, to speech, to audio data.
The Synthetic Data integration offers a new and efficient way to use Edge Impulse for LLM-based data creation, enabling DALL-E for image generation, Whisper for creating speech elements for keyword spotting and ElevenLabs for audible events. Enterprise customers also have the option to add custom LLM sources such as other data providers or self-hosted LLMs. Other LLM toolkits will be added in the coming months. These new features are in addition to Edge Impulse’s existing direct integration with NVIDIA Omniverse Replicator, a framework for developing custom synthetic data generation pipelines to generate highly realistic, physically based datasets tailored to train computer vision models.
Within the new Synthetic Data integration, the user can add and refine their prompts quickly and efficiently. The output, including images and audio fragments, are then displayed, allowing users to quickly evaluate and refine their prompts until they get the desired data set:
- DALL-E Image Generation Block: Generate image datasets with DALL-E using the DALL-E model.
- Whisper Keyword Spotting Generation Block: Generate keyword-spotting datasets using the Whisper model. Ideal for keyword spotting and speech recognition applications.
- ElevenLabs Synthetic Audio Block: Generate audible events – like glass breaking, or alarm sounds – using the ElevenLabs Sound Effects model.
- Custom LLM Sources: Connect to other LLM data providers or self-hosted LLMs using transformation blocks, including Edge Impulse‘s existing integration with GPT-4o for labeling image data.
This iterative workflow will make it easier to determine the right prompts for generating data. Additionally, any data that is not deleted will automatically be added to the project, ensuring seamless data management.
This new toolset greatly streamlines the process of generating and refining prompts to create the desired data set. It provides an efficient workflow for building models using synthetic data and makes it easier for developers to create high-quality data sets by leveraging generative AI.
SOURCE: BusinessWire