Inference at the Edge: Gcore ensures seamless real-time performance of AI applications

Inference

Gcore, a global provider of edge AI, cloud, networking and security solutions, announced the launch of Inference at the Edge, an innovative solution that enables ultra-low latency AI applications, facilitating distributed deployment of pre-trained machine learning (ML) models to edge inference nodes.

Gcore Inference at the Edge enables companies across a wide range of industries to deploy AI models cost-effectively, scalably, and securely, including manufacturing, automotive, retail, and technology. Use cases such as generative AI, object recognition, real-time behavioral analytics, virtual assistants, and production monitoring can now be implemented quickly and globally.

The service runs on Gcore’s global network of more than 180 edge nodes connected by sophisticated low-latency smart routing technology. Each of these high-performance nodes is located at the edge of the Gcore network, strategically placing servers close to end users. Inference at the Edge runs on NVIDIA’s market-leading L40S GPUs, which are specifically designed for AI inference. When a user sends a request, an edge node determines the route to the nearest available inference region with the lowest latency, achieving a typical response time of under 30ms.

Also Read: Sumsub Joins Mastercard Engage Partner Program to Enhance User Verification and Fraud Prevention

The new solution supports a wide range of core ML and custom models. Open source core models available in the Gcore ML Model Hub include LLaMA Pro 8B, Mistral 7B and Stable-Diffusion XL. Models can be selected on demand and flexibly trained before being distributed globally to edge nodes via Gcore Inference. This solves a key problem for development teams: typically, AI models are run on the same servers they were trained on, resulting in degraded performance.

Other benefits of Gcore Inference at the Edge include:

  • Cost-effective deployment: A flexible pricing structure ensures that customers only pay for the resources they actually use.
  • Built-in DDoS protection: Gcore’s infrastructure automatically protects ML endpoints from DDoS attacks.
  • Data protection and security at the highest level: The solution has built-in GDPR compliance and follows the PCI DSS and ISO/IEC 27001 standards.
  • Model autoscaling: Auto-scaling is available for peak loads, so that a model can be deployed at any time to handle peak demand and unexpected peak loads.
  • Unlimited object storage: Scalable S3-compatible cloud storage that grows as the model’s needs grow. This provides users with a flexible, cost-effective and reliable solution that enables easy integration.

Andre Reitenbach, Managing Director of Gcore, explains: ” With Gcore Inference at the Edge, our customers can focus on optimizing their machine learning models instead of worrying about the costs, capacity and infrastructure required to deploy AI applications globally. At Gcore, we believe that the best performance and end-user experience is achieved at the edge of the network. That’s why we continuously evolve our solutions to ensure that every customer experiences unrivaled scale and performance. Gcore Inference at the Edge is powerful but straightforward, providing modern, effective and efficient real-time data processing using artificial intelligence.”

SOURCE: BusinessWire