NVIDIA Boosts AI Factories With DPU-Enhanced Kubernetes Service Proxy
As the landscape of artificial intelligence (AI) continues to evolve, NVIDIA is paving the way for more efficient AI factories with its data processing unit (DPU)-accelerated service proxy for Kubernetes. According to NVIDIA, this innovation is designed to streamline the deployment of complex AI workflows, enhancing both performance and operational efficiency.
Enhanced AI Application Deployment
AI applications have become increasingly sophisticated, transitioning from basic model training to advanced planning and reasoning tasks. This evolution necessitates a robust infrastructure capable of supporting agentic AI applications. NVIDIA’s solution involves a software-defined, hardware-accelerated application delivery controller (ADC), which is powered by NVIDIA BlueField-3 DPUs. This setup allows for dynamic load balancing, robust security, and efficient cloud-native multi-tenancy.
Since the introduction of OpenAI’s ChatGPT in 2022, AI has expanded from simple GPU-based model training to distributed inferencing technologies. Large language models (LLMs) now integrate enterprise data and employ reasoning models, such as DeepSeek R1, to solve complex problems. NVIDIA’s digital human blueprint exemplifies this advancement, utilizing containerized NVIDIA Inference Microservices (NIM) to create a cohesive agentic workflow.
Optimizing AI Operations
The BlueField-3 DPU plays a crucial role in optimizing data movements within AI clouds. By combining high-performance acceleration engines with efficient Arm compute cores, BlueField enhances performance and flexibility, crucial for programming agentic AI data flows. NVIDIA’s reference architecture for sovereign AI cloud operators underlines the importance of BlueField in managing north-south networking for GPU clusters.
F5 BIG-IP Next for Kubernetes
F5’s BIG-IP Next for Kubernetes (BINK) ADC, when accelerated by BlueField-3, provides necessary infrastructure optimizations for AI clouds. This solution offers high-performance networking, zero-trust security, and efficient resource utilization across multiple customers. BINK's capabilities are particularly beneficial for cloud-native multi-tenancy, allowing efficient GPU resource management without overprovisioning.
Additionally, BINK improves energy efficiency by offloading data paths from host CPUs to BlueField’s power-efficient Arm cores, resulting in higher network energy efficiency.
Case Study: SoftBank
SoftBank, a leader in supercomputing, conducted a proof of concept (PoC) using BINK on a NVIDIA H100 GPU cluster. The results demonstrated significant improvements in network performance and resource utilization. BINK achieved 77 Gbps throughput without CPU core consumption, significantly outperforming open-source alternatives like Nginx, which consumed 30 host cores for lower throughput.
In terms of latency, BINK powered by BlueField reduced HTTP GET response times by 11 times compared to Nginx. Additionally, BlueField-accelerated BINK exhibited 99% lower CPU utilization and 190 times higher network energy efficiency.
Conclusion
The collaboration between NVIDIA and F5 marks a significant advancement in AI infrastructure, offering enhanced performance, security, and efficiency. SoftBank's PoC results underscore the potential of offloading and accelerating application delivery with DPUs, positioning AI factories to meet the rigorous demands of contemporary AI workloads.
For a detailed exploration of these capabilities, refer to the NVIDIA blog.
Read More
NVIDIA Enhances cuQuantum with Dynamic Gradients and DMRG Primitives
Jul 07, 2025 0 Min Read
Optimizing LLM Inference with TensorRT: A Comprehensive Guide
Jul 07, 2025 0 Min Read
The Shift from Permissive to Copyleft Licenses: A New Perspective in Open Source
Jul 07, 2025 0 Min Read
Bitcoin (BTC) Eyes $110K Amid Mixed Market Signals
Jul 07, 2025 0 Min Read
Sui Blockchain Welcomes tBTC, Unlocking $500M in Bitcoin Liquidity
Jul 07, 2025 0 Min Read