Automatic NF acceleration in ACES

Network Functions (NFs) are pervasive in today’s networks. They implement core network functionality, from basic fundamental features like bridging and network address translation; to accelerating the network with WAN optimizers and load-balancers; to guaranteeing security with firewalls, port scan detectors, and intrusion detection systems.

The ACES network will incorporate accelerated software NFs across the infrastructure to meet the requirements of its use cases. A crucial challenge is guaranteeing the required performance while keeping the flexibility offered by software.

Context. NFs were originally implemented as fixed-function, closed-source appliances, but recently, there has been a transition to implementing them in software using commodity off-the-shelf servers. These NFs trade off flexibility and ease of deployment for an increased performance challenge. Specifically, to allow NFs to process packets at current line-rate speeds (100+ Gbps), one must resort to multiple CPU cores. The difficulty is doing so without breaking the NF core functionality.

Operating at high line rates (e.g., 100Gbps), each packet demands a very short processing time, making inter-core coordination a complex and costly task. The challenge of avoiding this synchronization is not only difficult but also error-prone, necessitating a deep understanding of the NF, meticulous implementation, and careful avoidance of common parallelization pitfalls.

Automatic NF parallelization. In ACES, we advocate a paradigm shift in NF parallelization: the burden of parallelization should not be put on the developer but instead be automatically performed by compilers. This approach empowers developers to reason about their NFs in a sequential mindset while reaping the full benefits of parallelization. In this context, we developed Maestro, a system that facilitates the automatic parallelization of software network functions. 

Maestro uses static-analysis tools to analyze the sequential implementation of the NF and automatically generates an enhanced parallel version that carefully configures the NIC to distribute traffic across cores while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic.

To find a shared-nothing solution, Maestro analyzes the NF and infers how it should partition state and packets across cores to altogether avoid synchronization (i.e. a sharding solution). To concretize this sharding solution, Maestro formulates it as an SMT problem and uses a solver (e.g., Z3) to find the correct NIC configuration that enforces it. Finally, it correctly configures the NIC and automatically generates performance-oriented parallel code, dealing with the pitfalls of parallel programming.

Evaluation. We parallelized 8 common software NFs. The Figure shows how their performance scales with the number of cores. They generally scale up linearly until bottlenecked by PCIe when using small packets (an optimal outcome) or by 100Gbps line rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.

Maestro was presented at NSDI24, and its source code is available on Github.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Related content

Final Call to Action: Help Us Build Europe’s Autopoietic Edge

Over the past three years, ACES-Autopoietic Cognitive Edge-cloud Services-set out to rethink how Europe processes data where it matters most: at the edge. We’ve...

Dell Technologies: strengthening the Edge Ecosystem through Collaboration and Shared Innovation

As a global provider of digital infrastructure and technology solutions, Dell Technologies actively supports research and innovation ecosystems that advance Europe’s...

The Road to Market: turning ACES Innovation into Real-World Impact

With the ACES project (Autopoietic Cognitive Edge-cloud Services), European research in edge computing has reached a pivotal moment. After years of experimentation and...

Tackling Edge Computing with intelligence and autopoiesis: Challenges and Opportunities for ACES

The computing landscape continues its significant transformation, with Edge Computing solidifying its role as a vital bridge between centralised Cloud platforms and...

The Rise of the Autopoietic Edge: How a Mesh of Micro-Datacenters Is Re-shaping Europe’s Digital Resilience

Looking back Once celebrated for limitless scalability, the promise of the cloud has quietly revealed its fragility. The world’s digital backbone is centralised,...

ACES Stages of Exploitation

The Foundations Exploitation within ACES has been conceived as a progressive path from knowledge creation to market positioning. It aims to transform technical outcomes...