Google re-invents Avici — Aquila

Adam Dunstan
3 min readApr 18, 2022

--

Just this weekend I was thinking about previous projects and how today’s hardware advances would impact them. In particular I was thinking about Avici Systems, a startup from the 90’s Internet era that developed a Backbone Router based upon supercomputer technology, one of the most interesting projects I have been involved. Designed for massive scale there were few providers that had matching requirements. However, as projected the Internet continued to grow, unfortunately the funding and market enthusiasm for Internet technology did not. Using process technology from that era made it expensive to build and after moderate success Avici shut its doors, and its customers decommissioned the equipment.

Avici TSR

This morning I read a paper on Googles new data-center fabric Aquila, the similarities are striking. (Usenix paper The Next Platform article )

The Avici Terabit Switch Router was based upon a supercomputer fabric technology called a k-ary n-cube, a toroidal mesh. Each network line card was connected to six other cards to create the fabric. Just as described in the Aquila paper, a number of problems needed to be solved.

  1. Virtual Circuits to avoid Head-of-Line blocking. The original k-ary n-cube was the Cray T3D super computer, Dr. Bill Dally (now Chief Scientist at Nvidia and was an advisor to Avici) is considered to have invented the machine. It however suffered from head-of-line blocking. Avici solved this problem using a Virtual Circuit overlay so a process could not block a fabric link.
  2. Cells not packets. This was the 90’s, the networking world was coming to terms with how much money had been wasted on Asynchronous Transfer Mode (ATM). I am still at a loss as to how a 53byte cell ended up being the format, not an even number not a logical boundary… However, variable length packets are problematic in fabrics, especially with the process technology available in the 90’s. So Avici called them “flits”, a fixed cell that was used over the fabric. By using cells instead of packets, the amount of on chip memory and the impact of memory instructions is reduced, therefore reducing latency. Latency management is important in these types of fabrics where traffic transits fabric nodes. Google uses cells with some variability thanks to modern process technology.
  3. Unequal cost fabric routing. In addition to computing the IP routing table, a routing table inside the fabric had to be computed and it needs to use multiple unequal cost paths to achieve cross sectional bandwidth while accommodating ordering for protocols like TCP. Bandwidth is problematic in smaller n-cubes but is alleviated as the n-cube gets bigger with each fabric node having more links. The ordering problem is easily solved with L3/4 hashing that is common in any load balancer today.

As Google use case is data center fabric, not Internet Core Router, there are some additional differences however the concept remains the same. As we were looking for ways to grow Avici, we did discuss data center fabrics like Google have documented, but it would have been an expensive project and the scale was not yet required.

Building the Avici Router in the 90’s was expensive, there were seven custom ASICs. A few of them had gate counts that pushed the boundaries of simulation and process technology at the time. I recall months of manufacturing problems with ASIC ball solder flow failures, a process that is common in today’s manufacturing.

Even though Avici was not a durable commercial success like our competitor Juniper, if I had my time over I would still choose Avici. I worked with some of the smartest people I have ever known at that company, we together solved the unsolvable.

Now 20 years later, Google invents Avici……

--

--

Adam Dunstan

Tech enthusiast, infrastructure specialist, leader & engineer