Abstract
HopliteBuf is a deflection-free, low-cost, and high-speed FPGA overlay Network-on-chip (NoC) with stall-free buffers. It is an FPGA-friendly 2D unidirectional torus topology built on top of HopliteRT overlay NoC. The stall-free buffers in HopliteBuf are supported by static analysis tools based on network calculus that help determine worst-case FIFO occupancy bounds for a prescribed workload. We implement these FIFOs using cheap LUT SRAMs (Xilinx SRL32s and Intel MLABs) to reduce cost. HopliteBuf is a hybrid microarchiteclure that combines the performance benefits of conventional buffered NoCs by using stall-free buffers with the cost advantages of deflection-routed NoCs by retaining the lightweight unidirectional torus topology structure. We present two design variants of the HopliteBuf NoC: (1) single corner-turn FIFO (W -> S) and (2) dual corner-turn FIFO (W -> S + N). The single corner-turn (W -> S) design is simpler and only introduces a buffering requirement for packets changing dimension from the X ring to the downhill Y ring (or West to South). The dual corner-turn variant requires two FIFOs for turning packets going downhill (W -> S) as well as uphill (W -> N). The dual corner-turn design overcomes the mathematical analysis challenges associated with single corner-turn designs for communication workloads with cyclic dependencies between flow traversal paths at the expense of a small increase in resource cost. Our static analysis delivers bounds that are not only better (in latency) than HopliteRT but also tighter by 2 - 3x. Across 100 randomly generated flowsets mapped to a 5x5 system size, HopliteBuf is able to route a larger fraction of these flowsets with <128-deep FIFOs, boost worst-case routing latency by approximate to 2x for mutually feasible flowsets, and support a 10% higher injection rate than HopliteRT. At 20% injection rates, HopliteRT is only able to route 1-2% of the flowsets, while HopliteBuf can deliver 40-50% sustainability. When compared to the W -> S-bkp, backpressure-based router, we observe that our HopliteBuf solution offers 25-30% better feasibility at 30-40% lower LUT cost.