Abstract
Simulation is the main tool for computer architects and parallel application developers for developing new architectures and parallel algorithms on many-core machines. Simulating a many-core architecture represent a challenge to software simulators even with parallelization of these SW on multi-cores. Field Programmable Gate Arrays offer an excellent implementation platform due to inherent parallelism. Existing FPGA-based simulators however, are mostly execution-driven which consumes too many FPGA resources. Hence, they still trade-off accuracy with simulation speed as SW simulators do. In this work, an application-level trace-driven FPGA-based many-core simulator is presented. A parameterized Verilog template was developed that can generate any number of simulator tiles. The input trace has an architecturally agnostic format that is directly interpreted by the FPGA-based timing model to re-construct the execution events of the original application with accurate timing. This allows fitting a large number of simulation tiles on a single FPGA without sacrificing simulation speed or accuracy. Experimental results show that the simulator's average accuracy is similar to 14 percent with simulation speeds ranging from 100's of MIPs to over 2,200 MIPS for a 16-core target architecture. Hence, with accuracy similar to SW simulators, its speed is higher than all other FPGA-based simulators.