Abstract
Modern, fast microprocessors are deeply pipelined to enhance their performance. Thus they cannot afford to wait for each instruction to complete before starting the next. When inter-instruction dependencies are encountered it is essential that data are forwarded from their point of production to where they are needed as rapidly as possible. This has been a problem in asynchronous processors because of the lack of synchronisation between the units producing and consuming the data. This paper presents a solution to this problem. The mechanism described allows the depth of speculative execution to be increased, improving memory efficiency by hiding the load latency yet still allowing precise exceptions.