Network On Chip implementation for Tinuso multicore configurations

Rune Ploug

AbstractThe DTU IMM ESE section is developing the many-core FPGA based Tinuso processor for research into future architectures of multi- and many-core processors. The focus of this thesis is to develop the infrastructure needed for Tinuso cores to access main memory over a Network On Chip (NOC).
The reason a NOC is interesting is that the Tinuso core which is designed for multi- and many-core implementations can be implemented with more than one core in a Tinuso processor architecture. For implementing this multi-core architecture different technologies is needed - one of the main features being the NOC. The Tinuso core is already developed. It can directly access memory via a interface. To access memory with multiple cores in one processor a NOC is needed. This requires then a Network Interface Controller to interface between core and NOC is needed. Also a routing & switch component and a memory controller with network interface are needed. In this project a NOC solution is designed for the Tinuso processor cores so that these cores can access main memory over the NOC. This was designed by analyzing current theory and design concepts based on specific prioritization of requirements. This prioritization was achieved via agile analysis of the base requirements, ideas for extensions and division of the base requirements into sub components and technologies. The design chosen was a Torus 2D mesh with a YX routing algorithm with a twist of torus routing. The design was implemented in VHDL for FPGA's and tested towards a simulated Tinuso core interface. This interface was design in cooperation with the Tinuso core developer[1] which developed his own test-bench. As a result slight differences was found between the actual core's expectations and what was used in the simulations here. The differences are documented in results. Clock frequency results obtained from synthesis indicates that the NOC has to run at about half the clock frequency of the Tinuso cores[1]. This was expected as the implementation has not been optimized for fast clock frequency. This is the raw results from first total system test of an experimental prototype: the system is synthesized on a FPGA that is too small to even map it in synthesis. The implemented solution demonstrates the feasibility of the design and network protocol when tested. It also demonstrates how many challenges there are in designing and implementing deadlock free solutions in concurrent systems. Concurrent access to the main memory from multiple cores failed in many cases in the test-bench as a result of a specific deadlock situation. Most of the deadlocks were even expected as the testing went outside the requirements for this version of the NOC system. This was done to test support for interesting extensions such as core to core communication or and cache coherency. Several solutions for the few deadlocks situations experienced in testing has been suggested demonstrating that there is many ways, with different tradeoffs, to handle deadlocks.
TypeBachelor of Engineering thesis [Academic thesis]
Year2011
PublisherTechnical University of Denmark, DTU Informatics, E-mail: reception@imm.dtu.dk
AddressAsmussens Alle, Building 305, DK-2800 Kgs. Lyngby, Denmark
SeriesIMM-B.Eng.-2010-67
NoteSupervised by Associate Professor Sven Karlsson, ska@imm.dtu.dk, DTU Informatics
Electronic version(s)[pdf]
Publication linkhttp://www.imm.dtu.dk/English.aspx
BibTeX data [bibtex]
IMM Group(s)Computer Science & Engineering