ISSN ONLINE(2278-8875) PRINT (2320-3765)

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Fast Flexible FPGA-Tuned Networks-on-Chip

Deepan Raj.B1, T.V.P.Sundararajan2, K.Shoukath Ali3
  1. PG Scholar, Department of ECE, Bannari Amman Institute of Technology, Sathyamangalam, Tamilnadu,India
  2. Professor, Department of ECE, Bannari Amman Institute of Technology, Sathyamangalam, Tamilnadu, India
  3. Assistant Professor, Department of ECE, Bannari Amman Institute of Technology, Sathyamangalam, Tamilnadu, India
Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

Abstract

To design 4x4 Mesh, Fat Tree16, Ring16 and Double Ring networks using CONNECT based NoC that embodies a set of FPGA-motivated design principles that uniquely influence key NoC design decisions, such as topology, flit width, router pipeline depth and flow control. The flexibility, lightweight nature and high performance of CONNECT-based NoCs makes them ideal candidates for use in FPGA-based research studies. In this project evaluating these networks with different FPGA family and analysing the efficient FPGA family using the Xilinx ISE software. In the 4x4 mesh configuration when evaluated using synthesis results of different FPGA family, reduced logic resource cost is obtained. To demonstrate CONNECT's flexibility and extensive design space coverage, different CONNECT networks are synthesized

Keywords

CONNECT, NoC, FPGA, Xilinx

INTRODUCTION

Network-on-Chip (NoC) is an emerging paradigm for communications within large VLSI systems implemented on a single silicon chip that is called the layered-stack approach to the design of the on-chip intercore communications the Network-on-Chip (NOC) methodology. In a NoC system, modules such as processor cores, memories and specialized IP blocks exchange data using a network as a public transportation sub-system for the information traffic. A decisions at the switches. A NoC is similar to a modern telecommunications network,using digital bit-packets switching over multiplexed links. Although packet-switching is sometimes claimed as necessity for a NoC, there are several NoC proposals utilizing circuit-switching techniques.This definition based on routers is usually interpreted so that a single shared bus, a single crossbar switch or a point-to-point network are not NoCs but practically all other topologies are somewhat confusing since all above mentioned are networks (they enable communication between two or more devices) but they are not considered as network-on-chips. This paper is organised as follows : Section II describes introduction to NoC terminologies. Section III describes CONNECT router architecture. Section IV describes about flit data width analysis. Section V describes about conclusion and future work.

NoC TERMINOLOGIES

A. Packets
Packets are the basic logical unit of transmission at the endpoints of a networks.
B. Flits
When traversing a network, packets, especially large ones, are broken into its (flow control digits), which are the basic unit of resource allocation and flow control within the network. Some NoCs require special additional header or tail its to carry control information and to mark the beginning and end of a packet [10].
C. Virtual Channels
A channel corresponds to a path between two points in a network. NoCs often employ a technique called virtual channels (VCs) to provide the abstrac-tion of multiple logical channels over a physical underlying channel. Routers implement VCs by having non-interfeRing it buffers for different VCs and time-multiplexed shaRing of the switches and links. Thus, the number of implemented VCs has a large impact on the buffer requirements of an NoC. Employing VCs can help in the implementation of protocols that require traffic isolation between different message classes (e.g. to prevent deadlock).
D. Flow Control
In lossless networks a router can only send a it to a downstream receiving router if it is known that the downstream router's buffer has space to receive the it. Flow control refers to the protocol for managing and negotiating the available buffer space between routers. Due to physical separation and the speed of router operation, it is not always possible for the sending router to have immediate, up-to-date knowledge of the buffer status at the receiving router. In credit-based flow-control, the sending router tracks credits from its downstream receiving routers. At any moment, the number of accumulated credits indicates the guaranteed available buffer space (equal to or less than what is actually available due to delay in receiving credits) at the downstream router's buffer. Flow control is typically performed on a per-VC basis.
E. Input-Output Allocation
Allocation refers to the process or algorithm of matching a router's input requests with the available router outputs. Different allocators offer different trade-offs in terms of hardware cost, speed and matching efficiency. [10] Separable allocators form a class of allocators that are popular in NoCs. They perform matching in two independent steps, which sacrifices matching efficiency for speed and low hardware cost.
image
Driven by the special characteristics of FPGAs, a simple router architecture to serve as the basic building block for composing CONNECT networks. Our router design was implemented using Bluespec System Verilog (BSV). CONNECT routers are heavily configurable and among other parameters they support -variable number of input and output ports, variable Number of virtual channels (VCs), variable it width, variable it buffer depth, two flow control mechanisms, flexible user-specifed routing, four allocation algorithms.

FLIT DATA WIDTH ANALYSIS

A. Introduction
In this chapter 4x4 Mesh, Fat Tree16, Ring16 and Double Ring networks are designed with 32, 128,256-bits flit data width using CONNECT based NoC. To evaluate the CONNECT NoC architecture and highlight its flexibility and extensive design space coverage examine CONNECT networks on FPGA resource usage (LUTs) and frequency estimates from synthesis report for different FPGA family like Virtex 4,Virtex 5,Virtex 6 in Xilinx ISE 14.2 version.
B. 4x4 Mesh Network
In the Figure 1 a 4x4 Mesh network is designed and it consist of 16 routers with a flit data width of 32-bits and 4 virtual channels. Flit data width determines the data transmission rate of the routers. The network is designed on partial Mesh topology and each router is connected to some of its neighbouring routers. Router 5,6,9 and 10 is connected to maximum to four neighbouring routers. Routing technique is used to know the path availability from source to destination. The network is quite reliable, as there is often more than one path between source and destination in the network.
image
image
image
In the Figure 3 a Fat Tree16 network is designed and it consist of 20 routers , 2 virtual channels specified with a flit data width of 32,64,128-bits and this determines the data transmission rate of the routers. It consist of 16 nodes(N0 to N15) and they are resources. The main characteristic of Fat Tree is that the links that connect nodes from different levels may have different bandwidth depending on their utilization. The complexity of nodes grows as they get close to the roots. It is recursively scalable and easily partitionable network.
image
In the comparison Table 2, synthesis results of 4x4 Mesh network is compared with 32,64,128-bit flit data width in Virtex 6 FPGA family. LUTs utilization is increases by 1 percent and frequency increases when synthesized with higher order bits like 64 and 128-bits. When each bits compared with different FPGA families Virtex 6 family is efficient in LUTs usage.
D. Ring16 Network
image
In the Figure 4 a Ring16 network is designed and it consist of 16 routers, 4 virtual channels specified with a flit data width of 32,64,128-bits.Flit data width determines the data transmission rate of the routers. Routers from R0 to R15 are connected circularly . It consist of 16 nodes (N0 to N15). A Ring network is a standard circular topology in which each router is connected directly exactly two other nodes, forming a circular pathway and provides only one pathway between any two routers.
image
In the synthesis results of Ring16 with 32,64 and 128-bit flit data width, Virtex 6 FPGA family is efficient.From the comparison Table 3 the frequency variation is high but increase in LUTs usage is with very less variation when the network is designed with higher order bits.
E. Double Ring Network
image
In the Figure 5 a Double Ring network is designed and it consist of 16 routers, 4 virtual channels specified with a flit data width of 32,64,128-bits. Flit data width determines the data transmission rate of the routers. It consist of two concentric Rings that connect each node on a network instead of one network Ring that is used in a Ring topology. Secondary Ring in a dual-Ring topology is redundant. It is used as a backup in case the primary Ring fails. In these configurations, data moves in opposite directions around the Rings. Each Ring is independent of the other until the primary Ring fails and the two Rings are connected to continue the flow of data traffic.
image
In the comparison Table 4, synthesis results of Double Ring network is compared with 32,64,128-bit flit data width in Virtex 6 FPGA family. LUTs utilization is increases by 1 percent and frequency deccreases when synthesized with higher order bits like 64 and 128-bits. When each bits compared with different FPGA families Virtex 6 family is efficient in LUTs usage.

CONCLUSION

Network topologies such as 4x4 Mesh, Fat Tree16, Ring16 and Double Ring are designed using CONNECT based NoC. These networks has been designed with different flit data width like 32,64,128 and 256. Each network is synthesized with Xilinx ISE 14.2 software. Synthesis results has been compared with different FPGA family like Virtex 4 , Virtex 5, Virtex 6. In the synthesis report of these networks with all types of bit values the LUTs usage has been decreased in Virtex 6 family. Synthesis report shows that Virtex 6 family is very efficient in reducing area and FPGA cost compared to other family. In future designing different networks, performing simulation and analyzing the latency and network performance.

References