Keywords
             | 
        
        
            | system-on-chip; low power; energy efficiency; high-performance; full duplex bus. | 
        
        
            
            INTRODUCTION
             | 
        
        
            | Nowadays, there is a need of reduced interfaced complexity, low-cost and low-energy on-chip bus due to rapid growth       of Internet of Things (IoT) market. Traditional bus protocols like the advanced microcontroller bus architecture       (AMBA) advanced high-performance bus (AHB) [9] and advanced eXtensible interface (AXI) [3] from ARM       Holdings, Wishbone from Silicore Corporation [4], Open Core Protocol (OCP) from OCP International Partnership [5]       and CoreConnect from IBM [10] are commonly used in industry. The main characteristic of all these buses is that they       transfer data linearly. However, There are some specific applications such as image processing, computer vision and       wireless communication where data processing is usually based on the relationship of data neighbours, adjacency,       connectivity, regions and boundaries [6],[8], and block data load and store [7]. In these cases, data transfer by block or       matrix is preferable rather than by linear burst. In addition, for some advanced bus structures such as multi-bus and       multilayer architectures, the bandwidth can be improved when maximum transactions occur in the same bus level or the       same bus layer. However, these buses use a large number of wires for several set of bus signals and define complex       hardware structures that are costly in terms of silicon area and energy consumption. These traditional bus-based       architectures are also not suitableto the battery-driven portable devices due to features of high bandwidth and high       energy consumption. | 
        
        
            | In order to overcome the above specified limitations of traditional buses, a low-cost and low-energy bus termed masterslave       bus is introduced in brief. It is a dual bus structure as it is made up of control bus (MBUS) and data bus (SBUS).       This protocol balances performance with cost, overcomes and implements features such as low power and high       throughput. | 
        
        
            | The rest of the paper is organized as follows; the Section II givesbrief survey of all system-on-chip protocols. Section       III gives their description in detail with necessary diagrams. Section IV illustrates the comparison between different onchip       bus protocols. Section Vconcludes in brief. | 
        
        
            | ARM has developed AMBA [9] and AMBA AXI protocol [3] which are commonly used in various industries. Many       protocols have been invented by different companies such as Wishbone Bus [4] from Silicore Corporation and       Opencores; Open Core Protocol [5] from OCP Int. Partnership and CoreConnect [10] from IBM. Each protocol has its       own significance and applied according to the need of system application. Many researches are working on protocols in       order to optimize the protocol in terms of power consumption and hardware connection. | 
        
        
            
            DIFFERENT SYSTEM-ON-CHIP BUS PROTOCOLS
             | 
        
        
            | The following are the bus protocols that are commonly used in industry. They are listed in as follows and are explained       in brief as under. | 
        
        
            | 1. AMBA | 
        
        
            | 2. AMBA AXI | 
        
        
            | 3. WISHBONE BUS | 
        
        
            | 4. OPEN CORE PROTOCOL | 
        
        
            | 5. CORECONNECT BUS | 
        
        
            | 6. MSBUS PROTOCOL | 
        
        
            
            1) AMBA
             | 
        
        
            | The Advanced Microcontroller Bus Architecture (AMBA) specification by ARM defines an on-chip communications       standard which is required for designing high-performance embedded microcontrollers. AMBA comprises of three       buses which are the Advanced High-performance Bus (AHB), the Advanced System Bus (ASB) and the Advanced       Peripheral Bus (APB).Fig.1 shows a typical AMBA based architecture. It consists of a backbone bus which has the       capability to sustain the external memory bandwidth where the on-chip memory, CPU and other Direct Memory       Access(DMA) devices reside. | 
        
        
            | The AHB is mainly used for high-performance, high clock frequency system modules. It acts as the high-performance       system backbone bus. It supports multiple bus masters, provides high-bandwidth operation and supports efficient       connection of processors, off-chip external memory interfaces and on-chip memories with low-power peripheral       macrocell functions. It implements features such as burst transfers, split transactions, single-cycle bus master handover,       single-clock edge operation, non-tristate implementation and wider data bus configurations (64/128 bits). AHB Master,       AHB Slave, AHB Arbiter, and AHB Decoder are the main components of the AHB. | 
        
        
            | The AMBA ASB specially defined for high-performance system modules is an alternative system bus which can be       used whenever the high-performance features of AHB are not required. ASB implements most of the features of AHB       which are required for high performance along with burst transfers, pipelined transfer operation and multiple bus       master. ASB system design is composed of ASB Master, ASB Slave, ASB Arbiter, and ASB Decoder. | 
        
        
            | The AMBA APB acts as local secondary bus which is used for low-power peripherals and is enclosed as a single AHB       or ASB slave device. AMBA APB is optimized for minimum power consumption and reduced interface complexity in       order to support peripheral functions. The AMBA APB can be used to interface to any peripherals which are having       low bandwidth and they do not require the high performance of a pipelined bus interface. In advanced version of APB,       all signal transitions are only related to the rising edge of the clock. It enables the APB peripherals to integrate easily       into any design flow with many advantages such as high-frequency operation, simplification of static timing analysis       by the use of a single clock edge, etc. An AMBA APB comprises of a single APB bridge which is required to convert       ASB or AHB transfers into a particular format for the slave devices on the APB and is responsible for bus handshaking       and control signal retiming on behalf of the local peripheral bus. The function of bridge is to latch all address, data and       control signals and to provide a second level of decoding to generate slave select signals for the APB peripherals.       Another modules on the APB are APB slaves. [9] | 
        
        
            
            2) AMBA AXI
             | 
        
        
            | The AMBA AXI protocol from ARM is intended at high-performance, high-frequency system designs and includes a       number of features which are suitable for a high-speed submicron interconnects. The AXI protocol provides key       features such as separate address/control and data phases, support for unaligned data transfers using byte strobes, burstarchitecture       and design of the systems in which based transactions with start address only issued, separate read and       write data channels to ensure low-cost Direct Memory Access (DMA), ability to issue multiple outstanding addresses,       out-of-order transaction completion and easy addition of register stages to provide timing closure. | 
        
        
            | The generic architecture of AXI protocol is shown in Fig.2.It consists of AXI Master, Slave and Interconnect/Arbiter.       The AXI Master is responsible for initiating transfers on bus whereas AXI slave responds to the request initiated by the       master.AXI arbiter has the ability to take multiple masters at a time and arbitrates for the bus using configurable       priority scheme. From Fig.2, this protocol provides a single interface definition for describing interfaces between a       master and the interconnect; between a slave and the interconnect and between a master and a slave. | 
        
        
            | The AXI protocol is burst-based. In this protocol, every transaction has address and control information on the address       channel that describes the type of the data to be transferred. The data is transferred between master and slave using a       write data channel to the slave or a read data channel to the master. In write transactions, in which all the data flows       from the master to the slave, the AXI protocol has an additional write response channel to allow the slave to signal to       the master the completion of the write transaction. Fig.3 describes how a read transaction uses the read address and       read data channels. | 
        
        
            | Fig.4 describes how a write transaction uses the write address, write data, and write response channels. The AXI       protocol supports the variable-length bursts, from 1 to 16 data transfers per burst; bursts with a transfer size of 8-1024       bits; wrapping, incrementing, and fixed bursts; system-level caching and buffering control and secure and privileged       access; atomic operations, using exclusive or locked accesses.[3] | 
        
        
            
            3) WISHBONE BUS
             | 
        
        
            | One of the flexible design methodology for use with semiconductor IP cores is the WISHBONE System-on-Chip (SoC)       Interconnection Architecture by Opencores and Silicore Corporation for Portable IP Cores. Its main purpose is to foster       design reuse by overcoming System-on-Chip integration problems. This is achieved by creating a common interface       between IP cores that improves the reliability and portability of the system, and results in faster time-to-market for the       end user. Initially, IP cores used non-standard interconnection schemes that were difficult to integrate. By adopting a       standard interconnection scheme, the cores can be integrated in quick and easy manner by the end user. WISHBONE       interconnect is targeted as a general purpose interface which defines the standard data exchange between IP core modules.The WISHBONE interconnection makes system-on-chip and design reuse easy by creating a standard data       exchange protocol. | 
        
        
            | The Wishbone consists of two interfaces, termed as master and slave. Master interfaces are IPs which are capable of       initiating bus cycles, while slave interfaces are capable of accepting bus cycles. The hardware implementations have       compatibility with various types of interconnection topologies such as crossbar switch interconnection (Fig.5), dataflow       interconnection (Fig.6), Point-to–Point interconnection (Fig.7), and shared bus interconnection (Fig.8). | 
        
        
            | Crossbar switch interconnection mechanism allows modules to connect and communicate. Each connection channel       can be operated in parallel to other connection channels, thus making the crossbar switches inherently faster than       traditional bus schemes.The data flow interconnection is used when data is processed in a sequential manner. Here,       each IP core has both a MASTER and a SLAVE interface.Data flows from one core to another core which is       sometimes referred to as pipelining. They are used in linearsystolic array architectures for implementation of DSP       algorithms. Point-to-point interconnection connects wishbone single master with a slave. The shared bus       interconnection connects two or more MASTERs with one or more SLAVEs. In this case a MASTER initiates a bus       cycle to a target SLAVE. The target SLAVE then participates in one or more bus cycles with the MASTER. | 
        
        
            | It includes features such as Simple, compact, logical IP core hardware interfaces that require very few logic gates; full       set of popular data transfer bus protocols including READ/WRITE cycle, BLOCK transfer cycle and RMW cycle;       Modular data bus widths and operand sizes; Supports both LITTLE ENDIAN and BIG ENDIAN data       ordering;Supports single clock data transfers; Supports normal retry termination, cycle termination and termination due       to error; User-defined tags which are useful for applying information to an address bus, a bus cycle or data bus; Master       / Slave architecture for very flexible system designs; Multiprocessing capabilities which allows for a wide variety of       System-on-Chip configurations; Arbitration methodology such as priority arbiter, round-robin arbiter, etc. is defined by       the end user.[4] | 
        
        
            
            4) OPEN CORE PROTOCOL
             | 
        
        
            | The Open Core Protocol (OCP) by OCP, International Partnership, defines a high-performance and a bus independent       interface between IP cores which can be a simple peripheral core, a high-performance microprocessor, or it can be an       on-chip communication subsystem such as a wrapped on-chip bus that reduces design risk, time, and manufacturing       costs for SOC designs. The Open Core Protocol reuses the IP design by transforming IP cores, making them       independent of they are used. It optimizes die area by configuring into the OCP interfaces and simplifies system       verification and testing by providing a firm boundary around each IP core which can be observed, controlled, and       validated. The OCP is similar to Virtual Socket Interface Alliance’s (VSIA) Virtual Component Interface (VCI).The       Open Core Protocol hence, interface addresses communications between the functional units (or IP cores) which results       in system on a chip. Figure 9describes SoC design based on the OCP protocol where a simple system contains a wrapped bus and three IP coreentities: one that is a system target, one that is a system initiator, and anentity that is       both. | 
        
        
            | The Open Core Protocol provides features such as point-to-point synchronous interface; bus independence; commands       such as read and write with five command extensions; 8-bit bytes (octets) address space; supports configurable data       width to allow multiple bytes to be transferred simultaneously; supports pipelining of transfers; supports burst transfers;       present tags in OCP interface controls the ordering of responses; supports the notion of multiple threads and supports       sideband (or out-of-band) signalling.[5] | 
        
        
            
            5) CORECONNECT BUS
             | 
        
        
            | The CoreConnect bus architecture was developed by IBM which simplifies the integration and reuse of processor,       system, and peripheral cores within standard product and VLSI designs. Fig.10 shows CoreConnect Bus Architecture. | 
        
        
            | This architecture permits integrating custom SOC designs using cores designed according to the given specifications       and lays the foundation of IBM Blue LogicCore Library or other non-IBM devices.The main elements of       thisarchitecture comprises of the processor local bus (PLB) which is multi-master, synchronous and provides high       bandwidth capabilities; the on-chip peripheral bus (OPB), a bus bridge, which is capable of executing fully       synchronous operation, dynamic bus sizing, separate address and data buses and multiple OPB bus masters; and a       device control register (DCR) bus which provides fully synchronous movement of GPR data between CPU and slave       logic. The overall system performance is increased by connecting high-performance peripherals to the high-bandwidth,       low-latency PLB and slower peripheral cores to the OPB, thereby reducing traffic on the PLB. Open architecture bus       standard; presence of 32-, 64-, and 128-bit versions to support a wide variety of applications; enabling IP reuse in       multiple designs and no-fee, no-royalty licensing are some advanced features of CoreConnect bus architecture. [10] | 
        
        
            
            6) MSBUS PROTOCOL
             | 
        
        
            | MSBUS comprises of a master bus (MBUS) with a single master which can be microprocessor and slave bus (SBUS)       with a single slave which can be a memory controller. The MBUS is a control bus which defines single transfer mode       with at least one cycle command and one cycle data. As MBUS is defined with single master, it does not require any       arbitration.It is responsible for low speed and low bandwidth functional register operations. It helps in reducing       interface complexity and power consumption as number of wires for basic word size data bus is minimal when       compared with other on-chip protocols.The timing diagram of MBUS protocol is shown in Fig. 11. A signal       m_addr_wdata is created in MBUSas a shared bus with read address, write address, and write data information. A valid       signal s_vld defined in MBUS is used to acknowledge the request that is required for synchronizing signals crossing       between master and slave clock domain which avoids command FIFO overflows. A response delay timer is defined in       the MBUS protocol in order to detect command errors. The command is indicated as “error” if the current response is a       timeout and is retried or discarded by the master. | 
        
        
            | The SBUS is a data bus which is responsible for high-speed and high-bandwidth data transfers. Two handshake signals       m_req and s_gnt are defined in SBUS which ensures that only one master occupies the read or write channel at a time.       SBUS signals are classified into five packets: command which consists of transfer direction, size, and initial       addressinformation; write packet; write data mask packet indicates the valid byte of the current word unit write data;       read datapacket; and the respond packet which indicates that the current write data is ready or the read data is valid.       The timingdiagram of SBUS protocol is shown in Fig.12. | 
        
        
            | The main feature of SBUS is that it supports block data transfer which is indicated by the most significant bit of data       size signal,.len[10:0] for instance. If len[10] is considered as logic 1, the current transfer is a block transfer, len[9:6]       denoting the width of the block and len[5:0] denoting the height of the block. Otherwise, it is defined as a linear       transfer and all the other 10-bit signals denote the total transfer size.The block transfer mode ensures every memory       boundary crossing commands predictable and computable by hardware. Hence, the time consumption of software       configuration and bus commands is reduced. | 
        
        
            | Fig.13 shows typical MSBUS SoC Architecture. MSBUS is a dual-bus structure. The control bus is named as Master-       Bus, because there is only one master themicroprocessor located on MBUS, likewise, the data bus isnamed as Slave-       Bus, because there is only one slave located on SBUS via the DMA Controller. As shown in Fig. 13, all the peripherals       such as UART, Timer,Serial Flash and GPIO, I2C devices are MBUS slaves. They are configured by softwarethrough       MBUS directly. On the other side, all the devices such as USB,Bluetooth,Wifi arethe masters of SBUS and access the       only slave memorythrough SBUS. | 
        
        
            | The following table shows comparison of SoC buses w.r.t features such as topology,arbitration,bus width,etc. | 
        
        
              | 
        
        
            
            CONCLUSION
             | 
        
        
            | In this brief, we have done the survey of different on-chip protocols along with their features and architectures. A       descriptive comparison between various on-chip protocols is made. MSBUS is efficient protocol as it can efficiently       transfers block of data thereby reducing the hardware resources and minimal power consumption. This can be verified       by implementing the MSBUS based DMA at RTL in HDL and comparing the same with other protocols by considering       various parameters such as transfer time consumption, wire efficiency, valid data bandwidth, dynamic energy       efficiency and power consumption. This is achieved by setting up Universal Verification Methodology (UVM) of given       MSBUS SoC Architecture by integrating various ready-to-use configurable agents such as I2C, GPIO, USB, Bluetooth       controller into test-bench and by application of various test cases. | 
        
        
            
            Figures at a glance
             | 
        
        
             
             
             
             
            
                
                    
                          | 
                          | 
                          | 
                          | 
                          | 
                     
                    
                        | Figure 1 | 
                        Figure 2 | 
                        Figure 3 | 
                        Figure 4 | 
                        Figure 5 | 
                     
                    
                          | 
                          | 
                          | 
                          | 
                          | 
                     
                    
                        | Figure 6 | 
                        Figure 7 | 
                        Figure 8 | 
                        Figure 9 | 
                        Figure 10 | 
                     
                    
                          | 
                          | 
                          | 
                     
                    
                        | Figure 11 | 
                        Figure 12 | 
                        Figure 13 | 
                     
                
             
             | 
        
        
            |   | 
        
        
            
            References
             | 
        
        
            
            
                - Xiaokun Yang and Jean H. Andrian, “A  High-Performance On-Chip Bus (MSBUS) Design and Verification,” IEEE  Transactions on Very Large Scale Integration (VLSI) Systems, Vol. PP, Issue:  99, July 2014.
 
                 
                - Xiaokun Yang and Jean H. Andrian,“A Low-Cost  and High-Performance Embedded System Architecture and an Evaluation  Methodology,”IEEE Computer Society Annual Symposium on  VLSI(ISVLSI),July 2014.
 
                 
                - AMBA AXI Protocol v 1.0 Specification,  ARM, 2003.
 
                 
                - Wishbone SOC Architecture Specification,  Revision B.3, Silicore Corp., USA, 2002.
 
                 
                - Open Core Protocol Specification, OCP  Int. Partnership, Beaverton, OR,USA, 2001.
 
                 
                - R. C. Gonzalez and R. E. Woods, Digital  Image Processing, 3rd ed.Englewood Cliffs, NJ, USA: Prentice-Hall, Jun.  2012, pp. 68–99.
 
                 
                - MPEG-2 Standards Part 1  Systems,ISO/IEC,GE,Switzerland,Jun.2010.
 
                 
                - Wireless LAN Medium Access Control (MAC)  and Physical Layer (PHY)Specification, IEEE Standard 802.11-1999, 1999.
 
                 
                - AMBA Specification, Rev 2.0, ARM, 1999.
 
                 
                - CoreConnect Bus  Architecture, IBM, Yorktown Heights, New York, NY,USA, 1999.
 
                 
             
             |