

# Generation of RTL-code and test facilities for on-chip TDM-network

CNoCCing on Heaven's Door?

Geir Åge Noven

ESTEC Round-Table on SoC September 2009

# Background



- Master thesis conducted by Steffen Forså Oslo, spring 2006
- Moore's law still applies
- Transistor size is shrinking faster than interconnect area
  - Routing consumes relatively more area than gates
  - Can we avoid the «Scyscraper limitation»?
  - Can we throw out a relatively large number of wires for the cost of only a few gates?



#### www.kongsberg.com

# Routing congestion, signal integrity

- Studies show that «long wires» on chips toggles at 10% of the clock rate of the chip, on average
  - Can we, by combining some of the «long wires» reduce the effort needed by the routing algorithms?
- Long, adjacent wires create crosstalk problems, and suffer from capacitive delays.
  - This problem is moved into a «communication module» and, thus isolated, is better solved there. (divide & conq)
- Routing congestion is a major concern in the design of large ASICs
  - A separate communication network will make overall planning of the layout easier





#### Examples of NoC

- Communication as a separate module on large chips
- DAC 2001: Communication is the next «level of abstraction»
- Several architectures and protocols have been suggested
  - Packet-/lineswitch, regular og irregular
- Network on Chip is a well-established area of research in the academic community
- Commercial solutions available since 2000
- This thesis describes a TDM-switched network



Route packets



Make a ring (Sonics inc.)

| Communication |  |  |  |  |
|---------------|--|--|--|--|
| Behavioural   |  |  |  |  |
| RTL           |  |  |  |  |
| Gate          |  |  |  |  |
| Transistor    |  |  |  |  |



"Most echonomical" network



# What is a NoC?

- An on-chip network is a porting of a LAN to the chip level (CAN?)
- The modules communicates via
  - Protocol
  - HW infrastructure, topology
- The General NoC Handles
  - Complexity
  - Resources



These has to be designed under the new prerequisites the chip poses compared to a traditional LAN

*Designspeed, re-usability, testability, routability* 

- Advantages of TDM:
  - Predictable delays
  - Simple hardware

Disadvantages of TDM

- Static Connections (predetermined)
- Not optimal for connections with bandwidth variation in time

(But: A TDM network can solve some physical problems and function as a customizable **Network layer** for packet switching.)



#### **Custom Network on Chip Creator**



# Behavioural synthesis

#### CNoCC in the tool flow





# CTG Communication Trace Graph

- **CTG** is the input to CnoCC
  - List of **Modules**
  - List of commnication channels with **Bandwith** and **Latency** requirements
- Some kind of Graph Partitioning Algorithm must be adapted og and run (not within the scope of the thesis)
- This type of algorithm is also used by «Floorplanning» tools
- The GPA is also required to suggest positions of **TDM switching** nodes
- If the TDM network is viewed as a Network Layer, the Packet switches are to be considered as Modules at this stage, belonging to the Transport I aver





risc

mem

3

upsp

mem

dct

rast

dsp

lbab

cpu

vu mem



#### CNoCC in the tool flow





# **Operation of CNoCC 1**

GenF2

Α

module

B

module

С

module

ZZZ

Ν

Ο

С



- Assume that the modules has an adapted interface towards the network
- Assume that the modules produce/consume a constant-bitrate module stream of data, where the frequency is derived from the clock frequency of the network (fs)
  - Choose a number N as the repitition rate of the sequensing (N clock cycles of fs)

RecF1

RecF2

This gives (rounded to integers) the bus width and packing scheme produced/consumed by the modules

s1

s2

s3

GenF1

s0



## **Operation of CNoCC 2**



1)Build shapes between switches 2) Sequence the bits within the shape

3) Map the inputshape to the outputshape with least possible delay







|    | 11 | 1 | 4 | 7 |
|----|----|---|---|---|
| 9  | 12 | 2 | 5 | 8 |
| 10 | 0  | 3 | 6 |   |

#### Planning of the network









#### Test generators/receivers





- Using a perl script, fairly large test networks was built (1000's of generators/consumers with different random bandwidths)
- They were automatically tested by Pseudorandom generators/analysers in a VHDL testbench (also automatically generated)
- The networks were synthisized to get the component resource usage



### Empirical tests of the system



- Many test cases with different variations of parameters were tested
- The results show the expected trends, but the Packing algorithm and the HW architecture has to be tuned to achieve better results
- The consumption of HW resources is too high





The area of ASICs and FPGAs are becoming increasingly dominated by routing resources, causing routing congestion





- The area of ASICs and FPGAs are becoming increasingly dominated by routing resources, causing routing congestion
- «The skyscraper problem» may stop devices from becoming increasingly larger





- The area of ASICs and FPGAs are becoming increasingly dominated by routing resources, causing routing congestion
- «The skyscraper problem» may stop devices from becoming increasingly larger
- Traditional routing can be used for local connections (using **lower** metal layers)





- The area of ASICs and FPGAs are becoming increasingly dominated by routing resources, causing routing congestion
- «The skyscraper problem» may stop devices from becoming increasingly larger
- Traditional routing can be used for local connections (using lower metal layers)
- Global routing is transferred to an «echonomical network» (i.e. using only the neccesary amount of routing) with deterministic behaviour and using top metal layers.





- The area of ASICs and FPGAs are becoming increasingly dominated by routing resources, causing routing congestion
- «The skyscraper problem» may stop devices from becoming increasingly larger
- Traditional routing can be used for local connections (using lower metal layers)
- Global routing is transferred to an «echonomical network» (i.e. using only the neccesary amount of routing) with deterministic behaviour and using top metal layers.
- The global network (e.g. using static TDM switching) may be customized to have channels used by





- The area of ASICs and FPGAs are becoming increasingly dominated by routing resources, causing routing congestion
- «The skyscraper problem» may stop devices from becoming increasingly larger
- Traditional routing can be used for local connections (using lower metal layers)
- Global routing is transferred to an «echonomical network» (i.e. using only the neccesary amount of routing) with deterministic behaviour and using top metal layers.
- The global network (e.g. using static TDM switching) may be customized to have channels used by
  - Low bandwith control/data signals with controlled latency





- The area of ASICs and FPGAs are becoming increasingly dominated by routing resources, causing routing congestion
- «The skyscraper problem» may stop devices from becoming increasingly larger
- Traditional routing can be used for local connections (using lower metal layers)
- Global routing is transferred to an «echonomical network» (i.e. using only the neccesary amount of routing) with deterministic behaviour and using top metal layers.
- The global network (e.g. using static TDM switching) may be customized to have channels used by
  - Low bandwith control/data signals with controlled latency
  - A packet-switched system for random-access communication between modules. Some of the Modules in the CTG becomes the Routers.





- The area of ASICs and FPGAs are becoming increasingly dominated by routing resources, causing routing congestion
- «The skyscraper problem» may stop devices from becoming increasingly larger
- Traditional routing can be used for local connections (using **lower** metal layers)
- Global routing is transferred to an «echonomical network» (i.e. using only the neccesary amount of routing) with deterministic behaviour and using top metal layers.
- The global network (e.g. using static TDM switching) may be customized to have channels used by
  - Low bandwith control/data signals with controlled latency
  - A packet-switched system for random-access communication between modules. Some of the Modules in the CTG becomes the Routers.
- Based on the CTG (Communication Trace Graph), specifying all modules and their interconnection needs, this global network may be synthesized by a system as shown with CNoCC

