QCDOC is QCD On a Chip
Multi-teraflops QCD requirements scale as N2.5requiring
Thousands of processors
High-bandwidth, low-latency network
QCDOC architecture exploits lattice QCD regularity and supports around 20 thousand processors of 1 Gflop each
CPU/FPU and communications hardware integrated on a single chip manufactured by IBM
Single-chip processing node:
1 Gflops, 64 bit IEEE FPU, PowerPC
4 Mbytes on-chip memory
6-dimensional mesh network:
1.5 Gbyte/sec/node network bandwidth
≤ 500 ns network latency
≤ 2 Gbyte DIMM DDR SDRAM external memory per node
1.5 Gbyte/sec/node network bandwidth
Commodity Fast Ethernet provides independent host access to each node

Block Diagram of the QCDOC ASIC

Four-node development system

Ethernet/JTAG Prototypes
ASIC design nearly finished
Physics code performance running in simulator:
single node FPU, SU(3) x
2-spinor 84 % (L1 cache)
78% (eDRAM)
cache fill/flush to eDRAM 3.2 GB/s sustained
bandwidth from/to eDRAM 2.5/2.0 GB/s
DDR fill/flush 1.2/1.7 GB/s
2-node Ethernet simulation
4-node OS development system: PowerPC 405GP and Ethernet/JTAG boards
preliminary schedule:
~1.5TFlops sustained in 2002
~10 TFlops sustained in
2003