Intel's Teraflops Research Chip implements several innovations for multi-core architectures:
-
Rapid design - The tiled-design approach allows designers to use smaller cores that can easily be repeated across the chip. A single-core chip of this size (100 million transistors) would take roughly twice as long and twice as many people to design.
-
Network on a chip - In addition to the compute element, each core contains a 5-port messaging passing router. These are connected in a 2D mesh network that implement message-passing. This mesh interconnect scheme could prove much more scalable than today's multi-core chip interconnects, allowing for better communications between the cores and delivering more processor performance.
-
Fine-grain power management - The individual compute engines and data routers in each core can be activated or put to sleep based on the performance required by the application a person is running. In addition, new circuit techniques give the chip world-class power efficiency—1 teraflops requires only 62W, comparable to desktop processors sold today.
-
And other innovations - Such as sleep transistors, mesochronous clocking, and clock gating.
Below is a summary of results from the research chip. Note that while performance gains can still be made through frequency scaling, there is a significant cost in terms of energy efficiency. This underscores the motivation to scale by utilizing more and more cores, instead of just increasing the frequency.
| Frequency |
Voltage |
Power |
Aggregate Bandwidth |
Performance |
| 3.16 GHz |
0.95 V |
62W |
1.62 Terabits/s |
1.01 Teraflops |
| 5.1 GHz |
1.2 V |
175W |
2.61 Terabits/s |
1.63 Teraflops |
| 5.7 GHz |
1.35 V |
265W |
2.92 Terabits/s |
1.81 Teraflops |
ASCI Red was the first computer to benchmark at a teraflops (1996). That system used nearly 10,000 Pentium® Pro processors running at 200MHz and consumed 500kW of power plus an additional 500kW just to cool the room that housed it. Although not a general purpose computing device, this Teraflops Research Chip delivers 1.0 teraflops of performance and 1.6 terabits aggregate core to core communication bandwidth, while dissipating only 62W.
Bringing tera-scale computing to PCs and servers requires a new way of building processors that can be thought of as a network of powerful computers on a chip. This Teraflops Research Chip is one important example of how the Intel® Tera-scale Computing Research Program aims to change the future through constant hardware and software innovation.
Print this article.
|