- Home›
- Technology and Research›
- Intel Technology Journal›
- Tera-scale Computing
Tera-scale Computing
Integration Challenges and Tradeoffs for Terascale Architectures
REFERENCES
[1] J. Andrews and N. Baker, "XBOX 360 System Architecture," IEEE Micro, MarchApril 2006.
[2] J. Balfour and W. J. Dally, "Design Tradeoffs for Tiled CMP On-Chip Networks," International Conference on Supercomputing, June 2006.
[3] J. Barth et al., "A 500MHz Random Cycle 1.5nsLatency, SOI Embedded DRAM Macro Featuring a 3T Micro Sense Amplifier," IEEE International Solid-State Circuits Conference, Feb. 2007.
[4] B. M. Beckman, M. R. Marty and D. A. Wood, "ASR: Adaptive Selective Replication for CMP Caches," in Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (Micro), Orlando, FL, December 2006.
[5] R.V. Bopanna and S. Chalasani, "FaultTolerant Wormhole Routing Algorithms for Mesh Networks," IEEE Trans. Computers, vol. 44, no. 7, pp. 848864, July 1995.
[6] S. Borkar, "Challenges in Reliable System Design in the Presence of Transistor Variability and Degradation," IEEE Micro, vol. 25, n. 6, pp. 1016 Nov.Dec. 2005.
[7] F. Briggs et. al., "Intel 870: A Building Block for CostEffective Scalable Servers," IEEE Micro, MarchApril 2002, pp. 3647.
[8] D. Chaiken, C. Fields, K. Kurihara, A. Agarwal, "Directorybased cache coherence in large-scale multiprocessors," IEEE Computer, June 1990, pp. 4958.
[9] J. Chang and G. S. Sohi, "Cooperative Caching for Chip Multiprocessors," in Proceedings of the 33rd International Symposium on Computer Architecture, Boston, MA, June 2006.
[10] "ComputeIntensive, Highly Parallel Applications and Uses," Intel Technology Journal, Volume 09 Issue 02, May 2005.
[11] P. Gratz, K. Sankaralingam, H. Hanson, P. Shivakumar, R. McDonald, S. Keckler, D. Burger, "Implementation and Evaluation of a Dynamically Routed Processor Operand Network," IEEE/ACM International Symposium on NetworksonChips (NOCS), May 2007.
[12] "IEEE standard for Scalable Coherent Interface (SCI)," IEEE P1596, August 1993.
[13] Intel News Release, "Intel Develops Tera-Scale Research Chips," Sept 26, 2006, at
http://www.intel.com/pressroom/archive/releases/
20060926corp_b.htm.
[14] D. N. Jayasimha, B. Zafar, Y. Hoskote, "On-die Interconnection Networks: Why They are Different and How to Compare Them," Technical Report at http://blogs.intel.com/research/terascale/ODI_why-different.pdf, Microprocessor Technology Lab, Corporate Technology Group, Intel Corp.
[15] C. Kim, D. Burger, and S. W. Keckler, "An Adaptive, Non-Uniform Cache Structure for Wire-Delay Dominated On-Chip Caches," in Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems, Oct. 2002.
[16] A. Kumar, L-S. Peh, P. Kundu, N. Jha, "Express Virtual Channels: Towards the Ideal Interconnection Fabric," in Proceedings 34th Annual International Symposium on Computer Architecture (ISCA'07), pp. 150161, June 2007.
[17] D. Lenoski, J. Laudon, T. Joe, D Nakahira, L Stevens, A. Gupta, and J. Hennessy, "The DASH Prototype: Implementation and Performance," In Proceedings19th International Symposium on Computer Architecture, pp. 92103, Gold Coast, Australia, May 1992.
[18] A. S. Leon, et al., "A power-efficient high-throughput 32-thread SPARC processor," IEEE International Solid-State Circuits Conference, Feb. 2006.
[19] "MPI Performance Measurements" at http://www.llnl.gov/computing/mpi/mpi_benchmarks.html.*
[20] C. McNairy and R. Bhatia, "Montecito: A dual-core, dual-threaded Itanium® processor," IEEE Micro, MarchApril, 2005.
[21] C.A. Nicopoulos, D. Park, J. Kim, N. Vijaykrishnan, M. S. Yousif, C.R. Das, "ViChaR: A Dynamic Virtual Channel Regulator for NetworkonChip Routers," International Symposium On Microarchitecture (MICRO'06) pp. 333346, Dec. 2006.
[22] M. K. Qureshi, A. Jaleel, Y. N. Patt, S. C. Steely Jr., J. Emer, "Adaptive Insertion Policies for High Performance Caching," International Symposium on Computer Arachitecture, June 2007.
[23] N. Sakran, et al., "The Implementation of the 65nm DualCore 64b Merom Processor," IEEE International SolidState Circuits Conference, Feb. 2007.
[24] P. Stenstrom, "A survey of cache coherence schemes for multiprocessors," IEEE Computer, pp. 1224, June 1990.
[25] M. B. Taylor, W. Lee, S. Amarasinghe, A. Agarwal, "Scalar Operand Networks: On-chip Interconnect for ILP in Partitioned Architectures," International Symposium on High Performance Computer Architecture, February 2003.
[26] A. W. Topol et al., "Three-dimensional integrated circuits," IBM Journal of Research and Development, vol. 50, no. 4/5, 2006, pp. 491506.
[27] S. Vangal et al., "An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS," IEEE International Solid-State Circuits Conference, Feb. 2007.
[28] HS. Wang, L-S. Peh, N. Jha, "Power-driven design of router microarchitectures in on-chip networks," International Symposium On Microarchitecture (MICRO'03), pp. 105116, Nov. 2003.
[29] P. Wu, A. E. Eichenberger, A. Wang, P. Zhao, "An integrated simdization framework using virtual vectors," International Conference on Supercomputing, pp. 169178, June 2005.
[30] M. Zhang and K. Asanovic, "Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches," MIT CSAIL Technical Report, MIT-CSAIL-TR-2005-064, Cambridge, MA, October 2005.
[31] M. Zhang and K. Asanovic, "Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors," in Proceedings 32nd International Symposium on Computer Architecture, Madison, WI, June 2005.
