Lecture 1: Fundamentals of Quantitative Design and Analysis

Lecture 2: Memory Hierarchy Design

Lecture 3: Instruction-Level Parallelism and Its Exploitation

·         H.M. Mathis, A.E. Mercias, J. D. McCalpin, R.J. Eickemeyer, and S.R. Kunkel, “Characterization of the multithreading (SMT) efficiency in Power5,” IBM J. Res. & Dev., 49:4/5 (July/September), 555–564.

·         B. Sinharoy, R. N. Koala, J. M. Tendler, R. J. Eickemeyer, and J. B. Joyner, “POWER5 system microarchitecture,” IBM J. Res. & Dev, 49:4-5, 505–521.

·         J.M. Tendler, J. S. Dodson, J. S. Fields, Jr., H. Le, and B. Sinharoy, “Power4 system microarchitecture,” IBM J. Res & Dev, 46:1, 5–26.

·         N. Tuck, and D. Tullsen, “Initial observations of the simultaneous multithreading Pentium 4 processor,” Proc. 12th Int. Conf. on Parallel Architectures and Compilation Techniques (PACT), 2003, pp. 26–34.

Lecture 5: Thread-Level Parallelsim

Lecture 6: Data-Level Parallelism

Appendix A: Instruction Set Architecture

·         D.A. Jimenez and C. Lin, “Neural methods for dynamic branch prediction,” ACM Trans. Computer Sys 20:4, (November), 369–397.

·         C. McNairy and D. Soltis. “Itanium 2 processor microarchitecture,” IEEE Micro, vol. 23, no. :2, pp. 44–55, Mar.-Apr. 2003.