Monday, June 21, 2004

Performance Analysis of High Performance Computing Networks

After a lot of writings, reviews, rewriting, my thesis is finally completed! Thanks Dr. Bauer and Dr. Katchabaw's help.

(Oct. 2004 updated). The full thesis can be downloaded here.

Following is the table of context of my work:

Network Performance Measurement and Analysis in High Performance Computing Environments

Chapter 1 Introduction.

Chapter 2 Background. 3
2.1 HPC history and Its Convergence to Cluster Computing.
2.2 HPC networking.
2.2.1 High Performance Network Technologies.
2.2.2 Networking of HPC Clusters.
2.3 Message Passing Interface (MPI
2.3.1 MPI Introduction.
2.3.2 MPICH.
2.4 Job Management System.
2.4.1 Goals of JMS.
2.4.2 LSF (Load Share Facility).
2.5 File Systems in HPC Clusters
2.5.1 Storage Networking.
2.5.2 Cluster File Systems.
2.5.3 Network Storage in SHARCNET.
2.6 Test-bed specifications

Chapter 3 Implementation of Hpcbench.
3.1 A Survey of Network Measurement Tools.
3.2 Metrics.
3.3 Communication Model.
3.4 Timers and Timing.
3.5 Iteration Estimation and Communication Synchronization.
3.6 System Resource Tracing.
3.7 UDP Communication Measurement Considerations.
3.8 An overivew of Hpcbench.

Chapter 4 Investigation of Gigabit Ethernet in HPC Clusters.
4.1 A Closer Look at Gigabit Ethernet
4.1.1 Protocol Properties.
4.1.2 Interrupts Coalescence and Jumbo Frame Size.
4.1.3 Data Buffers and Zero-Copy Technique.
4.2 Network Performance Analysis of Gigabit Ethernet
4.2.1 Examining Network Protocols Communication Internal
4.2.1.1 Alpha SMP Architecture.
4.2.1.2 Intel Xeon SMP Architecture.
4.2.2 Network Performance vs. Computer Performance.
4.2.3 Blocking and Non-blocking Communication.
4.2.4 Local Communication.
4.2.5 Network Protocols Latency.
4.2.6 TCP/IP Communication Throughput
4.3 A Comparison with Myrinet and Quadrics Interconnects

Chapter 5 Conclusions and Future Work.
5.1 Summary and Conclusions
5.2 Future Work.

Reference
94. 106