Sunday, August 13, 2006
HPCBench Now Supports Linux Kernel 2.6.X
You can visit http://hpcbench.sourceforge.net for more information about HPCBench.
Friday, October 08, 2004
HPCBench Now Open Source
Overview
Hpcbench is a Linux-based network benchmark evaluating the high performance networks such as Gigabit Ethernet, Myrinet and QsNet. Hpcbench measures the network latency and achievable throughput between two ends. Hpcbench is able to log the kernel information for each test, which includes the CPU and memory usage, interrupts, swapping, paging, context switches, network cards' statistics, etc.
Hpcbench consists of three independent packages that test UDP, TCP and MPI communications respectively. A kernel resources tracing tool "sysmon" is also included, whose output is similar to that of vmstat, but has more information of network statistics.
Programming language: C, MPI.
Recommended OS and compiler: Linux kernel 2.4 and gcc.
UDP communication:
- Microsecond resolution
- Roundtrip time test (UDP ping)
- Throughput test
- Unidirectional and Bidirectional test
- UDP traffic generator (can run in single mode)
- Fixed size and exponential test
- Log throughputs and process resource usage of each test
- Log system resources information of client and server (Linux only)
- Create plot configuration file for gnuplot
- Configurable message size
- Other tunable parameters:
- Port number
- Client and server's UDP socket buffer size
- Message size
- Packet (datagram) size
- Data size of each read/write
- QoS (TOS) type (Pre-defined six levels)
- Test time
- Test repetition
- Maximum throughput restriction (Unidirectional and UDP traffic generator)
TCP communication:
- Microsecond resolution
- Roundtrip Time test (TCP ping)
- Throughput test
- Unidirectional and Bidirectional test
- Blocking and non-blocking test
- Fixed size and exponential test
- Linux sendfile() test
- Log throughputs and process resource usage of each test
- Log system resources information of client and server (Linux only)
- Create plot configuration file for gnuplot
- Configurable message size
- Other tunable parameters:
- Port number
- Client and server's TCP socket buffer (window) size
- Message size
- Data size of each read/write
- Iteration of read/write
- MTU (MSS) setting
- TCP socket's TCP_NODELAY option setting
- TCP socket's TCP_CORK option setting
- QoS (TOS) type (Pre-defined six levels)
- Test time
- Test repetition
MPI communication:
- Microsecond resolution
- Roundtrip Time test (MPI ping)
- Throughput test
- Unidirectional and Bidirectional test
- Blocking and non-blocking test
- Fixed size and exponential test
- Log throughputs and process resource usage of each test
- Log system resources information of two processes (nodes) (Linux only)
- Create plot configuration file for gnuplot
- Tunable parameters:
- Message size
- Test time
- Test repetition
Monday, June 21, 2004
Performance Analysis of High Performance Computing Networks
(Oct. 2004 updated). The full thesis can be downloaded here.
Following is the table of context of my work:
Network Performance Measurement and Analysis in High Performance Computing Environments
Chapter 1 Introduction.
Chapter 2 Background. 3
2.1 HPC history and Its Convergence to Cluster Computing.
2.2 HPC networking.
2.2.1 High Performance Network Technologies.
2.2.2 Networking of HPC Clusters.
2.3 Message Passing Interface (MPI
2.3.1 MPI Introduction.
2.3.2 MPICH.
2.4 Job Management System.
2.4.1 Goals of JMS.
2.4.2 LSF (Load Share Facility).
2.5 File Systems in HPC Clusters
2.5.1 Storage Networking.
2.5.2 Cluster File Systems.
2.5.3 Network Storage in SHARCNET.
2.6 Test-bed specifications
Chapter 3 Implementation of Hpcbench.
3.1 A Survey of Network Measurement Tools.
3.2 Metrics.
3.3 Communication Model.
3.4 Timers and Timing.
3.5 Iteration Estimation and Communication Synchronization.
3.6 System Resource Tracing.
3.7 UDP Communication Measurement Considerations.
3.8 An overivew of Hpcbench.
Chapter 4 Investigation of Gigabit Ethernet in HPC Clusters.
4.1 A Closer Look at Gigabit Ethernet
4.1.1 Protocol Properties.
4.1.2 Interrupts Coalescence and Jumbo Frame Size.
4.1.3 Data Buffers and Zero-Copy Technique.
4.2 Network Performance Analysis of Gigabit Ethernet
4.2.1 Examining Network Protocols Communication Internal
4.2.1.1 Alpha SMP Architecture.
4.2.1.2 Intel Xeon SMP Architecture.
4.2.2 Network Performance vs. Computer Performance.
4.2.3 Blocking and Non-blocking Communication.
4.2.4 Local Communication.
4.2.5 Network Protocols Latency.
4.2.6 TCP/IP Communication Throughput
4.3 A Comparison with Myrinet and Quadrics Interconnects
Chapter 5 Conclusions and Future Work.
5.1 Summary and Conclusions
5.2 Future Work.
Reference
94. 106
Tuesday, March 16, 2004
MPICH Cluster Setup
This is a test to setup a cluster with two nodes using my home machines.
/etc/hosts (Redhat 9.0 with kernel 2.4.20) files:
Dell Inspiron 8100 (master node): 192.182.1.2 node1
Dell Dimension L600 (secondary node): 192.182.1.3 node2
Download the MPICH 1.2.5.2 from http://www-unix.mcs.anl.gov/mpi/mpich/ , follow the instruction to install on both machines. MPICH uses rsh or ssh to communicate with each other. The default is rsh. If you like to use ssh (secure shell) instead, you should configure with following parameters in the MPICH install directory:
[root]# ./configure --with-device=ch_p4 --prefix=/usr/local/mpich --rsh=ssh
[root]# make
After installation, add the /mpich_install_dir/bin and /mpich_install_dir/util to your $PATH environment. To let the master node ( laptop node1) know the other secondary nodes, add all nodes in the file /mpich_install_dir/util/machines/machines.LINUX:
[huang]$ cat machines.LINUX
# Change this file to contain the machines that you want to use
# to run MPI jobs on. The format is one host name per line, with either
# hostname
# or
# hostname:n
# where n is the number of processors in an SMP. The hostname should
# be the same as the result from the command "hostname"
#localhost.localdomain
node1
node2
To enable rsh (remote shell), edit the /etc/xinetd.d/rsh, change the line of "disable = yes" to "disable = no". To be convenient, I also enable the rlogin service. After the modification, you have to restart the xinetd daemon:
[root]# /etc/rc.d/init.d/xinetd restart
To let the node1 (master node) be able to run the programs in node2 automatically without password prompt, add .rhosts file in user's home directory of node2:
[huang] $ cat ~/.rhosts
node1 huang
Also, the /etc/hosts.allow and /etc/hosts.deny files must be correctly configured to allow the rsh service. For simplicity reason, I add following line on the /etc/hosts.allow file to accept all services between two machines:
ALL: node1 node2 192.182.1.0/255.255.255.0, 127.0.0.1
To allow the super user root to use the rsh and rlogin services, add another line on file /etc/securetty:
rsh, rlogin, rexec, pts/0, pts/1
The authentication file /etc/pam.d/rsh should also be modified:
[root]# cat /etc/pam.d/rsh
#%PAM-1.0
# For root login to succeed here with pam_securetty, "rsh" must be
# listed in /etc/securetty.
auth sufficient /lib/security/pam_nologin.so
auth optional /lib/security/pam_securetty.so
auth sufficient /lib/security/pam_env.so
auth sufficient /lib/security/pam_rhosts_auth.so
account sufficient /lib/security/pam_stack.so service=system-auth
session sufficient /lib/security/pam_stack.so service=system-auth
We could verify the rsh service in master node (node1):
[huang]$ rsh node2 "ps -ef"
Then the running processes in node2 will be showed on node1. Now it's ready to run parallel programs. There are some sample programs in /mpich_install_dir/examples/basic directory, enter the directory and compile the source code with command make (in both machines), e.g., cpi is MPI program to compute the PI value:
[huang]$ mpirun -np 2 cpi
Process 0 of 2 on node1
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.001943
Process 1 of 2 on node2
Make sure the executable files in each machine must be in the same directory structure. We could also specify a configure file instead of using the default machines.LINUX configuration:
[huang]$ cat my.conf
node1 0 /home/huang/cpi
node2 1 /huang/cpi
[huang]$ mpirun -p4pg my.conf cpi
Process 0 of 1 on node1
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.002097
Process 1 of 2 on node2
P4 procgroup file is my.conf.
[huang]$
Enjoy the power of parallel computing!