HemeLB Deployment

From RealityGrid

Jump to: navigation, search

Contents

[edit] HemeLB deployment

HemeLB is a high performance lattice-Boltzmann fluid flow solver particularly well-suited to simulate single-phase fluid behaviour confined in complex systems either on single multiprocessor platform or across several such machines. HemeLB is being used in the GENIUS project to characterize blood flow in arteries and vessels of the brain (cerebral haemodynamics). The computational resources located at the TeraGrid, LONI, NW-Grid, NGS, HPCx and HECToR will be employed to simulate various cerebral vascular systems. AHE and HARC will be used whenever possible to launch simulations on single and multiple resources on the aforementioned infrastructures. This page, which will be updated frequently, will show the course of HemeLB's installation on various platforms. For information about the deployment of HARC on those resources, see GENIUS HARC For information about the integration of HARC into the AHE see AHE GENIUS.


[edit] Running HemeLB

We are in the process of installing stable versions of HemeLB on various machines. Two directories are given in the table below, one for executable locations and the other for data input locations. One executable is currently available - 'hemelb_bench' which has been compiled using the standard single-machine MPI. There are also three datasets available, 'angio1', 'angio4', 'square_duct_32x16x16' and 'bifurcation_512x512x512'. For simple testing it is recommended you use the square duct dataset.

HemeLB takes two arguments - the location of the input files, along with the time to run (for benchmarking purposes). For example, to run on the SDSC IA-64 cluster using angio1 for 3 minutes:

/users/smanos/shared/hemelb/bin/hemelb_bench /path/to/input/files/angio1 3


Please be sure that you copy the input directories and contained files to your own writable diskspace since output files are created within that directory.

[edit] Installation Status

The following table (under construction) summarises HemeLB's installation status, the installation status of the RealityGrid steering library and the locations of a RealityGrid instrumented HemeLB installation.



Infrastructure Machine HemeLB RealityGrid deployed RealityGrid HemeLB Binary location Dataset location
TeraGrid SDSC IA-64 Yes Yes Yes /users/smanos/shared/hemelb/bin /users/smanos/shared/hemelb/data
NCSA IA-64 Yes Yes Yes
ANL IA-64 Yes No No
NCSA Abe Yes No No
TACC Lonestar Yes Yes Yes
TACC Ranger No No No
PSC Bigben Yes No No
LONI QueenBee No (network to machine under installation) No No
LONI bluedawg Yes Yes Yes
ducky Yes Yes Yes
zeke Yes Yes Yes
NW-Grid man1 No Yes No
man2 No Yes No
lancs1 No No No
lv1 No No No
dll No No No
NGS2 Manchester Yes Yes Yes
Oxford Yes Yes Yes
STFC-RAL No (In Progress) Yes (C only) No
Leeds Yes Yes Yes
HPCx Yes No No
HECToR Yes No No

[edit] TeraGrid

HemeLB will be installed soon on the TeraGrid Lonestar. However, it has already been tested on the IA64 Linux clusters at SDSC and NCSA and on Bigben.

[edit] LONI

HemeLB has already been installed on all publicly usable platforms of LONI infrastructure.

Starting in very early 2008, Queen Bee will become part of the TeraGrid as well as being the central machine on LONI.

[edit] NW-Grid

We will request NW-Grid (regional Grid in the North-West of England NW-Grid) support to get accounts and to install HemeLB there .


[edit] NGS2

HemeLB has already been installed on the new NGS2 resources of the UK National Grid Service (NGS) located at Manchester and Oxford.

[edit] HPCx

HemeLB has already been installed on HPCx where, currently, users do not have permission to make advance reservations using AHE and HARC.


[edit] HECToR

The new UK National Service machine HECToR will be available online in October 2007.

HemeLB has been compiled and tested on the HECToR test system by Kevin Roy, and is working on the Cray XT4 [30/8/2007].

[edit] RealityGrid

The RealityGrid steering library has been installed and tested at all the sites highlighted in green and yellow in the table above. The yellow sites have notes below. Both file-based (for debug) and grid-based steering versions have been installed. Presently all installations are in my (Robert Haines) home dir, but I hope to move them to a public location soon.

[edit] Middleware status

See the Registry Browser for details of running containers and IOProxies. If you can't see the page, you'll need to send me your DN and I'll add you to the ACL.

[edit] Site specific notes

  • NGS2 at RAL - I'm having trouble with the Intel compilers. I have submitted a ticket to grid-support.ac.uk.

[edit] HemeLB's fluid solver performance

Single-site performance of the HemeLB's fluid solver have been tested on HPCx simulating the fluid flow of an idealized bifurcation of 6.25M fluid lattice sites. The performance behaviour in terms of millions of fluid lattice site updates per second (MSUPS) and time steps per second (TSS) as a functions of the number of processors (PEs) is provided in the following table:



PEs MSUPS TSS
8 41.98 6.713
32 136.5 21.83
64 262.5 41.98
128 635.4 101.6
256 1185 189.5
512 2097 335.3
1024 3579 572.3

The time needed to check convergence and stability was not considered and the executable was produced with "xlc -qstrict -qstrict_induction -O5 -qipa=level=2 -qhot=vector -qcache=auto -lm -bmaxdata:0x70000000".

Preliminary cross-site benchmarks have been conducted on LONI machines simulating the flow field of a cerebral patient-specific system of about 4.69M fluid lattice sites. The cross-site runs were performed on two machines using half of the total number of processors (PEs) in each one. The timing results in MSUPS are reported in the following table (the numbers between parentheses indicate the TSS):



PEs Single-site Cross-site
16 50.09 (10.68) 49.24 (10.50)
32 104.2 (22.22) 101.2 (21.58)
64 236.7 (50.46) 238.0 (50.74)


In contrast to the benchmarks of the bifurcayion, the time spent to check convergence and stability was taken into account. It is stressed that the previous cross-site benchmarks are preliminary: the elapsed times of the benchmarks were too short, the resolution of the function that returns the elapsed time is too low compared to that associated with a single time step and the compiler optimizations used was relatively low on those machines ("-O3"). However, the timing results are not far from the definitive ones. We can see that cross-site runs exhibit a very similar performance with respect to the single-site ones which is very high. Superlinear speedup is clearly achieved in any case.

Personal tools
projects