GENIUS Meeting 18 Minutes
From RealityGrid
GENIUS Meeting 24/4/08
Item 1:
ANL: PB, SN
EPCC: SB
UCL: PVC, MM, SM, SZ, OK, LF
Manchester: RH, RP
SDSC: KY
LSU: SJ
Item 2: Clinical update
SM: Have identified kit that we need to purchase to get link put together. Close to getting network link in place.
SM: Have high resolution datasets from scanning equipment. Hope to be able to put images from dataset online.
Item 3: HemeLB
MM: With latest large angiographic datasets the code can now perform as it should. For interactive simulations need 100-2000 cores - 1hr on 1000 cores. Studying variations of parameters to tune simulations.
MM: Working with Kevin Roy on OpenMP/MPI hybrid simulations. First results not successful.
MM: Would like to run HemeLB on 8K+ cores, but need higher resolution datasets.
Item 4: Advance reservations, emergency computing, and back-end node access
SM has provided SB with details required to get back end access to HPCx up and running.
SB: Are next in queue SPRUCE jobs on HPCx useful?
PVC: We'd be very happy with that.
SM: How do various resource providers cover real emergency jobs that require immediate access to machine.
PB: UC have set up their machine to do pre-emption. Regular jobs are charged at lower rate than emergency jobs. Some resources provide pre-emption on a single queue on the machine. Some IBM resources support freezing a job. Would advise HPCx provide both next-to-run and pre-emtive jobs to test with.
SN: GENIUS has next-to-run access on two resources at SDSC.
KY: For pre-emption, can only have one most important customer who is allowed to do pre-emption.
PB: 550TFlops on ANL large computing facility. Have 1 machine that is unaccounted for - PB can give access to 4000 cores (1 rack) on Bluegene-P. Production machine has 40,000 4 core nodes - 160,000 cores. Have directors discretionary time to give access to machine. LRac is coming up to provide millions of hours to users.
OK: Will request access to test rack.
PB: Don't need to have access to single rack test machine before accessing full machine, if you can demonstrate code works on BlueGene-P already.
OK: Will go straight on to ANL machine.
PB: Will put OK in touch with applications team re codes and libraries currently available on machine.
PB: Haven't yet enabled urgent computing on big BlueGene. Only 20 projects on machine. Could provide next to run access to small BlueGene.
PB: Apply for account on one rack machine (Surveyor). PB will act as sponsor and approve.
SJ: No progress on MPI flavour of Globus - running in to serious build problems.
SM: Would like to run some more accurate benchmarks across multiple resources for papers that are coming up.
Item 5: Software development
MM: Investigating a new LB algorithm with lower memory requirements.
RH: Hoping to get started on many of the things we've discussed in the next week. Nothing else to report right now.
SM: Working on developing proper user interface from tomorrow onwards. Should be ready to go by end of May.
RH: Already has a Java interface for steering up and running
Item 6: Publications
SM & PVC will attend TG08. SJ and KY will also be there. Not sure who will attend from ANL.
SJ: On the Monday of TG conference, Extreme Scaling working group will be meeting.
PVC: HPDC paper accepted. We may not attend - paper accepted as poster. Looking for someone who is attending to speak.
SJ: Will attend. Would be glad to speak to the poster.
PVC: Will be a show floor with stands at AHM08.
SM: Plan to submit a general genius paper, and also a paper on the infrastructure we've put together. Will also submit a paper on distributed computing.
PVC: SC'08 - timetable for submissions has slipped. Need to produce a video, and need infrastructure up and running ready for submission.
Item 7: AOB
KY: SM can you make executables available for testing GUR?
SM: Will do, next week.
RH: HPCx article has gone in to latest issue. Should appear soon.
Item 8: Next Meeting
Thurs 8 May.
