MPI-over-Grid
Talk given at 9th AstroGrid-D Meeting
in Potsdam, 25th February 2008
Overview
MPI Message Passing Interface
- focus is on parallel communications.
- machines must at least be running at the same time.
- many applications require synchronized communication, this means
machines have to run at about the same speed.
- but there are many applications that don't require such synchronization.
Grid technology
- primary advantage: authentication via trust mechanism
- some nice tools: job submission, fast file transfer
How can they go together
- MPI-Over-Grid
- submitting MPI jobs to cluster using grid technology
Requirements
- each node must be a Grid resource
- requires knowledge of architecture and Globus flavor on nodes
ports for Globus 2 gateway
Job submission over grid
- big advantage: abstracts away the cluster resource manager.
- important for those who run jobs on many different clusters.
Issues
- synchronization (application dependent)
- if nodes are remote, bandwidth lower than on cluster
- communications overhead on fast cluster interconnect
- job management
- administration, installation of Globus
Topologies
(depend heavily on application)
- heterogeneous nodes
-
- synchronization issues
- job/utilization permission issues
- tightly communicating clusters
-
- Synchronization issues
- Not clear where grid comes in here...
could communicate without grid authentication
- Some communications overhead.
- MPI-over-grid on remote clusters
-
- each node must be a grid resource, but current installation
of Globus makes this difficult.
- hierarchical
-
- Compute nodes not grid resources. only head nodes are.
- MPI jobs run on clusters, each cluster with controlling process
on head node communicates intermediate results among clusters
Problems
MPICH-G2 a moribund project
if MPI wanted in grid, pressure must be applied.
job managers inadequate
(Condor?) (must manage arch/flavor, fine control over
when, how much, resource may be used by whom.