[Carpet] component geometry for multi-core processor chips

Steve White steve.white at aei.mpg.de
Tue Aug 29 10:05:43 CEST 2006


Erik,

That was the easy part.

The harder question is how to efficiently pack the computational domain
with the clumps of processes.

"Clump" is new.  Do we ever shy away from new nomenclature? 

Let me clarify my terminology here:
	core: hardware computational processing unit (previously CPU)
	node: an individual computer in a cluster, each of which may 
	      contain several cores
	computational domain: the rectangular range of xyz values
	component: a rectangular subrange of the computational domain,
	           whose interior calculations are executed on a single
	           processor core
	clump: a set of contiguous components, meant to be assigned to
	       the processor cores of a single node.


* a reasonable assumption is that the number of cores on a node is a
  multiple of 2 (or even a power of 2).

* it will be best for the clumps all to be the same.  That is, the total
  number of processes should be restricted to
	k × n
  where K is the clump size, and that the clumps should be arranged 
  all the same.  (As opposed to: on a 4 processor machine, having some
  clumps arranged as 2 × 2 slabs, and others as 1 x 4 chains.  Otherwise,
  several other issues arise, which in the end may ruin the performance gain.

  I would suggest to warn the user if they have chosen a geometry that
  doesn't permit a good decomposition into clumps, and fall back to the
  old algorithm in that case.

* for clump size k = 1 or 8, one can treat the (cubical) clumps as components
  were treated before as regards decomposing the computational domain.

* for clump size k = 2 (Peyote, Mike) or 4 (as on Belladonna), the clumps
  can't be cubical, so one has to determine a good orientation for them to
  best pack the computational domain. 

On 28.08.06, Erik Schnetter wrote:
> On Aug 28, 2006, at 01:52:15, Steve White wrote:
> 
> >Second, it would be better not to ask the specific batch system  
> >(PBS) about
> >which processes are running on which nodes.  I would prefer a means  
> >that
> >is independent of batch system.
> >
> >How about:
> >
> >	if this is not the MPI root process
> >		* send to MPI root process the result of
> >			system( "uname -n" )
> 
> You mean Util_GetHostName.
> 
> >	otherwise
> >		* make a hash of
> >			node_name, mpi_rank
> >		* wait for message from each other process
> >		* for each message,
> >			add to the hash the the message body with mpi_rank
> >		* add to the hash the present node name with mpi_rank = 0
> >
> >Now you have an easy association of nodes to MPI rank numbers.
> 
> Clever.  I didn't think of that.  I was thinking along the lines of  
> running a short benchmark, determining latencies and bandwidths  
> between the individual processors.  That would unfortunately be quite  
> expensive, since you would need to test each processor pair, and the  
> individual communications would influence each other.
> 


-- 
Steve White : Programmer
Max-Planck-Institut für Gravitationsphysik      Albert-Einstein-Institut
Am Mühlenberg 1, D-14476 Golm, Germany                  +49-331-567-7625




More information about the developers mailing list