Configuring csWMPI

The configuration of csWMPI is divided into two parts/files:

  • The Cluster Configuration describes the machines, and optionally the communication methods and security settings for your cluster.

  • The Process Group Configuration describes the initial set of processes at startup. This configuration includes which machines run what executables with what arguments.

Cluster Configuration

The Cluster Configuration file describes the machines available in a cluster, and optionally the communication devices to use and the security context of processes on each machine. Usually a user creates a Cluster Configuration file that contains information about all the machines and then re-uses that file for all csWMPI applications.

The path for this configuration file can be set in the environment variable csWMPI_CLUSTER_CONF_FILE. If this environment variable is not set, csWMPI will search for the default configuration file, named csWMPI.clusterconf, in the current working directory.

File format of cluster configuration files
A cluster configuration file is a normal text file (ASCII). It should be created and edited using a text editor like Notepad or pico. The lines that start with a hash mark (#) are comment lines. These lines are not interpreted by csWMPI.

A simple cluster configuration file is just a line /Machines followed by one line for each of the names of the machines in your cluster. Machines must be identified by a valid network name or IP address, so that all machines can communicate with each other.

A typical configuration can look like this:
/Machines
squirrel
mountain
flan

Thus, a cluster with three machines, squirrel, mountain, and flan. Optionally, you can configure how processes should communicate and under which security they should run, see Advanced Configuration.


Process Group

A process group is the initial set of processes created at the beginning of a computation. The process group for a given application is configured in a Process Group file. One of two different types of files can be selected. PG files (with .pg extension and the only available up to csWMPI v1.6.0) are easier to use for small configurations and PG2 (usually with .pg2 extension) are more flexible and easier to maintain in complex situations. In any case, this file lets you configure which executables are started on either all or a subset of the machines specified in the cluster configuration file. Several runtime options can also be specified in PG2 file.

PG file format (.pg files)
The format of a .pg file is as follows (only files with .pg extension are interpreted as such):
<Machine Name> <number of procs> <executable path> [arguments]
<Machine Name> <number of procs> <executable path> [arguments]
...

Each machine is identified as it is in the ClusterConf file. The number of procs denotes how many copies of the executable to run on the machine specified. The executable path must be relative to the machine on which the executable is to be run. Optionally, you can specify arguments for an executable.

For example, a heterogeneous computation with mountain and squirrel running MS Windows and pacific running Linux, the process group file would look like this:
mountain 2 "c:\mpi apps\weathersim" 10 256
squirrel 1 "c:\mpi apps\weathersim" 10 512
pacific 1 /home/csWMPI-user/apps/weathersim 20 256

The MPI ranks are assigned to processes in the order in which the processes entries appear in the configuration file. Like in the cluster configuration file the hash mark (#) can be used to place comments in the configuration file (characters after a hash mark are not interpreted).

PG2 file format (.pg2 files)
A PG2 file format allows one to configure a wider number of options than .pg files. XML allows one to easily read and maintain complex files. A simple .pg2 file would look as following:
<job>
     <set>
         <executable>c:\myapps\myapp.exe</executable>
         <processes>2</processes><!-- processes per machine -->
         <machine name="machine1"/>
         <machine name="machine2"/>
         <machine name="machine3"/>
     </set>
</job>

According this .pg2 file, 2 processes will be created on each machine and mpi ranks are distributed according to order of appearance of machine elements in the file.
PG2 files allows one to define a wider number of options, please refer to Advanced Configuration | Process Group for extensive and detailed information on all .pg2 XML elements.

Starting computation with direct-run
When starting the computation, by running an mpi executable directly, csWMPI will create afterwards all the processes specified in the .pg2/.pg file, thus the computation will have as it MPI_COMM_WORLD size all the processes in the .pg2/.pg file plus 1, the user create process.
One can set the environment variable csWMPI_PG_FILENAME with the path to a .pg2/.pg file. If this variable is set, csWMPI attempts to locate the specified file and exits on error in case it fails to open the file. If the environment variable is not defined, csWMPI looks for four different files in the executable's directory, by the following order:

  • [program name].pg2
  • csWMPI.pg2
  • [program name].pg
  • csWMPI.pg
  • Starting computation with mpiexec
    Using -configfile option of mpiexec, one can specify a .pg2/.pg file with the process group configuration. Only processes specified in that file will be created, thus MPI_COMM_WORLD size will be the number of processes in the .pg2/.pg file.

    Portable .pg2/.pg files
    One can create a portable process group file using the wildcard "." to specify the current machine as the host. Instead of specifying the Machine Name you simply put a ".", for example in a .pg2 file:
    <job>
         <set>
             <executable>c:\mpi apps\weathersim.exe</executable>
             <processes>2</processes><!-- processes per machine -->
             <machine name="."/>
         </set>
    </job>
    or in a .pg file:
    . 2 "c:\mpi apps\weathersim.exe" 10 256 

    By specifying the ".", weathersim.exe will be run on the current machine.


    Testing Configurations

    The application named csWMPItestconf will test a cluster configuration. Often it proves more convenient to use this tool after configurations have been modified, than to simply try to run the application.

    What is tested?
    For all machines specified in cluster configuration the following is tested:

  • Are the services are running?
  • Can the necessary libraries be located?
  • Is the security context (domain\user and password) valid?
  • Is the License Server running?
  • Is the license valid?
  • Usage
    csWMPItestconf <cluster configuration file>

    Example
    csWMPItestconf my_cluster.clusterconf

    Assume that the my_cluster.clusterconf contains the following configuration:
    /Machines 
    obelix
    alc

    The output of csWMPItestconf, if everything is correct, is:
    
    csWMPItestconf - Copyright 2004 by Critical Software 
    (www.criticalsoftware.com)
    
    License Server Check
    
    + License Server: ideafix
    
    + License:        Valid.
    ---------------------------
    
    
    Cluster configuration Interpretation
    
    The cluster has 2 machines
    
    
     * Machine alc *
    
    + Available devices:
    Device: shmem  Machine id: alc
    Device: tcp  Machine id: alc
    --
    + StartUp Address: alc
    --
    + Connections:
    Connect with machine alc using shmem device.
    Connect with machine obelix using tcp device.
    --
    + Security:
    Domain: CRITICAL
    User: jbrito
    --
    + Verification contact with remote machine:
    All libraries and security context verification succeeded in the 
    remote machine.
    
    
     * Machine obelix *
    
    + Available devices:
    Device: shmem  Machine id: obelix
    Device: tcp  Machine id: obelix
    --
    + StartUp Address: obelix
    --
    + Connections:
    Connect with machine alc using tcp device.
    Connect with machine obelix using shmem device.
    --
    + Security:
    Domain: CRITICAL
    User: jbrito
    --
    + Verification contact with remote machine:
    All libraries and security context verification succeeded in the 
    remote machine.
    
    ---------------------------
    
    csWMPI configuration is correct
    

    The tool sends the output to the standard output (usually the console). It is advised to redirect the output to a file and open it with a text editor (e.g. Notepad or pico). E.g.:
    > csWMPItestconf my_cluster.clusterconf > test_result.txt
    > notepad test_result.txt


    Registering Users

    In the cluster configuration file you specify the security contexts of the processes running on different machines. The security contexts are the user accounts under which processes should run. In order to use such accounts, csWMPI needs to know that password(s) for the user(s). csWMPI can store usernames and passwords (encrypted), so that we are not prompted for them each time you run a csWMPI application. In Windows, they are stored in the personal profile in registry. In Linux, they are saved in a file, by default, only accessible by the registering used.

    MS Windows Specific - csWMPIreguser:

    To manage the usernames and passwords database, you should use the tool named csWMPIreguser found in the bin sub-directory of the installation directory.

    When csWMPIreguser is run the following window will appear:

    In case of removing a user it is not necessary to insert the password. Alternatively, you use the command line arguments to specify username, domain, and password (useful for writing scripts that add numerous accounts automatically).

    Usage:

    csWMPIreguser [-a <domain> <username> <password>] [-r <domain> <username>]

    Option: Description:
    -a <domain> <username> <password> Add a user or alter one's password if the user already exists 
    -r <domain> <username> Remove a user
    -help Display usage.

     

    Linux Specific - csWMPIreguser:

    To manage the usernames and passwords database, you should use the tool named csWMPIreguser found in the bin sub-directory of the installation directory.

    Usage:

    csWMPIreguser -a [{<domain>|none} [<username> [<password>]]]
    csWMPIreguser -r {<domain>|none} [<username>]

    Option: Description:
    -a [{<domain>|none} [<username> [<password>]]] Add a user or alter one's password if the user already exists. User will be asked for missing parameters.
    -r [{<domain>|none} [<username>]]] Remove a user. User will be asked for missing parameters.
    -help Display usage.

    To register a Linux user with name "ideafix" and password "passwd" you can use, e.g.:
    csWMPIreguser -a none ideafix passwd




    © 2009 Critical Software SA. All trademarks and copyrights on this page are owned by their respective owners.
    cscsWMPI II™, cscsWMPI™ and PatentMPI™ are trademarks of Critical Software SA. All Rights Reserved.