Re: Running Molcas on a Linux Cluster


[ Molcas user's WWWBoard ]

Posted by Darko Babic on January 17, 2003 at 16:55:10:

In Reply to: Running Molcas on a Linux Cluster posted by Jose C. Corchado on January 16, 2003 at 12:57:29:

I also had problems with armci in Molcas 5.2 (with RedHat 7.2). They dissapeared
after replacing armci in molcas52/g/ with the last version of armci (it was 1.0):
.

But, as Valera Veryazov said, there is not much to expect from parallel Molcas 5.2.
---------------------------------------------------------------------------------
: Hi,

: I am trying to install Molcas 5.2 on a Red Hat 7.2 Linux Cluster with Portland compilers (Portland's Cluster Development Kit). When I run it on one processor, it works fine, but when I try to run it on two or more processors I get the following error:

: -------- Extract of the output file

: ARMCI configured for 2 cluster nodes
: -10001(s):armci_rcv_req: invalid to: 55535
: -10001(s):armci_rcv_req: invalid to: 55535
: 1:Child process terminated prematurely, status=: 256
: rm_l_1_4949: p4_error: net_recv read: probable EOF on socket: 1
: Last System Error Message from Task -10001:: Resource temporarily unavailable
: 1:Child process terminated prematurely, status=: 256
: Last System Error Message from Task 1:: Resource temporarily unavailable
: -10000(s):armci_rcv_req: invalid to: 55536
: -10000(s):armci_rcv_req: invalid to: 55536
: Last System Error Message from Task -10000:: Resource temporarily unavailable
: 0:Child process terminated prematurely, status=: 256
: 0:Child process terminated prematurely, status=: 256
: Last System Error Message from Task 0:: Resource temporarily unavailable
: /usr/local/PGI/linux86/bin/mpirun: line 1: 29562 Broken pipe /home/corchado/molcas52/bin/seward.exe -p4pg /home/corchado/tmp/test001.input.29354/PI29485 -p4wd /home/corchado/tmp/test001.input.29354
: --- Stop Module: seward at Thu Jan 16 12:14:38 CET 2003 /rc=98 ---
: --- Stop Module: seward at Thu Jan 16 12:14:38 CET 2003 /rc=98 ---
: bm_list_29563: p4_error: net_recv read: probable EOF on socket: 1
: Non-zero return code - check program input.
: --- Stop Module: automolcas at Thu Jan 16 12:14:38 CET 2003 /rc=98 ---
: --- Stop Module: automolcas at Thu Jan 16 12:14:38 CET 2003 /rc=98 ---

: -------- END

: I have tried to run it on a local directory for each node and also on a NFS shared directory, but none of them worked. If the working directory is not NFS shared, I also get a missing file error.
: This is the Symbols file that configure generated. I had to change it a little bit in order to make it run on my cluster:

: -------- Symbols file

: # Molcas build symbols generated by ./configure on Fri Jan 10 18:43:27 CET 2003 for MOLCAS version 5.2 patch level 138.

: # ./configure options, DO ONLY CHANGE BY RERUNNING CONFIGURE.
: OS='Linux'
: COMPILER='portland'
: FAST='yes'
: PARALLEL='yes'
: MSGPASS='mpich'

: # Machine.
: HW='i686'

: # Standard commands.
: SH='/bin/ksh'
: MAKE='/usr/bin/gmake'
: CP='/bin/cp'
: MV='/bin/mv'
: RM='/bin/rm'
: LS='/bin/ls'
: AWK='/usr/bin/awk'
: SED='/bin/sed'
: GREP='/bin/grep'
: CHMOD='/bin/chmod'
: FIND='/usr/bin/find'
: MKDIR='/bin/mkdir'
: LN='/bin/ln'
: SOFTLINK='-L'
: WC='/usr/bin/wc'
: MORE='/bin/more'
: CAT='/bin/cat'
: AR='/usr/bin/ar'
: TIME='/usr/bin/time'
: RANLIB='/usr/bin/ranlib'
: LATEX='/usr/bin/latex'
: DVIPS='/usr/bin/dvips'
: MAKEINDEX='/usr/bin/makeindex'
: BIBTEX='/usr/bin/bibtex'

: # Compilers.
: CPP='/usr/bin/cpp'
: CPPFLAGS='-P -C -D_LINUX_ -D_MOLCAS_MPP_ -D_HAVE_UNISTD_ -I. -I${INCDIR} -I${GAINC}'
: F77='/usr/local/PGI/linux86/bin/pgf77'
: F77FLAGS='-fast -Minform,warn -Minfo=loop -Munixlogical -I/home/corchado/MOLCAS5/OTRO/molcas52/include -D_LINUX_ -D_MOLCAS_MPP_ -D_HAVE_UNISTD_ -I. -I${INCDIR} -I${GAINC}'
: F77NOWARN=' '
: F77STATIC=' '
: F90=''
: F90FLAGS=''
: FPREPROC='F'
: CC='/usr/local/PGI/linux86/bin/pgcc'
: CFLAGS='-O2 -w -D_LINUX_ -D_MOLCAS_MPP_ -D_HAVE_UNISTD_ -I. -I${INCDIR} -I${GAINC}'
: LDFLAGS=''

: # External libraries.
: XLIB='-llapack -lblas'

: # Molcas.
: MOLCAS='/home/corchado/MOLCAS5/OTRO/molcas52'
: INCDIR='/home/corchado/MOLCAS5/OTRO/molcas52/include'
: PRGM_LIST=' seward scf rasscf mipi check alaska caspt2 casvb cpfmcpf ffpt funi genano grid_it guga mbpt2 mckinley mclr motra mrci rasread rassi slapaf vibrot '
: UTIL_LIST=' aces2_util amfi_util blas_util casvb_util clones_util dtraf_util essl_util integral_util io_util lapack_util memory_util molcas_ci_util molpro_util nq_util parallel_util pcm_util property_util runfile_util rys_util util '
: MANUALS='manual'
: MOLCASDRIVER='/home/corchado/bin'

: # Global arrays.
: GADIR='/home/corchado/molcas52/g'
: GAINC='/home/corchado/molcas52/g/include'
: GALIB='-L/home/corchado/molcas52/g/lib/LINUX -lma -ltcgmsg-mpi -lglobal -larmci -lpario -L/usr/local/PGI/linux86/lib -lmpich'
: GATARGET='LINUX'
: GAOPTIONS='MPI_LIB=/usr/local/PGI/linux86/lib MPI_INCLUDE=/usr/local/PGI/linux86/include LIBMPI=-lmpich USE_MPI=yes CC=pgcc FC=pgf77'

: # Commands for running executables.
: RUNSCRIPT='$program < $input'
: RUNBINARY='/usr/local/PGI/linux86/bin/mpirun -np $CPUS $program'

: -------- END

: I also tried with p4_ch as the message passing method, but it didn't work either, even though the GA test worked.

: Any suggestion will be deeply appreciated.

: Thank you very much for your help





Follow Ups:



Post a Followup

Name:
E-Mail:

Subject:

Comments:


[ Follow Ups ] [ Post Followup ] [ Molcas user's WWWBoard ]