Frequently Asked Questions and Error Messages
Last updated Nov 7, 2013
Seeds Framework
MPI Assignment
1. "When I run the matrix multiplication program, the values in the output differ from the values in the file output512x512mult by small quantities, such as:
C =
1329796.50 1266513.38 1330501.88 1262294.12 1428972.12 1292474.62 1365209.62 ...
versus:
C =
1329796.52 1266512.52 1330502.06 1262293.92 1428972.18 1292474.58 1365209.85 ...
"
Change the data type for the matrices from float to double. The instructions for Assignment 3 had the matrix multiplication program with floats as the type of the matrices. It was correct in Assignment 2, but not Assignment 3.
2. "My matrix multiplication program works fine for 4, 8, or 16 processors, but when I try to run it for 12, I get the error:
[compute-0-1.local][[6640,1],0][btl_tcp_frag.c:118:mca_btl_tcp_frag_send]
mca_btl_tcp_frag_send: writev error (0x7fffb96329d0, 8272)
Bad address(1)
"
Paraguin Compiler
1. "I can't get the matrix multiplication program to produce the correct results, even though I am transposing the B matrix and scattering it."
The transposing of the B matrix and scattering it does not work in the matrix multiplication. Each processors needs the entire B matrix in order to compute the partial results for any single row. So disregard the comment about tr ansposing the matrix and scattering it. You will simply have to broadcast the B matrix. The A matrix can still be scattered.2. "When I try to compile my program, I'm getting the
following output:
mpirun was unable to launch the specified application as it could not
access
or execute an executable:
Executable: ./hello.out
Node: compute-0-1.local "
Check that your comments in the job submission file are all on one
line. For example, the line
"#$ -pe orte 12 #
Specify how many processors we want"
should all be on one line. If the word "want" is on a separate
line by itself, it would be a syntax error. Either put the entire
comment on one line or put a "#" in front of the part that wraps to the
next line.
If that doesn't seem to fix it, it might be because you created the file on your PC and uploaded it as a DOS file. Sometimes, DOS files do not have the same end of line sequence (carriage return/linefeed). You may need to edit the file on babbage, delete the last line, and retype it.
3. "When I try to compile my program, I get the following
error:
In file included from /usr/include/features.h:385,
from /usr/include/stdio.h:28,
from hello.c:7:
/usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or
directory
/usr/bin/cpp -D__SCC__ -DPARAGUIN -D_x86_64_
-I/share/apps/suifhome/x86_64-redhat-linux/include -undef -U__GNUC__
-U__GNUC_MINOR__ hello.c /tmp/scc29468_0.i
FAILED (exit status 0x1) "
One reason is there is a typo on the command you are entering to
compile. For example, you will get this error if you don't have
the proper number of underscores ('_') in the word
"-D__x86_64__". There should be TWO underscores before the "x86"
and after the "64".
4. "When I compile my program with the Paraguin compiler,
I'm getting the following error message:
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/include/stdarg.h:40: syntax
error; found `__gnuc_va_list' expecting `;'
"
The lines in the program that declare the "builtin_va_list" need to come BEFORE all the include files. The reason is because the va_list is used in the stdarg.h file.
5. "I dont know where I am suppose to put the input file for the matrix multiplication problem."
or
"I get the error:
Usage: ./matrix file
"
The matrix multiplication skeleton program is written to take the command-line argument given to the program as the name of the input file. So the input file you copied into your directory should be given immediately after the name of your program.
Assignment 4 (UNC-C cluster issues)
1. I get the error message "Cannot connect to X server localhost:10:0"
Check that your local client is running an Xserver.
Check that you are forwarding X11 from the server, including through any intermediate severs (i.e. ssh -X ...)
Check if xclock displays.
This is not a server issue.