Matlab Licenses renewed on Opportunity cluster
Just a quick FYI,
The Matlab licenses on the Opportunity cluster have been renewed by the Provost office!
I would like to thank everyone involved!
Leo
Established by the Office of the Provost, managed by the Academic Research Computing User Group and hosted by Information Services
Just a quick FYI,
The Matlab licenses on the Opportunity cluster have been renewed by the Provost office!
I would like to thank everyone involved!
Leo
Below is a link to some introductory training slides for the Opportunity cluster. If you are new to clusters... or a seasoned expert, you might find these slides beneficial.
http://opportunity.neu.edu/blog/cluster101-Opportunity.pdf
Leo
In the digital world, there are two types of electronic chips: memory and logic. Memory chips are used to store information. Logic chips are used to manipulate, or interface with, the information contained in memory.
Programmable Logic Devices (PLDs) are "off the shelf" logic chips that the customer, rather than the chip manufacturer, programs to perform a specific function. With the ability to program their own chips, customers realize two key benefits: product design flexibility and faster time to market. Given today's shorter product life cycles, both of these factors can be critical determinants of a product's ultimate success. Electronic equipment manufacturers rely upon PLDs to make fast design changes, accommodate uncertain production volumes, and accelerate the introduction of their products to the market place
To access Xilinx's ISE follow the following 4 steps:
1) Login into the cluster via an Xterminal of your choice
2) SSH into a compute node
3) Type: 'source /usr/local/Xilinx91i/settings.sh' into your shell
4) Type:'ise' to launch the Xilinx graphical interface.
If you have any questions about xilinx, please let me know.
Leo
There two unquoted file systems all Opportunity Users and the clustered compute nodes have access to: Scratch_local and Scratch_Global
*** Please note these areas are NOT backed up *****
The system has been configured to have both local and global scratch disk locations for the temporary storage of non-critical data. This global scratch space /scratch_global is an NSF mounted file share from the front end to the rest of the cluster. The local scratch space /scratch_local is unique disk space residing on the individual compute node (hence the term local)
Polices regarding the longevity of files on these file systems is 30 days.
If you have a need for temporary use of storage, please feel free to use either of these two spaces.
Leo
The Gaussian 03 software package has been installed onto the Opportunity cluster and available for use. If you would like to use this software product, please contact l.hill@neu.edu for further infomation.
Opportunity Cluster
Policy Exception Request
Title/description of the research project:
My work is in the area of large search space enumeration. Typically, the search spaces we deal with do not fit in the distributed memory of a cluster and our algorithms take advantage of disk to both speed in the discovery process as well as make the discovery possible. We have a disk-based technique for exploring one of these search spaces (one with applications in the area of computational group theory) and are looking to run that computation on opportunity. The search space has 11 billion nodes, each of which requires 50 bytes of storage (compressed to 12 bytes), along with a small amount of additional storage for the path to reach that node.
Is this project in support of an externally funded grant or award: Yes
National Science Foundation grant number ACIR-0342555,
``Collaborative Research: Tuning Libraries to Effectively
Exploit the Memory Hierarchy'', co-principal investigator
(joint with David Kaeli (PI, Northeastern U.) and Misha Kilmer
(PI, Tufts U.)),
2004-2007,
Estimated system requirements:
# Nodes/CPU: 50 (only one of the dual processors)
Amount of Disk (local or global): 9GB per machine
Memory Usage: 100 Megabytes of RAM per node
Network : At least Gigabit Ethernet
Running time: 4 days
Start Date: ASAP (Thurs. Evening – Mon. Morning acceptable)
Expected Research Findings:
We aim to discover a property of the group we are working on (Brauer Tree structure). To do this, we will be running a separate computation over the results obtained here. This, however, can be done externally after the data is transferred from the cluster (approximately 500 gigabytes).
pMatlab has been installed on the Opportunity Cluster . To use pMatlab, copy the
following startup.m file into the directory you will be launching your Matlab jobs from.
cp /scratch_global/pMatlab/startup.m ~/.
To validate you have access to the pMatlab tools, one can enter at the Matlab command
line "help pMatlab"
Leo
Please join me in extending a THANK YOU to the Provost Office
for sponsoring the renewal of the Matlab license on the Opportunity
Cluster. The new license has been installed and is valid through
June 2007.
Leo
NAMD is a parallel molecular dynamics application
./charmrun ++remote-shell ssh ++nodelist nodelist +p4 ./namd2 ~/apoa1/apoa1.namd
If you have problems, or want to see what's going in in the launch process, add ++verbose to the charmrun command line.
Below is an example of an LSF script of a NAMD job. To run it do the following:
* Place the contents below into a file "batchjob" in your NAMD directory
* The job is currently set up for 10 processors (note the 10's through the script)
################################################################
# hello_cpu10.lsf
# LSF demo NAMD job script.
#
# Use:
# bsub < batchjob
#
#
################################################################
# Define the working directory (RUNDIR), the program to run (PROG),
# the number of CPUs (NPROC and -n), and the output (-o) and error (-e) files:
RUNDIR=$HOME/NAMD_2.6b1_Linux-amd64-TCP
PROG=hello
NPROC=10
#BSUB -n 10
#BSUB -o hello_n10.out
#BSUB -e hello_n10.err
################################################################
# Nothing to edit below this line.
rm -f $RUNDIR/nodelist
echo 'group main' >> $RUNDIR/nodelist
for host in $LSB_HOSTS
do
echo host $host >> $RUNDIR/nodelist
done
./charmrun ++remote-shell ssh ++nodelist $RUNDIR/nodelist +p10 ./namd2 ~/apoa1/apoa1.namd
################################################################
What does p4_error: semget failed for setnum: 0 mean
p4_error: semget failed for setnum: 0
This means that the maximum number of allowed semaphores on the master node has been created, and the program you are trying to run cannot allocate a new semaphore for inter-process communication. This can happen when somebody has been testing software that does not exit properly, leaving semaphores and shared memory segments allocated.
If the leftover semaphores are owned by you, it can be fixed by running the following two commands:
/opt/mpich/gnu/sbin/cleanipcs
cluster-fork /opt/mpich/gnu/sbin/cleanipcs
(In this case, using the intel or gnu version doesn't matter. The scripts are identical.)
It is possible that other users may have filled up the semaphore table. In this case, either they or root will need to clean the tables.
To find out who else may be using semaphores, you can execute the commands
ipcs (on opportunity)
cluster-fork ipcs (on opportunity)
It has been brought to my attention that frequently users will need to submit jobs that will consume more than 2 gig of memory ( 2gig is equal to total memory divided by the number of job slots/processors on the compute nodes).
When this occurs, hence, a jobs requiring 3.5 or more gig get released for running. This job and others can become staved of memory resources when two jobs are running on the same node.
To help relieve this issue we have set a couple of conditionals, before a new job can be released into a node’s free job slot. The new conditionals are as follows:
CPU Load (1 and 15 minute) <= .6 / .8
Memory available >= 2gig (Not including swap)
If you have any questions, please send you questions to l.hill@neu.edu