Frequently Asked Questions1
- Who
do I contact if I have problems?
- Send Email to support
millennium.berkeley.edu.
We're happy to respond
to your comments, complaints, feature-requests, etc.
-
Who
can use the cluster?
-
X Cluster
access is currently limited to active members of the OceanStore
and ROC projects. Exceptions will be made with consent of the projects'
PI's.
- Where do I login?
- Reservable hosts are
x1.millennium.berkeley.edu through
x39.millennium.berkeley.edu. x40
runs
FreeBSD and is a reserved Modelnet router. x42 is not
reservable and is for testing and debugging your software first.
- Older nodes
ibm1.cs.berkeley.edu
through ibm8.cs.berkeley.edu are also available.
- How
do I find a free node?
- Please use the
Cluster
Reservation
System. Email
support
for the login/passwd. Please do not reserve nodes more than 36
hours in advance.
- Use the Ganglia
graphs to check system load.
- I need the
machine/network specs for this research paper.
- Wait, the specs say these are SMP servers. How come
top
only lists 1 cpu?
- You are used to the nonstandard behavior of RedHat top.
- On a 2 cpu server, 1 cpu running at full capacity will indicate a load
of 1.0 and 50% cpu utilization.
- What
OS is this? It's funny.
- I
can't find program foobar, where is it?
- Make
sure you have
/usr/roc/bin (and optionally /usr/roc/sbin) in your path.
- Various Java
distributions are located in /usr/local.
- We
don't mount much of IRIS's Linux
SWW because it is a)
RedHat centric and b) horribly out of date.
- Additional software
can be installed as needed. Just ask.
- Some
key libraries are missing, even though the programs are installed.
Why?
- The
Debian package system likes to separate out all the useful libraries
into "-dev" packages. Ask and we shall install.
- Why
doesn't rexec work?
- rexec is no longer
supported. Please use gexec
instead.
- Where
do I put my temporary files?
- Each node has a large (10-50GB) file system mounted as
/scratch.
This file system is striped across both disks for speed. Note:
The nodes are considered stateless, and while we will try to preserve
data in /scratch, we do not guarantee its safety; /scratch
is never backed up.
- We also have a local NFS fileserver for scratch
data. It is mounted as
/work on [x1-x42]. This
storage is RAID 5, but is not backed up. Use policy is similar to
/work on the Millennium cluster, except stale data is not automatically
deleted.
- Where
can I keep my important data safe?
- Any data you're concerned about should be stored
on the IRIS
provided fileservers; either your home directory or your group's project
space.
- Why is access to my home directory
slow/broken?
- The X Cluster is connected to the Millennium
Network, not the EECS
network. Even though it's fully gigabit connected, traffic from the cluster to the CS
file servers travels all the way to Evans Hall and back.
Congestion on the campus core routers or at the EECS firewall can cause
problems.
- If your home directory is in some other research
group's project space, it may not be exported to the cluster, even if
you appear in the
roc-l
yp netgroup. Mail support,
and we'll request permission to have that project space exported.
- Gigabit feed, you say? It certainly
doesn't seem to go that fast.
- The machines can each only push about 400 megabit though.
- You can check the utilization graphs for the two switches: ocean-gw1,
ocean-gw2 It's nowhere near capacity.
- What is this Myrinet thing you keep talking
about?
- We inherited a bunch of 2nd generation (LANai
7.2) Myrinet equipment from the Millennium
project. The PCI64A
provides 1.28Gb/sec bandwidth with very low (<10 µsec) latency. More information is
available from Myricom.
- Myrinet provides an Ethernet emulation
layer. Ours uses private IP addresses in the range
192.168.10.[201-241] and have associated private DNS entries [x1-x41].myri
.
- Some of the hosts seem to be missing from the
Myrinet; what gives?
- The following hosts have PCI Advanced System
Management cards instead of Myrinet: x10, x21, x31, x42.
- User JoeBob is hogging all the CPU time and I
can't run my foomulator. Make him stop!
- I found some other bug or need to contact the other users of the
cluster...
- There is a mailing list for the cluster users: xcluster-users
millennium.berkeley.edu.
1
You're right, nobody ever asked these questions. I just don't want to answer
them more than once. Contact: support at millennium.berkeley.edu. Last
modified on
15-Mar-2006 01:22:45 -0800 |