Auckland Rocks Cluster Backup and Restore Procedure

From BeSTGRID

Jump to: navigation, search

Contents

[edit] Installing Applications

Possible options:

Since standard Rocks procedure for installing RPMS ensures persistance in case the nodes are re-imaged, we decided to use it and add the RPM to development roll. When quick install is required we use standard procedure, but do not re-image all the nodes immediately. Instead cluster-fork is used. This way we ensure stability of our system and do not have extra outage.

[edit] Install Process

In case something bad happens or we need to reinstall everything.

  • Go through normal Rocks install process http://www.rocksclusters.org/roll-documentation/base/5.0/
    • As a central server select reference cluster (currently bestgrid83.math.auckland.ac.nz)
    • select standard rolls, restore roll and development roll
    • download install scripts from //data.bestgrid.org/eResearch/scripts/install.tar.gz to /home/install on the cluster and run ./install.sh
  • Currently due to problems with local network setup during installation, there needs to be a CD with restore roll ready
    • if restore roll is installed afterwards, X server has to be running
  • To fix MPI warnings run cluster-fork 'echo "btl=^openib,udapl" >> /etc/openmpi-mca-params.conf'

[edit] Possible Issues

  • /home/ directory is missing
    • check /etc/auto.home
    • run service autofs restart
  • restore roll does not recover user and system files properly
    • the files can still be recovered from /export/profile/nodes/restore-user-files.xml. They are stored in base64 format and need to be copied from that xml file and decoded with base64 -d -i command.

[edit] Backups

We need to organize backups for