Using the DataFabric
From BeSTGRID
The BeSTGRID DataFabric can be accessed in several different ways - each may be suitable for a different type of users or for a different scenario. This page provides a guide to the most common ways for accessing the DataFabric. While this guide is still being developed, users may find more relevant information in the ARCS DataFabric user guide.
The three primary means of accessing the DataFabric are:
- Browsing the DataFabric via a web browser
- Suitable for casual users and for browsing existing collections
- Mounting the DataFabric as a filesystem via webDAV
- Suitable for more involved users, for uploading larger collections of files, and for accessing the files on the DataFabric directly from applications.
- Accessing the DataFabric directly via the iRODS protocol with iCommands
- For the most involved users, who need the most transfer performance they can get - or who need direct access to the iRODS advanced features (access control, metadata, ...)
For each of these scenarios, the exact use may still differ depending on the authentication mechanism used. The sections below describe how to start using the DataFabric for each of these scenarios.
[edit] Web Browser and Shibboleth Login
For users who do have a login in the AAF Shibboleth Federation, a Shibboleth login is the easiest way to access the DataFabric with a browser.
This option is directly available to users at University of Auckland, University of Canterbury, Lincoln University - and also to users with an account at the ARCS IdP. Users without a Shibboleth login can request an account at the ARCS IdP, open to Australian and New Zealand academic and research community.
- To access the DataFabric, go to http://df.bestgrid.org/BeSTGRID/home/
- When prompted, select your institution (you may make this setting permanent)
- When prompted, login with your institutional username and password
- Start accessing the DataFabric
- If your institution is not listed, request an account on the ARCS IdP at https://idp.arcs.org.au/idp_reg/
- This is a registration process that should be swift if your identity can be easily vetted, but may take up to a few days.
[edit] Web Browser and Grid Certificate
For users who do have a grid certificate, it may be the easiest to use their grid certificate as their identity on the DataFabric. They would delegate their credential into the MyProxy server, protecting the copy of the certificate with a username and password, and then login to the DataFabric with the MyProxy username and password.
To delegate a certificate into MyProxy:
- Start Grix: Grix Java WebStart
- Select Authentication tab
- Select Local X509 certificate
- Enter your certificate passphrase
- Select a Lifetime for the delegated certificate - it's reasonable to choose 30 or 60 days, the certificate will have to be delegated again after this period of time.
- Click the Authenticate button, this will create a local proxy certificate.
- Click the Upload button and pick a username and password. This will upload the certificate to MyProxy.
To access the DataFabric:
- Open https://df.bestgrid.org/BeSTGRID/home/ and authenticate with your MyProxy username and password
[edit] Linking Shibboleth and Grid identities together
The DataFabric will automatically create an account for each user on the first access. The account will be linked either with the Distinguished Name from the certificate or with the SharedToken received in the Shibboleth login.
For users who have both a Grid certificate and a Shibboleth login, it may be useful to link their two identities together - instead of having two separate DataFabric (iRODS) accounts. Please send a request to help@bestgrid.org to link your DataFabric account with a DN or a SharedToken from your other identity.
In the request, please include the following information:
- The full DN included in your certificate - it is displayed on first page in Grix, and it can be also obtained with
openssl x509 -subject -noout -in $HOME/.globus/usercert.pem
- Your CN and SharedToken as provided by your IdP - ideally, include in the request the information you get at the http://df.bestgrid.org/shared-token/
- Your iRODS username
[edit] Accessing the DataFabric in anonymous mode
For parts of the DataFabric that have been made available to the anonymous user, it is possible to access them without a user account.
- Via Shibboleth: the DataFabric won't be requesting a Shibboleth login when directly accessing an anonymously accessible collection. Go directly to the project URL - like http://df.bestgrid.org/BeSTGRID/home/<anonymous-project>
- Via https: the DataFabric will not be prompting you for a username + password, go directly to the project URL like https://df.bestgrid.org/BeSTGRID/home/<anonymous-project>
[edit] Mounting DataFabric as a disk drive
The DataFabric comes with a webDAV interface, available at the same URL as the DataFabric web interface. The webDAV interface can be mounted into most current operating systems and desktop environments (Windows, Mac, Linux, POSIX). In most cases, the desktop environment already comes with built-in support for mounting a webDAV URL, but it may pay off to install additional tools - which can provide a more efficient and more reliable way of accessing the DataFabric.
In all of the scenarios (except for anonymous access, see below), one has to use a MyProxy username and password to mount the DataFabric from the HTTPS URL, https://df.bestgrid.org/BeSTGRID/home/
- Users who have a grid certificate and have been using the DataFabric via the web interface can use the same login credentials for the webDAV interface.
- Users with a Shibboleth login need to get a SLCS certificate based on their Shibboleth login, upload the certificate into MyProxy (choosing a username and password) and then login with the MyProxy username and password (unfortunately, this has to be repeated at least every 10 days - the lifetime of the SLCS certificates).
- See Creating a MyProxy login below for more details.
The actual instructions to mount the DataFabric vary across operating systems and desktop environments. The most common cases are below, more information can be found in the ARCS webDAV mount documentation
[edit] Anonymous access
When accessing a project that has been made *anonymously* accessible, and the intention is to only access the project with the permissions of the anonymous user (typically read-only), it is *not necessary* to get a MyProxy username and password. In that case, use the URL to the project home directory (example: https://df.bestgrid.org/BeSTGRID/home/GeoFabric) and when prompted for a username and password, either leave it blank, or enter "irods\anonymous" as the username and anything as the password. Otherwise proceed as documented below.
[edit] Windows XP with default webDAV client
- From My Network Places, choose Add a Network Place
- Select "Choose another network location", click Next.
- Enter the following URL as the network address: https://df.bestgrid.org/BeSTGRID/home/
- When prompted, enter your MyProxy username and password
- Choose a name for the connection - e.g., BeSTGRID DataFabric
- Click Finish
- Please note that while this is the easiest solution to get going, the Windows XP *built-in webDAV client has severe limitations*, in particular with files larger than 2GB (it cannot read a directory if it contains a file larger than 2GB). We strongly recommend using one of the alternative solutions, in particular BitKinex
[edit] Windows - alternative solutions
On Windows Vista and Windows 7, it is necessary to use external tools for mounting a webDAV URL - and Windows XP users may also get additional performance.
The tools available are NetDrive, WebDrive, and BitKinex. Read a brief comparison and review of these clients here, note that only BitKinex has shown to reliably transfer files greater than 2GB in size.
BitKinex is the recommended alternative client, a complete procedure for getting started with the Data Fabric using the BitKinex client is described here
NOTE: Some of the alternative clients do not work with files or directories with special characters. Use only letters, numbers, spaces, dashes, underscores, and periods in file and directory names.
NOTE: There is Microsoft patch available to make WebDAV native support working: [1]
[edit] Mac - using Finder
Finder is a WebDAV client that is bundled with the operating system. To connect:
- In the Finder menu, find "Go", then select "Connect to Server" (or press Cmd-K).
- In Server Address, type in:
https://df.bestgrid.org/BeSTGRID/home/
- Click on "+" to save this URL as a connection favorite.
- Click on connect and enter the MyProxy username and password when prompted.
- Note
- Mac Finder would by default create a .DS_Store file in each directory it visits. To avoid polluting the DataFabric, it is important that you disable the .DS_Store creation
The following will disable this function for all network connections: SMB/CIFS, AFP, NFS, and WebDAV.
- Open Terminal and run there the following command:
defaults write com.apple.desktopservices DSDontWriteNetworkStores true
- Note
- Inspector application that displays file information when requested from the Finder has a bug that prevents it displaying correct on-disk size for WebDAV directories. Even though the total directory size is reported correctly, Size on Disk returns very high values, unrelated to the actual folder size. This bug is known to the Apple developers. If more exact information is required, a du command can be used from the command line:
- Open Temrinal
- type the command:
du -k /Volumes/home/
The result would be the list of directories in the BeSTGRID home directory with their sizes and the total amount taken by the directories.
[edit] Linux and POSIX systems with KDE - Konqueror
In KDE, you can use the webDAV client built into Konqueror:
- Open up a Konqueror window, and type in:
webdavs://df.bestgrid.org/BeSTGRID/home/
- When prompted, enter the MyProxy username and password
[edit] Linux and POSIX systems with Gnome - Nautilus
In Gnome, you can use the webDAV client built into Gnome/Nautilus:
- Select Connect to Server (either from the File menu in any Nautilus window, or from the Places menu in the top status bar)
- In the Connect to Server dialog box, fill in the following details:
Service type: Secure WebDAV (HTTPS) Host: df.bestgrid.org Port: (leave empty) Folder: BeSTGRID
- Leave the username blank - enter it later when prompted
- Select to Add a bookmark and choose a bookmark name.
- Click Connect
- When prompted, enter your MyProxy username and password.
[edit] Linux and POSIX systems - DavFS
Alternatively, one can install DavFS, a standalone client library for webDAV.
- Install Davfs
yum install davfs2
- Configure davfs NOT to use file locking (not implemented by the DataFabric)
- Add these lines to /etc/davfs2/davfs2.conf
# ARCS Specific Options # --------------------- use_locks 0 drop_weak_etags 1
- Mount the DataFabric with (run this as root):
mount -t davfs https://df.bestgrid.org/BeSTGRID/ /mnt
- To mount the filesystem as accessible by your local account, modify the command to:
mount -t davfs -o uid=username,gid=groupname https://df.bestgrid.org/BeSTGRID /mnt
[edit] Creating a MyProxy login
Because webDAV cannot handle a Shibboleth login, users who use a Shibboleth login on the web interface need to get a MyProxy username and password to access the webDAV interface.
To create a MyProxy login from a Shibboleth login, one needs to get a SLCS certificate based on the Shibboleth login, upload the certificate into MyProxy (choosing a username and password) and then login with the MyProxy username and password.
Unfortunately, this has to be repeated at least every 10 days - the lifetime of the SLCS certificates.
To get a SLCS certificate and delegate it into MyProxy:
- Start Grix: Grix Java WebStart
- Select Authentication tab
- Select Institution login
- Select your institution in the IdP list
- Enter your institutional username and password
- Click the Authenticate button, this will create a local proxy certificate.
- Click the Upload button and pick a username and password. This will upload the certificate to MyProxy.
This process will not work through a HTTP proxy.
[edit] Accessing DataFabric from iRODS
This is an option suitable for users who need the highest performance and data throughput available, and is only available on Linux and POSIX systems.
This option may give users access to find-grained iRODS features not available via the web interface (detailed access control, metadata access,...)
- Install iCommands (the client part of iRODS) on your system, following the ARCS iRODS Client installation manual
- This is the same as installing iRODS, without compiling the server.
- Install the same version as the one installed on the BeSTGRID DataFabric (iRODS 2.3 at the time of writing)
- You will need to compile iRODS with GSI support, and you will need Globus (and it's development libraries / SDK) for that.
- Either install Globus from VDT as recommended in the ARCS manual
- Or install Globus from source code TODO: link ngdata iRODS
- Create an irodsEnv file:
- First create a .irods directory:
mkdir $HOME/.irods
- Then create $HOME/.irods/.irodsEnv with the following contents
- First create a .irods directory:
irodsHost 'ngdata.canterbury.ac.nz' irodsPort 1247 irodsDefResource 'griddata.canterbury.ac.nz' irodsZone 'BeSTGRID' irodsAuthScheme 'GSI' irodsServerDn '/C=NZ/O=BeSTGRID/OU=University of Canterbury/CN=ngdata.canterbury.ac.nz' irodsUserName your.irods.username
- The last line of the file must be your irods username. You can find your iRODS username in the top right corner of the DataFabric web interface.
- You may also find your username with:
iquest "SELECT USER_NAME where USER_DN = '$( grid-proxy-info | grep ^identity | cut -d : -f 2- | cut -d ' ' -f 2- )'"
- Or alternatively with the iupdate script
For more information:
- See the iRODS iCommands documentation
- Walk through the iRODS tutorial
[edit] Accessing DataFabric with iRODS FUSE
An alternative to mounting the DataFabric is to use iRODS FUSE (file system in user space) module. See the iRODS FUSE page for more information.
