iRods

For projects that require a data grid that can do…

  • Data analysis of large data sets
  • Data analysis of a large number of data sets
  • Workflow automation
  • Metadata analysis for data discovery
  • Secure collaboration
The Integrated Rule-Oriented Data System (iRODS) is open source data management software used by research organizations and government agencies worldwide.
iRODS provides a virtual filesystem and unified namespace atop disparate physical data storage systems, a metadata catalog for annotation and data discovery, a rule engine framework to enforce and audit organizational policy, and allows secure collaboration both within and between organizations.
The iRODS system is unique among data grid systems because it integrates the following features; workflow automation, secure collaboration, metadata creation and queries, and data virtualization.  If your project requires several of these features, iRODS is an appropriate tool.
Some more details about the iRODS implementation:

Federation with NCSU, UNCCH, UNCC:  Distributed and shared storage means that data can be placed close to the computing resources that will do the data processing or analysis, and yet allow inter-institutional collaboration (which is the nature of the ROI projects, since they involve investigators from all 3 institutions)

Workflow with iRods micro-rules, for automating data-processing.

Advanced permission administration capabilities:  Users can easily define which directories, collections or data sets are shared with which users.

iCloud browser:  Allows users the interface with the data grid via a web server running on the iCat (iRods) server.  Users can upload, download and examine metadata from anywhere with a web browser.

Command Line or GUI:  iCommands (iRods) client, run from a VCL Linux image, allows users to interact with the data grid using GUI’s or command lines.

iRods data is backed up.  See the FAQ section on “Backup and Restoration”.

Accessing iRods Data

Login with your Shibboleth credentials at irods-cloudbrowser.hpc.ncsu.edu. This web server will allow you limited control of your data:  upload, download, and manually create metadata. For more detailed control, like using command line controls to run iRules, log into the the iRods client, which is done as follows:  

Connecting to iRods Client

  1. Go to vcl.ncsu.edu and click on Make a Reservation
  2. Use your Shibboleth credentials to login
  3. Click on the Reservations tab
  4. Click on the New Reservations button
    1. From the environments dropdown, select Centos 6 with iRods Client
    2. Choose a time for which you want to schedule the reservation
  5. Wait for the reservation to be created. Once it is ready, you will see a button to connect. Click this and you will see a popup. At the bottom of the popup, you will see a button for RDF File. Click this and open it within Windows.
  6. Login with your Shibboleth credentials.
  7. You should now see the Centos GUI with icons on the left, one of which is the “Initialize iRods” icon.  Double click on that to initialize iRods.

You are now connected to the data grid.  If you open a terminal window (Applications > SystemTools>Terminal), “ils” will list all of your files on the data grid.

Connecting to the iRods Client on the HPC-VCL

…for projects that need access to the HPC storage environment.  The HPC-VCL reservation is just like an HPC login node.  However it also has iRods Client, or iCommands, running on it.  You must have an HPC account to log into the HPC-VCL reservation, and you must have an iRods account to use the iRods Client on the HPC-VCL reservation.

  1. Go to vcl.ncsu.edu and click on Make a Reservation
  2. Use your Shibboleth credentials to login
  3. Click on the Reservations tab
  4. Click on the New Reservations button
    1. From the environments dropdown, select HPC (CentOS 7.1 64 bit VM)
    2. Choose a time for which you want to schedule the reservation
  5. Wait for the reservation to be created. Once it is ready, you will see a button to connect. Click this and you will see a popup.  Note the ip address, and use the it to ssh into the HPC-VCL reservation (with Putty, for instance)
  6. Login with your Shibboleth credentials. Now you are logged in, at a Linux command line.  You must have an HPC account to do this step.
  7. In order to get the iRods Client, or iCommands to work, you also need to have an iRods account.  Assuming you have one, the next step is to initialize iRods.  First, run to following script on the command line, “/usr/bin/irods-auth-setup“.  This only has to be done once; it does not need to be repeated for future reservations.  Then, type iinit, and you will be asked for your PAM password.  This needs to be done for every new reservation.

More information about using the data grid and icommands: https://irods.org/uploads/2016/06/irods_beginner_training_2016.pdf, from page 14 onwards.