58 lines
3.1 KiB
Markdown
58 lines
3.1 KiB
Markdown
|
|
# Arctic nanopolish
|
||
|
|
|
||
|
|
This is a modified bash script to upload fast5/fastq results into iRODS.
|
||
|
|
Most sections are commented as it has not been tested on the GRID genome sequencing machines.
|
||
|
|
This is an example based on the customers existing script and an .odt document outlining the workflow.
|
||
|
|
|
||
|
|
## Install iRODS client on GRID server, icommands are used in script
|
||
|
|
|
||
|
|
> https://packages.irods.org/ setup package repositories
|
||
|
|
> https://github.com/irods/irods_client_icommands do not follow this use for client host reference
|
||
|
|
> sudo apt-get install irods-icommands irods-dev irods-runtime
|
||
|
|
|
||
|
|
- The irods client will require a service account for this host, the host cannot join the domain via SSSD as it is sold as an appliance and updated by the vendor under a service contract.
|
||
|
|
- The irods config will require a resource for the data, the resource is loosely a network disk
|
||
|
|
- The irods config will require a top level collection, this is akin to a directory and can have permissions granted recursively for whomever requires access to the data
|
||
|
|
- Data objects (files) maybe uploaded to the collection and then tagged with metadata or can be tagged on upload with metadata using the iput command
|
||
|
|
|
||
|
|
Sample client config file follows.
|
||
|
|
In this case the user_name is an LDAP user (windows active directory) authenticated systemwide via the pam auth stack.
|
||
|
|
A local to iRODS service account and password pair will likely reside within this file with irods_authentication_scheme set to native.
|
||
|
|
|
||
|
|
```
|
||
|
|
[toby.seed@phe.gov.uk@smedmaster02 ~]$ cat << EOF > ~/.irods/irods_environment.json
|
||
|
|
|
||
|
|
{
|
||
|
|
"irods_host": "irodscol01.unix.phe.gov.uk",
|
||
|
|
"irods_port": 1247,
|
||
|
|
"irods_user_name": "toby.seed@phe.gov.uk",
|
||
|
|
"irods_zone_name": "PHE",
|
||
|
|
"irods_default_resource": "s3_compound",
|
||
|
|
"irods_authentication_scheme": "PAM"
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
With a working config the client will be authenticated against the iRODS server with the `iinit` command and checked with the `ienv` command.
|
||
|
|
Depending on the irods server configuration the token may last for up to two weeks, it maybe necessary to ensure the bash_rc login script runs iinit on login or to be atomic, run iinit at the top of the various workload scripts.
|
||
|
|
|
||
|
|
## rough requirements to test in a live environment
|
||
|
|
|
||
|
|
- a resource
|
||
|
|
- a service account for this host
|
||
|
|
- network connectivity to the target irods server @ tcp 1247
|
||
|
|
- a top level collection with some recursive permissions for users requiring access fast5/fastq data
|
||
|
|
|
||
|
|
## Using iCommands to upload files to iRODS
|
||
|
|
|
||
|
|
> resource ~ network disk
|
||
|
|
> collection ~ directory
|
||
|
|
> object ~ file
|
||
|
|
|
||
|
|
- generally you would create a collection on your resource to put your files
|
||
|
|
- it is likely you would create a collection (runnameXYZ) within a collection that already has recursive permissions for a group of users; /PHE/projectXYZ/runnameXYZ
|
||
|
|
- the iput command will push objects, it may also push collections recursively, irsync is much preferred for this task to ensure data integrity
|
||
|
|
- the imeta command will list/add/remove metadata for collections and objects that have been uploaded
|
||
|
|
- the ils command will list file attributes and permissions
|
||
|
|
- the irm command will remove files from the irods storage
|
||
|
|
- https://docs.irods.org/master/icommands/user/
|