14 KiB
Executable File
Using the native rule engine
Using the native rule engine
https://docs.irods.org/4.2.8/plugins/pluggable_rule_engine/ https://docs.irods.org/4.1.5/manual/rule_language/ https://docs.irods.org/4.2.8/doxygen/ # list of microservices https://groups.google.com/g/irod-chat # user group with many examples https://groups.google.com/g/irod-chat/c/ECt5oWSA978/m/hCtoKUjrBQAJ # dynamic PEP - new way of invoking rules (not tried)
List rule engines installed:
tseed@tseed-irods:~$ irule -a
Level 0: Available rule engine plugin instances:
irods_rule_engine_plugin-python-instance
irods_rule_engine_plugin-irods_rule_language-instance
irods_rule_engine_plugin-cpp_default_policy-instance
NOTE: To test these examples it will be neccessary to copy the scripts to a text editor that handles unix end of line characters, then copy/paste to the script file on the linux host.
simple client side rule, run from desktop
NOTE: if you have installed the python rule engine as the first engine in the server.json schema you must select the rule engine to use in irule (not sure if there is a client side way to set the which rule engine to use as default?)
tseed@tseed-irods:~$ cat test.r
main() {
writeLine("stdout", "Success!");
}
OUTPUT ruleExecOut
tseed@tseed-irods:~$ irule -r irods_rule_engine_plugin-irods_rule_language-instance -F test.r
Success!
sample client side rule to illustrate metadata manipulation
This example shows most of the methods to manipulate metadata in the native rule engine, it is designed to run on the client but most of the inbuilt microservices (functions) will run server side, there are some exceptions listed in the doxygen documentation.
tseed@tseed-irods:~$ cat test-meta.r
main() {
#### metadata examples ####
# before version 4.2.7 several microservices(functions) are required achieve metadata manipulation
# generally data structures of key pairs are created, applied or removed from the data-object/collection/resource
# from version 4.2.7 the msiModAVUMetadata microservice greatly simplifies metadata transformation
# for any version - retrieve metadata with sql like queries against the iRODS database, this is equivalent to the 'imeta ls' command
### metadata structures
# sample metadata in the % delimted format for msiString2KeyValPair
*str = "a=10%zebra=horse%hula=dance"
# load kvp structure from string
msiString2KeyValPair(*str, *kvp);
# add some keys with values
msiAddKeyVal(*kvp, "art", "crayon");
msiGetSystemTime(*Time, "human");
msiAddKeyVal(*kvp, "timestamp", *Time);
# print all keys and values
msiPrintKeyValPair("stdout", *kvp);
writeKeyValPairs("stdout", *kvp, " || ")
# print all keys
foreach(*kvp) {
writeLine("stdout", *kvp);
}
# print all keys with values
foreach(*i in *kvp) {
msiGetValByKey(*kvp,*i,*result);
writeLine("stdout", "1: *i = *result");
}
# print key=value where key like zeb*
# print key=value where value matches *Time (set with msiGetSystemTime above)
# print key=value where string value matches 10, convert to int and divide
foreach(*i in *kvp) {
msiGetValByKey(*kvp,*i,*result);
if (*i like "zeb*") then {
writeLine("stdout","2: *i = *result")
} else if (*result == *Time) then {
writeLine("stdout","2: *i = *result")
} else if (*result == "10") then {
*number=int("*result")/2
writeLine("stdout","2: *i = *number")
} else {
writeLine("stdout","2: no match")
}
}
# more conditional behaviour
foreach(*i in *kvp) {
msiGetValByKey(*kvp,*i,*result);
#if (*result == "dance" || *i == "art" || *i == "zebra") then { # this is a valid condition, however multiple OR with != / "not like" operators are not pre-evaluated correctly
if (*i == "a" && *result == str(10)) then { # must preceed the else statement or will not be matched
writeLine("stdout","3: AND *i = *result")
} else if (*result not like "dance") then {
writeLine("stdout","3: *i = *result")
}
}
## Add/Remove metadata for files(data objects) or directories(collections) ##
# print all session variables (rei) with msiGetSessionVarValue, when running the rule locally without script parameters the only useful variables are userNameClient= / rodsZoneClient=
# when run from the server rules engine, there are many more useful session variables
#msiGetSessionVarValue("all", "client");
# access the variables as $<variable name>
*testpath = "/$rodsZoneClient/home/$userNameClient"
*newfile = "/$rodsZoneClient/home/$userNameClient/test.txt"
# test for valid path, errorcode microservice is used to ensure the script does not exit on failure instead return boolean
*a = errorcode(msiObjStat(*testpath,*status)) >=0
writeLine("stdout","4: collection exists: *testpath *a")
# if path exists add/remove metadata
if (errorcode(msiObjStat(*testpath,*status)) >=0) then {
# remove file without sending to trash and unregister from database
if (errorcode(msiObjStat(*newfile,*status)) >=0) then {
msiDataObjUnlink("objPath=*newfile++++forceFlag=++++unreg=",*status)
writeLine("stdout","4: file removed: *newfile")
}
# create a file, forceFlag attribute required to overwrite file, resource can be specified here, note the field delimiter "destRescName=demoResc++++forceFlag="
*content = "test.txt content"
msiDataObjCreate(*newfile,"forceFlag=",*file_descriptor)
msiDataObjWrite(*file_descriptor,*content,*write_length)
msiDataObjClose(*file_descriptor,*status)
writeLine("stdout","4: file created: *newfile")
# apply metadata to object from kvp structure
msiAssociateKeyValuePairsToObj(*kvp,*newfile,"-d")
# get data object and collection from a full path string
#*filepath_element = ( size( (split(*newfile,"/")) ) )
#*file = (elem((split(*newfile,"/")), (*filepath_element - 1) ))
#*data_object = (elem( (split(*newfile,"/")), ( (size((split(*newfile,"/")))) - 1) ))
msiSplitPath(*newfile,*collection,*file)
# query iRODS db for metadata of file, load into a new key pair structure
*query = SELECT META_DATA_ATTR_NAME,META_DATA_ATTR_VALUE WHERE DATA_NAME = '*file' AND COLL_NAME = '*collection'
foreach(*row in *query) {
#msiPrintKeyValPair("stdout",*row)
#writeLine("stdout","next row")
msiGetValByKey(*row,"META_DATA_ATTR_NAME",*key);
msiGetValByKey(*row,"META_DATA_ATTR_VALUE",*value);
msiAddKeyVal(*query_kvp, *key, *value);
}
# create a new 'trimmed' metadata structure including the key pairs to be removed
foreach(*i in *query_kvp) {
#writeLine("stdout", "key is *i")
if (*i == "a" || *i == "art") then {
msiGetValByKey(*query_kvp,*i,*result)
writeLine("stdout","4: metadata to keep on *newfile, *i=*result")
} else {
msiGetValByKey(*query_kvp,*i,*result)
writeLine("stdout","4: metadata to remove from *newfile, *i=*result")
msiAddKeyVal(*new_kvp, *i, *result)
}
}
# remove key pairs listed in the new metadata structure from the data object
msiRemoveKeyValuePairsFromObj(*new_kvp,*newfile,"-d")
# create a new kvp structure, add key pairs
msiAddKeyVal(*kvp2, "company", "OCF");
msiAddKeyVal(*kvp2, "department", "Cloud");
msiGetSystemTime(*created_epoc, "unix")
msiAddKeyVal(*kvp2, "create_date_epoc", *created_epoc );
# get system time, load into list and grab elements based on position
msiGetFormattedSystemTime(*created,"human","%d-%02d-%02d-%02d-%02d-%02d")
writeLine("stdout", "4: year:" ++ " " ++ (elem((split(*created,"-")),0)) )
*year = elem((split(*created,"-")),0)
*month = elem((split(*created,"-")),1)
*day = elem((split(*created,"-")),2)
msiAddKeyVal(*kvp2, "create_year", *year );
msiAddKeyVal(*kvp2, "create_month", *month );
msiAddKeyVal(*kvp2, "create_day", *day );
# add meta data to the data object; -d file(data object), -C directory(collection)
msiAssociateKeyValuePairsToObj(*kvp2,*newfile,"-d");
# find files with metadata between an epoc date range
# supported operators
#>=
#<=
#=
#<
#>
#'1' '100'
#
# 2020(1575072000) - 2030(1890691200)
*query = SELECT DATA_NAME WHERE COLL_NAME = '*collection' AND META_DATA_ATTR_NAME = 'create_date_epoc' AND META_DATA_ATTR_VALUE BETWEEN '01575072000' '01890691200'
foreach(*row in *query) {
msiGetValByKey(*row,"DATA_NAME",*data_name)
writeLine("stdout", "4: file: " ++ "*data_name" ++ " created between 2020 - 2030" )
}
}
### msiModAVUMetadata - change metadata directly on the object/collection/resource ###
# this is new microservice as of version 4.2.7 and easy to use
# msiModAVUMetadata allows key, value and unit (AVU) manipulation, much like the imeta icommand
# assigning an additional attribute 'unit' to the key pair is useful and can be treated as a secondary value or left empty ""
# remove all key pairs directly from the data object
msiModAVUMetadata("-d","*newfile","rmw", "%", "%", "%")
# add new key pair directly to the data object
msiModAVUMetadata("-d","*newfile","add", "car", "ford", "string")
# change value for key directly on the data object
msiModAVUMetadata("-d","*newfile","set", "car", "toyoda", "string")
# remove key pair directly on the data object
msiModAVUMetadata("-d","*newfile","rm", "car", "toyoda", "string")
# wildcard remove key pairs directly on the data object
msiModAVUMetadata("-d","*newfile","add", "car", "subaru", "string")
msiModAVUMetadata("-d","*newfile","add", "car", "suzuki", "string")
msiModAVUMetadata("-d","*newfile","add", "car", "saab", "string")
msiModAVUMetadata("-d","*newfile","rmw", "car", "su%", "%")
#msiModAVUMetadata("-d","*newfile","rmw", "ca%", "%", "%")
# add some meta data with arbitrary unit types
msiModAVUMetadata("-d","*newfile","add", "catC", "yes", "damage")
msiModAVUMetadata("-d","*newfile","add", "price", "1200", "sterling")
## searching with metadata
# search for files in a collection where the key unit matches damage and the key value matches yes, return the filename key value key name with value yes and unit damage
*query = SELECT DATA_NAME,META_DATA_ATTR_NAME WHERE COLL_NAME = '*collection' AND META_DATA_ATTR_UNITS = 'damage' AND META_DATA_ATTR_VALUE like 'y%'
foreach(*row in *query) {
msiGetValByKey(*row,"DATA_NAME",*target_file)
msiGetValByKey(*row,"META_DATA_ATTR_NAME",*damage_type)
# search for car key value using the file name
*sub_query = SELECT META_DATA_ATTR_VALUE WHERE COLL_NAME = '*collection' AND DATA_NAME = '*target_file' AND META_DATA_ATTR_NAME = 'car'
foreach(*sub_row in *sub_query) {
msiGetValByKey(*sub_row,"META_DATA_ATTR_VALUE",*car)
# search for the price key value under threshold (string is dynamically evaluated as numeric)
*sub_query = SELECT META_DATA_ATTR_VALUE WHERE COLL_NAME = '*collection' AND DATA_NAME = '*target_file' AND META_DATA_ATTR_NAME = 'price' AND META_DATA_ATTR_VALUE < '1201'
foreach(*sub_row in *sub_query) {
msiGetValByKey(*sub_row,"META_DATA_ATTR_VALUE",*price)
#writeLine("stdout", *price)
}
# if price variable set, its value below 1201
if (errorcode(*price) >=0) then {
writeLine("stdout","5: writeoff: *damage_type *car £*price")
}
}
}
}
INPUT null
OUTPUT ruleExecOut
tseed@tseed-irods:~$ irule -h # display command parameters
tseed@tseed-irods:~$ irule -t -r irods_rule_engine_plugin-irods_rule_language-instance -F test-meta.r
a = 10
zebra = horse
hula = dance
art = crayon
timestamp = 2021-05-24.09:46:23
a || 10
zebra || horse
hula || dance
art || crayon
timestamp || 2021-05-24.09:46:23
a
zebra
hula
art
timestamp
1: a = 10
1: zebra = horse
1: hula = dance
1: art = crayon
1: timestamp = 2021-05-24.09:46:23
2: a = 5
2: zebra = horse
2: no match
2: no match
2: timestamp = 2021-05-24.09:46:23
3: AND a = 10
3: zebra = horse
3: art = crayon
3: timestamp = 2021-05-24.09:46:23
4: collection exists: /OCF/home/rods true
4: file removed: /OCF/home/rods/test.txt
4: file created: /OCF/home/rods/test.txt
4: metadata to keep on /OCF/home/rods/test.txt, a=10
4: metadata to keep on /OCF/home/rods/test.txt, art=crayon
4: metadata to remove from /OCF/home/rods/test.txt, hula=dance
4: metadata to remove from /OCF/home/rods/test.txt, timestamp=2021-05-24.09:46:23
4: metadata to remove from /OCF/home/rods/test.txt, zebra=horse
4: year: 2021
4: file: test.txt created between 2020 - 2030
5: writeoff: catC saab £1200
TODO developing server side rules
Two server side "native language" rules examples in "python server side rule engine", need one more with some replication to s3 resource for reference.
https://groups.google.com/g/irod-chat/c/ABED29dReBs/m/5fWo87WYCAAJ https://groups.google.com/g/irod-chat/c/ObjcBN7W1j0/m/LpQCzp-OAAAJ python rule instead of native - this needs to be put in the server side rul engine example. https://groups.google.com/g/irod-chat/c/evYIHiG0R60/m/QaaluRjpBwAJ print out https://groups.google.com/g/irod-chat/c/gZSB3Pzv8XM/m/t_eXf0LZAAAJ nice keyval stuff
https://slides.com/irods/ugm2019-administration-rule-engine-plugins#/5 might explain how the rule engine has changed with new dynamic PEPs (these will mature over time and hopefully be better documented for the transition)
ok what have we learned? - must have python rulesets ONLY for specific PEPS or rule engine breaks if the python rule engine is enabled any client irule activity must specify which rule engine to use (a bit of a pain - cant find a way to set client side defaults) why do they have it like this? - doesnt matter the python weighted matching engine is easier to configure in many ways - keep it but ensure the policy only overrides single PEPs - what if we want a mix of PEPS triggered from native ruleset vs python? - have to write additional python logic for the PEP to callback the r. rules engine.