Description
oar-scripting is an helper for OAR's prologue/epilogue writes in Ruby, this helper :
- Create execution logs in order to debug easely these scripts.
- Seperate into steps each different parts.
- Add stats with the execution duration of each steps, and the oar-scripting-graph tool can generates graphs.
- Add the possibilities to use several scripts and load them from a directory.
- Steps are overwritable.
Requirements
Ruby Gem installation
$ gem install oar-scripting
Writting a script
#!/usr/bin/env ruby
require "rubygems"
require "oar/scripting"
include OAR::Scripting
Init a prologue script
Script.init :prologue
Init an epilogue script
Script.init :epilogue
Job informations
job method (prologue/epilogue args)
A method job return a Hash with job informations (prologue/epilogue arguments parsed at the initialization of the Script class) :
job[:id] # id of the job
job[:user] # user, owner of the job
job[:nodesfile] # file containing list of resources
oarstat method
Full description of a job. Executes oarstat -J
command to get informations
about job, only one time per script and only if
necessary. This method return a Hash.
oarstat["initial_request"]
oarstat["walltime"]
...
Step definition
Usage
A step is a ruby block, using the step method and defined by :
-
a name as a Ruby Symbol :name
-
some options as a Hash with these default values :
options = { :order => 50, :overwrite => false, :continue => true }
Options
:order
Define the step execution order
:overwrite
If true, this step overwrites an other step defined with the same name.
:continue
If false, block execution of this step will exit the script if an exception is raised.
Example
step :kadeploy, :order => 20 do
# code
end
Disable steps
Steps can be disabled like this :
Script.disable_steps [:kadeploy, :kavlan, :oar]
Script.disable_steps :storage
Note : See /etc/oar/epilogue.d/storage.rb example.
System calls with logging
Usage
A system call with logging is made with method sh and defined by :
-
the command to launch as a String
-
some options as a Hash with these default values :
options : { :return => false, :stderr => true }
Options
:return
If true, return the result of command.
:stderr
If false, append 2>/dev/null to the command. If true, append 2>&1 to the command.
Examples
sh %{/usr/sbin/karights3 --overwrite-rights -a -f #{job[:nodesfile]} -p /dev/sda3 -u #{job[:user]}}
KAVLAN_IDS = sh(%{/usr/bin/kavlan -V -j #{job[:id]} -q}, :return => true, :stderr => false).split("\n")
Split prologue/epilogue into several files
Load scripts
Script.load_steps
By default, this command will load /etc/oar/prologue.d/*.rb files for the prologue and /etc/oar/epilogue.d/*.rb files for the epilogue.
You can change these paths :
OAR::Scripting::Config[:prologue_d_path] = "/var/lib/oar/prologue.d"
OAR::Scripting::Config[:epilogue_d_path] = "/var/lib/oar/epilogue.d"
or
OAR::Scripting::Config[:oar_conf_path] = "/var/lib/oar" # Do the same as the 2 above lines
Execute resulting ordered steps
Script.execute
Generated logs
These log are generated by default in the /var/log/oar directory, using format jobid-epilogue|prologue-user.log
OAR::Scripting::Config[:log_path] = "/var/lib/oar/scripting/logs" # to change the log path
Example : file /var/log/oar/420075-epilogue-pmorillo.log
# Logfile created on Fri Mar 30 09:29:49 +0200 2012 by logger.rb/22285
I, [2012-03-30T09:29:49.998917 #1246] INFO -- : [begin]
I, [2012-03-30T09:29:50.420609 #1246] INFO -- : [disable_step]bonfire
I, [2012-03-30T09:29:50.420651 #1246] INFO -- : [disable_step]storage
I, [2012-03-30T09:29:50.420683 #1246] INFO -- : [begin_step]kavlan
I, [2012-03-30T09:29:50.420776 #1246] INFO -- : [command] /usr/bin/kavlan -V -j 420075 -q 2>&1
D, [2012-03-30T09:29:50.962787 #1246] DEBUG -- : [result]
4
D, [2012-03-30T09:29:50.963022 #1246] DEBUG -- : [command_success]
I, [2012-03-30T09:29:50.963177 #1246] INFO -- : [command] /usr/sbin/kavlan_rights -u pmorillo -d -i 4 2>&1
D, [2012-03-30T09:29:51.284595 #1246] DEBUG -- : [result]
vlan 4: ok
D, [2012-03-30T09:29:51.284838 #1246] DEBUG -- : [command_success]
I, [2012-03-30T09:29:51.284943 #1246] INFO -- : [end_step]kavlan
I, [2012-03-30T09:29:51.285020 #1246] INFO -- : [begin_step]kadeploy
I, [2012-03-30T09:29:51.285135 #1246] INFO -- : [command] /usr/sbin/karights3 -d -u pmorillo -p /dev/sda3 -f /var/lib/oar/420075 2>&1
D, [2012-03-30T09:29:52.448497 #1246] DEBUG -- : [result]
D, [2012-03-30T09:29:52.448741 #1246] DEBUG -- : [command_success]
I, [2012-03-30T09:29:52.448846 #1246] INFO -- : [command] /usr/bin/kareboot3 -l hard --no-wait -p 2 -r env_recorded -e squeeze-x64-prod -u deploy -f /var/lib/oar/420075 2>&1
D, [2012-03-30T09:30:06.969389 #1246] DEBUG -- : [result]
--- switch_pxe (parapluie cluster)
>>> parapluie-38.rennes.grid5000.fr
--- reboot (parapluie cluster)
>>> parapluie-38.rennes.grid5000.fr
*** A hard reboot will be performed on the nodes parapluie-38.rennes.grid5000.fr
--- set_vlan (parapluie cluster)
>>> parapluie-38.rennes.grid5000.fr
*** Bypass the VLAN setting
D, [2012-03-30T09:30:06.969635 #1246] DEBUG -- : [command_success]
I, [2012-03-30T09:30:06.969723 #1246] INFO -- : [end_step]kadeploy
I, [2012-03-30T09:30:06.969832 #1246] INFO -- : [begin_step]oar
I, [2012-03-30T09:30:06.969938 #1246] INFO -- : [command] /usr/sbin/oarnodesetting -n -s Absent -p available_upto=0 --sql "resource_id IN (select assigned_resources.resource_id from jobs,assigned_resources,resources where assigned_resource_index = 'CURRENT' AND jobs.state = 'Running' AND jobs.job_id = 420075 and moldable_job_id = jobs.assigned_moldable_job AND (resources.resource_id = assigned_resources.resource_id AND resources.type='default'))" 2>&1
D, [2012-03-30T09:30:10.617586 #1246] DEBUG -- : [result]
3804 --> Absent
3805 --> Absent
3806 --> Absent
3807 --> Absent
3808 --> Absent
3809 --> Absent
3810 --> Absent
3811 --> Absent
3812 --> Absent
3813 --> Absent
3814 --> Absent
3815 --> Absent
3816 --> Absent
3817 --> Absent
3818 --> Absent
3819 --> Absent
3820 --> Absent
3821 --> Absent
3822 --> Absent
3823 --> Absent
3824 --> Absent
3825 --> Absent
3826 --> Absent
3827 --> Absent
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
Update property available_upto with value 0 ...DONE
D, [2012-03-30T09:30:10.617851 #1246] DEBUG -- : [command_success]
I, [2012-03-30T09:30:10.617936 #1246] INFO -- : [end_step]oar
I, [2012-03-30T09:30:10.618015 #1246] INFO -- : [end]
I, [2012-03-30T09:30:10.618210 #1246] INFO -- : [stats]{"duration":20.619485,"steps":[{"duration":0.864266,"name":"kavlan","order":20},{"duration":15.684741,"name":"kadeploy","order":40},{"duration":3.648107,"name":"oar","order":60}],"job":{"nodesfile":"/var/lib/oar/420075","user":"pmorillo","resources_count":24,"host_count":1,"id":"420075"}}
Generate graphs
oar-scripting-graph --output /tmp
This command will create the directory /tmp/oar-scripting-graph with web pages and graphs.
Note : This script work only with default log path (options will be implemented later).
Full Epilogue example used on a Grid'5000 frontend
This epilogue gerenerate the log above.
File /etc/oar/epilogue
#!/usr/bin/env ruby
# MANAGED BY PUPPET
# Module : oarg5k
# File : puppet:///modules/oarg5k/rennes/epilogue
require 'rubygems'
require 'oar/scripting'
include OAR::Scripting
Script.init :epilogue
Script.load_steps
Script.disable_steps :bonfire unless job[:user] == "bonfire"
step :bonfire, :order => 30 do
Script.logger.info "Bonfire's job, need to set all nodes in the default Vlan"
sh %{kavlan -s -i DEFAULT -f #{job[:nodesfile]}}
end
Script.execute
File /etc/oar/epilogue.d/kavlan.rb
# MANAGED BY PUPPET
# Module : kavlang5k
# File : puppet:///modules/kavlang5k/commons/oar/epilogue_kavlan.rb
unless oarstat["initial_request"].match "kavlan"
Script.disable_steps :kavlan
end
step :kavlan, :order => 20 do
KAVLAN_IDS = sh(%{/usr/bin/kavlan -V -j #{job[:id]} -q}, :return => true, :stderr => false).split("\n")
if !KAVLAN_IDS.empty?
sh %{/usr/sbin/kavlan_rights -u #{job[:user]} -d -i #{KAVLAN_IDS.join(" -i ")}}
KAVLAN_IDS.each do |vlan|
if vlan.to_i < 4
sh %{ssh kavlan-#{vlan} "echo \\"-:ALL:ALL\\" > /var/lib/oar/access.conf; sudo /etc/init.d/dhcp3-server restart; sudo pkill -9 -u #{job[:user]}"}
end
end
end
end
File /etc/oar/epilogue.d/kadeploy.rb
# MANAGED BY PUPPET
# Module : kadeployg5k
# File : puppet:///modules/kadeployg5k/oar/epilogue_kadeploy.rb
step :kadeploy, :order => 40 do
# Revoke kadeploy rights that were granted to the user for the time of the job
sh %{/usr/sbin/karights3 -d -u #{job[:user]} -p /dev/sda3 -f #{job[:nodesfile]}}
# Reboot the nodes.
# NB: the command is launched in background, because it can take time to
# return, and oar can grow impatient... For the nohup to work properly, we
# need to redirect all descriptors.
sh %{/usr/bin/kareboot3 -l very_hard --no-wait -p 2 -r env_recorded -e squeeze-x64-prod -u deploy -f #{job[:nodesfile]}}
end
File /etc/oar/epilogue.d/oar.rb
# MANAGED BY PUPPET
# Module : kadeployg5k
# File : puppet:///modules/kadeployg5k/oar/epilogue_oar.rb
step :oar, :order => 60 do
# Mark nodes as Absent as they are rebooting
sh %{/usr/sbin/oarnodesetting -n -s Absent -p available_upto=0 --sql "resource_id IN (select assigned_resources.resource_id from jobs,assigned_resources,resources where assigned_resource_index = 'CURRENT' AND jobs.state = 'Running' AND jobs.job_id = #{job[:id]} and moldable_job_id = jobs.assigned_moldable_job AND (resources.resource_id = assigned_resources.resource_id AND resources.type='default'))"}
end
File /etc/oar/epilogue.d/storage.rb
# MANAGED BY PUPPET
# Module : kadeployg5k
# File : puppet:///modules/storage5k/oar/epilogue_storage.rb
#
if oarstat["initial_request"].match "chunks"
Script.disable_steps [:oar, :kadeploy, :kavlan]
else
Script.disable_steps :storage
end
step :storage, :order => 80 do
sh %{/usr/bin/storage5k -a job-remove -u #{job[:user]} -j #{job[:id]}}
sh %{/usr/bin/storage5k -a umount -j #{job[:id]}}
end
overwrite example and oarstat usage
/etc/oar/epilogue.d/test.rb
step :kadeploy, :order => 40, :overwrite => true do
sh %{/usr/sbin/karights3 -d -u #{job[:user]} -p /dev/sda3 -f #{job[:nodesfile]}}
end
step :oar, :order => 60, :overwrite => true do
Script.logger.debug "Walltime : #{oarstat["walltime"]}"
Script.logger.debug "Assigned resources : #{oarstat["assigned_resources"].inspect}"
end
Generated logs
root@fqualif.qualif.grid5000.fr(kvm|paramount-srv):~# cat /var/log/oar/404-epilogue-pmorillo.log
# Logfile created on Mon Mar 26 11:23:09 +0200 2012 by logger.rb/22285
I, [2012-03-26T11:23:09.532239 #30551] INFO -- : [begin]
I, [2012-03-26T11:23:09.532818 #30551] INFO -- : [step_overwrites] replace {:proc=>#<Proc:0x00007f41ab7a27b0@/etc/oar/epilogue.d/kadeploy.rb:5>, :continue=>true, :name=>:kadeploy, :order=>40} by {:proc=>#<Proc:0x00007f41ab7a1fe0@/etc/oar/epilogue.d/test.rb:1>, :continue=>true, :name=>:kadeploy, :overwrite=>true, :order=>40}
I, [2012-03-26T11:23:09.533040 #30551] INFO -- : [step_already_defined] skip {:proc=>#<Proc:0x00007f41ab7a0ac8@/etc/oar/epilogue.d/oar.rb:5>, :continue=>true, :name=>:oar, :order=>60}
I, [2012-03-26T11:23:09.533102 #30551] INFO -- : [begin_step]kadeploy
I, [2012-03-26T11:23:09.533178 #30551] INFO -- : [command] /usr/sbin/karights3 -d -u pmorillo -p /dev/sda3 -f /var/lib/oar/404 2>&1
D, [2012-03-26T11:23:10.717976 #30551] DEBUG -- : [result]
D, [2012-03-26T11:23:10.718377 #30551] DEBUG -- : [command_success]
I, [2012-03-26T11:23:10.718456 #30551] INFO -- : [end_step]kadeploy
I, [2012-03-26T11:23:10.718563 #30551] INFO -- : [begin_step]oar
D, [2012-03-26T11:23:11.008116 #30551] DEBUG -- : Walltime : 7200
D, [2012-03-26T11:23:11.008433 #30551] DEBUG -- : Assigned resources : ["5", "6", "7", "8"]
I, [2012-03-26T11:23:11.008504 #30551] INFO -- : [end_step]oar
I, [2012-03-26T11:23:11.008596 #30551] INFO -- : [end]
I, [2012-03-26T11:23:11.008896 #30551] INFO -- : [stats]{"duration":1.476776,"steps":[{"duration":1.185375,"name":"kadeploy","order":40},{"duration":0.289969,"name":"oar","order":60}],"job":{"user":"pmorillo","host_count":1,"nodesfile":"/var/lib/oar/404","id":"404","resources_count":4}}