server-scripts
A gem providing easily usable server scripts for various supercomputers and servers. The following functionality is provided:
- Generate job scripts and run batch jobs on TSUBAME 3.0, ABCI and reedbush machines.
- Parse various kinds of profiling files and generate meaningful output.
Table of Contents
- server-scripts
- Usage
- ENV variables
- Writing job scripts
- Simple openMPI job script
- Intel MPI profiling job script
- Parse intel ITAC output
- Parse starpu worker info
- Usage
Usage
ENV variables
Make sure the SYSTEM
variable is set on your machine so that the gem will automatically
select the appropriate commands to run.
Writing job scripts
Simple openMPI job script
Use the ServerScripts::BatchJob
class in your Ruby for outputting and submitting
job files. A simple MPI job can be generated and submitted as follows:
require 'server_scripts'
include ServerScripts
task = BatchJob.new do |t|
t.nodes = 4
t.npernode = 4
t.wall_time = "1:30:00"
t.out_file = "out.log"
t.err_file = "err.log"
t.node_type = NodeType::FULL
t.mpi = OPENMPI
t.set_env "STARPU_SCHED", "dmda"
t.set_env "MKL_NUM_THREADS", "1"
t.executable = "a.out"
t.options = "3 32768 2048 2 2"
end
task.submit!
This will generate a unique file name and submit it using the system's batch job submission command.
Intel MPI profiling job script
If you want to generate traces using intel MPI, you can use additional options like setting the ITAC and VTUNE output file/folder names.
Parse intel VTune output
Output chart of intel VTune
The way VTune classfies the output in the CSV is a little funny and should be understood properly unless you want to have a hard time. The output can be said to be classified as a tree that looks like so:
CPU Time
- Effective Time
- Idle
- Poor
- Ok
- Ideal
- Spin Time
- Imbalance or Serial Spinning
- Lock Contention
- MPI Busy Wait Time
- Other
- Overhead Time
- Scheduling
- Reduction
- Atomics
- Other
Wait Time
- Idle
- Poor
- Ok
- Ideal
- Over
Wait Count
PID
TID
The total time the sum of CPU Time
and Wait Time
.
Usage
A sample program for parsing the firt 16 threads reported by the vtune command:
vtune -report hotspots -group-by thread -result-dir result_file.vtune \
-report-output result_thread_res.csv -csv-delimiter=,
parser = Parser::VTune::Hotspots::SLATE.new(
"test/artifacts/slate-two-proc-p1.csv", nthreads: 16)
puts parser.total_cpu_time
puts parser.total_cpu_effective_time
puts parser.total_cpu_overhead_time
puts parser.total_wait_time
puts parser.total_mpi_busy_time
puts parser.total_time
Parse intel ITAC output
The intel ITAC tool can be helpful for generating traces of parallel MPI programs. This class can be used for converting an ITAC file to an ideal trace and then generating the function profile for obtaining things like the MPI wait time.
Usage
For extracting the MPI wait time from an ITAC trace, do the following:
require 'server_scripts'
itac = ServerScripts::Parser::ITAC.new("itac_file.stf")
itac.generate_ideal_trace!
# All times are reported in seconds.
puts itac.mpi_time(kind: :ideal)
puts itac.mpi_time(kind: :real)
puts itac.event_time("getrf_start", how: :total, kind: :real)
puts itac.event_time("getrf_start", how: :per_proc, kind: :real)
Parse starpu worker info
The ServerScripts::Parser::StarpuProfile
class has various functions for parsing the
*.starpu_profile
files that are generated by starpu with per-worker CPU execution info.
These can be batch-processed using server_scripts by specifying a regex that will match the
profile for each process that produces it. You can either get per-worker or per-process
information from this.
Usage
parser = Parser::StarpuProfile.new("test/artifacts/4_proc_profile_8_*.starpu_profile")
puts parser.total_time
puts parser.total_exec_time
puts parser.total_sleep_time
puts parser.total_overhead_time
puts parser.time(event: :total_time, proc_id: 0, worker_id: 4)
puts parser.proc_time event: :exec_time, proc_id: 2
CUBEX profiles
Parse data from profiles generated by scorep in .cubex
file format.
Usage for getting performance metrics
Use the Parser::Cubex
class and provide it with a folder name. The folder should contain a profile.cubex
file
that will be parsed for the output. Then use the parse
method for obtaining various perf counters for any event:
parser = Parser::Cubex.new("test/artifacts/cubex")
puts parser.parse(counter: "PAPI_L3_TCM", event: "gemv")