SLURM Job Accounting Gather Plugin API
Overview
This document describes SLURM job accounting gather plugins and the API that defines them. It is intended as a resource to programmers wishing to write their own SLURM job accounting gather plugins. This is version 1 of the API.
SLURM job accounting gather plugins must conform to the SLURM Plugin API with the following specifications:
const char plugin_name[]="full text name"
A free-formatted ASCII text string that identifies the plugin.
const char
plugin_type[]="major/minor"
The major type must be "jobacct_gather." The minor type can be any suitable name for the type of accounting package. We currently use
- aix Gathers information from AIX /proc table and adds this information to the standard rusage information also gathered for each job.
- linuxGathers information from Linux /proc table and adds this information to the standard rusage information also gathered for each job.
- noneNo information gathered.
The programmer is urged to study src/plugins/jobacct_gather/linux and src/common/jobacct_common.c/.h for a sample implementation of a SLURM job accounting gather plugin.
API Functions
All of the following functions are required. Functions which are not implemented must be stubbed.
jobacctinfo_t *jobacct_gather_p_create(jobacct_id_t *jobacct_id)
Description:
jobacct_gather_p_alloc() used to alloc a pointer to and initialize a
new jobacctinfo structure.
You will need to free the information returned by this function!
Arguments:
tid
(input) id of the task send in (uint16_t)NO_VAL if no specfic task.
Returns:
jobacctinfo structure pointer on success, or
NULL on failure.
void jobacct_gather_p_destroy(jobacctinfo_t *jobacct)
Description:
jobacct_gather_p_free() used to free the allocation made by jobacct_gather_p_alloc().
Arguments:
jobacct
(input) structure to be freed.
none
Returns:
none
int jobacct_gather_p_setinfo(jobacctinfo_t *jobacct, enum jobacct_data_type type, void *data)
Description:
jobacct_gather_p_setinfo() is called to set the values of a jobacctinfo_t to
specific values based on inputs.
Arguments:
jobacct
(input/output) structure to be altered.
type
(input) enum of specific part of jobacct to alter.
data
(input) corresponding data to set jobacct part to.
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
int jobacct_gather_p_getinfo(jobacctinfo_t *jobacct, enum jobacct_data_type type, void *data)
Description:
jobacct_gather_p_getinfo() is called to get the values of a jobacctinfo_t
specific values based on inputs.
Arguments:
jobacct
(input) structure to be queried.
type
(input) enum of specific part of jobacct to get.
data
(output) corresponding data to from jobacct part.
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
void jobacct_gather_p_pack(jobacctinfo_t *jobacct, Buf buffer)
Description:
jobacct_gather_p_pack() pack jobacctinfo_t in a buffer to send across the network.
Arguments:
jobacct
(input) structure to pack.
buffer
(input/output) buffer to pack structure into.
Returns:
none
void jobacct_gather_p_unpack(jobacctinfo_t *jobacct, Buf buffer)
Description:
jobacct_gather_p_unpack() unpack jobacctinfo_t from a buffer received from
the network.
You will need to free the jobacctinfo_t returned by this function!
Arguments:
jobacct
(input/output) structure to fill.
buffer
(input) buffer to unpack structure from.
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
void jobacct_gather_p_aggregate(jobacctinfo_t *dest, jobacctinfo_t *from)
Description:
jobacct_gather_p_aggregate() is called to aggregate and get max values from two
different jobacctinfo structures.
Arguments:
dest
(input/output) initial structure to be applied to.
from
(input) new info to apply to dest.
Returns:
none
int jobacct_gather_p_startpoll(int frequency)
Description:
jobacct_gather_p_startpoll() is called at the start of the slurmstepd,
this starts a thread that should poll information to be queried at any time
during throughout the end of the process.
Put global initialization here.
Arguments:
frequency (input) poll frequency for polling
thread.
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
int jobacct_gather_p_endpoll()
Description:
jobacct_gather_p_endpoll() is called when the process is finished to stop the
polling thread.
Arguments:
none
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
void jobacct_gather_p_suspend_poll(void)
Description:
jobacct_gather_p_suspend_poll() is called when the process is suspended.
This causes the polling thread to halt until the process is resumed.
Arguments:
none
Returns:
none
void jobacct_gather_p_resume_poll(void)
Description:
jobacct_gather_p_resume_poll() is called when the process is resumed.
This causes the polling thread to resume operation.
Arguments:
none
Returns:
none
int jobacct_gather_p_set_proctrack_container_id(uint64_t cont_id)
Description:
jobacct_gather_p_set_proctrack_container_id() is called after the
proctrack container id is known at the start of the slurmstepd,
if using a proctrack plugin to track processes this will set the head
of the process tree in the plugin.
Arguments:
cont_id (input) procktrack container id.
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
int jobacct_gather_p_add_task(pid_t pid, uint16_t tid)
Description:
jobacct_gather_p_add_task() used to add a task to the poller.
Arguments:
pid (input) Process id
tid (input) slurm global task id
Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
jobacctinfo_t *jobacct_gather_p_stat_task(pid_t pid)
Description:
jobacct_gather_p_stat_task() used to get most recent information about task.
You need to FREE the information returned by this function!
Arguments:
pid (input) Process id
Returns:
jobacctinfo structure pointer on success, or
NULL on failure.
jobacctinfo_t *jobacct_gather_p_remove_task(pid_t pid)
Description:
jobacct_gather_p_remove_task() used to remove a task from the poller.
You need to FREE the information returned by this function!
Arguments:
pid (input) Process id
Returns:
Pointer to removed jobacctinfo_t structure
on success, or
NULL on failure.
void jobacct_gather_p_2_sacct(sacct_t *sacct, jobacctinfo_t *jobacct)
Description:
jobacct_gather_p_2_sacct() is called to transfer information from data structure
jobacct to structure sacct.
Arguments:
sacct
(input/output) initial structure to be applied to.
jobacct
(input) jobacctinfo_t structure containing information to apply to sacct.
Returns:
none
Parameters
These parameters can be used in the slurm.conf to set up type of plugin and the frequency at which to gather information about running jobs.
- JobAcctGatherType
- Specifies which plugin should be used.
- JobAcctGatherFrequency
- Let the plugin know how long between pollings.
Versioning
This document describes version 1 of the SLURM Job Accounting Gather API. Future releases of SLURM may revise this API. A job accounting gather plugin conveys its ability to implement a particular API version using the mechanism outlined for SLURM plugins.
Last modified 15 April 2011