SLURM Job Accounting Gather Plugin API

Overview

This document describes SLURM job accounting gather plugins and the API that defines them. It is intended as a resource to programmers wishing to write their own SLURM job accounting gather plugins. This is version 1 of the API.

SLURM job accounting gather plugins must conform to the SLURM Plugin API with the following specifications:

const char plugin_name[]="full text name"

A free-formatted ASCII text string that identifies the plugin.

const char plugin_type[]="major/minor"

The major type must be "jobacct_gather." The minor type can be any suitable name for the type of accounting package. We currently use

The sacct program can be used to display gathered data from regular accounting and from these plugins.

The programmer is urged to study src/plugins/jobacct_gather/linux and src/common/jobacct_common.c/.h for a sample implementation of a SLURM job accounting gather plugin.

API Functions

All of the following functions are required. Functions which are not implemented must be stubbed.

jobacctinfo_t *jobacct_gather_p_create(jobacct_id_t *jobacct_id)

Description:
jobacct_gather_p_alloc() used to alloc a pointer to and initialize a new jobacctinfo structure.

You will need to free the information returned by this function!

Arguments:
tid (input) id of the task send in (uint16_t)NO_VAL if no specfic task.

Returns:
jobacctinfo structure pointer on success, or
NULL on failure.

void jobacct_gather_p_destroy(jobacctinfo_t *jobacct)

Description:
jobacct_gather_p_free() used to free the allocation made by jobacct_gather_p_alloc().

Arguments:
jobacct (input) structure to be freed.
none

Returns:
none

int jobacct_gather_p_setinfo(jobacctinfo_t *jobacct, enum jobacct_data_type type, void *data)

Description:
jobacct_gather_p_setinfo() is called to set the values of a jobacctinfo_t to specific values based on inputs.

Arguments:
jobacct (input/output) structure to be altered.
type (input) enum of specific part of jobacct to alter.
data (input) corresponding data to set jobacct part to.

Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.

int jobacct_gather_p_getinfo(jobacctinfo_t *jobacct, enum jobacct_data_type type, void *data)

Description:
jobacct_gather_p_getinfo() is called to get the values of a jobacctinfo_t specific values based on inputs.

Arguments:
jobacct (input) structure to be queried.
type (input) enum of specific part of jobacct to get.
data (output) corresponding data to from jobacct part.

Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.

void jobacct_gather_p_pack(jobacctinfo_t *jobacct, Buf buffer)

Description:
jobacct_gather_p_pack() pack jobacctinfo_t in a buffer to send across the network.

Arguments:
jobacct (input) structure to pack.
buffer (input/output) buffer to pack structure into.

Returns:
none

void jobacct_gather_p_unpack(jobacctinfo_t *jobacct, Buf buffer)

Description:
jobacct_gather_p_unpack() unpack jobacctinfo_t from a buffer received from the network. You will need to free the jobacctinfo_t returned by this function!

Arguments:
jobacct (input/output) structure to fill.
buffer (input) buffer to unpack structure from.

Returns: SLURM_SUCCESS on success, or
SLURM_ERROR on failure.

void jobacct_gather_p_aggregate(jobacctinfo_t *dest, jobacctinfo_t *from)

Description:
jobacct_gather_p_aggregate() is called to aggregate and get max values from two different jobacctinfo structures.

Arguments:
dest (input/output) initial structure to be applied to.
from (input) new info to apply to dest.

Returns:
none

int jobacct_gather_p_startpoll(int frequency)

Description:
jobacct_gather_p_startpoll() is called at the start of the slurmstepd, this starts a thread that should poll information to be queried at any time during throughout the end of the process. Put global initialization here.

Arguments:
frequency (input) poll frequency for polling thread.

Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.

int jobacct_gather_p_endpoll()

Description:
jobacct_gather_p_endpoll() is called when the process is finished to stop the polling thread.

Arguments:
none

Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.

void jobacct_gather_p_suspend_poll(void)

Description:
jobacct_gather_p_suspend_poll() is called when the process is suspended. This causes the polling thread to halt until the process is resumed.

Arguments:
none

Returns:
none

void jobacct_gather_p_resume_poll(void)

Description:
jobacct_gather_p_resume_poll() is called when the process is resumed. This causes the polling thread to resume operation.

Arguments:
none

Returns:
none

int jobacct_gather_p_set_proctrack_container_id(uint64_t cont_id)

Description:
jobacct_gather_p_set_proctrack_container_id() is called after the proctrack container id is known at the start of the slurmstepd, if using a proctrack plugin to track processes this will set the head of the process tree in the plugin.

Arguments:
cont_id (input) procktrack container id.

Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.

int jobacct_gather_p_add_task(pid_t pid, uint16_t tid)

Description:
jobacct_gather_p_add_task() used to add a task to the poller.

Arguments:
pid (input) Process id
tid (input) slurm global task id

Returns:
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.

jobacctinfo_t *jobacct_gather_p_stat_task(pid_t pid)

Description:
jobacct_gather_p_stat_task() used to get most recent information about task. You need to FREE the information returned by this function!

Arguments:
pid (input) Process id

Returns:
jobacctinfo structure pointer on success, or
NULL on failure.

jobacctinfo_t *jobacct_gather_p_remove_task(pid_t pid)

Description:
jobacct_gather_p_remove_task() used to remove a task from the poller. You need to FREE the information returned by this function!

Arguments:
pid (input) Process id

Returns:
Pointer to removed jobacctinfo_t structure on success, or
NULL on failure.

void jobacct_gather_p_2_sacct(sacct_t *sacct, jobacctinfo_t *jobacct)

Description:
jobacct_gather_p_2_sacct() is called to transfer information from data structure jobacct to structure sacct.

Arguments:
sacct (input/output) initial structure to be applied to.
jobacct (input) jobacctinfo_t structure containing information to apply to sacct.

Returns:
none

Parameters

These parameters can be used in the slurm.conf to set up type of plugin and the frequency at which to gather information about running jobs.

JobAcctGatherType
Specifies which plugin should be used.
JobAcctGatherFrequency
Let the plugin know how long between pollings.

Versioning

This document describes version 1 of the SLURM Job Accounting Gather API. Future releases of SLURM may revise this API. A job accounting gather plugin conveys its ability to implement a particular API version using the mechanism outlined for SLURM plugins.

Last modified 15 April 2011