13. C API Reference

This chapter describes the API for ArrayObjects and Ufuncs.

ArrayObject C Structure and API
Structures

The PyArrayObject is, like all Python types, a kind of PyObject. Its definition is:

typedef struct {

PyObject_HEAD

char *data;

int nd;

int *dimensions, *strides;

PyObject *base;

PyArray_Descr *descr;

int flags;

} PyArrayObject;

Where PyObject_HEAD is the standard PyObject header, and the other fields are:

char * data

A pointer to the first data element of the array.

int nd

The number of dimensions in the array.

int * dimensions

A pointer to an array of nd integers, describing the number of elements along each dimension. The sizes are in the conventional order, so that for any array a ,
a.shape==(dimensions[0], dimensions[1], ..., dimensions[nd-1]) .

int * strides

A pointer to an array of nd integers, describing the address offset between two successive data elements along each dimension. Note that strides can also be negative! Each number gives the number of bytes to add to a pointer to get to the next element in that dimension. For example, if myptr currently points to an element in a rank-5 array at indices 1,0,5,3,2 and you want it to point to element 1,0,5,4,2 then you should add strides[3] to the pointer: myptr += strides[3] . This works even if (and is especially useful when) the array is not contiguous in memory.

PyObject * base

Used internally in arrays that are created as slices of other arrays. Since the new array shares its data area with the old one, the original array's reference count is incremented. When the subarray is garbage collected, the base array's reference count is decremented.

PyArray_Desc * descr

See below.

int flags

A bitfield indicating whether the array:

The ownership bits are used by NumPy internally to manage memory allocation and deallocation. They can be false if the array is the result of e.g. a slicing operation on an existing array.

PyArrayDescr *descr

a pointer to a data structure that describes the array and has some handy functions. The slots in this structure are:

PyArray_VectorUnaryFunc *cast[]

an array of function pointers which will cast this arraytype to each of the other data types.

PyArray_GetItemFunc *getitem

a pointer to a function which returns a PyObject of the appropriate type given a (char) pointer to the data to get.

PyArray_SetItemFunc *setitem

a pointer to a function which sets the element pointed to by the second argument to converted Python Ojbect given as the first argument.

int type_num

A number indicating the datatype of the array (i.e. a PyArray_XXXX )

char *one

A pointer to a representation of one for this datatype.

char *zero

A pointer to a representation of zero for this datatype (especially useful for PyArray_OBJECT types)

char type

A character representing the array's typecode (one of 'cb1silfdFDO' ).

The ArrayObject API

In the following op is a pointer to a PyObject and arp is a pointer to a PyArrayObject . Routines which return PyObject * return NULL to indicate failure (and follow the standard exception-setting mechanism). Functions followed by a dagger (|) are functions which return PyObjects whose reference count has been increased by one (new references). See the Python Extending/Embedding manual for details on reference-count management.

int PyArray_Check(op)

returns 1 if op is a PyArrayObject or 0 if it is not.

int PyArray_SetNumericOps(d)

internally used by umath to setup some of its functions.

int PyArray_INCREF(op)

Used for arrays of python objects ( PyArray_OBJECT ) to increment the reference count of every python object in the array op . User code does not typically need to call this.

int PyArray_XDECREF(op)

Used for arrays of python objects ( PyArray_OBJECT ) to decrement the reference count of every python object in the array op .

PyArrayError

Exports the array error object. I don't know its use.

void PyArray_SetStringFunction(op,repr)

Sets the function for representation of all arrays to op which should be a callable PyObject . If repr is non-zero then the function corresponding to the repr string representationis set, otherwise, that for the str string representation is set.

PyArray_Descr PyArray_DescrFromType(type)

returns a PyArray_Descr structure for the datatype given by type . The input type can be either the enumerated types ( PyArray_Float , etc.) or a character ( 'cb1silfdFDO' ).

PyObject *PyArray_Cast(arp, type) |

returns a pointer to a PyArrayObject that is arp cast to the array type specified by type . It is just a wrapper around the function defined in arp->descr->cast that handles non-contiguous arrays and arrays of Python objects appropriately.

int PyArray_CanCastSafely(fromtype,totype)

returns 1 if the array with type fromtype can be cast to an array of type totype without loss of accuracy, otherwise it returns 0 . It allows conversion of long s to int s which is not safe on 64-bit machines. The inputs fromtype and totype are the enumerated array types (e.g. PyArray_SBYTE ).

int PyArray_ObjectType(op, min_type)

returns the typecode to use for a call to an array creation function given an input python sequence object op and a minimum type value, min_type . It looks at the datatypes used in op , compares this with min_type and returns a consistent type value that can be used to store all of the data in op and satisfying at the minimum the precision of min_type .

int _PyArray_multiply_list(list,n)

is a utility routine to multiply an array of n integers pointed to by list .

int PyArray_Size(op)

is a useful function for returning the total number of elements in op if op is a PyArrayObject , 0 otherwise.

PyObject *PyArray_FromDims(nd,dims,type) |

returns a pointer to a newly constructed PyArrayObject (returned as a PyObject ) given the number of dimensions in nd , an array dims of nd integers specifying the size of the array, and the enumerated type of the array in type .

PyObject *PyArray_FromDimsAndData(nd,dims,type,data) |

This function should only be used to access global data that will never be freed (like FORTRAN common blocks). It builds a PyArrayObject in the same way as PyArray_FromDims but instead of allocating new memory for the array elements it uses the bytes pointed to by data (a char * ).

PyObject *PyArray_ContiguousFromObject(op,type,min_dim,max_dim) |

returns a contiguous array of type type from the (possibly nested) sequence object op . If op is a contiguous PyArrayObject then a reference is made; if op is a non-contiguous then a copy is performed to get a contiguous array; if op is not a PyArrayObject then a new PyArrayObject is created from the sequence object and returned. The two parameters min_dim and max_dim let you specify the expected rank of the input sequence. An error will result if the resulting PyArrayObject does not have rank bounded by these limits. To specify an exact rank requirement set min_dim = max_dim . To allow for an arbitrary number of dimensions specify min_dim = max_dim = 0 .

PyObject *PyArray_CopyFromObject(op,type,min_dim,max_dim) |

returns a contiguous array similar to PyArray_ContiguousFromObject except that a copy of op is performed even if a shared array could have been used.

PyObject *PyArray_FromObject(op,type,min_dim,max_dim) |

returns a reference to op if op is a PyArrayObject and a newly constructed PyArrayObject if op is any other (nested) sequence object. You must use strides to access the elements of this possibly discontiguous array correctly.

PyObject *PyArray_Return(apr)

returns a pointer to apr with some extra code to check for errors and be sure that zero-dimensional arrays are returned as scalars. If a scalar is returned instead of apr then apr 's reference count is decremented, so it is safe to use this function in the form :
return PyArray_Return (apr);

PyObject *PyArray_Reshape(apr,op) |

returns a reference to apr with a new shape specified by op which must be a one dimensional sequence object. One dimension may be specified as unknown by giving a value less than zero, its value will be calculated from the size of apr .

PyObject *PyArray_Copy(apr) |

returns an element-for-element copy of apr

PyObject *PyArray_Take(a,indices,axis) |

the equivalent of take(a, indices, axis) which is a method defined in the Numeric module that just calls this function.

int PyArray_As1D(*op, char **ptr, int *n, int type)

This function replaces op with a pointer to a contiguous 1-D PyArrayObject (using PyArray_ContiguousFromObject ) and sets as output parameters a pointer to the first byte of the array in ptr and the number of elements in the array in n . It returns -1 on failure ( op is not a 1-D array or sequence object that can be cast to type type ) and 0 on success.

int PyArray_As2D(*op, char **ptr, int *m, int *n, int type)

This function replaces op with a pointer to a contiguous 2-D PyArrayObject (using PyArray_ContiguousFromObject ). It returns -1 on failure (op is not a 2-D array or nested sequence object that can be cast to type type) and 0 on success. It also sets as output parameters: an array of pointers in ptr which can be used to access the data as a 2-D array so that ptr[i][j] is a pointer to the first byte of element [i,j] in the array; m and n are set to respectively the number of rows and columns of the array.

int PyArray_Free(op,ptr)

is supposed to free the allocated data structures and decrease object references when using PyArray_As1D and PyArray_As2D but there are suspicions that this code is buggy.

Notes

Number formats, overflow issues, NaN/Inf representations, fpectl module, how to deal with 'missing' values.

UfuncObject C Structure and API
C Structure

The ufuncobject is a generic function object that can be used to perform fast operations over Numeric Arrays with very useful broadcasting rules and type conversions performed automatically. The ufuncobject and its API make it easy and graceful to add arbitrary functions to Python which operate over Numeric arrays. All of the unary and binary operators currently available in the Numerical extensions (like sin, cos, +, logical_or, etc.) are implemented using this object. The hooks are all in place to make it very easy to add any function that takes one or two (double) arguments and returns a single (double) argument. It is not difficult to add support routines in order to handle arbitrary functions whose total number of input/output arguments is less than some maximum number (currently 10).

typedef struct {

PyObject_HEAD

int *ranks, *canonical_ranks;

int nin, nout, nargs;

int identity;

PyUFuncGenericFunction *functions;

void **data;

int ntypes, nranks, attributes;

char *name, *types;

int check_return;

} PyUFuncObject;

where:

int *ranks

unused.

int *canonical_ranks

unused

int nin

the number of input arguments to function

int nout

the number of output arguments for the function

int nargs

the total number of arguments = nin + nout

int identity

a flag telling whether the identity for this function is 0 or 1 for use in the reduce method for a zero size array input.

PyUFuncGenericFunction *functions

an array of functions that perform the innermost looping over the input and output arrays (I think this is over a single axis). These functions call the underlying math function with the data from the input arguments along this axis and return the outputs of the function into the correct place in the output arrayobject (with appropriate typecasting). These functions are called by the general looping code. There is one function for each of the supported datatypes. Function pointers to do this looping for types 'f' , 'd' , 'F' , and 'D' , are provided in the C-API for functions that take one or two arguments and return one argument. Each PyUFuncGenericFunction returns void and has the following argument list (in order):

args

an array of pointers to the data for each of the input and output arguments with input arguments first and output arguments immediately following. Each element of args is a char * to the first byte in the corresponding input or output array.

dimensions

a pointer to a single int giving the size of the axis being looped over.

steps

an array of int s giving the number of bytes to skip to go to the next element of the array for this loop. There is an entry in the array for each of the input and output arguments, with input arguments first and output arguments immediately following.

func

a pointer to the underlying math function to be called at each point in this inner loop. This is a void * and must be recast to the required type before actually calling the function e.g. to a pointer to a function that takes two double s and returns a double ). If you need to write your own PyUFuncGenericFunction , it is most readable to also have a typedef statement that defines your specific underlying function type so the function pointer cast is somewhat readable.

void **data

a pointer to an array of functions (each cast to void *) that compute the actual mathematical function for each set of inputs and outputs. There should be a function in the array for each supported data type. This function will be called from the PyUFuncGenericFunction for the corresponding type.

int ntypes

the number of datatypes supported by this function. For datatypes that are not directly supported, a coercion will be performed if possible safely, otherwise an error will be reported.

int nranks

unused.

int attributes

unused.

char *name

the name of this function (not the same as the dictionary label for this function object, but it is usually set to the same string). It is printed when __repr__ is called for this object, defaults to "?" if set to NULL .

char *types

an array of supported types for this function object. I'm not sure why but each supported datatype ( PyArray_FLOAT , etc.) is entered as many times as there are arguments for this function. ( nargs )

int check_return

Usually best to set to 1. If this is non-zero then returned matrices will be cleaned up so that rank-0 arrays will be returned as python scalars. Also, if non-zero, then any math error that sets the errno global variable will cause an appropriate Python exception to be raised.

UfuncObject C API

There are currently 15 pointers in the C-API array for the ufuncobject which is loaded by import_ufunc() . The macros implemented by this API, available by including the file ufuncobject.h ,' are given below. The only function normally called by user code is the ufuncobject creation function PyUFunc_FromFuncAndData . Some of the other functions can be used as elements of an array to be passed to this creation function.

int PyUFunc_Check(op)

returns 1 if op is a ufunc object otherwise returns 0 .

PyObject *PyUFunc_FromFuncAndData(functions, data, types, ntypes, nin, nout, identity, name, check_return)

returns the ufunc object given its parameters. This is the most important function call. It requires defining three arrays to be passed as parameters: functions , data , and types . The arguments to be passed are:

functions

an array of functions of type PyUFuncGenericFunction , there should be one function for each supported datatype. The functions should be in order so that datatypes listed toward the beginning of the array could be cast as datatypes listed toward the end.

data

an array of pointers to void* the same size as the functions array and in the same datatype order. Each element of this array is the actual underlying math function (recast to a void *) that will be called from one of the PyUFuncGenericFunctions . It will operate on each element of the input NumPy arrayobject (s) and return its element-by-element result in the output NumPy arrayobject(s). There is one function call for each datatype supported, (though functions can be repeated if you handle the typecasting appropriately with the PyUFuncGenericFunction ).

types

an array of PyArray_Type s. The size of this array should be ( nin+nout ) times the size of one of the previous two arrays. There should be nin+nout copies of PyArray_XXXXX for each datatype explicitly supported. (Remember datatypes not explicitly supported will still be accepted as input arguments to the ufunc if they can be cast safely to a supported type.)

ntypes

the number of supported types for this ufunc.

nin

the number of input arguments

nout

the number of output arguments

identity

PyUFunc_One , PyUFunc_Zero , or PyUFunc_None , depending on the desired value for the identity. This is only relevant for functions that take two input arguments and return one output argument. If not relevant use PyUFunc_None .

name

the name of this ufuncobject for use in the __repr__ method.

check_return

the desired value for check_return for this ufuncobject.

int PyUFunc_GenericFunction(self,args,mps)

allows calling the ufunc from user C routine. It returns 0 on success and -1 on any failures. This is the core of what happens when a ufunc is called from Python. Its arguments are:

self

the ufunc object to be called. INPUT

args

a Python tuple object containing the input arguments to the ufunc (should be Python sequence objects). INPUT

mps

an array of pointers to PyArrayObjects for the input and output arguments to this function. The input NumPy arrays are elements mps[0]...mps[self->nin-1] . The output NumPy arrays are elements mps[self->nin]...mps[self->nargs-1] . OUTPUT

The following are all functions of type PyUFuncGenericFunction and are suitable for use in the functions argument passed to PyUFunc_FromFuncAndData :

PyUFunc_f_f_As_d_d

for a unary function that takes a double input and returns a double output as a ufunc that takes PyArray_FLOAT input and returns PyArray_FLOAT output.

PyUFunc_d_d

for a using a unary function that takes a double input and returns a double output as a ufunc that takes PyArray_DOUBLE input and returns PyArray_DOUBLE output.

PyUFunc_F_F_As_D_D

for a unary function that takes a Py_complex input and returns a Py_complex output as a ufunc that takes PyArray_CFLOAT input and returns PyArray_CFLOAT output.

PyUFunc_D_D

for a unary function that takes a Py_complex input and returns a Py_complex output as a ufunc that takes PyArray_CFLOAT input and returns PyArray_CFLOAT output.

PyUFunc_O_O

for a unary function that takes a Py_Object * input and returns a Py_Object * output as a ufunc that takes PyArray_OBJECT input and returns PyArray_OBJECT output

PyUFunc_ff_f_As_dd_d

for a binary function that takes two double inputs and returns one double output as a ufunc that takes PyArray_FLOAT input and returns PyArray_FLOAT output.

PyUFunc_dd_d

for a binary function that takes two double inputs and returns one double output as a ufunc that takes PyArray_DOUBLE input and returns PyArray_DOUBLE output.

PyUFunc_FF_F_As_DD_D

for a binary function that takes two Py_complex inputs and returns a Py_complex output as a ufunc that takes PyArray_CFLOAT input and returns PyArray_CFLOAT output.

PyUFunc_DD_D

for a binary function that takes two Py_complex inputs and returns a Py_complex output as a ufunc that takes PyArray_CFLOAT input and returns PyArray_CFLOAT output

PyUFunc_OO_O

for a unary function that takes two Py_Object * input and returns a Py_Object * output as a ufunc that takes PyArray_OBJECT input and returns PyArray_OBJECT output

PyUFunc_O_O_method

for a unary function that takes a Py_Object * input and returns a Py_Object * output and is pointed to by a Python method as a ufunc that takes PyArray_OBJECT input and returns PyArray_OBJECT output

PyArrayMap

an exported API that was apparently considered but never implemented probably because the functionality is already available with Python's map function.

Go to Main Go to Previous Go to Next