This chapter describes the API for ArrayObjects and Ufuncs.
The PyArrayObject is, like all Python types, a kind of PyObject. Its definition is:
Where PyObject_HEAD is the standard PyObject header, and the other fields are:
A pointer to an array of
nd
integers, describing the number of elements along each dimension. The sizes are in the conventional order, so that for any array
a
,
a.shape==(dimensions[0], dimensions[1], ..., dimensions[nd-1])
.
A pointer to an array of nd integers, describing the address offset between two successive data elements along each dimension. Note that strides can also be negative! Each number gives the number of bytes to add to a pointer to get to the next element in that dimension. For example, if myptr currently points to an element in a rank-5 array at indices 1,0,5,3,2 and you want it to point to element 1,0,5,4,2 then you should add strides[3] to the pointer: myptr += strides[3] . This works even if (and is especially useful when) the array is not contiguous in memory.
Used internally in arrays that are created as slices of other arrays. Since the new array shares its data area with the old one, the original array's reference count is incremented. When the subarray is garbage collected, the base array's reference count is decremented.
A bitfield indicating whether the array:
The ownership bits are used by NumPy internally to manage memory allocation and deallocation. They can be false if the array is the result of e.g. a slicing operation on an existing array.
a pointer to a data structure that describes the array and has some handy functions. The slots in this structure are:
an array of function pointers which will cast this arraytype to each of the other data types.
a pointer to a function which returns a PyObject of the appropriate type given a (char) pointer to the data to get.
a pointer to a function which sets the element pointed to by the second argument to converted Python Ojbect given as the first argument.
A pointer to a representation of zero for this datatype (especially useful for PyArray_OBJECT types)
In the following op is a pointer to a PyObject and arp is a pointer to a PyArrayObject . Routines which return PyObject * return NULL to indicate failure (and follow the standard exception-setting mechanism). Functions followed by a dagger (|) are functions which return PyObjects whose reference count has been increased by one (new references). See the Python Extending/Embedding manual for details on reference-count management.
Used for arrays of python objects ( PyArray_OBJECT ) to increment the reference count of every python object in the array op . User code does not typically need to call this.
Used for arrays of python objects ( PyArray_OBJECT ) to decrement the reference count of every python object in the array op .
Sets the function for representation of all arrays to op which should be a callable PyObject . If repr is non-zero then the function corresponding to the repr string representationis set, otherwise, that for the str string representation is set.
returns a PyArray_Descr structure for the datatype given by type . The input type can be either the enumerated types ( PyArray_Float , etc.) or a character ( 'cb1silfdFDO' ).
returns a pointer to a PyArrayObject that is arp cast to the array type specified by type . It is just a wrapper around the function defined in arp->descr->cast that handles non-contiguous arrays and arrays of Python objects appropriately.
returns 1 if the array with type fromtype can be cast to an array of type totype without loss of accuracy, otherwise it returns 0 . It allows conversion of long s to int s which is not safe on 64-bit machines. The inputs fromtype and totype are the enumerated array types (e.g. PyArray_SBYTE ).
returns the typecode to use for a call to an array creation function given an input python sequence object op and a minimum type value, min_type . It looks at the datatypes used in op , compares this with min_type and returns a consistent type value that can be used to store all of the data in op and satisfying at the minimum the precision of min_type .
is a utility routine to multiply an array of n integers pointed to by list .
is a useful function for returning the total number of elements in op if op is a PyArrayObject , 0 otherwise.
returns a pointer to a newly constructed PyArrayObject (returned as a PyObject ) given the number of dimensions in nd , an array dims of nd integers specifying the size of the array, and the enumerated type of the array in type .
This function should only be used to access global data that will never be freed (like FORTRAN common blocks). It builds a PyArrayObject in the same way as PyArray_FromDims but instead of allocating new memory for the array elements it uses the bytes pointed to by data (a char * ).
returns a contiguous array of type type from the (possibly nested) sequence object op . If op is a contiguous PyArrayObject then a reference is made; if op is a non-contiguous then a copy is performed to get a contiguous array; if op is not a PyArrayObject then a new PyArrayObject is created from the sequence object and returned. The two parameters min_dim and max_dim let you specify the expected rank of the input sequence. An error will result if the resulting PyArrayObject does not have rank bounded by these limits. To specify an exact rank requirement set min_dim = max_dim . To allow for an arbitrary number of dimensions specify min_dim = max_dim = 0 .
returns a contiguous array similar to PyArray_ContiguousFromObject except that a copy of op is performed even if a shared array could have been used.
returns a reference to op if op is a PyArrayObject and a newly constructed PyArrayObject if op is any other (nested) sequence object. You must use strides to access the elements of this possibly discontiguous array correctly.
returns a pointer to
apr
with some extra code to check for errors and be sure that zero-dimensional arrays are returned as scalars. If a scalar is returned instead of
apr
then
apr
's reference count is decremented, so it is safe to use this function in the form :
return PyArray_Return (apr);
returns a reference to apr with a new shape specified by op which must be a one dimensional sequence object. One dimension may be specified as unknown by giving a value less than zero, its value will be calculated from the size of apr .
the equivalent of take(a, indices, axis) which is a method defined in the Numeric module that just calls this function.
This function replaces op with a pointer to a contiguous 1-D PyArrayObject (using PyArray_ContiguousFromObject ) and sets as output parameters a pointer to the first byte of the array in ptr and the number of elements in the array in n . It returns -1 on failure ( op is not a 1-D array or sequence object that can be cast to type type ) and 0 on success.
This function replaces op with a pointer to a contiguous 2-D PyArrayObject (using PyArray_ContiguousFromObject ). It returns -1 on failure (op is not a 2-D array or nested sequence object that can be cast to type type) and 0 on success. It also sets as output parameters: an array of pointers in ptr which can be used to access the data as a 2-D array so that ptr[i][j] is a pointer to the first byte of element [i,j] in the array; m and n are set to respectively the number of rows and columns of the array.
The ufuncobject is a generic function object that can be used to perform fast operations over Numeric Arrays with very useful broadcasting rules and type conversions performed automatically. The ufuncobject and its API make it easy and graceful to add arbitrary functions to Python which operate over Numeric arrays. All of the unary and binary operators currently available in the Numerical extensions (like sin, cos, +, logical_or, etc.) are implemented using this object. The hooks are all in place to make it very easy to add any function that takes one or two (double) arguments and returns a single (double) argument. It is not difficult to add support routines in order to handle arbitrary functions whose total number of input/output arguments is less than some maximum number (currently 10).
PyUFuncGenericFunction *functions;
a flag telling whether the identity for this function is 0 or 1 for use in the reduce method for a zero size array input.
an array of functions that perform the innermost looping over the input and output arrays (I think this is over a single axis). These functions call the underlying math function with the data from the input arguments along this axis and return the outputs of the function into the correct place in the output arrayobject (with appropriate typecasting). These functions are called by the general looping code. There is one function for each of the supported datatypes. Function pointers to do this looping for types 'f' , 'd' , 'F' , and 'D' , are provided in the C-API for functions that take one or two arguments and return one argument. Each PyUFuncGenericFunction returns void and has the following argument list (in order):
an array of pointers to the data for each of the input and output arguments with input arguments first and output arguments immediately following. Each element of args is a char * to the first byte in the corresponding input or output array.
an array of int s giving the number of bytes to skip to go to the next element of the array for this loop. There is an entry in the array for each of the input and output arguments, with input arguments first and output arguments immediately following.
a pointer to the underlying math function to be called at each point in this inner loop. This is a void * and must be recast to the required type before actually calling the function e.g. to a pointer to a function that takes two double s and returns a double ). If you need to write your own PyUFuncGenericFunction , it is most readable to also have a typedef statement that defines your specific underlying function type so the function pointer cast is somewhat readable.
a pointer to an array of functions (each cast to void *) that compute the actual mathematical function for each set of inputs and outputs. There should be a function in the array for each supported data type. This function will be called from the PyUFuncGenericFunction for the corresponding type.
the number of datatypes supported by this function. For datatypes that are not directly supported, a coercion will be performed if possible safely, otherwise an error will be reported.
the name of this function (not the same as the dictionary label for this function object, but it is usually set to the same string). It is printed when __repr__ is called for this object, defaults to "?" if set to NULL .
an array of supported types for this function object. I'm not sure why but each supported datatype ( PyArray_FLOAT , etc.) is entered as many times as there are arguments for this function. ( nargs )
Usually best to set to 1. If this is non-zero then returned matrices will be cleaned up so that rank-0 arrays will be returned as python scalars. Also, if non-zero, then any math error that sets the errno global variable will cause an appropriate Python exception to be raised.
There are currently 15 pointers in the C-API array for the ufuncobject which is loaded by import_ufunc() . The macros implemented by this API, available by including the file ufuncobject.h ,' are given below. The only function normally called by user code is the ufuncobject creation function PyUFunc_FromFuncAndData . Some of the other functions can be used as elements of an array to be passed to this creation function.
returns the ufunc object given its parameters. This is the most important function call. It requires defining three arrays to be passed as parameters: functions , data , and types . The arguments to be passed are:
an array of functions of type PyUFuncGenericFunction , there should be one function for each supported datatype. The functions should be in order so that datatypes listed toward the beginning of the array could be cast as datatypes listed toward the end.
an array of pointers to void* the same size as the functions array and in the same datatype order. Each element of this array is the actual underlying math function (recast to a void *) that will be called from one of the PyUFuncGenericFunctions . It will operate on each element of the input NumPy arrayobject (s) and return its element-by-element result in the output NumPy arrayobject(s). There is one function call for each datatype supported, (though functions can be repeated if you handle the typecasting appropriately with the PyUFuncGenericFunction ).
an array of PyArray_Type s. The size of this array should be ( nin+nout ) times the size of one of the previous two arrays. There should be nin+nout copies of PyArray_XXXXX for each datatype explicitly supported. (Remember datatypes not explicitly supported will still be accepted as input arguments to the ufunc if they can be cast safely to a supported type.)
allows calling the ufunc from user C routine. It returns 0 on success and -1 on any failures. This is the core of what happens when a ufunc is called from Python. Its arguments are:
a Python tuple object containing the input arguments to the ufunc (should be Python sequence objects). INPUT
an array of pointers to PyArrayObjects for the input and output arguments to this function. The input NumPy arrays are elements mps[0]...mps[self->nin-1] . The output NumPy arrays are elements mps[self->nin]...mps[self->nargs-1] . OUTPUT
The following are all functions of type PyUFuncGenericFunction and are suitable for use in the functions argument passed to PyUFunc_FromFuncAndData :
for a unary function that takes a double input and returns a double output as a ufunc that takes PyArray_FLOAT input and returns PyArray_FLOAT output.
for a using a unary function that takes a double input and returns a double output as a ufunc that takes PyArray_DOUBLE input and returns PyArray_DOUBLE output.
for a unary function that takes a Py_complex input and returns a Py_complex output as a ufunc that takes PyArray_CFLOAT input and returns PyArray_CFLOAT output.
for a unary function that takes a Py_complex input and returns a Py_complex output as a ufunc that takes PyArray_CFLOAT input and returns PyArray_CFLOAT output.
for a unary function that takes a Py_Object * input and returns a Py_Object * output as a ufunc that takes PyArray_OBJECT input and returns PyArray_OBJECT output
for a binary function that takes two double inputs and returns one double output as a ufunc that takes PyArray_FLOAT input and returns PyArray_FLOAT output.
for a binary function that takes two double inputs and returns one double output as a ufunc that takes PyArray_DOUBLE input and returns PyArray_DOUBLE output.
for a binary function that takes two Py_complex inputs and returns a Py_complex output as a ufunc that takes PyArray_CFLOAT input and returns PyArray_CFLOAT output.
for a binary function that takes two Py_complex inputs and returns a Py_complex output as a ufunc that takes PyArray_CFLOAT input and returns PyArray_CFLOAT output
for a unary function that takes two Py_Object * input and returns a Py_Object * output as a ufunc that takes PyArray_OBJECT input and returns PyArray_OBJECT output