README for DB-All.e Python bindings

The DB-All.e Python bindings provide 2 levels of access to a DB-All.e database: a complete API similar to the Fortran and C++ API, and a high-level API called volnd that allows to automatically export matrixes of data out of the database.

Contents

The DB-All.e API

Every measured value is held in an object of type dballe.Var:

Var

Wrap a dba_var.

C++ includes: var.h

The methods of dballe.Var are:

variable_code = code()
Get the variable code.
dballe_var = copy()
Create a copy of this variable.
str = enqc()
Get the variable value, as a string.
float = enqd()
Get the variable value, as a double.
int = enqi()
Get the variable value, as an unscaled integer.
str = enqs()
Get the variable value, as a string.
bool = equals(var)

var is a dballe.Var

Check if two variables have the same value.

str = format(nullValue="(undef)")
Create a formatted string representation of the variable value.
dballe_varinfo = info()
Get the variable Varinfo metadata.
bool = isset()
Check if the variable is set.
str = raw()
Get the raw, string-serialised variable value.
set(val)

val is a str

Get the variable value, from a string.

unset()
Unset the variable value.
dba_var = var()
Access the underlying dba_var.

Detailed information about a measured value can be accessed using the info() method of dballe.Var, which gives a dballe.Varinfo object:

Varinfo

Access all available information about a DB-All.e mesured value.

C++ includes: var.h

The methods of dballe.Varinfo are:

int = bit_len()
The length in bits of the variable when encoded in a bit string (after scaling and changing reference value).
int = bit_ref()
The reference value for bit-encoding. When the variable is encoded in a bit string, it is added this value
str = desc()
The variable description.
float = dmax()
Maximum scaled value the field can have.
float = dmin()
Minimum scaled value the field can have.
int = imax()
Maximum unscaled value the field can have.
int = imin()
Minimum unscaled value the field can have.
bool = is_string()
True if the variable is a string; false if it is a numeric value.
int = len()
The length in digits of the integer representation of this variable (after scaling and changing reference value).
int = ref()
The reference value for the variable. When the variable is represented as an integer, and after scaling, it is added this value
int = scale()
The scale of the variable. When the variable is represented as an integer, it is multiplied by 10**scale
str = unit()
The measurement unit of the variable.
variable_code = var()
The variable code.

Data provided as input to the database, and data coming out of the database, is represented many dballe.Var objects grouped in a dballe.Record:

Record

Wrap a dba_record.

C++ includes: record.h

The methods of dballe.Record are:

add(rec)

rec is a dballe.Record

Add to this record the contents of another record.

clear()
Completely empty the record.
clearVars()
Remove all the variables from the record, but leave the context information.
bool = contains(code)

code is a variable code

Check if the record contains the given value.

record = copy()
Create a copy of this record.
record = difference(rec)

rec is a dballe.Record

Create a record with only those fields that change this record into the given record.

dumpToStderr()
Dump the record contents to standard error.
str = enqc_ifset(code)

code is a variable code

Get the string representation of a value (NULL is returned if unset).

float,_bool = enqd_ifset(code)

code is a variable code

Get the double representation of a value.

int,_bool = enqi_ifset(code)

code is a variable code

Get the unscaled integer representation of a value.

str,_bool = enqs_ifset(code)

code is a variable code

Get the string representation of a value.

dballe_var = enq(code)

code is a variable code

Get the Var representation of a value.

bool = equals(rec)

rec is a dballe.Record

Check if the two records have the same content.

iteritems(self)
Iterate all the keyword and variable names in the record, generating (name, value) tuples
iterkeys(self)
Iterate all the keyword and variable names in the record
itervalues(self)
Iterate all the values in the record
itervars(self)
Iterate all the variables in the record
dba_record = rec()
Return the underlying dba_record.
setAnaContext()
Set the record parameters to represent the pseudoana context.
setFromString(assignment)

assignment is a str

Set a record parameter or value from a string in the form "parm=val" or "Bxxyyy=val".

set(parm, value)

parm is a str

value is a str

Set a parameter or value from a string.

setc(parm, value)

parm is a str

value is a str

Set a parameter or value from a string.

setd(parm, value)

parm is a str

value is a float

Set a parameter or value from a double.

seti(parm, value)

parm is a str

value is a int

Set a parameter or value from an unscaled int.

sets(parm, value)

parm is a str

value is a str

Set a parameter or value from a string.

unset(parm)

parm is a str

Unset a parameter or value.

Finally, the database is accessed using a dballe.DB object:

DB

Wrap a dba_db.

C++ includes: db.h

The methods of dballe.DB are:

int = attrQuery(context, var, wanted, res)

context is a int

var is a variable code

wanted is a const vector< dba_varcode > &

res is a Record &

Query the attributes for the given variable in the given context.

attrRemoveList(context, var, attrs)

context is a int

var is a variable code

attrs is a const vector< dba_varcode > &

dba_db = db()
Access the underlying dba_db.
disconnect()
Explicitly disconnect from the database. This is normally performed in the destructor, but an explicit disconnect is needed to support language bindings, such as python previously to 2.5, with unpredictable object destruction patterns.
exportResults(query, encoding, file)

query is a Record &

encoding is a dba_encoding

file is a str

Export the results of a query to a file, using the given encoding.

exportResultsAsGeneric(query, encoding, file)

query is a Record &

encoding is a dba_encoding

file is a str

Export the results of a query to a file, using the given encoding and a generic BUFR and CREX template.

int = insert(rec, canReplace, addToPseudoana, anaid=None)

rec is a Record &

canReplace is a bool

addToPseudoana is a bool

Insert values from a record into the database. Parameters: rec: The record with the values to insert. canReplace: True if existing values can be replaced. addToPseudoana: True if a nonexisting pseudoana entry can be created. Parameters: anaid: If a pseudoana entry has been created, this is its database ID. The context id of the values that have been inserted.

cursor = query(query)

query is a dballe.Record

Query data values.

cursor = queryAna(query)

query is a dballe.Record

Query pseudoana information, retrieving the extra pseudoana info for every station.

cursor = queryAnaSummary(query)

query is a dballe.Record

Query pseudoana information, without retrieving the extra pseudoana info.

cursor = queryDateTimes(query)

query is a dballe.Record

Query the list of date and times for which there is data in the database.

cursor = queryIdents(query)

query is a dballe.Record

Query the list of movable station identifiers present in the database.

cursor = queryLevels(query)

query is a dballe.Record

Query the list of levels present in the database.

cursor = queryLevelsAndTimeRanges(query)

query is a dballe.Record

Query the list of levels and time ranges for which there is data in the database.

cursor = queryReports(query)

query is a dballe.Record

Query the report information for which there is data in the database.

cursor = queryTimeRanges(query)

query is a dballe.Record

Query the list of time ranges for which there is data in the database.

cursor = queryVariableTypes(query)

query is a dballe.Record

Query the list of variable types present in the database.

remove(query)

query is a dballe.Record

Remove from the database all the values that match the given query.

removeOrphans()
Remove all the pseudoana and context entries which have no data.
reset(repinfo_file="")
Wipe the database and reinitialize it, taking the initial repinfo values from the given file, or from the default repinfo.csv.

And query results are iterated using a dballe.Cursor object:

Cursor

Iterate through the results of a database query.

C++ includes: db.h

The methods of dballe.Cursor are:

attributes(self, *args)

Read the attributes for the variable pointed by this record.

If a rec argument is provided, it will write the attributes in that record and return the number of attributes read. If rec is None, it will return a tuple (Record, count) with a newly created Record.

int = attributes(wanted, res)

wanted is a const vector< dba_varcode > &

res is a Record &

Query the attributes for the variable currently referenced by the cursor.

int = contextID()
Get the context id of the last data fetched, when applicable.
bool = next(rec)

rec is a Record &

Fetch the next result into a record. false when there is no more data to read.

int = remaining()
Get the number of remaining results to be fetched.
variable_code = varcode()
Get the varcode of the last data fetched, when applicable.

dballe.Cursor is iterable and so it's rarely used outside of iteration. The normal way of getting data out of the database is:

db = dballe.DB("dbname", "user", "passwd")
query = dballe.Record()
query.seti("latmin", 10.)
query.seti("latmax", 60.)
query.seti("lonmin", -10.)
query.seti("lonmax", 40.)
query.seti("var", "B12001")
for record in db.query(query):
        print "Temperature:", record.enqvar("B12001")

Sometimes it is useful to access a cursor in order to access variable attributes:

cursor = db.query(query):
for record in cursor:
        print "Temperature:", record.enqvar("B12001")
        attrs = cursor.attributes()
        print "  Confidence interval:", attrs.enqvar("B33007")

The volnd API

volnd is an easy way of extracting entire matrixes of data out of a DB-All.e database.

This module allows to extract multidimensional matrixes of data given a list of dimension definitions. Every dimension definition defines what kind of data goes along that dimension.

Dimension definitions can be shared across different extracted matrixes and multiple extractions, allowing to have different matrixes whose indexes have the same meaning.

This example code extracts temperatures in a station by datetime matrix:

query = dballe.Record()
query.set("var", "B12001")
query.set("rep_memo", "synop")
query.setlevel(dballe.Level(105, 2, 0, 0))
query.settimerange(dballe.Timerange(0, 0, 0))
vars = read(self.db.query(query), (AnaIndex(), DateTimeIndex()))
data = vars["B12001"]
# Data is now a 2-dimensional Masked Array with the data
#
# Information about what values correspond to an index in the various
# directions can be accessed in data.dims, which contains one list per
# dimension with all the information corresponding to every index.
print "Ana dimension is", len(data.dims[0]), "items long"
print "Datetime dimension is", len(data.dims[1]), "items long"
print "First 10 stations along the Ana dimension:", data.dims[0][:10]
print "First 10 datetimes along the DateTime dimension:", data.dims[1][:10]

This is the list of dimensions supported by dballe.volnd:

AnaIndex

Index for stations, as they come out of the database.

The constructor syntax is: AnaIndex(shared=True, frozen=False, start=None).

The index saves all stations as AnaIndexEntry tuples, in the same order as they come out of the database.

NetworkIndex

Index for networks, as they come out of the database.

The constructor syntax is: NetworkIndex(shared=True, frozen=False, start=None).

The index saves all networks as NetworkIndexEntry tuples, in the same order as they come out of the database.

LevelIndex

Index for levels, as they come out of the database

The constructor syntax is: LevelIndex(shared=True, frozen=False), start=None.

The index saves all levels as dballe.Level tuples, in the same order as they come out of the database.

TimeRangeIndex

Index for time ranges, as they come out of the database.

The constructor syntax is: TimeRangeIndex(shared=True, frozen=False, start=None).

The index saves all time ranges as dballe.TimeRange tuples, in the same order as they come out of the database.

DateTimeIndex

Index for datetimes, as they come out of the database.

The constructor syntax is: DateTimeIndex(shared=True, frozen=False, start=None).

The index saves all datetime values as datetime.datetime objects, in the same order as they come out of the database.

IntervalIndex

Index by fixed time intervals: index points are at fixed time intervals, and data is acquired in one point only if it is within a given tolerance from the interval.

The constructor syntax is: IntervalIndex(start, step, tolerance=0, end=None, shared=True, frozen=False).

start is a datetime.datetime object giving the starting time of the time interval of this index.

step is a datetime.timedelta object with the interval between sampling points.

tolerance is a datetime.timedelta object specifying the maximum allowed interval between a datum datetime and the sampling step. If the interval is bigger than the tolerance, the data is discarded.

end is an optional datetime.datetime object giving the ending time of the time interval of the index. If omitted, the index will end at the latest accepted datum coming out of the database.

The data objects used by AnaIndex and NetworkIndex are:

AnaIndexEntry

AnaIndex entry, with various data about a single station.

It is a tuple of 4 values:
  • station id
  • latitude
  • longitude
  • mobile station identifier, or None
NetworkIndexEntry

NetworkIndex entry, with various data about a single station.

It is a tuple of 2 values:
  • network code
  • network name

The extraction is done using the dballe.volnd.read function:

read(query, dims, filter=None, checkConflicts=True, attributes=None)

query is a dballe.Cursor resulting from a dballe query

dims is the sequence of indexes to use for shaping the data matrixes

filter is an optional filter function that can be used to discard values from the query: if filter is not None, it will be called for every output record and if it returns False, the record will be discarded

checkConflicts tells if we should raise an exception if two values from the database would fill in the same position in the matrix

attributes tells if we should read attributes as well: if it is None, no attributes will be read; if it is True, all attributes will be read; if it is a sequence, then it is the sequence of attributes that should be read.

The result of dballe.volnd.read is a dict mapping output variable names to a dballe.volnd.Data object with the results. All the Data objects share their indexes unless the xxx-Index definitions have been created with shared=False.

This is the dballe.volnd.Data class documentation:

Data

Container for collecting variable data. It contains the variable data array and the dimension indexes.

If v is a Data object, you can access the tuple with the dimensions as v.dims, and the masked array with the values as v.vals.

The methods of dballe.volnd.Data are:

append(self, rec)

Collect a new value from the given dballe record.

You need to call finalise() before the values can be used.

appendAttrs(self, rec)

Collect attributes to append to the record.

You need to call finalise() before the values can be used.

finalise(self)
Stop collecting values and create a masked array with all the values collected so far.