The PHP bindings for Xapian are packaged in the xapian
extension. The PHP API provided by this extension largely follows Xapian's C++
API. This document lists the differences and additions.
These bindings can be built for either PHP4 or PHP5, though there are some differences in how the C++ API is wrapped for each version due to differences between PHP4 and PHP5.
PHP strings, arrays, etc., are converted automatically to and from the
corresponding C++ types in the bindings, so generally you can pass arguments as
you would expect. One thing to be aware of though is that SWIG implements
dispatch functions for overloaded methods based on the types of the parameters,
so you can't always pass in a string containing a number (e.g.
"42"
) where a number is expected as you usually can in PHP.
You need to
explicitly convert to the type required - e.g. use (int)
to
convert to an integer, (string)
to string, (double)
to a floating point number.
PHP has a lot of reserved words of various sorts, which sadly clash with common
method names. Because of this empty()
methods of various
container-like classes are wrapped as is_empty()
for PHP
and the clone()
method of the XapianWeight
class and subclasses is wrapped as clone_object()
.
The examples
subdirectory contains examples showing how to use the
PHP bindings based on the simple examples from xapian-examples
:
Assuming you have a suitable version of PHP installed (PHP4 or PHP5), running
configure will automatically enable the PHP bindings, and
make install
will install the extension shared library in
the location reported by php-config --extension-dir
.
Check that php.ini has a line like extension_dir =
"<location reported by php-config --extension-dir>"
.
Then add this line to php.ini: extension = xapian.so
(or
whatever the library is called - not all UNIX systems use .so
as the extension, and MS Windows uses .dll
).
If you're using PHP as a webserver module (e.g. mod_php with Apache), you may need to restart the webserver for this change to take effect.
Alternatively, you can get scripts which use Xapian to explicitly load it.
This approach is useful if you don't have root access and so can't make
changes to php.ini. The simplest set up is to copy xapian.so
into
the same directory as your PHP script, and then add the following line to the
start of your PHP scripts which use Xapian: dl('xapian.so');
You can put xapian.so
elsewhere (and it's probably better to)
but note that dl()
requires a relative path so you
might have to use something insane-looking like:
dl('../../../../usr/lib/php5/20051025/xapian.so');
If you're using PHP5, you also need to add include "xapian.php"
to your PHP scripts which use Xapian in order to get the PHP class wrappers.
The PHP4 class wrappers are implementing by the module you load with
dl()
.
In PHP5, we translate exceptions thrown by Xapian into PHP Exception objects which are thrown into the PHP script.
Unfortunately, exception handling with try
and catch
isn't supported in PHP4, and there isn't really a good alternative approach.
Most Xapian exceptions are converted to a PHP fatal error.
However, for exceptions which might be commonly encountered we instead
produce a PHP warning and the method returns null
. You can
then check for and handle the exception like so:
$old_error_reporting = error_reporting if ($old_error_reporting & E_WARNING) error_reporting($old_error_reporting ^ E_WARNING); $doc = $database->_get_document($docid); if ($doc != null) { exit(1); } if ($old_error_reporting & E_WARNING) error_reporting($old_error_reporting);
The following exceptions are currently handled like this (this is a fairly new feature, so feedback on exceptions which should and shouldn't be in this list is particularly welcome):
DocNotFoundError
FeatureUnavailableError
As of Xapian 0.9.7, the PHP bindings use a PHP object oriented style. This is very similar for PHP4 and PHP5, but there are a few differences.
In order to construct an object, use
$object = new XapianClassName(...);
. Objects are destroyed
when they go out of scope - to explicitly destroy an object you can use
unset($object);
or $object = Null;
You invoke a method on an object using $object->method_name()
.
In Xapian 1.0.0 and later, the Xapian::Stem, Xapian::QueryParser, and
Xapian::TermGenerator classes all assume text is in UTF-8. If you want
to index strings in a different encoding, use the PHP
iconv
function
to convert them to UTF-8 before passing them to Xapian, and
when reading values back from Xapian.
All iterators support next()
and equals()
methods
to move through and test iterators (as for all language bindings).
MSetIterator and ESetIterator also support prev()
.
C++ iterators are often dereferenced to get information, eg
(*it)
. With PHP these are all mapped to named methods, as
follows:
Iterator | Dereferencing method |
PositionIterator | get_termpos() |
PostingIterator | get_docid() |
TermIterator | get_term() |
ValueIterator | get_value() |
MSetIterator | get_docid() |
ESetIterator | get_term() |
Other methods, such as MSetIterator::get_document()
, are
available unchanged.
MSet objects have some additional methods to simplify access (these work using the C++ array dereferencing):
Method name | Explanation |
get_hit(index) | returns MSetIterator at index |
get_document_percentage(index) | convert_to_percent(get_hit(index)) |
get_document(index) | get_hit(index)->get_document() |
get_docid(index) | get_hit(index)->get_docid() |
For PHP4:
Xapian::Auto::open_stub(file)
is wrapped as open_stub(file)
Xapian::Flint::open()
is wrapped as flint_open()
Xapian::InMemory::open()
is wrapped as inmemory_open()
Xapian::Quartz::open(...)
is wrapped as quartz_open(...)
Xapian::Remote::open(...)
is wrapped as remote_open(...)
(both
the TCP and "program" versions are wrapped - the SWIG wrapper checks the parameter list to
decide which to call).
Xapian::Remote::open_writable(...)
is wrapped as remote_open_writable(...)
(both
the TCP and "program" versions are wrapped - the SWIG wrapper checks the parameter list to
decide which to call).
For PHP5:
Xapian::Auto::open_stub(file)
is wrapped as Xapian::auto_open_stub(file)
Xapian::Flint::open()
is wrapped as Xapian::flint_open()
Xapian::InMemory::open()
is wrapped as Xapian::inmemory_open()
Xapian::Quartz::open(...)
is wrapped as Xapian::quartz_open(...)
Xapian::Remote::open(...)
is wrapped as Xapian::remote_open(...)
(both
the TCP and "program" versions are wrapped - the SWIG wrapper checks the parameter list to
decide which to call).
Xapian::Remote::open_writable(...)
is wrapped as Xapian::remote_open_writable(...)
(both
the TCP and "program" versions are wrapped - the SWIG wrapper checks the parameter list to
decide which to call).
For PHP4, constants are wrapped in a similar way to class methods.
So Xapian::DB_CREATE_OR_OPEN
is available as
Xapian_DB_CREATE_OR_OPEN
, Xapian::Query::OP_OR
is
available as XapianQuery_OP_OR
, and so on.
For PHP5, constants are wrapped as const
members of the
appropriate class.
So Xapian::DB_CREATE_OR_OPEN
is available as
Xapian::DB_CREATE_OR_OPEN
, Xapian::Query::OP_OR
is
available as XapianQuery::OP_OR
, and so on.
For PHP5, non-class functions are wrapped in the natural way, so the C++
function Xapian::version_string
is wrapped under the same
name in PHP. PHP4 doesn't allow this, so we wrap functions with a
xapian_
prefix, e.g. xapian_version_string
.
In C++ there's a Xapian::Query constructor which takes a query operator and start/end iterators specifying a number of terms or queries, plus an optional parameter. In PHP, this is wrapped to accept an array listing the terms and/or queries (you can specify a mixture of terms and queries if you wish) For example, for PHP4:
$subq = new XapianQuery(XapianQuery_OP_AND, "hello", "world"); $q = new XapianQuery(XapianQuery_OP_AND, array($subq, "foo", new XapianQuery("bar", 2)));
And for PHP5:
$subq = new XapianQuery(XapianQuery::OP_AND, "hello", "world"); $q = new XapianQuery(XapianQuery::OP_AND, array($subq, "foo", new XapianQuery("bar", 2)));
There is an additional method get_matching_terms()
which takes
an MSetIterator and returns a list of terms in the current query which
match the document given by that iterator. You may find this
more convenient than using the TermIterator directly.