PHP Java Bridge in Ubuntu Gutsy with Lucene

The php/java bridge it a pretty awesome little protocol that basically lets us use java classes inside our own PHP applications! This lets you harness the awesome power of all the Java libraries that exist, including the popular Lucene search engine library.

I referenced two excellent blog entries here and here whilst implementing Lucene search for this blog, but I am writing up the experience anyway to compare issues and difficulties and enhance my understanding of the process.

To start with Java, Lucene and the bridge dependancies must be installed (remember to enable multiverse in your apt sources)


apt-get install sun-java6-jre sun-java6-jdk liblucene-java libitext-java
update-java-alternatives -s java-6-sun

Grab the php-java-bridge deb package from sourceforge and install it. The fact it is v4 does not reflect that it is only for PHP version 4! There are RPMs for version 5 which you could turn in to a deb package using alien but at the moment I am feeling lazy so I will see how version 4 works out first.


wget https://downloads.sourceforge.net/php-java-bridge/php-java-bridge_4.3.0-1_i386.deb
dpkg -i php-java-bridge_4.3.0-1_i386.deb

Apache should restart now, if not restart it yourself.

To check that it is working look at the output of phpinfo(), there should be a new shiny java section! Listing the running processes also is interesting!


root 20205 0.0 0.7 664520 15520 ? Sl 17:18 0:00 java -Djava.library.path=/usr/lib/php5/20060613+lfs
-Djava.class.path=/usr/lib/php5/20060613+lfs/JavaBridge.jar -Djava.awt.headless=true
-Dphp.java.bridge.base=/usr/lib/php5/20060613+lfs php.java.bridge.Standalone LOCAL:@java-bridge-4ee9 1

as does netstat


unix 2 [ ACC ] STREAM LISTENING 1913999 @java-bridge-4ee9

I think it gets started when apache starts, as java.so is loaded in to the PHP, I’m still investigating that.

As far as starting the Lucene development goes, this was a pretty good tutorial on how it all works and this site has some good Java example code that I used to work out how the PHP should work.

Below is my PHP Lucene test code, it just creates one document with a description then searches the index description for ‘idi test’ and outputs the match. It’s pretty rad!


java_require('/usr/share/java/lucene.jar');

$analyzer = new Java('org.apache.lucene.analysis.StopAnalyzer');
$writer = new Java('org.apache.lucene.index.IndexWriter', '/path/to/store/lucene/data/in', $analyzer, true);

$doc = new Java('org.apache.lucene.document.Document');
$field = new Java('org.apache.lucene.document.Field','description','idi data test',true, true, true);
$doc->add($field);

$writer->addDocument($doc);

$writer->close();

$indexer = new Java('org.apache.lucene.search.IndexSearcher','/path/to/store/lucene/data/in');
$parser = new Java('org.apache.lucene.queryParser.QueryParser','description',$analyzer);
$query = $parser->parse('rus test');

$hits = $indexer->search($query);

for ($i = 0; $i < $hits->length(); $i++) {
$found = $hits->doc($i);
print $i.".".$found->get('description');
}
?>

Now that it’s working I just have to incorperate it in to the site 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *