Atom feeds with PHP 5 Dom and XSL

All blogs require silly amounts of feed generators, right? And this is a silly blog so requires a silly generator. The entire site is written using PHP5, and my automagic ‘datahandler’ activepage concept creates an XML document using DOM that then uses XSL as a templating engine, so I figured it wouldn’t be too hard to knock up a stylesheet to turn the default datahandler for the blog in to a nice atom feed! Just make sure you set the content-type to application/atom+xml when generate the page!


<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version = "1.0" xmlns:xsl="https://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" method="xml" encoding="iso-8859-1" omit-xml-declaration="yes" doctype-system="https://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd" doctype-public="-//W3C//DTD XHTML 1.1//EN" />
<xsl:template match="page">
<feed xmlns="https://www.w3.org/2005/Atom">
<link rel="alternate" type="text/html" href="https://www.idimmu.net/" />
<link rel="self" href="https://www.idimmu.net/blog/atom.php" />
<title>idimmu . net</title>
<link href="https://www.idimmu.net/"/>
<updated>
<xsl:value-of select="datahandler_blog/blog_list/blog/date/year"/>-<xsl:value-of select="datahandler_blog/blog_list/blog/date/month"/>-<xsl:value-of select="datahandler_blog/blog_list/blog/date/day"/>T<xsl:value-of select="datahandler_blog/blog_list/blog/date/hour"/>:<xsl:value-of select="datahandler_blog/blog_list/blog/date/minute"/>:<xsl:value-of select="datahandler_blog/blog_list/blog/date/second"/>Z</updated> <author> <name>idimmu</name> </author> <id>https://www.idimmu.net/</id> <xsl:apply-templates select="datahandler_blog"/> </feed> </xsl:template> <xsl:template match="datahandler_blog"> <xsl:apply-templates select="blog_list"/> </xsl:template> <xsl:template match="blog"> <entry> <title><xsl:value-of select="title"/></title> <link href="https://www.idimmu.net/{clonefakeurl}"/> <id>https://www.idimmu.net/<xsl:value-of select="clonefakeurl"/></id> <updated><xsl:value-of select="date/year"/>-<xsl:value-of select="date/month"/>-<xsl:value-of select="date/day"/>T<xsl:value-of select="date/hour"/>:<xsl:value-of select="date/minute"/>:<xsl:value-of select="date/second"/>Z</updated>
<content type="xhtml">
<div xmlns="https://www.w3.org/1999/xhtml">
<xsl:value-of select="bb_content" disable-output-escaping="yes"/>
</div>
</content>
</entry>
</xsl:template>
<xsl:template match="blog_list">
<xsl:apply-templates select="blog"/>
</xsl:template>
</xsl:stylesheet>

PHP Java Bridge in Ubuntu Gutsy with Lucene

The php/java bridge it a pretty awesome little protocol that basically lets us use java classes inside our own PHP applications! This lets you harness the awesome power of all the Java libraries that exist, including the popular Lucene search engine library.

I referenced two excellent blog entries here and here whilst implementing Lucene search for this blog, but I am writing up the experience anyway to compare issues and difficulties and enhance my understanding of the process.

To start with Java, Lucene and the bridge dependancies must be installed (remember to enable multiverse in your apt sources)


apt-get install sun-java6-jre sun-java6-jdk liblucene-java libitext-java
update-java-alternatives -s java-6-sun

Grab the php-java-bridge deb package from sourceforge and install it. The fact it is v4 does not reflect that it is only for PHP version 4! There are RPMs for version 5 which you could turn in to a deb package using alien but at the moment I am feeling lazy so I will see how version 4 works out first.


wget https://downloads.sourceforge.net/php-java-bridge/php-java-bridge_4.3.0-1_i386.deb
dpkg -i php-java-bridge_4.3.0-1_i386.deb

Apache should restart now, if not restart it yourself.

To check that it is working look at the output of phpinfo(), there should be a new shiny java section! Listing the running processes also is interesting!


root 20205 0.0 0.7 664520 15520 ? Sl 17:18 0:00 java -Djava.library.path=/usr/lib/php5/20060613+lfs
-Djava.class.path=/usr/lib/php5/20060613+lfs/JavaBridge.jar -Djava.awt.headless=true
-Dphp.java.bridge.base=/usr/lib/php5/20060613+lfs php.java.bridge.Standalone LOCAL:@java-bridge-4ee9 1

as does netstat


unix 2 [ ACC ] STREAM LISTENING 1913999 @java-bridge-4ee9

I think it gets started when apache starts, as java.so is loaded in to the PHP, I’m still investigating that.

As far as starting the Lucene development goes, this was a pretty good tutorial on how it all works and this site has some good Java example code that I used to work out how the PHP should work.

Below is my PHP Lucene test code, it just creates one document with a description then searches the index description for ‘idi test’ and outputs the match. It’s pretty rad!


java_require('/usr/share/java/lucene.jar');

$analyzer = new Java('org.apache.lucene.analysis.StopAnalyzer');
$writer = new Java('org.apache.lucene.index.IndexWriter', '/path/to/store/lucene/data/in', $analyzer, true);

$doc = new Java('org.apache.lucene.document.Document');
$field = new Java('org.apache.lucene.document.Field','description','idi data test',true, true, true);
$doc->add($field);

$writer->addDocument($doc);

$writer->close();

$indexer = new Java('org.apache.lucene.search.IndexSearcher','/path/to/store/lucene/data/in');
$parser = new Java('org.apache.lucene.queryParser.QueryParser','description',$analyzer);
$query = $parser->parse('rus test');

$hits = $indexer->search($query);

for ($i = 0; $i < $hits->length(); $i++) {
$found = $hits->doc($i);
print $i.".".$found->get('description');
}
?>

Now that it’s working I just have to incorperate it in to the site 🙂

Copying files between servers with netcat and tar

One of the quickest ways (faster than scp at any rate) of copying a large number of files between 2 servers is by abusing the awesome powers of Linux’s pipeing and netcat and tar!

Basically we set up netcat listening on the server you want the files copied too which pipes it’s output to tar which extracts anything sent to it.


[email protected]:/exports/archive# nc -l -p 7878 | tar -xzf -

Then we set up tar on the server we want to copy from, make it create a tarball and pipe it through a netcat which connects to the other server!


fee /home/shared/people # tar -cz MC | nc -q 10 tanglefoot 7878

When the copy has finished the sending instance of netcat will then exit!

Using PowerDNS with PostgreSQL on Ubuntu Gutsy

We handle DNS for thousands of domains for our customers and whilst our existing solution worked it was very messy to maintain and work with so we decided to trial a new solution for our offices to see how it would perform. We wanted something that could be database driven for ease of maintenance and we were personally recommended PowerDNS, so we decided to trial that one first.

For the database we would normally go with MySQL but we wanted an instance of PostgreSQL to play with as we are considering moving our main platform to it at some point in the future.

Our DNS server is running on Ubuntu Gutsy and everything we need is fortunately in the repositories so installing it is as easy as:


apt-get install pdns-backend-pgsql pdns-doc pdns-recursor pdns-server postgresql postgresql-contrib postgresql-doc

After all the software is installed we need to tell PowerDNS to use our PostgreSQL server in /etc/powerdns/pdns.conf


launch=gpgsql
gpgsql-host=127.0.0.1
gpgsql-user=powerdns
gpgsql-password=password
gpgsql-dbname=powerdns

We then need to configure the database, tables and user permissions in PostgreSQL.

To create the user we must become a superuser which typically involves changing to the postgres unix user and taking advantage of the ident based authentication.


[email protected]:~# su postgres
[email protected]:/root$ psql
Welcome to psql 8.2.5, the PostgreSQL interactive terminal.

postgres=# CREATE USER powerdns WITH PASSWORD 'password';
CREATE USER

You can check the user has been created through the psql client too.


postgres=# select * from pg_shadow;
usename | usesysid | usecreatedb | usesuper | usecatupd | passwd | valuntil | useconfig
----------+----------+-------------+----------+-----------+-------------------------------------+----------+-----------
postgres | 10 | t | t | t | | |
powerdns | 16385 | f | f | f | md5e954fb1203f8da7392a0c7406f83d765 | |
(2 rows)

We then need to create and switch to the new database


postgres=# create database powerdns;
CREATE DATABASE

postgres=# l
List of databases
Name | Owner | Encoding
-----------+----------+----------
postgres | postgres | UTF8
powerdns | postgres | UTF8
template0 | postgres | UTF8
template1 | postgres | UTF8
(4 rows)

postgres=# c powerdns
You are now connected to database "powerdns".

The table structure is


create table domains (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
master VARCHAR(128) DEFAULT NULL,
last_check INT DEFAULT NULL,
type VARCHAR(6) NOT NULL,
notified_serial INT DEFAULT NULL,
account VARCHAR(40) DEFAULT NULL
);
CREATE UNIQUE INDEX name_index ON domains(name);

CREATE TABLE records (
id SERIAL PRIMARY KEY,
domain_id INT DEFAULT NULL,
name VARCHAR(255) DEFAULT NULL,
type VARCHAR(6) DEFAULT NULL,
content VARCHAR(255) DEFAULT NULL,
ttl INT DEFAULT NULL,
prio INT DEFAULT NULL,
change_date INT DEFAULT NULL,
CONSTRAINT domain_exists
FOREIGN KEY(domain_id) REFERENCES domains(id)
ON DELETE CASCADE
);

CREATE INDEX rec_name_index ON records(name);
CREATE INDEX nametype_index ON records(name,type);
CREATE INDEX domain_id ON records(domain_id);

create table supermasters (
ip VARCHAR(25) NOT NULL,
nameserver VARCHAR(255) NOT NULL,
account VARCHAR(40) DEFAULT NULL
);

GRANT SELECT ON supermasters TO powerdns;
GRANT ALL ON domains TO powerdns;
GRANT ALL ON domains_id_seq TO powerdns;
GRANT ALL ON records TO powerdns;
GRANT ALL ON records_id_seq TO powerdns;

And then we can look at them!


powerdns=# \d
List of relations
Schema | Name | Type | Owner
--------+----------------+----------+----------
public | domains | table | postgres
public | domains_id_seq | sequence | postgres
public | records | table | postgres
public | records_id_seq | sequence | postgres
public | supermasters | table | postgres
(5 rows)

After the user is created we need to edit /etc/postgresql/8.2/main/ph_hba.conf to grant that user access to the database from localhost


host powerdns powerdns 127.0.0.0/16 md5

We then need to reload PostgreSQL for the changes to take effect.


[email protected]:~# /etc/init.d/postgresql-8.2 reload

We then need to populate it with the important SOA and NS records. All the records take a creation date as a timestamp, so we also created a function to return the current timestamp.


create function epoch() returns int AS 'select extract(epoch from now())::int;';

insert into records (domain_id, name, type, content, ttl, prio,change_date) values (1, 'btn.com', 'NS', 'dnsserver.btn.com',600,10,epoch());

insert into records (domain_id, name, type, content, ttl, prio,change_date) values (1, 'btn.com', 'SOA', 'dnsserver 2005091301 10800 3600 604800 600',600,10,epoch());

insert into records (domain_id, name, type, content, ttl, prio,change_date) values (1, 'dnsserver.btn.com', 'A', '10.0.0.1',600,10,epoch());

Now all we need to do is edit /etc/resolv.conf to use the new nameserver


nameserver 10.0.0.1

and check that it works!


[email protected]:~$ host dnsserver.btn.com
dnsserver.btn.com has address 10.0.0.1

Configuring Tomcat 5.5 and Apache 2 with mod_jk

mod_jk is a conduit between a web server and Tomcat, it supports a variety of web servers including IIS. Using mod_jk to put Apache in front of Tomcat lets you use all the power of Apache (caching, gzip, mod_rewrite, etc) whilst at the same time serving content from Tomcat, also with Ubuntu it’s really easy to set up!

First of all install the software, you will need to enable the backports repository on Dapper for this.


apt-get install sun-java6-bin sun-java6-jdk tomcat5.5 libapache2-mod-jk

The Tomcat 5.5 that comes with Ubuntu already has an AJP connector configured on port 8009 so there is no additional configuration to do to it’s server.xml file.

We then need to configure a worker.properties file for Apache2 which tells it about the Tomcat instance, I make mine in /etc/apache2/worker.properties


worker.list=idimmu
worker.idimmu.type=ajp13
worker.idimmu.host=localhost
worker.idimmu.port=8009

Make sure mod_jk is then enabled with a2enmod jk (it probably already is).

And finally we tell Apache2 about the worker instance in /etc/apache2/apache2.conf


JkWorkersFile /etc/apache2/workers.properties
JkLogFile /var/log/apache2/mod_jk.log
JkLogLevel info
JkMount /* idimmu

This will direct any requests to the Apache2 server to the Tomcat server

High availability with LVS using LVSadmin

The Linux Virtual Server is a highly scalable and highly available server built on a cluster of real servers, with the load balancer running on the Linux operating system. The architecture of the server cluster is fully transparent to end users, and the users interact as if it were a single high-performance virtual server.

We use LVS extensively at work to provide a scalable and highly available website which gets around 300 hits per second. Setting up and managing LVS can be made a lot easier using a tool that our ex staff wrote called LVSadmin. Written in perl it is easily configurable and provides a curses based front end to manage the servers. Setting up a new LVS cluster is really easy.

For our new cluster we have 2 servers that I want to load balance with LVS:


dev-blobdirector0 10.0.2.4:8889
dev-blobdirector1 10.0.2.17:8889

And I want them presented with the following hostname:


lvs-dev-blobdirector 10.0.2.23

We want 2 LVS instances, for redundancy in case one dies which will run on the following servers:


lvs0 10.0.2.18
lvs1 10.0.2.19

The target platform is Ubuntu Dapper, which is our platform of choice at the moment until Hardy is out!

On lvs0/lvs1 Grab the source code (lvsadmin, LVS.pm) for LVS from SourceForge and place it in /usr/local/bin, lvsadmin should be +x.

Then install the following packages


apt-get install perl-modules libcurses-perl libcurses-widgets-perl keepalived

A few variable changes need to be done in the LVS.pm:

* Change the $MASTER to the hostname of the master server, in our case lvs0
* Change $IF to the interface that packets will be coming from, in our case eth0
* There is a br0 further down the script that needs to reflect the $IF change so again change that to eth0
* Change $PASSWORD to the keepalived password you want
* Find the lvs_id and change that to a new unique instance for this LVS cluster.

LVS.pm on both servers should be identical.

The following files need to then be made in /etc/keepalived

portlist (this is a list of all the realserver ports LVS will manage)


8889

serverlist (these hostnames are resolved to create a meaningful display)


lvs-dev-blobdirector.btn.dbplc.com
dev-blobdirector0
dev-blobdirector1

serverstate (the default state of the servers, lvsadmin will read and write it’s state to this file when you change things)


10.0.2.23:10.0.2.4:8889:up
10.0.2.23:10.0.2.17:8889:up

services (list of virtual server ports)


8889

viplist (list of virtual server IPs)


10.0.2.23 eth0

Then on each of the real servers we need to create the virtual IP for them to listen on in /etc/network/interfaces add:


auto lo:23
iface lo:23 inet static
address 10.0.2.23
netmask 255.255.255.255
broadcast 10.0.255.255
pre-up echo 1 >/proc/sys/net/ipv4/conf/all/arp_ignore; echo 2 >/proc/sys/net/ipv4/conf/all/arp_announce

then on each real server start the interface:


[email protected]:~# ifup lo:23

That’s all the configuration done, we just now have to start the LVS system, the first time is a little flaky but from them on in it will work smoothly.

On each lvs server start lvsadmin, then press go to info and press Shift-S to save, it will then create /etc/keepalived/keepalived.conf


! Configuration File for keepalived

global_defs {
lvs_id LVS_XEN
}

vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 150
advert_int 1
smtp_alert
authentication {
auth_type PASS
auth_pass eggsandham
}
virtual_ipaddress {
10.0.2.23 dev eth0
}
}
# Virtualserver: 10.0.2.23
virtual_server 10.0.2.23 8889 {
delay_loop 60
lb_algo wrr
lb_kind DR
protocol TCP
virtualhost www.digitalbrain.com
# Realserver: 10.0.2.4
real_server 10.0.2.4 8889 {
weight 30
inhibit_on_failure
TCP_CHECK {
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
# Realserver: 10.0.2.17
real_server 10.0.2.17 8889 {
weight 30
inhibit_on_failure
TCP_CHECK {
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}

Then start keepalived


/etc/init.d/keepalived start

You should now be able to telnet to the correct port on the virtual server if it’s working!


[email protected]:~$ telnet lvs-dev-blobdirector 8889
Trying 10.0.2.23...
Connected to lvs-dev-blobdirector.btn.dbplc.com.
Escape character is '^]'.

To test LVS redundancy take down the master (lvs0) and see if you can still connect to the virtual server.


[email protected]:~# /etc/init.d/keepalived stop
Stopping keepalived: keepalived.
[email protected]:~# ps aux | grep keep
root 3859 0.0 0.1 3940 900 pts/1 R+ 14:04 0:00 grep keep


[email protected]:~$ telnet lvs-dev-blobdirector 8889
Trying 10.0.2.23...
Connected to lvs-dev-blobdirector.btn.dbplc.com.
Escape character is '^]'.

And there we have it, an easy way to create a scalable, highly available server platform!

SVN COPY 502 Bad Gateway error

Our developers were experiencing a weird problem recently with our SVN installation where they couldn’t copy any files in SVN, they would always get the following error


svn: COPY of /project/!svn/bc/5121/trunk/path/file.gif: 502 Bad Gateway (https://svn)

A quick fix of course would have been to just create a new file and copy the contents but this wouldn’t have kept the file history. A quick google lead to this page and a solution!

Our specific problem was we had tried to be lean and set up a default https config that all our SSL sites used which specified the SSL parameters in, but we hadn’t explicitly enabled SSL in the svn vhost, so although SSL was working fine, Apache and mod_ssl actually thought the request was coming through on port 80, and thus http, instead of port 443, https, so the request was being translated to technically copy a file from one svn repository to a completely different one.

The solution was to put the SSL engine and cerficiate options back in to the svn vhost so Apache would pick up that the connect was indeed https and not http!

Version Control with Subversion

For more SVN advice, I recommend Version Control with Subversion by O’Reilly. It contains everything you need to know when using or managing SVN repositories.

New Years Resolutions

A lot of people hate the idea of new year resolutions, but if you want to make some changes and the fact it’s the start of a year will give you motivation, then so be it! Lets see how many I keep in 2009!

  • Give up smoking
  • Give up caffeine
  • Stick to diet and gym
  • Less red meat and fatty meats
  • No takeaway pizza, McDonalds or Burger King
  • Attend martial arts classes more regularly
  • Join a yoga/pilates class to improve posture/flexibility
  • Visit another contenant

Last year my resolution was to give up McDonalds, I only wavered once when stranded at Liverpool St. Station!

Resize Xen Filesystem

We run a lot of Xen instances for our development and test servers and a few were starting to get full. Fortunately the disks in the real servers were very large and the xenlet partitions were made using LVM so resizing them to add more space was possible!

[email protected]:~# df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/hda1             4.0G  3.8G  200M  95% /

varrun                257M   48K  257M   1% /var/run

varlock               257M     0  257M   0% /var/lock

udev                  257M   40K  257M   1% /dev

devshm                257M     0  257M   0% /dev/shm

Basically we just have to shut down the xenlet, resize the partition and then restart the xenlet again, simple!

[email protected]:~# xm shutdown dev-myfiles0

[email protected]:~# lvextend -L40G /dev/vg0/dev-myfiles0-disk

  Extending logical volume dev-myfiles0-disk to 40.00 GB

  Logical volume dev-myfiles0-disk successfully resized

[email protected]:~# e2fsck -f /dev/vg0/dev-myfiles0-disk

e2fsck 1.40.2 (12-Jul-2007)

Pass 1: Checking inodes, blocks, and sizes

Pass 2: Checking directory structure

Pass 3: Checking directory connectivity

Pass 4: Checking reference counts

Pass 5: Checking group summary information

/dev/vg0/dev-myfiles0-disk: 16541/524288 files (0.9% non-contiguous), 138346/1048576 blocks

[email protected]:~# resize2fs /dev/vg0/dev-myfiles0-disk

resize2fs 1.40.2 (12-Jul-2007)

Resizing the filesystem on /dev/vg0/dev-myfiles0-disk to 10485760 (4k) blocks.

The filesystem on /dev/vg0/dev-myfiles0-disk is now 10485760 blocks long.

[email protected]:~# cd /etc/xen

[email protected]:/etc/xen# xm create dev-myfiles0.cfg

Using config file "./dev-myfiles0.cfg".

Started domain dev-myfiles0

Wee, lots of free space now!

[email protected]:~# df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/hda1              40G  3.8G   37G  10% /

varrun                257M   40K  257M   1% /var/run

varlock               257M     0  257M   0% /var/lock

udev                  257M   40K  257M   1% /dev

devshm                257M     0  257M   0% /dev/shm