Recently in code Category

koha-map.png

Browsing through subscribed videos this week on YouTube I stumbled upon video SImulating Markers with Tile Layers which described how to create custom tiles for Google maps using perl and PostgreSQL. John Coryat did great job at describing challenges, but also provided useful source code snippets and documentation how to create custom tiles. So, this weekend I decided to try it out using publisher field (260$a) from our Koha to show where are our books coming from?

I had several challenges to overcome, including migrating data from MySQL Koha database to PostgreSQL (so I can use great point data-type), geolocating publisher location using Yahoo's PlaceFinder (I tried to use Google's v3 geolocation API, but it had limit at 10 requests which wasn't really useful). I also added support for different icons (with arbitrary size) depending on the zoom level. In the process I also replaced cgi based tile server with mod_rewrite configuration which does same function but inside Apache itself.

Source code is available at github.com/dpavlin/google-map-tiles and it's really easy to overlay huge amount of data-points over Google maps using custom tiles, so try it out!

If you talked with me in last years or so, you probably heard me mention queues as new paradigm in application development. If your background is web-development, you probably wondered why are they important. This blog will try to explain why they are useful and important, and how you can make your app scale, even on same box.

Problem was rather simple: I needed to make monitoring which will pull data from ~9000 devices using telnet protocol and store it in PostgreSQL. Normal way to solve this would be to write module which first checks if devices are available using something like fping and then telnet to each device and collect data. However, that would involve careful writing of puller, taking care of child processes and so on. This seemed like doable job, but it also seemed a bit complicated for task at hand.

So, I opted to implement system using Gearman as queue server, and leave all scaling to it. I decided to push all functionality in gearman workers. For that, I opted to use Gearman::Driver which allows me to easily change number of workers to test different configurations. Requirement was to pull each machine in 20-minute intervals.

Converting existing perl scripts which collect data into gearman workers was a joy. At first run (with 25 workers) it took 15 minutes to collect all data. Just by increasing number of workers to 100 we managed to cut down this time just over 1 minute. And that's on single core virtual machine (which makes sense, since most of the time we are waiting on network).

For web interface, I decided to use Mojolicious. But, to make it work with Gearman, I write MojoX::Gearman which allows me to invoke gearman functions directly from Mojolicious. In fact, all functionality of web interface is implemented as Gearman workers, even querying database :-)

I have been playing with Linux containers for a while, and finally I decided to take a plunge and migrate one of my servers from OpenVZ to lxc. It worked quite well for testing until I noticed lack of support for automatic startup or shutdown. lxc-users mailing list was helpful in providing useful hints and I found 5 OpenSUSE scripts but decided it's too complicated for task at hand.

I wanted single script, for easy deployment on any Debian box, with following features:

  • reboot and halt from inside container should work as expected
  • cleanly shutdown or reboot container from host (as opposed to lxc-stop which is equivalent to turning power off)
  • init script to start and stop containers on host boot and shutdown

Result is lxc-watchdog.sh. It can start containers (and remember to start it on next host reboot), reboot or halt containers from host (using signals to trigger container's init) and automatically configure your container when you start it for the first time. Here is quick overview:

root@prod:/srv# ./lxc-watchdog.sh status
koha-240 RUNNING boot /virtual/koha-240
koha-241 STOPPED boot /virtual/koha-241
koha-242 STOPPED      /virtual/koha-242

root@prod:/srv# ./lxc-watchdog.sh start
# start koha-240
'koha-240' is RUNNING
# start koha-241
2010-03-16T23:44:16 koha-241 start
'koha-241' is RUNNING
# skip start koha-242

root@prod:/srv# ./lxc-watchdog.sh status
koha-240 RUNNING boot /virtual/koha-240
koha-241 RUNNING boot /virtual/koha-241
koha-242 STOPPED      /virtual/koha-242

root@prod:/srv# ls -al /var/lib/lxc/*/on_boot
-rw-r--r-- 1 root root 9 2010-03-16 21:40 /var/lib/lxc/koha-240/on_boot
-rw-r--r-- 1 root root 9 2010-03-16 21:40 /var/lib/lxc/koha-241/on_boot
-rw-r--r-- 1 root root 0 2010-03-16 22:58 /var/lib/lxc/koha-242/on_boot
As you can see, I used file /var/lib/lxc/name/on_boot to record which machines to bring up. When container is started for the first time, it will have boot enabled (just in case this is production application which you will reboot in 6 months and then wonder why it doesn't work). You can change boot status using:
root@prod:/srv# ./lxc-watchdog.sh boot koha-242
# boot koha-242

root@prod:/srv# ./lxc-watchdog.sh status
koha-240 RUNNING boot /virtual/koha-240
koha-241 RUNNING boot /virtual/koha-241
koha-242 STOPPED boot /virtual/koha-242

root@prod:/srv# ./lxc-watchdog.sh disable koha-242
# disable koha-242
Installation as init script /etc/init.d/lxc-watchdog is easy:
root@prod:/srv# ln -s /srv/lxc-watchdog.sh /etc/init.d/lxc-watchdog

root@prod:/srv# update-rc.d lxc-watchdog defaults
update-rc.d: using dependency based boot sequencing
And finally, it can also be used to manually start, halt or reboot containers:
root@prod:/srv# /etc/init.d/lxc-watchdog start koha-242
# start koha-242
2010-03-16T23:47:46 koha-242 start
'koha-242' is RUNNING

root@prod:/srv# /etc/init.d/lxc-watchdog status
koha-240 RUNNING boot /virtual/koha-240
koha-241 RUNNING boot /virtual/koha-241
koha-242 RUNNING      /virtual/koha-242

root@prod:/srv# /etc/init.d/lxc-watchdog restart koha-242
# restart koha-242
2010-03-16T23:48:46 koha-242 kill -SIGINT 24838

root@prod:/srv# /etc/init.d/lxc-watchdog status
koha-240 RUNNING boot /virtual/koha-240
koha-241 RUNNING boot /virtual/koha-241
koha-242 RUNNING      /virtual/koha-242

root@prod:/srv# /etc/init.d/lxc-watchdog stop koha-242
# stop koha-242
2010-03-16T23:49:55 koha-242 stop
2010-03-16T23:49:55 koha-242 kill -SIGPWR 26086
2010-03-16T23:50:11 koha-242 stoped
In fact, you can use halt or reboot if you don't like stop and restart, just to keep one mapping less in your brain when working with it.

Log files are created for each container in /tmp/name.log. They include lxc-start output with boot messages and any output that started scripts might create which is useful for debugging container installations. Output from watchdog monitoring /var/run/utmp in container is also included, and it reports number of tasks (processes) in container, and here is example of stopping container:

root@prod:/srv# tail -5 /tmp/koha-242.log
2010-03-16T23:49:56 koha-242 66 tasks
2010-03-16T23:50:04 koha-242 22 tasks
2010-03-16T23:50:11 koha-242 runlevel 2 0
2010-03-16T23:50:11 koha-242 halt
2010-03-16T23:50:12 koha-242 watchdog exited
Hopefully this will make your switch to Linux Containers and recent kernels easier...

You have your new shiny application, and LDAP server on the other side. Easy as pie. What can go wrong?

  • you use e-mail as login, and application assumes that logins don't have domain in them and allows you embedding of whole login into DN
  • application can import various interesting fields using LDAP, but you have data somewhere else, and it doesn't really belong into your LDAP
  • you need to provide subset of data in your database as LDAP server to application

I had written about my saga with LDAP about augmenting LDAP search responses and exposing RDBMS data as LDAP server. But today, I added rewrite of bind, so now I can use unmodified Koha, and all needed quirks for my AAI@EduHr schema are outside application.

This made my attempt to virtualize LDAP almost complete so I created a project page on Ohloh. I will write small updates about status there, so If any of this is interesting to you, hop over there.

sack-onion-logo.png Main design goal is to have interactive environment to query perl hashes which are bigger than memory on single machine.

Implementation uses TCP sockets (over ssh if needed) between perl processes. This allows horizontal scalability both on multi-core machines as well as across the network to additional machines.

Reading data into hash is done using any perl module which returns perl hash and supports offset and limit to select just subset of data (this is required to create disjunctive shards). Parsing of source file is done on master node (called lorry) which then splits it to shards and send data to sack nodes.

Views are small perl snippets which are called for each record on each shard with $rec. Views create data in $out hash which is automatically merged on master node.

You can influence default shard merge by adding + (plus sign) in name of your key to indicate that key => value pairs below should have values summed when combining shards on master node.

If view operation generate huge amount of long field names, you might run out of memory on master node when merging results. Solution is to add # to name of key which will turn key names into integers which use less memory.

So, how does it look? Below is small video showing 121887 records spread over 18 cores on 9 machines running first few short views, and than largest one on this dataset.

If your browser doesn't have support for <video> tag, watch Sack video on YouTube or using ttyrec player written in JavaScript.

Source code for Sack is available in my subversion and this is currently second iteration which brings much simpler network protocol (based only on perl objects serialized directly to socket using Storable) and better support for starting and controlling cluster (which used to be shell script).

Update: Sack now has proper home page at Ohloh and even playlist on YouTube (which doesn't really like my Theora encoded videos and doesn't have rss feed natively).

Following video shows improvements in version 0.11 on 22 node cloud hopefully better than video above.

Few days ago, I was writing wiki page with user management description. In it, I wanted to have current data about users. Since I'm using wiki which has RSS feed include, I decided to hack on SQL2XLS until it produced RSS feed which I can include within page.

Usage is very similar to SQL2XSL: you just create a bunch of SQL files, and each of them will create one item in RSS feed. Items links in RSS feed will return just that SQL query.

If you need something similar, and have CGI enabled server, get sql2rss.cgi, create SQL files and you are ready to go...

Roughly six months ago, I wrote first version of Redis perl bindings. For some strange reason, I didn't push it to CPAN at the same time, but recent discussion on Redis list (and two pending e-mail about it) forced me to finally make a push.

And it was worth it! Just a day later we have KiokuDB::Backend::Redis. Did I mentioned it was just one day later? Remember: release often, release early, but also release at right place :-)

First, of all, happy sysadmin day 2009-07-31! So, it seems logical that I'm announcing by project PXElator which aims to replace me with a perl script. It's basically my take on cloud hype. Currently it supports bringing up machines (virtual or physical) from boot onwards. It implements bootp, dhcp, tftp and http server to enable single action boot of new machine.

It all started when I watched Practical Puppet: Systems Building Systems and decided that real power is in expressing system administration as code. I also liked DSL approach which Puppet took in ruby and tried to apply same principle (declarative DSL) in perl. If you take a look at source code it seems work work quite well.

In the spirit of release early, release often this code will be in flux until end of this summer, when I plan to deploy it to create web kiosk environment for browsing our library catalog based on it.

I'm working on Linux version of Sun storage machines, using commodity hardware, OpenVZ and Fuse-ZFS. I'm do have working system in my Sysadmin Cookbook so I might as well write a little bit of documentation about it.

My basic requirements are:

This makes it self-running system which won't fall over itself, so let's see how does it look:

root@opl:~# zpool status
  pool: opl
 state: ONLINE
 scrub: resilver completed after 1h59m with 0 errors on Wed Jun  3 15:29:50 2009
config:

        NAME        STATE     READ WRITE CKSUM
        opl         ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            sda2    ONLINE       0     0     0
            sdb2    ONLINE       0     0     0

errors: No known data errors
root@opl:~# zfs list
NAME                           USED  AVAIL  REFER  MOUNTPOINT
opl                            183G  35.7G    21K  /opl
opl/backup                     180G  35.7G    22K  /opl/backup
opl/backup/212052             76.1G  35.7G  8.12G  /opl/backup/212052
opl/backup/212052@2009-05-01  5.69G      -  7.50G  -
opl/backup/212052@2009-05-10  5.69G      -  7.67G  -
opl/backup/212052@2009-05-15  5.57G      -  7.49G  -
opl/backup/212052@2009-05-22  3.54G      -  7.74G  -
opl/backup/212052@2009-05-25  3.99G      -  8.38G  -
opl/backup/212052@2009-05-26  3.99G      -  8.38G  -
...
opl/backup/212052@2009-06-05  3.72G      -  8.09G  -
opl/backup/212052@2009-06-06      0      -  8.12G  -
opl/backup/212056             1.42G  35.7G   674M  /opl/backup/212056
opl/backup/212056@2009-05-30  37.1M      -   688M  -
opl/backup/212056@2009-05-31  47.3M      -   747M  -
opl/backup/212056@2009-06-01  40.9M      -   762M  -
opl/backup/212056@2009-06-02  62.4M      -   787M  -
...
opl/backup/212056@2009-06-05  12.1M      -  1.02G  -
opl/backup/212056@2009-06-06      0      -   674M  -
opl/backup/212226              103G  35.7G  26.8G  /opl/backup/212226
opl/backup/212226@2009-05-05  4.29G      -  26.7G  -
opl/backup/212226@2009-05-10  4.04G      -  26.6G  -
opl/backup/212226@2009-05-15  4.19G      -  26.6G  -
opl/backup/212226@2009-05-22  4.12G      -  26.7G  -
opl/backup/212226@2009-05-23  4.12G      -  26.7G  -
opl/backup/212226@2009-05-24  4.09G      -  26.6G  -
opl/backup/212226@2009-05-25  4.14G      -  26.7G  -
opl/backup/212226@2009-05-26  4.13G      -  26.7G  -
...
opl/backup/212226@2009-06-05  4.20G      -  26.8G  -
opl/backup/212226@2009-06-06      0      -  26.8G  -
opl/clone                      719M  35.7G    25K  /opl/clone
opl/clone/212056-60018         666M  35.7G  1.39G  /opl/clone/212056-60018
opl/clone/212226-60017        53.0M  35.7G  26.7G  /opl/clone/212226-60017
opl/vz                        1.59G  35.7G  43.5K  /opl/vz
opl/vz/private                1.59G  35.7G    22K  /opl/vz/private
opl/vz/private/60014           869M  35.7G   869M  /opl/vz/private/60014
opl/vz/private/60015           488M  35.7G   488M  /opl/vz/private/60015
opl/vz/private/60016           275M  35.7G   275M  /opl/vz/private/60016
There are several conventions here which are useful:
  • pool is named same as machine (borrowing from Debian way of naming LVM volume groups) which makes it easy to export/import pools on different machines (I did run it with mirror over nbd for a while)
  • snapshots names are dates of snapshot for easy overview
  • clones (writable snapshots) are named using combination of backup and new container ID

There are several things which I wouldn't be able to get without zfs:

  • clones can grows as much as they need
  • data is compressed, which increase disk IO as result
  • zfs and zpool commands are really nice and intuitive way to issue commands to filesystem
  • zpool history is great idea of writing all filesystem operations to internal log
  • ability to re-sliver (read/write all data on platters) together with checksums make it robust to disk errors

I rarely use X11 for system administration. There is one tool, however, which was always invaluable to me: xlax. Sure, there are other solutions but somehow, I got addicted to xlax ever since I was introduced to it by fellow sysadmin.

Until today, that is. I always have source on the disk since it was such a hard thing to find. I run quick xmkmf on it, and... it didn't work! However, since xlax now has a home page which explains everything about XTerm*allowSendEvents I was on right track.

But, now so fast, grasshopper!
dpavlin@llin:/rest/unix/x11/xlax2.4$ make
gcc -m32 -o xlax -g -O2 -fno-strict-aliasing       xlax.o -lXaw -lXmu -lXt -lSM -lICE -lXpm  -lXext -lX11      
xlax.o: In function `SetupInterface':
/rest/unix/x11/xlax2.4/xlax.c:173: undefined reference to `strlcpy'
collect2: ld returned 1 exit status
make: *** [xlax] Error 1
Argh. I started with Digital UNIX (called OSF/1 back then) and part of my brain which deals with minor adjustments hasn't died yet, so I decided to do quick google search for strlcpy which which has handy link to strlcpy implementation in OpenBSD. Licensing terms aside, I decided to give it a try.

After applying following patch:

diff -urw xlax2.4/Imakefile xlax2.4.strlcpy/Imakefile
--- xlax2.4/Imakefile   2008-07-31 22:18:25.000000000 +0200
+++ xlax2.4.strlcpy/Imakefile   2009-04-30 21:32:15.000000000 +0200
@@ -5,8 +5,8 @@
 #            DEFINES = -DDEBUG
             DEPLIBS = XawClientDepLibs
     LOCAL_LIBRARIES = XawClientLibs
-               SRCS = xlax.c
-               OBJS = xlax.o
+               SRCS = xlax.c strlcpy.c
+               OBJS = xlax.o strlcpy.c
 
 ComplexProgramTarget(xlax)

and quick xmkmf && make and I got it compiled.

Another trivial change was to implement automatic ssh to each host in mkxlax by adding -e 'ssh $ARGV[$i]' to system xterm line so I will have remote reminals opened by default.

And now, back to real work :-)

About this Archive

This page is an archive of recent entries in the code category.

conferences is the next category.

Find recent content on the main index or look in the archives to find all content.

Pages

  • pics
OpenID accepted here Learn more about OpenID
Powered by Movable Type 5.04