Main

hack-of-the-week Archives

August 23, 2007

Exhibit facet browsing

We have few mp3 players which no longer work, but are still under warranty. So idea was to pick another device (which will hopefully work longer). However, on-line shops leave a lot to be desired if you want to just do quick filtering of data.

As a very fortunate incident, I stumbled upon Exhibit from SMILE project at MIT which brought us such nice tools as Timeline and Potluck.

So, I scraped web, converted it to CSV and tried to do something with it. In the process I again re-visited the problem of semi-structured data: while data is separated in columns, one column has generic description, player name and all characteristics in it.

So, what did I do? Well, I started with CPAN and few hours later I had a script which is rather good in parsing semi-structured CSV files. It supports following:

  • guess CSV delimiter on it's own (using Text::CSV::Separator)
  • recognize 10 Kb and similar sizes and normalize them (using Number::Bytes::Human)
  • splitting of comma (,) separated values within single field
  • strip common prefix from all values in one column
  • group values and produce additional properties in data
  • generate specified number of groups for numeric data, useful for price ranges
  • produce JSON output for Exhibit using JSON::Syck


So how does it look?

In the end, it is very similar to the way Dabble DB parses your input. But, I never actually had any luck importing data into Dabble DB, so this one works better for me :-)

This will probably evolve to universal munger from CSV to arbitrary hash structure. What would be good name? Text::CSV::Mungler?

This is a first post in series of posts which will cover one hack a week on my blog. This will (hopefully) force me to write at least one post a week on one side, and provide some historic trace about my work for later.

September 7, 2007

Subversion tools

In an effort to continue my hack-of-the-week series, here is a quick overview of few subversion hacks I have worked on lately:

  • svn-ignore.sh is a tiny shell script which will bring all unversioned files in current svn or svk repository in your $EDITOR and add result of your edit to svn:ignore
  • svndump-move.pl is more complex perl script which will allow you to reorganize directory layout in your repository while preserving revision history -- it solved problems like: oh, if I only had root of my repository is subdirectory foo...
  • svn2cvs is a bit older tool which received attention when Bartek Teodorczyk very patiently started to report problems with it. As a result, it now has test suite, and it's much more robust

Most of documentation for those tools is hidden in subversion commit messages. If you think they are useful, take a peek there...

October 9, 2007

PowerPC emulation

In an effort to bring kernel 2.6 to D-Link DSM G600 device I have been playing with PowerPC emulators in an effort to boot Linux on them (with grand plan of writing support for DSM G600 device if at all possible).

As a start, I tried to boot DSM G600 Linux kernel with qemu. That didn't work quite well... So, I got detoured into trying to make PearPC to boot anything. Darwin kind-of worked until half of installation, but nothing else did. Shame, I really like PearPC debugging capabilities.

So I went back to qemu. And tried, and patched it, and tried again. Didn't help.

However, I had fortune to stumble upon dynamips which is Cisco 7200 (and other Cisco MIPS and PowerPC routers) which was written much more cleanly, and it was able to boot my unmodified DSM G600 kernel after just a few tweaks to load kernel ELF image at beginning of memory as opposed to address which is embedded in ELF file:

setup_arch: enter
setup_arch: bootmem
mpc10x:enter
mpc10x:exit2
Bridge init failed
arch: exit
Memory BAT mapping: BAT2=16Mb, BAT3=0Mb, residual: 0Mb
Total memory is 16777216.
Total memory = 16MB; using 64kB for hash table (at c01f0000)
Linux version 2.4.21-pre4 (dpavlin@brr) (gcc version 2.95.4 20010319 (prerelease)) #523 Wed Sep 5 13:07:36 CDT 2007
Host bridge init failed
Motorola SPS Sandpoint Test Platform
Port by MontaVista Software, Inc. (source@mvista.com)
On node 0 totalpages: 4096
zone(0): 4096 pages.
zone(1): 0 pages.
zone(2): 0 pages.
Kernel command line: root=/dev/sda3 console=ttyS0,9600
No OpenPIC found !
Calibrating delay loop...

It doesn't go much further than this, but than again, this is also great progress :-)

I also evaluated GXemul which is another multi-CPU emulator (supporting among PowerPC among other CPUs), but it doesn't have MMU implementation for PowerPC, so it wasn't able to map memory around which is needed for Linux boot (it seems). It's again shame, because it does have great debugging...

Anders Gavare answered my e-mail about MMU and I stand corrected:

GXemul has MMU (TLB) emulation for PowerPC. Unix-like OSes (such as NetBSD) require it.

However, different powerpc processors may use different kinds of tlbs, so it is possible that the specific PowerPC processor you want to emulate is not implemented in GXemul.

Now, I would really like to have support for qemu. However, qemu code is all but clean and simple, so this doesn't seem like possible goal, at least for me. But, if I continue to develop emulator based on dynamips or gxemul, I will have to write all hardware support myself.

More information about current status can be found in this forum post.

September 18, 2007

Optimized Link State Routing protocol

I have returned from YAXWE with very good impressions about mash networking. I did some reading and in theory I already knew how this works, but to see actual implementation of idea in form of Freifunk Firmware was both useful and impressive :-)

On the other hand, installing OLSR on Debian boxes is so easy that there is no excuse in becoming local mash node:

  1. Install packages
    $ apt-get install olsrd olsrd-plugins
    
  2. Put your card in adhoc mode

    This part might be specific to your card, but for my atheros card it's something like:

    $ wlanconfig ath1 create wlandev wifi0 wlanmode adhoc
    $ iwconfig ath1 essid olsr.example.com

    You should also probably run dhclient or ifconfig to setup your IP adress depending of configuration of your local mash network
  3. Change network interface in configuration

    $ vi /etc/olsrd/olsrd.conf


    Something like: Interface "ath1"
  4. Start it automatically

    $ vi /etc/default/olsrd

    Change line to START_OLSRD="YES"
  5. Start daemon

    $ /etc/init.d/olsrd start

    And that's it!. You are now part of mash network.

September 27, 2007

OpenMoko as a phone

For quite a long time I was complaining (in person) how nice and half-usable my OpenMoko is. However, thanks to few great hints I'm now make a dial out and dial in.

First, you will really want to install cu package. It contains old UUCP serial tool which will be much more unseful than you might think! Think of cu as cat for console.

Turn on your OpenMoko (while holding AUX button) and type following:

chown uucp:uucp /dev/ttyACM0 ; cu -l /dev/ttyACM0

You might try to just run cu as root, but it still doesn't work (for me) without chown first. If someone could say me to make this automatic, I would be grateful. So, dear lazyweb, I'm quite sure that there is some udev option for that, and if you know what, drop me a note. If not, this might become topic for another post.

Then, change boot parametars:

GTA01Bv4 # setenv bootargs_base rootfstype=jffs2 root=/dev/mtdblock4 console=tty0 loglevel=8
GTA01Bv4 # saveenv                
Saving Environment to NAND...
Erasing Nand...Writing to Nand... done
GTA01Bv4 # boot               

NAND read: device 0 offset 0x44000, size 0x1fc000
2080768 bytes read: OK
## Booting image at 32000000 ...
Image Name: OpenMoko Kernel Image Neo1973(GT
Created: 2007-08-31 11:29:10 UTC
Image Type: ARM Linux Kernel Image (gzip compressed)
Data Size: 1637653 Bytes = 1.6 MB
Load Address: 30008000
Entry Point: 30008000
Verifying Checksum ... OK

This will disable output on serial console which in interfering with gsmd that tries to open serial port to communicate with GSM part.

This is my journey so far... Now I have to wait for my poor old desktop to compile all packages to get freshest copies on my Neo...

October 16, 2007

Stitching maps together

I wanted to review my route to recent conference, and there are a lot of map servers out there. But, none of them offer good way to print map (analog copies are always useful :-)

I didn't really wanted whole map, I wanted just a part which I viewed. And than, I noticed that those parts already arrived on my laptop (because I can view them in browser). Parts? Packets? With a quick trip to tshark (wireshark but for terminal) I captured network trace of my browsing of route.

So, now I had all the tiles I needed to print out (to find my way through route) in pcap capture format. Quick CPAN search and I found Net::Analysis which seem to fit the bill. Simple HTTP transaction listener later and I have tiles saved in directories which are zoom levels and each tile has coordinates of tile.

Do I want to stitch them together by hand? Of course not, with handy misspelled tool will do that for you. It's written as cleanly as possible, and might be a good way to look at one possible way to write perl code (don't write too much!).

If you didn't checked links to source code, please have in mind whole working implementation is under 3k of perl code!

It does have dependency on CPAN, but just imagine how much time would I spend writing TCP session reassembly and/or multiple-format image handling it I didn't used CPAN.

Off to the road now...

October 25, 2007

Forget about Cisco VPN binary

I have heard several complaints about Cisco VPN client for Linux. It's a kernel module, so it's a mess.

However, few days ago I had to connect to one site using Cisco VPN. I got even Windows client. Which didn't work from emulated Windows on my Linux box... What should I do?

First idea to search Debian packages proved like a right thing to do: I soon found vpnc which had all the tools needed for transition. Even tool to convert Windows Support_template.pcf to format sutable for vpnc!

I also wrote a short tutorial how to install and configure it which might be helpful.

November 11, 2007

RAID5 for home

Several years after I wrote about software RAID1 and RAID5 I must say that I'm happily surprised how much software and hardware support for software RAID under Linux improved. My articles are more or less obsolete now, but this is short story of my RAID5 array for home.

I have Compaq brand-name desktop PC (refurbished, good buy) with one 160Gb SATA disk. I wanted to add three more 500Gb disks in RAID5 array to create ~1TB storage with redundancy.

Motherboard has just two SATA channels and form talking with a lot of people, it seems to me that SATA devices can't be chained (please correct me if I'm wrong). I somehow assumed that I can connect more than one disk on one SATA channel, partly from SCSI world, partly from old IDE (err, PATA) master/slave relationship. So, additional controller was in order, and only one which was available (with two SATA ports and in PCI-X variant) was no-name Sil 3132 based one:

20:00.0 RAID bus controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller (rev 01)

I googled a bit and found some horror stories (including binary driver on Silicon Image's site for RHEL kernels), but latest Debian kernel 2.6.23-1-686 just worked. I wanted to use it just as SATA controller (with RAID part), so I was all set. It doesn't have RAID5 anyway.

Now, let's take a short de-tour and explain why do I want to have RAID5? Isn't RAID1 better/faster/right stuff for system?

I know only one use for RAID1: booting. And that's if you don't have hardware RAID5 which shows you RAID arrays as one disk anyway. If you want to boot Linux from software RAID5 you are out of luck.

But there is simply no reason to install system on hardware RAID1 as opposed to hardware RAID5. If you have enough disks (five for example) worst configuration is 2*RAID1 + 3*RAID5 since you get maximum of 2 disks accumulated performace. Much better performace (accumulated 4 disks) is just 5*RAID5.

Having said this rand for RAID5, I also didn't want to boot from it (since I have system on 160Gb disk), and I just need reliable and fast disk. Since I could fit just 4 disks in case, I will get 2 disks of accumulated performance and reliability (third disk).

Why not simple striping over three disks (hack, I would get 500Gb of storage more)?

I don't really believe in desktop-grade disks, (even SATA). Those disks are designed to have 8 hours/day life cycle and will probably die in two years. And I like my data...

Let's see some performance, first single disk:

root@brr:~# hdparm -tT /dev/sda
/dev/sda:
 Timing cached reads:   2066 MB in  2.00 seconds = 1033.49 MB/sec
 Timing buffered disk reads:  232 MB in  3.00 seconds =  77.22 MB/sec

and RAID5 array:

root@brr:~# hdparm -tT /dev/md0
/dev/md0:
 Timing cached reads:   2030 MB in  2.00 seconds = 1014.76 MB/sec
 Timing buffered disk reads:  444 MB in  3.01 seconds = 147.49 MB/sec

After several hours of uptime, they are not very hot (order is same as in case):

root@brr:~# uptime
 21:53:24 up  5:57,  1 user,  load average: 0.00, 0.00, 0.00
root@brr:~# echo -e 'c\nd\na\nb'  | xargs -i hddtemp /dev/sd{}
/dev/sdc: WDC WD1600JS-60MHB1: 46°C
/dev/sdd: WDC WD5000AAKS-00YGA0: 45°C
/dev/sda: WDC WD5000AAKS-00YGA0: 48°C
/dev/sdb: WDC WD5000AAKS-00YGA0: 44°C

But they do go up to 55°C under heavy load (specially unhappy sda which is always a bit warmer than other disks).

As you can also see, it moved my old 160Gb disk to sdc. Hack, I'm quite sure that I could force bios into re-ordering this (or turn off RAID BIOS on controller), but I just don't have time.

And lastly how does it look?

root@brr:~# vgdisplay 
  --- Volume group ---
  VG Name               raid5
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  5
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               1
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               931.52 GB
  PE Size               4.00 MB
  Total PE              238468
  Alloc PE / Size       76800 / 300.00 GB
  Free  PE / Size       161668 / 631.52 GB
  VG UUID               onzKEw-TaRF-JBe1-9YWN-D6aJ-FIuY-1VRmLo

I didn't wrote exact commands to perform creation of array, but madm --create --help was basically enough. And that I waited for two hours for RAID array to sync...

There is one more useful tit-bit: if you have ext3 filesystem formated with resize_inode option you can do on-line resize of ext3 filesystem with resize2fs. Debian already does this by default (since etch, I beleve), but you can check with:

root@brr:~# tune2fs -l /dev/raid5/rest | grep resize_inode
Filesystem features:      has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file

So, if you installed LVM, you can on-line resize your logical volume while coping data to it. I did that while coping from old PATA and USB disks at same time and it works well... It does take much longer (and does more disk I/O) than resize_reiserfs, however.

And don't forget noatime for ext3!

December 1, 2007

OpenMoko GPS tracking

Finally, I have waited long enough to have binary driver for GPS available.

After installing it I wrote a small script which enables you to show GPS data on screen:

  • install gllin driver
  • ipkg install vte
  • install /home/root/gps.sh script below:
    #!/bin/sh
    
    

    if [ ! -z "$START_TERM" ] ; then

    gllin=/home/root/gllin/gllin

    echo "*** starting gllin"
    $gllin &
    sleep 3

    file="/media/card/`date +%Y-%m-%d`.$$"
    echo "*** creating log $file"

    cat /tmp/nmeaNP | tee $file

    killall gllin
    kill `ps ax | grep cat | grep nmea | awk '{ print $1 }'`

    DISPLAY=:0 /etc/init.d/xserver-nodm start

    else

    /etc/init.d/gsmd stop
    echo 0 > /sys/bus/platform/devices/gta01-pm-gsm.0/power_on
    START_TERM=1 DISPLAY=:0 vte -c $0

    fi


  • create /usr/share/applications/gps.desktop icon so you can start GPS tracking from GUI:

    [Desktop Entry]
    Encoding=UTF-8
    Name=GPS
    Comment=GPS trace output
    Exec=/home/root/gps.sh
    Icon=openmoko-terminal
    Terminal=false
    Type=Application
    Categories=GTK;Application;Utilities
    MimeType=text/x-vcard;
    SingleInstance=true
    StartupNotify=true

This combination will shutdown gsm part (to preserve power), create new trace file on /media/card/date.pid and open terminal with output so you can see what is going on (openmoko-terminal2 doesn't want to accept commands, so you need to install vte for this to work. vte on the other hand doesn't accept any arguments, so we need hack with START_TERM environment variable. OOH, mrxvt crashes X server when you kill it so it wan't an option.

openmoko-gps.png

Happy GPS hacking...

January 5, 2008

Icecast streaming from K77

For a last week or so, we have been in Berlin (visiting 24th Chaos Communication Congress (24C3) among other things) and we stayed in Kastanienallee 77 which is really nice place.

Since we are geeks, and didn't move much out of the room (we actually covered it with various fun toys due to our excessive trips to local computer store) it seems like a logical idea to offer some of ours skills to setup audio streaming for salon bruit which is downstairs in K77. It was great fun, but setting up streaming half an hour before program, is well, optimistic :-)

Task is simple: use darkcast to encode and icecast2 to stream audio. We had know how: both Damjan and Marcell had experience with icecast streaming and Saša and me were eager to learn how to do it.

Andrea and I managed to duck-tape ethernet extension adapter between two peaces of network cable (3m and 4m) and connect it via switch to house network two floors down to street level where Cafe is, so we had network there. Problem was that stage is on the other part of the house, and setting up wireless from front (network cable limit) to stage seems like a logical solution (at that time).

Initial idea was to use Adrea's iBook (freshly re-installed with Debian unstable for powerpc) to do all the stuff. However, nasty bcm43xx wifi card first didn't want to work as access point (disabling our chance to use it as bridge between wired and wireless network using ipmasq Debian package) so we deiced to use it as darkice to catch audio, encode it to ogg and send it to server in Croatia where icecast2 server was located which streamed content to listeners.

What was the problem? iBook doesn't have audio import port! Yes, let's save 30 cents and not put audio input connector before microphone! Thanks Apple. Marcell somehow managed to find USB microphone as alternative, but at same time, Saša managed to install darkice on his ThinkPad (thanks IBM for audio input, eh...) and I used my script which just bridges ethernet and wifi connection. Somehow in same time bcm43xx driver again gave up, we decided to stick with ThinkPad for darkcast (conveniently located on top of speaker) and moved off the stage so that program can finally start.

Having done that (with just half of hour or so delay in program start because we where fiddling with stuff) we started streaming... silence. We had connected line out from mix panel to mic in on laptop, we could ssh into it and tweak alsa setting, but all we got out was silence (we even checked with arecord -F cd foo.wav on stage laptop and aplay foo.wav on local laptop to be sure that it wasn't darkice/icecast problem.

Then Damjan suggested to press space on Capture in alsamixer (we had Capture only on Mic up to that point) and magically sound appeared. So, we had working stream, and blog post above got written. Audio levels where sub-optimal (to use kind word), but first part was nearly over, so we had pause and Saša tweaked audio levels, and we changed compression setting to lower quality so we can push it through ADSL upstream more easily.

I even remembered to record stream using wget before second part started, so we'll have a listen to it after we get some sleep to see how good the quality was. We got a couple of listeners from Croatia via #razmjenavjestina IRC channel so we are hoping for some feedback also :-)

All in all, it was a lot of fun, but I will plan to write complete walk-through while installing icecast on bljak. I hope to leave one working darkice client here in K77 so that future streams can be made much more easily.

January 20, 2008

GNU fdisk broken?

I have been backing up whole disk image from Eee PC, and mounting it using loop file system to access partition in it. However, I have problems with GNU fdisk which reports 4Gb image as:

Disk /backup/eee/hda: 3 GB, 3997486080 bytes
255 heads, 63 sectors/track, 486 cylinders, total 7807590 sectors
Units = sectors of 1 * 512 = 512 bytes

Device Boot Start End Blocks Id System
/backup/eee/hda1 63 4803435 2409718 83 Linux
/backup/eee/hda2 4819563 7759395 1469947 83 Linux
/backup/eee/hda3 7775523 7775460 0 c FAT32 LBA
/backup/eee/hda4 7791588 7791525 0 ef EFI FAT

For a start, disk size is wrong:

$ ls -al hda
-rwxrwxrwx 1 dpavlin root 4001292288 2008-01-20 00:59 hda

And then, even more wrong, offsets of partition seem to be wrong. When same image is examined using fdisk from util-linux, sectors are reported like this:

Disk hda: 0 MB, 0 bytes
255 heads, 63 sectors/track, 0 cylinders, total 0 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x332b332a

Device Boot Start End Blocks Id System
hda1 63 4819499 2409718+ 83 Linux
hda2 4819500 7775459 1477980 83 Linux
hda3 7775460 7791524 8032+ c W95 FAT32 (LBA)
hda4 7791525 7807589 8032+ ef EFI (FAT-12/16/32)

And this is correct (let's ignore size for now). I can verify this by mounting second file system as:

sudo mount hda 1 -o loop,offset=`expr 4819500 \* 512`

This seems to be off-by-one error. There is bug reported against Debian package which seems related, but than again, in my case I'm examining same disk image.

February 21, 2008

Converting PDF to Flash for web page flipping

What did I do last two days or so? I have been looking for solution to convert pdf magazine archive from simple pdf downloads into something more Web 2.0 like. As a first instinct, I hoped that I will find JavaScript solution, and do simple rendering into bitmaps.

But, then I remembered that almost everybody has Flash plugin by now (at least more people than pdf reader or plugin) and Flash also can do smooth scaling. How ward would be to find Flash picture viewer with nice page-flipping transitions.

Half way in that journey I stumbled upon Open Library which has very nice interface (and is pure JavaScript!) which was supposed to be available under GPL, but source is not available. It would be cool to re-use this reader, but if there is no source compressed version available on site isn't much help. I sent e-mail to developer, and now I'm hoping for some pointers.

Than I found several commercial flash page flipping offers (strange how flash community is strictly separated to Open Source and commercial part without much cross-over). If there is so much solution to turn pdf into flash, one of them must be Open Source, right?

And there is! SWFTOOLS is great suite of programs (available in Debian) which has automatic generation of swf files from pdf and even ability to add navigation elements over it. To cut it short, here is little snippet for example usage:

rm tmp/*
pdf2swf -p 1-20 -s insertstop magazine.pdf -o tmp/pages.swf
swfcombine -o tmp/pages+nav.swf SimpleViewer.swf viewport=tmp/pages.swf
jpeg2swf loader.jpg -o tmp/loader.swf
swfcombine -o tmp/magazine.swf PreLoader.swf loader=tmp/loader.swf movie=tmp/pages+nav.swf
cd tmp && swfdump --html magazine.swf > magazine.html
cd -

It's sweet and simple, but navigation which SimpleViewer is giving me is... Somewhat too simple. I would like to have buttons for zoom in/out (or slider) and page number for example. However, other than accidental exposure to haxe I don't know a first thing about Flash. There are at least two CPAN modules with swf support, and I have seen flv media player in one of them, but they look incredibly complicated to me, almost like the source which produce SimpleViewer itself.

I would love to design something in Inkscape and than convert that to bits of logic (if possible externally controlled by JavaScript), but I don't even know where to start looking for information. I'm only interested in solutions which are scriptable and runnable without human intervention under Linux.

There are quote a few pages to convert, and it seems that pdf2swf has limit of 65535 objects which translate to about 20 pages out of 100+ pages magazine, so splitting it into single pages seems like much better approach (and users will have less content to download). If this is plain JavaScript, I would preload one page for quick response, but is that concept even supported in Flash?

To make things worse, there is great tutorial over at SitePoint how to make page flipping using jQuery which I know and like.

Am I wasting time trying to find free (as in speech) Flash development environment which would feel good to me? Should I go with my instincts and stick with JavaScript? Time will tell... For a start I will try to convert presentations on my homepage to flash as a initial try...

Update: To make thing clear, I converted pdf files from my presentations to flash using method described above. Now I would like to replace arrows on top-left with more complex navigation, something like PdfMeNot has done.

Update^2: This was easier than I thought it will be. Following examples on swftools site I found two great viewers, and if that is not enough, I found original author of viewer used at PdfMeNot (scroll down to find download link).

March 4, 2008

Sync part of subversion repository

I had a particular problem at work: we have upstream subversion repository which we access over ssh tunnel (using svn protocol) which contains two branches in which we are interested and various other stuff we don't care about (and don't want to mirror).

On other hand, we also wanted to have local copy of all changes (preserving history) and local commit messages and SVN::Web interface.

In original idea, I also wanted to keep revision numbers as-is (so I can just checkout our local version and be done), but this wasn't possible. One solution that we examined is to use Pushmi and make local copy, but we didn't want all the other changes.

Other idea was to use svndumpfilter to sync only two branches we are interested in (it will create dummy commits for revision which are outside our branches), but since branches are result of copy from parts of the tree we don't want to sync, it didn't work either.

Did I mentioned that our svn repository can access upstream only through carefully crafted ssh tunnels? Mess, right?

So, in the end, solution was hybrid:

  • make local copy of two upstream branches using svk (loosing original order of commits, even if we are commiting into same svk mirror copy at our side)
  • install post-commit hook in upstream repository which will call (over https) svk sync at our side (I would probably use SMTP to trigger that, but our machine with svn repository doesn't accept outside e-mail)
  • install local post-commit hook to send e-mail notifications

Rest of this post are instructions on how to do this. Since I learned a thing of two doing this, I hope it might be also useful for others.

First create svn-pull.sh shell script which will run under user which has ssh keys to login to upstream firewall (1.2.3.4 in this example) and setup tunnels to upstream svn server (10.1.1.1):

#!/bin/sh
ssh -L 13690:10.1.1.1:3690 1.2.3.4 sleep 2 &
pid=$!
SVKROOT=/home/user/.svk svk sync -a
kill $pid

Now setup mirrors of branches we care about:

svk mirror svn://127.0.0.1:13690/project/carnet-foo /project/foo
svk mirror svn://127.0.0.1:13690/project/carnet-bar /project/bar

This is all nice, but we need to trigger it from www-user which is done with following in /etc/sudoers:

www-data ALL=(user) NOPASSWD:/home/user/svn-pull.sh

and add simple cgi script which will trigger sync operation:

#!/bin/sh
echo -e "Content-type: text/plain\n\r\n\r"
sudo -u user /home/user/svn-pull.sh

I used ScriptAlias in apache to make it visible at https://svn-ours.example.com/upstream-svn-update. No need on obsucate URL, since it's behind SSL for added points. IP address limit might also be a good idea:

  <Location /upstream-svn-update>                   
        Order allow,deny
        Allow from 1.2.3.4
  </Location>

Now install post-commit hook in upstream repository. We care only for files which have /carnet in path since branches which we are interested have that prefix:

svn log -v -r $REV file://$REPOS | grep ' /carnet' 2>/dev/null \
    && wget -q -O /dev/null https://svn-ours.example.com/upstream-svn-update

You will notice that there are no locking or any other tweaks, since all tools have those capabilities anyway, so we are really just using RPC via cgi over https in fact.

Nice and easy, once you know how to do it! It seems like a few bits of configuration all over the place, but I hope that it employs KISS - keep it simple and stupid at it's best.

Update: OK, now we have local repository (with different revisions), but svn switch --relocate doesn't work because those repositories are not same (makes sense, eh?)

Following steps are quick explanation now to copy .svn directories from new repository:

cd /srv/carnet-foo
# update repository to last upstream version
svn update
# delete old .svn directories
find . -name ".svn" -exec rm -Rf {} \;
# checkout new repository
cd /srv
svn co svn://svn-ours.example.com/carnet-foo carnet-foo.new
# copy new .svn files to old repository
cd carnet-foo.new
find . -wholename "*/.svn/*" | cpio -pvd ../carnet-foo/
# cleanup
cd /srv
rm -Rf carnet-foo.new
# following shouldn't return any differences
cd carnet-foo
svn diff

March 7, 2008

irc-logger - memory augmentation for #irc

Initially created in 2006 this handy tool is best described with original commit message:

IRC bot which replace human memory

Here is a quick run-down through available features:

  • web archive with search
  • irc commands: last, grep/search, stat, poll/count
  • tags// in normal irc messages (tagcloud, filter by tag, export as RSS feed)
  • announce /me messages to Twitter (yes, lame, but that was a year ago)
  • tags are available as html links for embedding (in wikis)
  • RSS feed from messages with tags (also nice for embedding)
  • irssi log import (useful for recovery in case of failure of machine or service :-)
  • announce new messages from RSS feeds (nice for wiki changes, blog entries or commits)

It has grown quite a bit from initial vision to recall last messages on the web (and it does go through some hoops to produce nice web archive). Adding of tags allowed easy recall of interesting topics but in a way now it provides an central hub for different content connected to irc.

It's written in perl using POE and it's probably not best example of POE usage. It is also somewhat PostgreSQL specific but works well for our small comunity at #razmjenavjestina irc channel. Since I have seen some interest in it this blog post might serve as announce of it's existence.

I will probably add some documentation to it's wiki page and add real muti-channel support (most of code is in there, but web archive needs filtering by channel). If you are interested to /invite it to your channel, drop me a note.

March 15, 2008

Predavanje: SQL od početnika do relacijskog maga

Danas sam na Razmjeni vještina održao maratonsko četverosatno predavanje koje je nadam se bilo donekle korisno. Nažalost, nismo stigli ući u detalje onoliko koliko bih želio, ali ako ništa drugo ponovo sam koristio pgrestraier (koji se nekako indeksira previše sporo, morati ću pogledati zašto) i još jedan zgodan projektić koji sam napisao prošle godine za studente u Zadru pg-getfeed koji je zapravo mala perl stored procedura kojim možete raditi SQL upite na RSS feedovima.

April 30, 2008

State of linux wifi (first week with OLPC)

So It has been a week from time when borrowed OLPC entered my family of computers. I have Thinkpad T60 with Atheros AR5212 (which works with atk5k driver from 2.6.25, nice work!) and Eee PC with Atheros (which works with special madwifi patch).

Since 802.11s just landed into upstream kernel git, I was eager to take a look at this mash network thing. Oh, how ignorant I was. OLPC uses 802.11s protocol which is different from official implementation of 802.11s and with good reason: they are using embedded processor in wifi card do to mash protocol for them (saving power and enabling mash to work when laptop is suspended). I could have installed olsr on OLPC, but I'm really trying to have bigger mash which is compatible with unmodified OLPCs.

Because my time is limited, I would like to work in user-land if at all possible, and since wpa_supplicant can work on unmodified kernels, it would be nice to have that level of support for OLPC mash also. After a lot of browsing (and reading few really great wifi hacking sites), I concluded that only hope is radiotap which is more-or-less supported on every pcmcia wifi card that I have (prism based 802.11b card and rt2500). I had also found simpliest possible code which uses radiotap to start with.

Now, I would just need another OLPC to save some network traces and start experimenting :-)

Aside from that, I switched totally to OLPC for this week, and amazingly enough, I didn't miss my Eee PC one tiny bit. Although a bit slower than Eee, OLPC screen is bigger (and better in black and write mode on sunlight) which helps a lot with web pages. Browser performance is amazing, so I have little doubt that we will be able to support most of web sites on OLPC without much problem. OOH, I did notice a couple of excessive round-tips on one of my web sites, while surfing on it, but that's for best anyway :-)

Update: According to message on libertas-dev mail list there is effort to use kernel's 802.11s implementation which makes my effort in supporting OLPC variant obsolete.

May 15, 2008

Bag of useful scripts

Why does console refuse to die?
- It's because of pipes!
This post is result of my long addiction to console applications. Somehow, when I want to get a quick view of things on my system, I always turn to pipes and do something with them. In that process, I developed few of useful scripts for use within shell pipes, and I would like to introduce my readership to them.

PostgreSQL database size

When I want to see size of all databases on my system or size of tables in one database I turn to pg_size. It's a short and sweet script which will do a little shell magic (take a look in it) and display size of all databases on system (without any options) or size of all tables in database and number of rows (when used like pg_size database_name) like this:
dpavlin@llin:~$ pg_size dellstore2 | grep -v sql_
4890624 customers 20000
3153920 orderlines 60350
2678784 cust_hist 60350
991232 products 10000
966656 orders 12000
450560 inventory 10000
8192 categories 16
0 reorder 0
This is all nice and well, but doesn't really gives us the right overview, so move along for...

Nice console graphs

First, a caveat: this tools assumes that it will get number, space, and optional description. Output above seems to fit into this description, so let's try it:
dpavlin@llin:~$ COLUMNS=80 pg_size dellstore2 | grep -v sql_ | sum.pl -h
customers 20000    4776k OOOOOOOOOOOOOOOO                                 4776k
orderlines 60350   3080k OOOOOOOOOO-----------------                      7856k
cust_hist 60350    2616k OOOOOOOOO---------------------------               10M
products 10000      968k OOO-------------------------------------           11M
orders 12000        944k OOO----------------------------------------        12M
inventory 10000     440k O-------------------------------------------       12M
categories 16      8192b ---------------------------------------------      12M
reorder 0              0 ---------------------------------------------      12M
This gives us nice output: description (followed by number of rows from above output), and running total of size in human readable form (if you don't like it, remove -h flag and you will get raw numbers).

Let's take another example (if you are still reading this and not interested in PostgreSQL database size). Let's see how much traffic did pppd transfer over very slow GPRS link on 8-day vacation:

dpavlin@llin:~$ grep 'pppd.*: Sent' /var/log/messages | awk '{ print $7 + $10 " " $1 " " $2 }' | sum.pl -h
May 5         0                                                               0
May 5       39k                                                             39k
May 5     7512k OO                                                        7551k
May 6     6352b --                                                        7558k
May 6       20k --                                                        7579k
May 6     1183k ---                                                       8762k
May 8     6869k OO---                                                       15M
May 8       70k -----                                                       15M
May 9     3596k O------                                                     18M
May 9     1998k -------                                                     20M
May 10      32M OOOOOOOOOOOO--------                                        53M
May 10      13k --------------------                                        53M
May 11      44M OOOOOOOOOOOOOOOOO--------------------                       98M
May 12      12M OOOO--------------------------------------                 111M
May 13    7120k OO-------------------------------------------              118M
May 13      20M OOOOOOO----------------------------------------------      139M
Much more interesting! A long time ago, I had a bunch of quick one-lines which used sum.pl to produce output from various other system counters, but somehow it got lost.

As I get only few comments on my blog, if you find this useful, leave one. I have few other examples, like the one which shows top 5 memory eaters on my system:

dpavlin@llin:~$ ps v | awk '{ print $8 " " $9 " " $10 }' | sort -rn | ~/private/perl/sum.pl | head -5
# RSS %MEM COMMAND
10.6 /usr/lib/iceweasel/firefox-bin 165092 OOOOOOOOOOOOOO                165092
4.3 perl                             67240 OOOOOO---------------         232332
0.5 awesome                           8504 ---------------------         240836
0.4 irssi                             6632 ----------------------        247468
0.3 vi                                5888 ----------------------        253356
but, if this is not interested to my readership, tell me so, and I will stop spamming your already full RSS reader with console output! :-)

May 22, 2008

Group by data in shell pipes

My mind is just too accustomed to RDBMS engines to accept that I can't have GROUP BY in my shell pipes. So I wrote one groupby.pl.


Aside from fact that it somewhat looks like perl golfing (which I'm somewhat proud of), let's see how does it look:


dpavlin@llin:~/private/perl$ ps axv | ./groupby.pl 'sum:($6+$7+$8),10,count:10,min:($6+$7+$8),max:($6+$7+$8)' | sort -k1 -nr | head -10 | align
440947 /usr/lib/iceweasel/firefox-bin 1 440947 440947
390913 /usr/sbin/apache2 11 22207 39875
180943 /usr/bin/X 1 180943 180943
135279 /usr/bin/pinot-dbus-daemon 1 135279 135279
122254 mocp 2 25131 97123
84887 pinot 1 84887 84887
78279 postgres: 5 10723 21971
70030 /usr/bin/perl 6 6959 15615
50213 /bin/bash 7 6351 7343
49266 /usr/lib/postgresql/8.2/bin/postgres 2 24631 24635

This will display total usage for process, it's name, number of such processes and range of memory usage. We can then use old friend sum.pl to produce console graph, but I already wrote about it.


So, let's move to another example, this time for OpenVZ. Let's see how much memory is each virtual machine using (and get number of processes for free):



$ vzps -E axv --no-headers | ./groupby.pl 'sum:($7+$8+$9*1024),1,count:1'
2209504127 0 265
611768242 212024 38
162484775 212037 19
170797534 212052 38
104853258 212226 26
712007227 212253 21

But wouldn't it be nice to display hostnames instead of VEID numbers? We can, using --join and --on options (which are really backticks on steroids):

$ vzps -E axv --no-headers | ./groupby.pl 'sum:($7+$8+$9*1024),1,count:1' --join 'sudo vzlist -H -o veid,hostname' --on 2
2146263206 0 259
675835528 saturn.ffzg.hr 40
162484775 arh.rot13.org 19
170797534 koha-dev.rot13.org 38
104853258 koha.ffzg.hr 26
712011323 zemlja.ffzg.hr 21

Which brings us to final result:

$ vzps -E axv --no-headers | ./groupby.pl 'sum:($7+$8+$9*1024),1,count:1' --join 'sudo vzlist -H -o veid,hostname' --on 2 | sort -rn | align | ./sum.pl -h
0 260 2105M OOOOOOOOOOOOOOOOOOO 2105M
zemlja.ffzg.hr 21 679M OOOOOO------------------- 2784M
saturn.ffzg.hr 35 512M OOOO-------------------------- 3296M
koha-dev.rot13.org 38 162M O------------------------------ 3459M
arh.rot13.org 19 154M O-------------------------------- 3614M
koha.ffzg.hr 26 99M ---------------------------------- 3714M

So, here you have it: SQL like query language for your shell pipes.

About hack-of-the-week

This page contains an archive of all entries posted to Dobrica Pavlinušić's Weblog / Blog in the hack-of-the-week category. They are listed from oldest to newest.

code is the previous category.

howto is the next category.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type 4.1