PXElator introduction

This weekend we where in Split on Ništa se neće dogoditi event and I did presetation about first three weeks of PXElator development which can be used as gentle introduction into this project. So, here we go...

Introduction

PXElator is just a peace of puzzle which aims to replace system administration with nice declarative programs in perl. It's a experiment in replacing my work with reusable perl snippets.

It tries to solve following problems:

  • support deployment of new physical or virtual machines (ip, hostname, common configuration)

  • maintain documentation about changes on systems, good enough to be used for disaster recovery (or deployment of similar system)

  • configure systems in small chunks (virtual or containers) for better management and resource tracking using normal system administration tools (but track those changes)

  • provide overview and monitoring of network segment and services on it with alerting and trending

Deployment of new machines

What is really machine? For PXElator, it's MAC and IP address and some optional parameters (like hostname). It's stored on file-system, under conf/server.ip/machine.ip/hostname and can be tracked using source control if needed.

This is also shared state between all daemons implementing network protocols:

  • DHCP (with PXE support)

  • TFTP (to deliver initial kernel and initrd using pxelinux)

  • HTTP (to provide alternative way to fetch files and user interface)

  • DNS (we already have data)

  • syslog

  • AMT for remote management

Having all that protocols written in same language enables incredible flexibility in automatic configuration. I can issue command using installation which has only ping because I can have special DNS names which issue commands.

But, to get real power, we need to aggregate that data. I'm currently using CouchDB from http://couchdb.apache.org/ to store all audit data from all services into single database.

I wanted simple way to write ad-hoc queries without warring about data structure too much. At the end, I opted for audit role of data, and used 1 second granularity as key when storing data. Result of it is that 133 syslog messages from kernel right after boot you will create single document with 133 revisions instead of flooding your database.

It would be logical to plug RRDtool http://oss.oetiker.ch/rrdtool/ somewhere here to provide nice graphs here, but that is still on TODO list.

End user scenarios:

  • Take a new machine, plug it into network, boot it from network and configure for kiosk style deployment with Webconverger available at http://webconverger.com/. Kiosk should automatically turn on every morning at 7:30 and turn off at 20:30.

  • Boot virtual machine (with new ip and hostname) from backup snapshot for easy recovery or testing

  • Boot machine from network into fully configurable (writable) system for quick recovery or dedicated machine. This is implemented using NFS server with aufs read-write overlay on top of debootstrap base machine.

Disaster recovery documentation for me, two years later

I have been trying to write useful documentation snippets for years. My best effort so far is Sysadmin Cookbook at http://sysadmin-cookbook.rot13.org/ a set of semi-structured shell scripts which can be executed directly on machines.

This part isn't yet integrated into PXElator, but most of the recipe will become some kind of rule which you can enforce on some managed machine.

End user scenario:

  • Install that something also on this other machine

Configure system like you normally would but track changes

This is basically requirement to track configuration changes. Currently, this feature falls out of writable snapshot over base system which is read-only. Overlay data is all custom configuration that I did!

Tracking changes on existing machines will be implemented scp to copy file on server into hostname/path/to/local/file directory structure. This structure will be tracked using source control (probably git as opposed to subversion which PXElator source uses) and cron job will pull those files at some interval (daily, hourly) to create rsync+git equivalent of BackupPC http://backuppc.sourceforge.net for this setup.

It's interesting to take a look how it's different from Puppet and similar to cfengine3:

  • All data is kept in normal configuration files on system -- you don't need to learn new administration tools or somehow maintain two sources of configuration (in configuration management and on the system)

  • Introspect live system and just tries to apply corrections if needed which is similar to cfengine3 approach.

End user scenario:

  • Turn useful how-to into workable configuration without much effort

Provide overview and monitoring

This falls out from HTTP interface and from collecting of data into CouchDB. For now, PXElator tries to manage development environment for you, opening xterms (with screen inside for logging and easy scrollback) in different colors, and enable you to start Wireshark on active network interfaces for debugging.