Skip to Content »

Tech Life of Recht » archive for 'System'

 Playing it safe: Choosing database engines

  • December 13th, 2007
  • 11:32 pm

The project I’m working on now is using Oracle for data storage. Not only that, it’s good old 9.2. Try installing that on your brand new Ubuntu. Oh well, it’s doable, but recently we’ve begun discussing if we could run it on MySQL or PostgreSQL (I’m voting for PostgreSQL, but that’s mostly because I used MySQL back when foreign keys and other advanced features weren’t exactly implemented, and in the meantime, I’ve come to like PostgreSQL. At the very least, hitting Ctrl-c in psql does not exit the prompt like mysql does).
The way it works now is that if something goes wrong, we can either call Oracle or our strategic Oracle partner, and they’ll fix it. This is just about everything I’m against: paying large amounts of money just because then we can blame somebody else when things screw up.
There are basically two issues with switching to a more light-weight database, and they’re probably more or less the same on many projects. First of all, performance. Oracle probably has a good advantage there, especially in handling large and complex queries. Secondly, maintenance, especially backup and failover.
The only one of these points which I think is valid is backup – it doesn’t really help if it takes a week to backup or restore data. However, when that’s said, we’re left with two things: Performance and reliability. The Oracle way (I’m using this term as the broadest possible. Replace Oracle with MSSQL or DB2 if you like) is to add more memory, disk, clustering, high availability. Expensive, but you get to live in your little world where you can just write code against one large database.
The other way, which I prefer (and which is probably in the Web 2.0 spirit) is to distribute. Distribute both data and processing to a number of autonomous nodes which can operate independently of each other. This is no news, and has been done many times, but it’s not something that’s normally considered when building good old business applications.
The result of distributing is essentially that you think about how you’re accessing your data. Instead of just delegating al of the work to the database and hoping for the best, you’re actually forced to analyze data relationships to discover separate components. If this succeeds, the choice of database should no longer be about whether it can optimize a query over 50 tables with subselects, type conversions, views, functions, and other stuff, but if it is efficient at looking up simple data relations. My guess is that all the popular databases can do this, so then you’re free to choose the cheapest, the one which is easiest to work with, or whatever suits your environment.

Returning to the project I’m working on, we have some reservations about some of the queries we’re executing. They’re on Oracle syntax right now, but can probably be converted to regular SQL in a finite amount of time. That doesn’t make them any smaller, but we have a good amount of pretty static data (addresses, classifications, and so on), which are retrieved together with the more dynamic data. It the static data is removed from the SQL, we end up with some much simpler queries, which shouldn’t be a problem for any database engine. The problem which remains is how the static data is retrieved effeciently. At the moment, I’m leaning towards a solution where we implement a service which can take a list of data keys and return the data. Depending on the amount of data, the service can then be implemented as a in-memory map, a memcached cache, or maybe even something like Hadoop. No matter what, basing the model on a basic principle of isolating static data from the dynamic, and only querying the dynamic data seems like the way to go as a first step – and as a nice side effect, the dependency on the database’s ability to perform doesn’t matter that much anymore.

This probably just sound like drunken ramblings to those who have actually implemented distributed business systems, but bear with me, it’s a first for me, and I need to get things out of my head before the space runs out.

 My xmonad start script

  • October 25th, 2007
  • 10:54 pm

After reinstalling xmonad on my laptop using Ubuntu as the base distribution, I've finally got a setup I like, so I thought I'd share it. So, here is my script which starts xmonad:

CODE:
  1. ssh-add ~/.ssh/id_dsa </dev/null>/dev/null
  2.  
  3. Esetroot /usr/share/backgrounds/warty-final-ubuntu.png
  4. unclutter -idle 2 &
  5.  
  6. urxvtd -f -o
  7.  
  8. gnome-settings-daemon &
  9. gnome-power-manager
  10. gnome-volume-manager &
  11.  
  12. BG=white
  13. FG=black
  14. FONT="-xos4-terminus-medium-r-normal--12-120-72-72-c-60-iso8859-1"
  15.  
  16. export PATH=$PATH:`dirname $0`
  17.  
  18. # create a pipe for xmonad to talk to
  19. PIPE=$HOME/.xmonad-status
  20. rm -f $PIPE
  21. mkfifo -m 600 $PIPE
  22. [ -p $PIPE ] || exit
  23.  
  24. # and a workspace status bar, reading's xmonad's stdout
  25. dzen2 -e '' -w 680 -ta l -fg $FG -bg $BG -fn $FONT -h 15 <$PIPE &
  26.  
  27. `dirname $0`/status &
  28.  
  29. xmonad> $PIPE &
  30. wait $!
  31.  
  32. pkill -HUP dzen2
  33.  
  34. wait

This script is launched from ~/.xsession. It uses dzen and some of the dzen gadgets (which must be compiled separately from dzen/gadgets). It also starts some of the Gnome daemons which normally run under Ubuntu. This means that power management and sound volume goes through Gnome, and devices are mounted automatically when they are added to the system.

Two dzens are launched: One for displaying workspaces and one for general status. The general status script looks like this:

CODE:
  1. #!/bin/sh
  2.  
  3. BG=white
  4. FG=black
  5. FONT="-xos4-terminus-medium-r-normal--12-120-72-72-c-60-iso8859-1"
  6.  
  7. while :; do
  8. MEM=`awk '/MemTotal/ {t=$2}; /MemFree/ {f=$2}; END {print t-f " " t}' \
  9.   /proc/meminfo | dbar -s ':' -l 'Mem:' -w 10`
  10. CF0=$(echo `cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq` \
  11.   `cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq` | dbar -w 7 -l 'CPU0:')
  12. CF1=$(echo `cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_cur_freq` \
  13.   `cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_max_freq` | dbar -w 7 -l 'CPU1:')
  14. DATE=`date +"%d-%m %k:%M"`
  15.  
  16. REM=`grep 'remaining capacity' /proc/acpi/battery/BAT0/state | awk '{print $3}'`
  17. LAST=`grep 'last full' /proc/acpi/battery/BAT0/info |awk '{print $4}'`
  18. STATE=`awk '{print $2}' /proc/acpi/ac_adapter/ADP1/state`
  19. if [ "$STATE" = "on-line" ]; then
  20.   BAT=$(echo $REM $LAST | awk '{printf "Bat: %.1f%%, AC", ($1/$2)*100'})
  21. else
  22.   PRESENT=`grep 'present rate' /proc/acpi/battery/BAT0/state | awk '{print $3}'`
  23.   BAT=$(echo $REM $LAST $PRESENT | awk '{printf "Bat: %.1f%%, %d min", ($1/$2)*100, ($1/$3)*60}')
  24. fi
  25.  
  26. LOAD=`awk '{print $1 " " $2 " " $3}' /proc/loadavg`
  27.  
  28. echo "$CF0 $CF1 $MEM | $DATE | $BAT | $LOAD"
  29. sleep 5
  30. done | dzen2 -e '' -x 440 -w 1000 -fg $FG -bg $BG -fn $FONT -h 15

The script is made for a MacBook Pro with a two core processor and acpi. It displays the CPU frequencies in a dbar, memory usage, battery status, and load. I would like to display network status too, but I can't figure out how to dig around in dbus efficiently, so that's still missing.

 MacBook Pro

  • October 24th, 2007
  • 11:04 pm

Last week, we had some of the local drugheads come into the office at night, and they ran off with my laptop, and 6 other laptops. I had a trusty Lenovo Thinkpad z61m with Ubuntu (and xmonad, of course), which is now gone, together with whatever I had on it. Naturally, I didn't have backup of everything, but at least all of my code was checked into Subversion, so the most critical I'm missing now are some course material.
Luckily, it didn't take long before we got permission to go out and buy some new laptops (this is one of the great things when you're working for a small company: no company procedures which have to be followed). We were told that we should probably go for a IBM, HP or Apple laptop. Now, I don't know about you, but when I buy new hardware, I can't just decide. It takes a couple of days (or even weeks), so deciding what to buy and live with for the next couple of years in half an hour certainly wasn't easy. In the end, I got a MacBook Pro with a 15" display - but only after confirming that it could run Linux, in case OS X got on my nerves too much.
Having lived with xmonad for a couple of months, it didn't take long before I decided that OS X might be pretty nice, but there's just too much focus on the looks, so I installed rEFIt, downloaded the new Ubuntu 7.10 and installed it.

I had anticipated some problems installing, as some of the experience reports on the net indicated that some things might go wrong, but that didn't really happen. Everything just installed (tm) - almost: The wireless driver had to be compiled from Subversion. Other than that, everything works: X, accellerated graphics, Xinerama/TwinView, sound, webcam, temp. sensors, keyboard lights, and touchpad with different zones. Only two things aren't working: The remote control and suspend to ram.
I can live without the remote control (actually, I can't really imagine what I should use it for if it worked), but the missing software suspend is not exactly optimal. It doesn't help that it's not consistent: Sometimes it can actually suspend and resume correctly. Using 'pm-suspend --quirk-post-vbe' seems to increase the number of times it works.
At other times, it doesn't suspend. Sometimes, it just blanks the screen, and nothing else happend. It can also switch to console with a blinking cursor, and then nothing else happend. Sometimes bliking the cursor is so hard it has to turn on the fans. No matter what, a cold boot is the only way forward.
If it manages to suspend, resume often fails. When this happens, the disk turns on, but the screen doesn't, and there's no response to any keypresses. Again, cold boot is the way forward.
I haven't managed to identify why software suspend doesn't work. The only hint I have is that if the network cable is unplugged when I suspend, it seems to suspend successfully more often than else. Acpid should unload the network drivers before suspend, so I can't see why this has anything to do with it. I'm hoping a kernel upgrade will fix it at some point.

 xfsdump/xfsrestore

  • January 5th, 2007
  • 7:50 pm

Because I'm going to need this later, and I don't want to forget:

Dump xfs to a remote host:
xfsdump -l0 - /mnt | gzip -c | ssh user@host dd of=/somewhere/backup.dgz

Restore xfs from remote host:
ssh user@host "dd if=/somewhere/backup.dgz" | gunzip -c | xfsrestore - /mnt

 SSL certificates

  • November 3rd, 2006
  • 7:16 pm

It's always a mess to create self-signed certificates and managing them correctly. Instead, I've used CAcert where you can get free certificates without too much trouble. You just have to import their root certificate, which isn't that bad.
However, you're out of luck if you're stuck with Java 1.4, as the CAcert certificates are in 4096 bits, which Java 1.4 can't handle, it just dies with

CODE:
  1. keytool error: java.io.IOException: Keystore was tampered with, or password was incorrect

or some other variation of the same error. Nothing can solve that besides upgrading to 5.0, which would be a good choice, but at the moment, I'm stuck with a 1.4 app server.

Instead, I found StartCom Free SSL, which provides more or less the same service, just with lower-grade certificates. And as an extra bonus, their root certificate is already included in Firefox.