ben / tech
This is a page for fragments of text that are each not really big enough to justify getting their own separate page, as well as links to stuff that does have its own page.
Some of the code that I work on is browsable in gitweb.
I have some Java code for converting to/from Roman numerals, and a simple applet to demonstrate.
I also have a Literate Haskell document converting from Roman numerals
to integers (but not the other way round) which is available as
Hrn.lhs.
It implements the same algorithm as the above Java code.
(darcs version hrn 1)
Whilst there are a number of styles for writing Roman numerals (several of which are implemented in the above code), there appears to be a simple algorithm that can correctly parse almost any 'reasonable'-looking Roman number (sufficiently simple that I'm inclined to treat it as a test for whether a number is valid or not). One thing (way down on the todo list) is to gather together examples of the various styles in use (the most bizarre I've seen so far is the (?)Borgias Rooms in the Vatican, though all of these are correctly parsed by my above algorithm, I think).
This code is licensed under a BSD-like license once I get round to it.
a few patches I use on top of msnlib 3.5:
3m for 3 minutes, 2d for
two days.More than a usual amount of CPU time is spent generating my shell
prompt:
export PS1='!\! $(if [ "\j" = "0" ]; then echo -n; else echo "[\j] "; fi)\u@\h:\w$(output-git-head-or-blank)\$ '
with output-git-head-or-blank being a simple script:
#!/bin/bash PS_GIT=$(git-symbolic-ref HEAD 2>/dev/null) && echo " #$(basename $PS_GIT)" && exit # else output nothing
Having become frustrated at my inability (due to wine) to finish off a sudoku puzzle that a Japanese girl made me attempt from a copy of British Balls, a magazine for backpackers and British in Australia, I knocked up a solver over a couple of days in Haskell (my first real haskell code). There is a git repo, that is not online yet. I've only tested it on one puzzle, the abovementioned one in British Balls. If I get round to it, I'll type some more in and see if it can deal with them.
The code is under a BSD-like license.
You can view the repository in gitweb.
Spam filtering of my inbound mail is mainly focused around spamassassin, which combines a large number of tests to provide a single numerical score.
On one of my MXes, I use the spamassassin milter to check messages during the delivering SMTP session. Mail that scores very highly is rejected at the SMTP level. Surprisingly (to me), a lot of mail is rejected in this fashion. This is better than sending a bounce email because it is sending back the reject down the same channel as was used to deliver the message, rather than sending a new email to what the headers claim is the original sender.
Unfortunately, this does not catch mail delivered to my lower priority MXes (a common spammer trick), though I've previously used the BOGUS_MX spamassassin plugin (I don't have it running at the moment though) to give extra points for such mail.
Mail which gets this far is never bounced due to spaminess. Instead, it falls into one of three buckets: probably-not-spam; maybe; probably-spam, based on the spamassassin score. probably-not-spam is for messages which spamassassin gives a 'not spam' rating; maybe if for messages which spamassassin gives a 'spam' rating but with a borderline score. 'probably-spam' is for messages which spamassassin gives a very high score.
For outbound mail protection, I have SPF set up. SPF stores an outbound mail policy in DNS. Anyone receiving mail claiming to be from a hawaga.org.uk email address can check this policy to determine if the mail is authorised or not. I've tried a few different policies. At the moment, the policy lists a few IP address blocks that are permitted to send, and SOFTFAILs on everything else (there are still a few places that seem to send mail as me, such as blogger...). Previously, I've had per-user filtering, to list usernames that it was permissible to send mail from (hawaga does not have many outbound email addresses), but this isn't deployed at the moment.
Interesting event scrollback in-mud: store "interesting event" lines in their entirely in-mud (rather than just storing that they happened).
some stuff that I run on my mail -- specifically spamassassin -- ends up aborting if the CPU load or whatever is too high. it still makes sense to filter this stuff later on (hours later, even, and in fact (due to distributed spam dbs) may be better to filter later on). so it would be cool to have a wee little routine that scans maildir and for every message that has not been spamassassined, run the check on it; or for every message that has not been run in the past n minutes, run the check again. ... and from that result, move the message elsewhere, blah...
For the past few years I've been using the excellent 'album' script from Dave's Marginal Hacks. On top of this, I've ended up with an increasingly unwieldy set of shell scripts to handle things like categorising photos and applying comments automatically.
So I'd like to redo the lot in a slightly more elegant, efficient fashion.
I use a front-end caching mechanism to speed up my photo website, with rough setup notes here.
my spamassassin configuration puts extra headers in to give details about the scores each rule provided, etc. I would like that info to be preserved in the case that the message went through the milter.
Something that behaves mostly like LambdaMOO (at least as far as end user interaction goes) but very distributed (potentially each object running in its own host/security domain/whatnot); it should be sufficiently distributed that there need only be one instance, in the same sense that there is only one internet.
One feature I've seen under VMS which I think I like better than the concept of
doing union mounts as explicit mounts; something like being able to type
cd $PATH or cd $MANPATH and end up in what
appears to be a directory with the contents of each directory listed in that
$PATH. I wonder if there is a nice way to do that under unix.
maybe implement this one day: would be nice if rsync could detect that files were moved rather than than thinking it was two files, one deleted, one added - use eg sha1 on the list of removed and added files to determine this?
An RSS generator for text files. When run given a text file and a (possibly not-existing-yet) RSS file, it generates a new RSS file containing all the old entries plus a new entry for any text that has been added to the text file since the last run. Gives an RSS feed that you can make new entries to by appending to the text file.
related to above txtrss - I want to generate blog-like structures on the web, but I don't like the input-side of things like blogger or livejournal - a little bit too online for my tastes (I'm sporadically connected, and its the times when I'm not connected - on train/plane/bus - that I most want to do editing, etc of such)
a while ago when I worked at ISI, I looked at doing some kind of fragmented XML transmission (imagine something like diff-based XML transmission except that for what we/I wanted to do, diffs didn't work so well so a more complicated framework is needed). I thought about this loads but never really got any code written - would be fun to spend a weekend hacking this up (I think thats enough for a basic prototype). Do it in Haskell for extra language learning points!
Interestingly, googling round a year or so later, I found that there's been some other work that is really close to some of the ideas I was tossing round: http://lambda.uta.edu/webdb05.pdf
During my travels, I've collected a bunch of SIM cards. Each has an on-chip phonebook, which I'd like to keep (partially?) synchronised.
Eventually I may have enough numbers that they will not all fit on some of my cards; thus I might want to have certain UK people just on my UK SIMs, australia people just on my australia SIM, orange-specific numbers (such as balance enq) just on my orange sim etc, whilst good friends/family should be on all SIMs
furthermore, I might be using several SIMs at once (I own three phones in various states of repair) with modifications happening to the SIM DB (mostly additions and modifications -- its probably OK if delete doesn't work perfectly)
So, given a SIM card reader for my computer, how to use that to keep everything synchronised as desired?
Ages ago I wrote the following:
pondering over maildir, which I have been using recently, I have a few questions. I'd be interested if anyone has commentary on such:
procmail creates unique IDs that look like:
_-BF,okOjDB.mundungus:2,S _-DE.H81pDB.mundungus:2,Swhich the pine maildir driver does not appear to see. This syntax looks nothing at all like the unique ID syntax detailed on djb's maildir.html page.
the Courier imap server I use creates filenames that look like:
1136850601.M749735P15170V0E000002I00CB189A_2.piva.local,S=485:2,S 1136899138.M533326P308V0E000002I00CB4AC9_0.piva.hawaga.org.uk,S=663:2,SThe ,S= syntax doesn't seem to match up with the maildir.html syntax.
This is how I'm using maildirs at the moment. The purpose is to get a decent distributed mailbox, that can be modified 'simultaneously' in different locations without network connection (specifically, I have a laptop and an always-net-connected server; usually mail is delivered to the always-net-connected server, but I read mail on a combination of the laptop and the always-net-connected server, depending on my situation). I'd like this to be able to scale up to allow ~10 machines to participate in the pool (for example, one or more work PCs, a home PC, potentially multiple delivery servers). No one particular host should be the master; this means that any host can die and everything else can carry one with little or no reconfiguration.
Shepard tones. See this wikipedia entry. There's java code to generate sox raw data format:
Shepard.java and an mp3 of the output:
shepard_tone.wav.mp3 (3 Mb).
Use like this: java Shepard < a.dat && sox a.dat && sox a.dat a.wav
-- end --