Archives

May 2009 (1)
April 2009 (1)
March 2009 (4)
January 2009 (3)

November 2008 (2)
October 2008 (2)
September 2008 (1)
August 2008 (5)
July 2008 (3)
June 2008 (1)
May 2008 (5)
April 2008 (8)
March 2008 (3)
February 2008 (1)
January 2008 (2)

December 2007 (2)
November 2007 (4)
October 2007 (17)
September 2007 (9)

Elements or attributes, the eternal question

Sunday, May 10 2009

Nice take on when to use attributes in XML here.

no comments

Tags: xml ~ linky

Perils of late binding in Python

Thursday, April 30 2009

So I haven’t been doing a lot of Python recently, and I got tripped up by something that in retrospect should have been obvious.

You can write code that references an undefined thingy, and Python won’t complain until you actually run the code and try to access the thingy.

Eg:

>>> def f():
... z()
...
>>> f()
Traceback (most recent call last):
File "", line 1, in
File "", line 2, in f
NameError: global name 'z' is not defined

Which kind of sucks when you forgot to write a test for that code, and you get a runtime exception.

no comments

Tags: python

Helpful features that aren’t

Wednesday, March 25 2009

Dear WhitePages,

Your “suggested locations” dropdown and region searching are broken.

Eg, I want to find “City GPs” in Wellington. As I type “Wellington”, the first suggested location is “Wellington Central”. Aha! I think. They are indeed in Central Wellington.

However, this search yields no results. Neither does “Wellington City” or “Wellington CBD”, even though they are at the city end of Willis St and definitely would be in both those regions. Only plain “Wellington” gives me a result.

To add insult to injury, the suggestion when there are no results is to “refine my search”. But “Wellington CBD” IS more refined than “Wellington”.

At this point, your suggested locations feature is actually more useless and more annoying than if it didn’t exist at all. Either get rid of it, or make your subcategories work as expected.

Yours sincerely

Stephen

1 comment

Tags: usability ~ catalyst ~ misfeature

Physician, heal thyself

Sunday, March 22 2009

The other day I was reading Ryan Tomayko’s blog and I got inspired.

Ryan wrote the Kid templating library which drives this blog, and is quite the Python/Ruby hacker. He also has a very minimalist design. Its principles are outlined here.

With hypertext, the information itself is the interface. The content takes center stage while the chrome and tool areas are placed in the back-seat. This inversion of priorities has created as big a leap in interface innovation as the first graphical user interfaces did to the terminal based applications before them.

And yet, these fine attributes of hypertext are regularly subverted. Since the web’s inception and subsequent boom, people have been trying to get around hypertext’s “limitations” as an interface medium: first with Java Applets and Active X controls, later with Flash sites, and today with Rich Internet Application (RIA) platforms. There was a time when sites were authored with the goal of preventing the vertical scroll-bar from ever appearing! The goal is always the same: invert the web’s superior content-oriented interface back to the GUI era and allow for the types of administrative debris so common and accepted in desktop applications.

I have applied them over on my other channel. (I also made a bunch of other improvements, like per-tag RSS feeds, and better 404 handling.)

I often have rude things to say about other people’s usability, so it feels good to get my own house in order. I am interested though in whether there such a thing as best practice design for blogs. For example, are “recent comments” widgets useful? Should you have whole articles rather than excerpts on your home page, and if so, how many? I don’t know, but I’d like to.

Naturally, this blog is still untouched and looks like pus; in fact owing to changes made for the other channel, it’s worse than before. This will not be the case for long.

no comments

Tags: usability ~ burble ~ ryan tomayko ~ catalyst

A skeletal Python script

Tuesday, March 03 2009

Had a burst of hacking over the weekend, and one of the outcomes was the realisation that I have a few practises that could be usefully put into a template for new scripts.

So: here is my current starting point for any new script.

 

#!/usr/bin/python
# -*- coding: utf-8 -*-

from optparse import OptionParser

def _test():
import doctest
doctest.testmod()

def _profile_main(filename):
import cProfile, pstats
prof = cProfile.Profile()
ctx = """_main(filename)"""
prof = prof.runctx(ctx, globals(), locals())
stats = pstats.Stats(prof)
stats.sort_stats("time")
stats.print_stats(10)

def _blurt(s):
pass

def _main(filename):
pass

if __name__ == "__main__":
usage = "usage: %prog [options]"
parser = OptionParser(usage=usage)
parser.add_option('--profile', '-P',
help = "Print out profiling stats",
action = 'store_true')
parser.add_option('--test', '-t',
help ='Run doctests',
action = 'store_true')
parser.add_option('--verbose', '-v',
help ='print debugging output',
action = 'store_true')

(options, args) = parser.parse_args()

# assign non-flag arguments here
# filename = args[0]

def really_blurt(s):
print s

if options.verbose:
_blurt = really_blurt

if options.profile:
_profile_main(filename)
exit()

if options.test:
_blurt = really_blurt
_test()
exit()

_main()

2 comments

Tags: python

Using Gnome Do’s Docky view with dual monitors

Tuesday, March 03 2009

Gnome Do offers a thing called “Docky” which is somewhat like the Mac OS X Dock. I’ve become quite fond of it.

Docky has an autohide mode, so that it will only appear when your mouse goes below the bottom edge of the screen.

I have dual monitors at home and at work, and I’m afraid that if auto-hide is on, Docky disappears and won’t come back, except for the odd flicker. This is a problem, because the only easy way to toggle auto-hide mode is by right-clicking on Docky.

I realised that this setting was probably in gconf. It is. You can use gconf-editor to find Gnome Do’s settings and tweak Docky autohide there. Problem solved.

Also, bug reported.

no comments

Tags: gnome do ~ docky ~ autohide ~ dual monitors

Kiwibank’s KeepSafe feature, and ETAOIN SHRDLU

Friday, January 30 2009

Kiwibank have added a new step to their login process, called KeepSafe.

In this step, user knows the answer to a small range of questions they have selected, like “Where were you born” or “What’s your pet’s name?” And when they log in they are prompted with the questions and asked to select random letters from the answer (eg to select the 1st and 5th letters).

The aim is to defeat keyloggers. The user uses their mouse to select letters from a display of the alphabet, and they never type the whole answer, so an attacker who logged mouse clicks would have to capture multiple logins.

My guess is that password-stealing malware is common enough now that it poses a significant risk to banks.

Unfortunately for users, this system is quite inconvenient. It involves an unaccustomed degree of mental and physical dexterity to select the correct letters. It also is unaccessible for people with text only browsers, or who have Javascript turned off (ironically, the very people least likely to be vulnerable to malware).

A friend suggested that their Keepsafe answer would be “Keepsafe is bloody annoying”. This inspired me. I realise now that the savvier user will set all their Keepsafe answers to AAAAAAAAAAAA.

I also wonder whether it wouldn’t be reasonably easy to guess Keepsafe answers. If I were a wily hacker, I’d use my dictionary to compile stats of the most common letters in English words, by word length and position in the word. Let’s see.

#!/usr/bin/python

import string

f = file('/usr/share/dict/words')

counts = [{'all':0},{'all':0},{'all':0},{'all':0},{'all':0},{'all':0}]

# snag all 6 letter words
for line in [l.lower().strip() for l in f.readlines() if len(l) == 7]:
for i in range(6):
# count the letters in position i
letter = line[i]
counts[i][letter] = counts[i].get(letter, 0) + 1
# keep a total so we can compute a percentage easily
counts[i]['all'] = counts[i]['all'] + 1

for pos in range(6):
print "Position %d" % (pos + 1)
tops = {}
for letter in string.lowercase:
tops[letter] = counts[pos].get(letter,0)*100/counts[pos]['all']
# take the top ten most frequent letters
for pair in sorted(tops.iteritems(), key=lambda(k,v):(v,k), reverse=True)[0:9]:
print "%s %02.2f%%" % (pair[0], pair[1]),
print

Results:

Position 1
s 11.00% c 7.00% b 7.00% p 6.00% m 6.00% t 5.00% r 5.00% d 5.00% a 5.00%
Position 2
a 18.00% o 15.00% e 13.00% i 10.00% u 9.00% r 7.00% l 5.00% n 3.00% h 3.00%
Position 3
r 10.00% a 9.00% n 8.00% l 7.00% s 6.00% o 6.00% i 6.00% t 5.00% e 5.00%
Position 4
i 10.00% e 10.00% t 8.00% a 7.00% n 6.00% l 6.00% o 5.00% s 4.00% r 4.00%
Position 5
e 27.00% n 7.00% l 6.00% a 5.00% t 4.00% r 4.00% o 4.00% i 4.00% u 2.00%
Position 6
s 36.00% d 11.00% e 9.00% r 8.00% y 6.00% n 5.00% t 4.00% g 3.00% a 3.00%

The distribution of letters is quite skewed, and you get three goes with Keepsafe, so a patient intruder could probably guess a substantial minority of answers.

I’m not sure what the end of this arms race will be.

no comments

Tags: security ~ kiwibank ~ python

A letter to Steven Joyce about S92A of the Copyright Amendment Act

Wednesday, January 28 2009

Dear Mr Joyce

I am writing to you in the hope that you will take action to prevent s92a of the Copyright Amendment Act from taking effect.

The law in question suffers from the following problems:
– it reverses the normal presumption of innocence
– it imposes no penalty for improper accusations
– it provides no easy remedy for people wrongly accused to have their access to an essential service restored
– it is likely to punish people who have done no wrong (for example, parents of teenagers, managers of organisations with careless employees, victims of viruses, flatmates who share an internet connection, etc).

In other jurisdictions, especially the US, recording industry bodies have been both aggressive and inaccurate in their attempts to pursue file sharers. In Australia they are suing ISPs who ask them to verify their accusations. In the UK, a parallel law has already been ruled out as being unworkable from the get-go.

Our government officials are on record as saying that laws against fraud will be sufficient to deter false accusations. This is clearly not so. The recording industry, unlike the typical citizen, is well-funded and well-advised by lawyers. It will be difficult for the police or for a private citizen to prove criminal intent for an incorrect takedown notice.

This law is ill-conceived, attacks the rights of ordinary citizens, and poses a real threat to the livelihood of anyone who depends on a working internet connection.

I look forward to hearing that this legislation from the previous government will be reviewed by the current one in the common sense manner prized by the National party.

Yours sincerely

Stephen Judd

2 comments

Tags: layer 8: politics

Painless html parsing with lxml

Wednesday, January 14 2009

I am working on a Ruminator 2.0. I intend to parse full stories, not just the summaries that appear in RSS.

So I’ve been investigating my options for HTML parsing. There are quite a few options for Python, with varying degrees of speed, flexibility, and tolerance for broken markup.

After a rapturous writeup from Ian Bicking, I thought I’d try lxml, which is a Pythonic wrapper around Gnome’s libxml and libxlst libraries. I’m sold. You can even use CSS selectors if, just like jQuery! (I like not having too much loaded into my head at once).

Suppose you want to scrape a news story (for statistical analysis, not copyright infringement) from the NZ Herald:

>>> from lxml.html import parse
>>> doc = parse('http://www.nzherald.co.nz/nz/news/article.cfm?c_id=1&objectid=10551829&ref=rss&pnum=0').getroot()
>>> paras = doc.cssselect('div.article-holder p')
>>> for p in paras:
... print p.text_content()

Easy peasy.

1 comment

Tags: python ~ lxml ~ the ruminator

Issues in authentication systems

Friday, November 14 2008

I have my own issues with biometric authentication systems, but this is not one I had foreseen.

To Whom it May concern: It has come to the attention of Recognition Systems that some people have a particular concern about using our hand scanners which relates to their religious beliefs. The concern revolves around the detection or placement of what is described in the Scriptures as “the mark of the Beast.”

Read the whole thing.

no comments

Tags: security ~ authentication ~ biometrics

Recent comments

Rendered at 2009-07-04 20:19:29