Archives

February 2010 (1)
September 2009 (1)
May 2009 (1)
April 2009 (1)
March 2009 (4)
January 2009 (3)

November 2008 (2)
October 2008 (2)
September 2008 (1)
August 2008 (5)
July 2008 (3)
June 2008 (1)
May 2008 (5)
April 2008 (8)
March 2008 (3)
February 2008 (1)
January 2008 (2)

December 2007 (2)
November 2007 (4)
October 2007 (17)
September 2007 (9)

Kiwibank’s KeepSafe feature, and ETAOIN SHRDLU

Friday, January 30 2009

Kiwibank have added a new step to their login process, called KeepSafe.

In this step, user knows the answer to a small range of questions they have selected, like “Where were you born” or “What’s your pet’s name?” And when they log in they are prompted with the questions and asked to select random letters from the answer (eg to select the 1st and 5th letters).

The aim is to defeat keyloggers. The user uses their mouse to select letters from a display of the alphabet, and they never type the whole answer, so an attacker who logged mouse clicks would have to capture multiple logins.

My guess is that password-stealing malware is common enough now that it poses a significant risk to banks.

Unfortunately for users, this system is quite inconvenient. It involves an unaccustomed degree of mental and physical dexterity to select the correct letters. It also is unaccessible for people with text only browsers, or who have Javascript turned off (ironically, the very people least likely to be vulnerable to malware).

A friend suggested that their Keepsafe answer would be “Keepsafe is bloody annoying”. This inspired me. I realise now that the savvier user will set all their Keepsafe answers to AAAAAAAAAAAA.

I also wonder whether it wouldn’t be reasonably easy to guess Keepsafe answers. If I were a wily hacker, I’d use my dictionary to compile stats of the most common letters in English words, by word length and position in the word. Let’s see.

#!/usr/bin/python

import string

f = file('/usr/share/dict/words')

counts = [{'all':0},{'all':0},{'all':0},{'all':0},{'all':0},{'all':0}]

# snag all 6 letter words
for line in [l.lower().strip() for l in f.readlines() if len(l) == 7]:
for i in range(6):
# count the letters in position i
letter = line[i]
counts[i][letter] = counts[i].get(letter, 0) + 1
# keep a total so we can compute a percentage easily
counts[i]['all'] = counts[i]['all'] + 1

for pos in range(6):
print "Position %d" % (pos + 1)
tops = {}
for letter in string.lowercase:
tops[letter] = counts[pos].get(letter,0)*100/counts[pos]['all']
# take the top ten most frequent letters
for pair in sorted(tops.iteritems(), key=lambda(k,v):(v,k), reverse=True)[0:9]:
print "%s %02.2f%%" % (pair[0], pair[1]),
print

Results:

Position 1
s 11.00% c 7.00% b 7.00% p 6.00% m 6.00% t 5.00% r 5.00% d 5.00% a 5.00%
Position 2
a 18.00% o 15.00% e 13.00% i 10.00% u 9.00% r 7.00% l 5.00% n 3.00% h 3.00%
Position 3
r 10.00% a 9.00% n 8.00% l 7.00% s 6.00% o 6.00% i 6.00% t 5.00% e 5.00%
Position 4
i 10.00% e 10.00% t 8.00% a 7.00% n 6.00% l 6.00% o 5.00% s 4.00% r 4.00%
Position 5
e 27.00% n 7.00% l 6.00% a 5.00% t 4.00% r 4.00% o 4.00% i 4.00% u 2.00%
Position 6
s 36.00% d 11.00% e 9.00% r 8.00% y 6.00% n 5.00% t 4.00% g 3.00% a 3.00%

The distribution of letters is quite skewed, and you get three goes with Keepsafe, so a patient intruder could probably guess a substantial minority of answers.

I’m not sure what the end of this arms race will be.

no comments

Tags: security ~ kiwibank ~ python

A letter to Steven Joyce about S92A of the Copyright Amendment Act

Wednesday, January 28 2009

Dear Mr Joyce

I am writing to you in the hope that you will take action to prevent s92a of the Copyright Amendment Act from taking effect.

The law in question suffers from the following problems:
– it reverses the normal presumption of innocence
– it imposes no penalty for improper accusations
– it provides no easy remedy for people wrongly accused to have their access to an essential service restored
– it is likely to punish people who have done no wrong (for example, parents of teenagers, managers of organisations with careless employees, victims of viruses, flatmates who share an internet connection, etc).

In other jurisdictions, especially the US, recording industry bodies have been both aggressive and inaccurate in their attempts to pursue file sharers. In Australia they are suing ISPs who ask them to verify their accusations. In the UK, a parallel law has already been ruled out as being unworkable from the get-go.

Our government officials are on record as saying that laws against fraud will be sufficient to deter false accusations. This is clearly not so. The recording industry, unlike the typical citizen, is well-funded and well-advised by lawyers. It will be difficult for the police or for a private citizen to prove criminal intent for an incorrect takedown notice.

This law is ill-conceived, attacks the rights of ordinary citizens, and poses a real threat to the livelihood of anyone who depends on a working internet connection.

I look forward to hearing that this legislation from the previous government will be reviewed by the current one in the common sense manner prized by the National party.

Yours sincerely

Stephen Judd

2 comments

Tags: layer 8: politics

Painless html parsing with lxml

Wednesday, January 14 2009

I am working on a Ruminator 2.0. I intend to parse full stories, not just the summaries that appear in RSS.

So I’ve been investigating my options for HTML parsing. There are quite a few options for Python, with varying degrees of speed, flexibility, and tolerance for broken markup.

After a rapturous writeup from Ian Bicking, I thought I’d try lxml, which is a Pythonic wrapper around Gnome’s libxml and libxlst libraries. I’m sold. You can even use CSS selectors if, just like jQuery! (I like not having too much loaded into my head at once).

Suppose you want to scrape a news story (for statistical analysis, not copyright infringement) from the NZ Herald:

>>> from lxml.html import parse
>>> doc = parse('http://www.nzherald.co.nz/nz/news/article.cfm?c_id=1&objectid=10551829&ref=rss&pnum=0').getroot()
>>> paras = doc.cssselect('div.article-holder p')
>>> for p in paras:
... print p.text_content()

Easy peasy.

1 comment

Tags: python ~ lxml ~ the ruminator

Recent comments

Rendered at 2010-03-13 08:54:39