Wednesday, July 23, 2014

JAFLOC

There is an acronym in programming "JAFLOC" - this means Just a Few Lines of Code... and is the traditional answer to any problem posed.

The Gorse Fox has 24 archives of emails stretching back to to 1996 that need to be extracted so that he can mine them for useful historical information. This is a classic JAFLOC... but like most JAFLOCs it is a little more complex than it first seemed. Firstly it is a matter of detecting the transition from one email to the next - which GF finally worked out as the "0C"x code. Then each email seems to have a number of keywords - but the keyword list does not appear to be consistent. Before GF can decide on a final storage mechanism, he needs to isolate the keywords to decide how to handle them.

His code is currently mining its way through several million lines of text searching for the definitive list of keywords that will have to be handled. GF has processed over 40,000 historical emails so far... but the code is hammering onwards.

2 comments:

Anonymous said...

This may be a stupid question, but Why?

The Gorse Fox said...

The Gorse Fox maintains a separate blog which contains diary transcripts and other more personal information. This goes back to 1965. The idea of this mail archive was to use it to provide more detail regarding some events, where such detail was available in the emails.