There is an acronym in programming "JAFLOC" - this means Just a Few Lines of Code... and is the traditional answer to any problem posed.
The Gorse Fox has 24 archives of emails stretching back to to 1996 that need to be extracted so that he can mine them for useful historical information. This is a classic JAFLOC... but like most JAFLOCs it is a little more complex than it first seemed. Firstly it is a matter of detecting the transition from one email to the next - which GF finally worked out as the "0C"x code. Then each email seems to have a number of keywords - but the keyword list does not appear to be consistent. Before GF can decide on a final storage mechanism, he needs to isolate the keywords to decide how to handle them.
His code is currently mining its way through several million lines of text searching for the definitive list of keywords that will have to be handled. GF has processed over 40,000 historical emails so far... but the code is hammering onwards.