Hi Petter,
We here in Norway are now in the process of revitalizing the Norwegian Bokmål and Nynorsk spell checking package, URL:http://no.speling.org/.
Excellent - I just saw the project announcement come across on freshmeat.
In addition, a related group of people are funded by the Norwegian government to create spell checking systems for several of the Saami languages, URL:http://divvun.no/english.html. This work is organized by the university of Tromsø, and this group have access to a corpus, but could use more words.
Yes, I believe Børre Gaup wrote to me about this last year some time.
Is the collection of files available on the web somewhere?
I don't make the word/frequency lists available on the web because (ironically) I intend them only to be used for open source projects. Can you tell me a bit about the licensing you'll be using for the spell checkers? As I recall there was some kind of morphological back end being written for Saami - will that be open source also or will you use it to generate a large word list offline? Are you writing affix files too?
Anyway, if you send me your latest word lists for all three languages (with affix flags expanded, if any) I can send lists of "best candidates" for addition that are determined via some naive statistics.
Norwegian Bokmål is missing from your status page. Would you be willing to collect documents for that language as well?
The crawler runs for Bokmål also, but the language has a substantial enough web presence that it doesn't qualify for "minority" status (and so is not listed on the page). In practical terms, this means that I don't let the crawler run to completion, but gather just enough text to use for frequency lists, 3-gram models, etc.
I'll be travelling for the next week, but should have access to a computer - so please be patient if I'm slow in responding.
-Kevin