På bokmål har vi hatt litt problemer med koding av noen ord for Debian installer level 1. Ved hjelp av følgende lille snutt (som jeg fikk fra Christian Perrier) var det lett å finne hvilke ord/posisjon:
cat file.po | LC_ALL=C iconv --from utf-8 --to iso-8859-1 >/dev/null
Jeg har rettet feilene i debian-installer.po (og debian-installer-post-sarge.po) slik at nb ikke lenger har status INVALID som rapportert i meldingen under.
Hans
PS! Det ser ut som samisk (se) har noen feil i koding.
----- Forwarded message from Christian Perrier bubulle@debian.org -----
Date: Fri, 28 Jan 2005 08:14:05 +0100 From: Christian Perrier bubulle@debian.org To: debian-i18n@lists.debian.org Subject: [D-I] Testing the validity of encodings for Debian Installer translations X-Mailing-List: debian-i18n@lists.debian.org archive/latest/3422
The Norwegian translators recently discovered that their translations were having few encoding errors. The PO file for Debian Installer was mentioned as using UTF-8 encoding, but on a few places, invalid characters were there.
This yielded me to try testing these files. Though I've read that iconv cannot always detect every possible errors, I've used it on all files.
For each language, I have to possibilities:
-if the translation is using UTF-8, I try converting it to another encoding which is suited for that language. [kutt] -if the translation uses another encoding, I try converting it to UTF-8
Languages which do not use the master file (pt, nl, uk...) have not been tested.
Below is the result. Please notice that some "INVALID" warnings may be false alarms because the chosen alternate encoding is inappropriate. In such case, please check the alternate encoding table.
Testing gl...UTF-8 to iso-8859-15 --> INVALID Testing ja...UTF-8 to EUC-JP --> INVALID Testing nb...UTF-8 to iso-8859-1 --> INVALID Testing se...UTF-8 to iso-8859-1 --> INVALID Testing sl...UTF-8 to iso-8859-2 --> INVALID Testing sq...UTF-8 to iso-8859-1 --> INVALID
Others are found correct. [kutt] ----- End forwarded message -----
Quoting Hans F. Nordhaug (Hans.F.Nordhaug@hiMolde.no):
På bokmål har vi hatt litt problemer med koding av noen ord for Debian installer level 1. Ved hjelp av følgende lille snutt (som jeg fikk fra Christian Perrier) var det lett å finne hvilke ord/posisjon:
cat file.po | LC_ALL=C iconv --from utf-8 --to iso-8859-1 >/dev/null
Jeg har rettet feilene i debian-installer.po (og debian-installer-post-sarge.po) slik at nb ikke lenger har status INVALID som rapportert i meldingen under.
The corrected files have been commited to D-I repository a few minutes ago.
Hans
PS! Det ser ut som samisk (se) har noen feil i koding.
For Sami, I'm not really sure that using ISO-8859-1 as alternate encoding is appropriate. Anyway, the translation is *very* partial.
PS : No, I don't read Norwegian, I'm just guessing..:-)
[Christian Perrier]
PS! Det ser ut som samisk (se) har noen feil i koding.
For Sami, I'm not really sure that using ISO-8859-1 as alternate encoding is appropriate. Anyway, the translation is *very* partial.
You are correct. Nothern Saami have some characters which are missing in ISO-8859-1. I'm not sure if there is any good 8bit charset to use. I suggest trying WINSAMI2. Btw, why do you convert to a 8bit charset? Why not just convert from UTF-8 to UTF-8?
% echo å | iconv -f utf-8 -t utf-8 iconv: illegal input sequence at position 0 %
As for the lack of translation, the only Nother Saami translator we have is focusing on KDE, and ignores d-i due to lack of time. We desperately need more Nothern Saami translators.
On Thu, 2005-02-17 at 09:42 +0100, Petter Reinholdtsen wrote:
[Christian Perrier]
PS! Det ser ut som samisk (se) har noen feil i koding.
For Sami, I'm not really sure that using ISO-8859-1 as alternate encoding is appropriate. Anyway, the translation is *very* partial.
You are correct. Nothern Saami have some characters which are missing in ISO-8859-1. I'm not sure if there is any good 8bit charset to use. I suggest trying WINSAMI2.
The 8 bit character set for sami is iso-8859-10. See "http://www.terena.nl/library/multiling/ml-docs/iso-8859.html".
Btw, why do you convert to a 8bit charset? Why not just convert from UTF-8 to UTF-8?
Agreed.