[NUUG kart] Yet another tool to convert kartverket data to OSM

Thomas Weidner thomas001le at googlemail.com
Sun Jan 11 13:45:32 CET 2015

Am 11.01.2015 um 11:41 schrieb Gnonthgol:
> Den 11. jan. 2015 01:24, Thomas Weidner skreiv:
> > There seemed to be no good tool to convert SOSI files to OSM (yes
> there is sosi2osm, but it seemed incomplete...why didn't I work on
> sosi2osm? good question...)
> That is a good question. I think it is pretty easy to convert your
> "plugins" to lua scripts for use in sosi2osm though so not to
> duplicate the work.
Yes, I totally plead guilty of NIH syndrome here. One thing I did not
seem to be possible with sosi2osm is collecting all roads and road route
numbers and output a corresponding relation at the end. But then, at
some point I stopped looking at it. If sosi2osm is the project to go and
you can maybe only copy some SOSI->OSM tag mapping from my source this
would also be fine for me.
> > This tool is now pretty usable and understands nearly all
> information available in the freely available "N50" map data. I
> created some nice maps of the area around Ålesund for my Garmin GPS
> using it. :-)
> The tool looks pretty usable. There are however some bugs though.
> Firsty I do not see where you are reading the "TEGNSETT" tag in the
> header. Not all SOSI files are ISO8859-10 and it may be other values
> for that tag.
You are right this is a problem, I am aware of this field. Unfortunately
Java has no nice way of switching the input encoding while reading the
file, so I put it on the TODO list, but it should be fixed.
> You have probably seen the SOSI standard definitions over at
> http://www.kartverket.no/en/SOSI-Standard-in-English/SOSI/ but it
> would be fun to make a proper parser that follows the standard. I see
> that you have made a regex parser based on actual standard. I use the
> official library which does not implement such a
> parser. I have also seen that the files can violate the standard in
> some areas.
I read the Norwegian standard document about the SOSI format (using
google translate, but the BNF is international ;) ). I think apart from
the encoding issue, there are two problems:

 1. The parser does not handle data values correctly, strings ("a quoted
    string") are not parsed correctly
 2. It does not parse references correctly (..REF is handled not by the
 3. to correctly understand files you seem to have to read the
    definition files. For example "..VNR X Y Z" is really shorthand
    notation for "..VNR ...VEGKATEGORI X ...VEGSTATUS Y ...VEGNUMMER Z" 

The FYBA C library seems to be pretty complete, but also pretty ancient.
Maybe having an alternative is not that bad.

- Thomas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.nuug.no/pipermail/kart/attachments/20150111/f1850338/attachment.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
Url : http://lists.nuug.no/pipermail/kart/attachments/20150111/f1850338/attachment.pgp 

More information about the kart mailing list