Character encoding for Shapefiles

This is the Forum to discuss the use of SBuilderX (version 3.10 and above). For previous versions of SBuilder please use the "SBuilder for Flight Simulator FS2004" forum.
Post Reply
Mick
Posts: 59
Joined: Fri Oct 05, 2007 7:42 pm
Location: Germany

Character encoding for Shapefiles

Post by Mick » Tue Mar 10, 2015 10:46 am

Hi all,

trying to get the roads of my area through a shp file is basically easy. But many of the street names have special characters in them, and I don't get their proper (German) representation in SBuilder. Looking at the cpg-file in the source I see data are UTF-8 encoded.

Can I set / change that somewhere in SBuilderX?

Thank you
Mick

User avatar
luisfeliztirado
Posts: 436
Joined: Sun May 15, 2005 9:15 am
Location: Santo Domingo

Re: Character encoding for Shapefiles

Post by luisfeliztirado » Tue Mar 10, 2015 9:42 pm

Hello,

I don't think that you can change that. But, perhaps you can edit the shapefiles.

Best regards.
Luis

Mick
Posts: 59
Joined: Fri Oct 05, 2007 7:42 pm
Location: Germany

Re: Character encoding for Shapefiles

Post by Mick » Wed Mar 11, 2015 6:02 pm

Hello Luis,

the shapefiles follow the official standard, which is Unicode (UTF-8) and determined in one of the shapefile's subfiles (*.cpg). The encoding for the users can only be changed while reading / parsing the file(s), so it doesn't make a lot of sense to edit every street name out of hundreds...

Since I guess that many European (or Asian, Russian, Arabic) users have the same problem, I could contribute a C# parser class which works very well in my test application. But it might be much easier if a programmer looks at this issue and corrects some lines in his code?

Just to make SBuilder even more international 8) would you mind passing it on as a feature request?

Thank you, regards
Mick

User avatar
luisfeliztirado
Posts: 436
Joined: Sun May 15, 2005 9:15 am
Location: Santo Domingo

Re: Character encoding for Shapefiles

Post by luisfeliztirado » Thu Mar 12, 2015 12:15 am

Thank you, Mick. Good idea and I shall definitely pass it along to Luis Sá, the developer of SBuilder. He is, by the way, Portuguese so must often deal with diacritical marks in his language as must I in Spanish. :lol:

Long live accents!

Best regards.
Luis

Mick
Posts: 59
Joined: Fri Oct 05, 2007 7:42 pm
Location: Germany

Re: Character encoding for Shapefiles

Post by Mick » Thu Mar 12, 2015 4:23 pm

Hi Luis,

we Germans have the "ß" in "Straße" (calle), so almose every street has a special character in its name :shock: And, to make it even worse, we have so much lesser sun :lol:

If you think it's a good idea, I want to let you (and Luis Sá) know it in complete... :
  • Read (standard) shp-file(s) with proper handling of the Unicode encoding, so that users have a readable result in their local character set (System settings).
  • Assign vector properties to every road, corresponding with the roads properties (width, asphalt/concrete etc.) automatically.
    Explanation: The dbf-file inside the shp has, e.g. for roads, a 'type' field. It contains a road classification like "primary, secondary, footway, ..." etc. for each road. Using a (probably user-configurable) list of these types together with vector properties (a name and GUID, like it's used for vector lines), SBuilder could easily assign a GUID to every single road, which corresponds with its assumed width (I have to say assumed, because 'primary' doesn't necessarily have to be of similar width in reality... but it's a good hint at least).
As a result, the user could set up his conversion list once (= assign vector data to approx. 12 types that make sense in this context, alternatively use a default standard) and then just import the roads from any standard-shp-file of any size (e.g. my city Munich has approx. 50.000 roads) for his scenery.

Maybe Luis Sá also likes the idea and makes it part of SBuilder. From the point of .NET programming, it should be easy and also not too much of an effort.

Regards
Mick

iangbusa
Posts: 44
Joined: Mon Jul 07, 2014 5:10 pm

Re: Character encoding for Shapefiles

Post by iangbusa » Thu Mar 10, 2016 2:42 pm

I'm curious to know why you need any other data than the vertices or point co-ordinates in SBX to set up lines or polys. If you remove the text fields from the shapefiles, none of this would present a problem, surely? As for the road type, to choose the line type in FSX, the one text field "type" in OSM should OK.

I say this because it seems you're using OSM shapefiles and want to process the road type. The standard field ("type"), for these is English, as you point out, with no special characters. The type field shouldn't present a problem. I use this all the time - including EU and Asian OSM files. Processing the "name" seems very ambitious. What will it provide for FSX that "type" will not?

Please, if it's not a "secret objective", tell me what am I missing?

BTW - I completely empathize with the lack of sun! :mrgreen:

ian

Mick
Posts: 59
Joined: Fri Oct 05, 2007 7:42 pm
Location: Germany

Re: Character encoding for Shapefiles

Post by Mick » Thu Mar 10, 2016 4:09 pm

Hi Ian,

no "secret objectives" here :lol: but maybe a temporal aim for overfulfillment :oops:.

The thought arose when dealing with a city scenery (part of Munich) where OSM files don't seem to be very exact, e.g. have line fragments of the same name. Letting SBuilder read the names in the defined encoding wouldn't be a big effort in .NET languages, but the result would be less confusing and look much more professional as well. That's all, nothing functional connected, and you're completely right putting your finger on that.

The second idea (e.g. let users map a GUID for each type and assign it automatically) was thought as an additional feature, which I think would make sense for SBuilders shapefile functionality. There's more, but it anyway seems that Luis Sà doesn't find the time to extend the code.

Regards - Mick

Post Reply