MegaGlest Forum
Archives (read only) => Glest Advanced Engine => General discussion => Topic started by: PolitikerNEU on 16 July 2009, 11:26:06
-
I think it is important to be able to translate whole Glest to different languages, not only the user interfaces.
To be able to do that, we need a mechanism to translate things.
I haven't found out where the unit strings and descriptions are stored, but we need to translate voices and sound effects any way.
I personally think that these points are important for a translation:
* Translation file should not be much longer than the translation itself and the file should be easy to create
* If the original text is updated, the translation may need updates too so we need a mechanism to handle this
* Some languages are similar to each other, so we need a mechanism to specify the fallback-language
At first, the solution to fallback-languages could be an XML-File like this:
<languages>
<!--Language example-->
<lang id="de">
<name><translatable id="trans_de" lang="de">Deutsch</translatable>
<translatable ref="trans_de" lang="en">German</translatable>
</name>
<!--Sublanguages-->
<sub-languages>
<lang id="de-ch">
<name><translatable id="trans_dech" lang="de-ch">Schweizerdeutsch</translatable>
</name>
</lang>
</sub-languages>
<!--Similar languages, then english at last? -->
<fallback-languages>
<lang ref="en" />
</fallback-languages>
</lang>
<!--Small ref example-->
<lang id="en">
<name><translatable id="trans_en" lang="en">English</translatable></name>
</lang>
</languages>
You specify Languages using a "lang"-tag with the required attribute id. All these languages have got a translatable name, zero or more sub-languages which have the same structure as a language and zero or more fallback-languages which are used if no text for the original language can be found (since english is glest's main language it would be sensible to add it as fallback-language to all languages)
All data for the translations could be in lang/<language-id>/data (folder-structure like "normal" /data)
So I thought we could add a <translatable>-tag to each glest XML-file
This tag should have one required attribute, an id which is unique in (?) all these translatable-Tags in the glest file.
Optionally you can specify a language there, then you add a lang-Tag which contains the abbr. of the base language (e.g. lang="en"). There should also be one optional version-Tag which defaults to 0 and should be increased whenever a substantial change (meaning: not a simple correction of mistakes, but different content or formatting) occurs. Maybe there could be three additional tags to "copy" strings: ref-lang="?" to take a string from a certain language, ref-id="?" to take a string from another id and ref-src="?" to take a string from another XML-File (of course the last one is only sensible if ref-id is set)
All the elements in the translatable-tag are the same as "normally", but in another folder (language-specific? e.g. having de/data or something like that?) There is another file which contains the translations: only elements of type <translatable> using ref (Refering to an ID), lang and version-tag.
Glest, at loading any XML-file, if it comes to any translatable Tag, it does:
* if the tag has set the attribute id, the node is saved (for example in a Map<ID,Node> translatableNodes)
* If the lang-Attribute has a higher priority than the one current one in (Map<Id, Node> curTranslation), the node there is replaced.
The direction-Information in Map<Id, string> currentDirectories is replaced with the directory the file is in
The priority for languages could be as following:
- The exact specified language in the Glest options
- Any sublanguages of the specified language
- Any parent languages
- Any manually specified fallback-language
- Any fallback-language of the chosen language (earlier fallback-language = higher priority)
- Any fallback-language of any sublanguage of the chosen language
- Any fallback-language of any parent language of the chosen language
- Nothing
* Glest looks for the best-matching language:
It looks for language-files in the order of priorities until the language of curTranslation has higher priority than the language which glest would look for (because then the curLanguage could never be replaced).
* If the attribute "version" exists for both the Nodes in curTranslation and translatableNodes and it is smaller in curTranslation than in translatableNodes, a warning is emitted
* Every $CURDIR in every attribute is replaced by the String of currentDirectories
* the Node in translatableNodes is replaced by the content (!) of the one in curTranslation
* maybe the resulting XML-File is saved in order to cache it?
* The modified XML-file is returned
Example:
Assume we have the following, simple XML-Filepart:
<selection-sounds enabled="true">
<sound path="../archmage_tower/sounds/magic_click_nolang.wav"/>
<sound path="../archmage_tower/sounds/magic_click_1.wav"/>
<sound path="../archmage_tower/sounds/magic_click_2.wav"/>
<sound path="../archmage_tower/sounds/magic_click_3.wav"/>
</selection-sounds>
While magic_click_nolang does not need any translation, the other ones should be translated.
<selection-sounds enabled="true">
<sound path="../archmage_tower/sounds/magic_click_nolang.wav" />
<translatable id="archmage_sounds" version="1" lang="en">
<sound path="$CURDIR../archmage_tower/sounds/magic_click_1.wav"/>
<sound path="$CURDIR../archmage_tower/sounds/magic_click_2.wav"/>
<sound path="$CURDIR../archmage_tower/sounds/magic_click_3.wav"/>
</translatable>
</selection-sounds>
or you can add no default language:
<selection-sounds enable="true">
<sound path="../archmage_tower/sounds/magic_click_nolang.wav />
<translatable id="archmage_sounds" />
</selection-sounds>
If the user wants to have e. g. german, there could be the following file in
$GLEST/lang/de/.../archmage_tower.xml:
<translations>
<translatable ref="archmage_sounds" version="1" lang="de">
<sound path="$CURDIR../archmage_tower/sounds/magic_click_1.wav" />
</translatable>
</translations>
because only one sound has been respoken in german.
The XML-Reader would replace the
<translatable id="archmage_sounds" /> of the XML-File with
<sound path="$CURDIR../archmage_tower/sounds/magic_click_1.wav>
and replace $CURDIR with e.g. /usr/share/glest/lang/de/data/game/techs/magitech/factions/magic/units/archmage_tower/
so the XML-File for german users would be
<selection-sounds enabled="true">
<sound path="../archmage_tower/sounds/magic_click_nolang.wav" />
<sound path="/usr/share/glest/lang/de/data/game/techs/magitech/factions/magic/units/archmage_tower/../archmage_tower/sounds/magic_click_1.wav"
</selection-sounds>
Of course the whole translation-thing would be more effective for descriptions etc. but I haven't found any.
EDIT:
Some logical mistakes corrected
-
This is all very well thought out... unfortunately I think it was largely unnecessary.
The games 'message' strings are translatable ( see <your glest directory>/data/lang )
Scenarios are translatable ( <scenario directory>/*.lng )
Unit names and sounds are defined in xml files, if you wanted to change them to make the game completely multi-lingual, there's nothing stopping you.
-
You gotta be kidding me! While its a nice idea, it would take lots of work, and I guarentee NO modder is going to want to multi-lingual his XMLs for his mods. It would also be a lot of unneccessary work. Sounds would be difficult, since I don't know about you, but last time I checked no one I knew could speak all of the hundreds of languages in the world. Even limiting to just the popular languages would be just too hard.
Besides 99% of the glest community can speak or understand enough english to play the game as it is anyway, as well most of the boards and the main part of the glest.org site is in english.
No offense, but if you can't speak english at least enough to read my post... you might want to try learning. (Yes, I can say this, as a harsh as it may sound, because I know that you can read english well enough if you understood that!).
I'm sorry, but this is a no-go.
-
While its a nice idea, it would take lots of work, and I guarentee NO modder is going to want to multi-lingual his XMLs for his mods.
That may be true for the current modders, but if I manage to port my (unpopular) WC3-Map "Das kleine Burggefecht" to Glest (which is my long-term goal in fact), I want to make it at least german/english bilingual (In fact, I even tried to make the map bilingual in WC3 but I faced several problems like that if only a single letter
No offense, but if you can't speak english at least enough to read my post... you might want to try learning. (Yes, I can say this, as a harsh as it may sound, because I know that you can read english well enough if you understood that!).
I know enough English to read your post but this point isn't entirely true because I might have used an online translator.
I'm sorry, but this is a no-go.
Even if I implement it myself (which is what I want to do)? Be aware that the only negative consequence for the current XMLs would be a little performance loss which could be avoided by having an attribute "isTranslatable='true'" with default value false in the root element of each XML document and reduced by caching processed XMLs. If a modder doesn't want to support translation, he'd be free to not use the <translatable>-Tag and since that, have no more work than now.
-
I also thought of something like this (and I'm also German :D we're not alone!):
https://forum.megaglest.org/index.php?topic=4480.msg28148#msg28148 (https://forum.megaglest.org/index.php?topic=4480.msg28148#msg28148)
But it's really not realistic to modify each XML-File. That's just crazy. But what about a language-file in the techtree/factions-directories, so the directories would look like this:
magitech-directory: (techtree)
factions (directory)
resources (directory)
magitech.xml
deutsch.lng
italiano.lng
tech-directory: (faction)
music (directory)
units (directory)
upgrades (directory)
tech.xml
deutsch.lng
italiano.lng
and the language-files would look like this:
german.lng for tech:
aerodrome=Aerodrom
air_ballista=Ballista
airship=Luftschiff
archer=Bogenschütze
barracks=Baracken
battle_machine=Kampfmaschine
...
I think this isn't much more work for the modders, if there isn't any translation the default english strings could be used. Seems to be perfect, doesn't it? :D
-
I think this sounds like a great idea for making Glest more readily accessible. Playing a game where you understand all the unit names is a lot nicer. I remember one of the things that bothered me the most about playing Age of Mythology was that a lot of the units' names were in Greek or Old Norse. I certainly never knew what a hersir, prodromos, or gastraphetes is, so the game was very confusing at first. I imagine it's the same way for someone who doesn't speak English playing Glest.
-
I think this sounds like a great idea for making Glest more readily accessible. Playing a game where you understand all the unit names is a lot nicer. I remember one of the things that bothered me the most about playing Age of Mythology was that a lot of the units' names were in Greek or Old Norse. I certainly never knew what a hersir, prodromos, or gastraphetes is, so the game was very confusing at first. I imagine it's the same way for someone who doesn't speak English playing Glest.
Or if you play Nihirilian; the unit names are nonsense to me. :P
I think that possibly the best way to implement new languages would be what I would call the caveman/tsunami approach. While it may not be most eloquent method, if there were separate downloads for each language, it would allow the translators to make translations, and would not interfere with anything in place now. Only the Techs folder would need to be different, because that is where the unit names are. The existing systems for translating the menus, and other things besides the factions I think are good enough. But I wouldn't know if they're good enough...
-
While it sounds like a good idea, would modders want to even TRY to make their XMLs compatible? Personally, I wouldn't myself, as I am too lazy, and perfectly content with the current language stuff. Forgive me, I am very language biased (possibly even discriminating). However, I do wonder where one found a download link for a mod in anything other than english... I also wonder how someone installed this with an english installer, off an english site. Of course, they COULD have used an online translator, in which case, who knows what they'd expect from the game itself. I guess anyone could install it, just keep clicking enough buttons and anyone could get it right eventually... lol.
Or if you play Nihirilian; the unit names are nonsense to me. :P
Well, do you speak alien?
I think that possibly the best way to implement new languages would be what I would call the caveman/tsunami approach. While it may not be most eloquent method, if there were separate downloads for each language, it would allow the translators to make translations, and would not interfere with anything in place now. Only the Techs folder would need to be different, because that is where the unit names are. The existing systems for translating the menus, and other things besides the factions I think are good enough. But I wouldn't know if they're good enough...
I agree, since there SHOULD NOT be a larger file to punish the 99% who can speak a fluent (or close enough) english! But isn't this just modifying the XMLs?
I think this sounds like a great idea for making Glest more readily accessible. Playing a game where you understand all the unit names is a lot nicer. I remember one of the things that bothered me the most about playing Age of Mythology was that a lot of the units' names were in Greek or Old Norse. I certainly never knew what a hersir, prodromos, or gastraphetes is, so the game was very confusing at first. I imagine it's the same way for someone who doesn't speak English playing Glest.
Hmm, good point. I never really minded that... Can't explain why.
-
I don't think it deserves a long discussion to know if it's worth or not translating the game and the mods. As you said Omega, it's discrimating to restrict language to only english and most non-native-english speakers prefer playing a game or using any software who is speaking their native language so Glest must have it, that's all.
The interesting point is that allowing translations should require the least work for modders, and I think that's rather easy to do. The way pheder described seems good as it's already how scenario strings translation works and it requires no additional work for modders.
Finally, Omega if you don't want to translate your mods, just don't do it, but don't prevent other people from doing it if they want.
-
While it sounds like a good idea, would modders want to even TRY to make their XMLs compatible? Personally, I wouldn't myself, as I am too lazy, and perfectly content with the current language stuff. Forgive me, I am very language biased (possibly even discriminating).
The thing is that we have a lot of multilingual members here, and for them it really wouldn't be a lot of work to translate a few strings. We seem to especially have a lot of German and Spanish speakers here, and I figure some of them would be more than willing to volunteer the few minutes it would take to translate. Aren't most Canadians bilingual anyway? :P
-
True. I am not against translations, but lets say I could speak another language, and lets say I'm fluent enough in both languages, the current method where you just change the bits in the XMLs could be done for magitech in maybe 20 minutes. There's only the resources, the units names, the commands, and the upgrade names.
Canadians aren't necessarily bilingual. The only bilinguals I know are a few guys who go to a francophone school (must quote: french girls are much hotter than non-french girls). New Brunswick is officially bilingual, and a good portion of quebecian's can speak both of the official languages. Aside from that, english is dominant. I mean, it's the language of science as well, which can't hurt. That could be like asking CERN to translate texts into dozens of languages just to communicate with all the brainiacs residing there. Ok, maybe I'm getting out of hand. Go ahead and translate as you wish, but I just don't think we should spend a code anything into the engine, even if it makes it easier to translate.
-
Omega, you now why translators translate although they would understand it without translating? They do it not just for themselves, but also for other people. I think thats the spirit of open source - doing it also for other people.
Modifying the XMLs is unworkably, just think about distributing (for every language an own package? That becomes too much.) or updating (changing the XMLs for every language would become quite annoying). The Modders also would have to work together with the translators when updating (too much effort) and the job would be a very hard one for translators at all, since they'd have to open every single XML, modify it, save it and look after the text strings in there. When it comes to update a translation there would be no central way to do it.
I'm looking for one Glest in many languages, not one Glest per language.
-
I agree with everything Pheder just said. I doubt that it would be hard to implement into the engine either, since there's already code for translating other parts of the game, so it should be easy enough to adapt that to suit our needs. I'm not much of a programmer (only ever had one class) so I don't know this for sure, but that's what my common sense tells me.
-
You are welcome to go ahead, but when it comes to coding this, you're on your own.
There is a large number of things on a waitlist lasting years of features wanted for glest, and it gets bigger over time, not smaller. I wouldn't consider this high enough priority to get silnarm or hailstone on it, who are apparently the only active coders at this time... And I just can't help but think that if someone is skilled enough to impliment such a feature, why can't they impliment some of the others that more people want?
-
You are welcome to go ahead, but when it comes to coding this, you're on your own.
Actually we have a potential contributor atm who has proposed a neat way to solve this little problem, it will work much as pheder suggested.
The same mechanism will allow us to set 'secondary' hot-keys (ie 'b' - build, followed by 'b' - barracks) all in a language neutral way, which wont break multiplayer, or at least won't break it any more than it currently is :)
Interested parties should join the new gae dev mailing list.
https://lists.sourceforge.net/lists/listinfo/glestae-devel
-
I have partly implemented my own specification (In fact I think that this simple ini-format has some advantages like it being simple, but too many disadvantages like not being able to change sounds etc.): I have implemented <translatable [id|ref]="id" lang="language"> in the file and in an external folder structure.
In order to do this, I had to introduce a new Element: <displayName></displayName>. This is the name used to display everything. I have chosen to introduce a new Element (instead of just using the name) because the name is heavily used internally.
Unfortunately it is really cumbersome to translate thinks - partially because my format uses rather long structures.
Here you can see a screenshot, you see that Ressources are in German, the unit name etc.
(http://img32.imageshack.us/img32/8404/shotyc.jpg)
As you can see, there are some issues with umlauts but I hope I can fix it.
The source code isn't really beautiful but of course I'll release it soon - I still have patched my own hack-version of "normal" glest, not GAE.
-
I must say, this could be a great idea for getting more people from different languages playing. If it doesn't break the source, and doesn't require the modder to do anything (though they can if they want, or another can to add the languages). Happy coding. Good luck implimenting it with GAE.
-
Thanks for bumping this thread Omega. I haven't read the whole thing (sorry, short on time right now), but I agree that languages other than English should be supported. I'm opposed to the language name being the name of the file, this introduces a barrage of file-system encoding issues that are unnecessary and breaks important apps like subversion and kdiff3, just to name a few. I'm working out a solution now that will solve this, and the language file names will be the ISO 639-1 (or similar) code. This is my first work on internationalizing something for this many different languages, so the APIs are new to me. I'm also looking into gettext, which appears to be the most standard internationalization solution. I also don't want to go too far off onto this right now because there's a lot of other things that I want to get done. But the XML (unit.xml, etc.) is definitely where this belongs.
-
I have partly implemented my own specification (In fact I think that this simple ini-format has some advantages like it being simple, but too many disadvantages like not being able to change sounds etc.):
Nice. I'd like to have a chat about how you've done this at some stage, I'll try to be on IRC as much as possible this week.
Something I meant to bring up with Daniel and Hailstone on the weekend was 'configuration' files... I would like to ditch the PropetyMap stuff at some point, we have an inbuilt configuration language now, we should be using it for more than game scripting ;) While it's probably better known for it's use as a game scripting language, this is a 'bread and butter' use of Lua.
-
ahhh, so ic. Well I'll have to look into that later because there's a BIG storm coming and I wont be in IRC because my mommie says that I have to clean my room and do the dishes and then go to bed. Ok, not really. :) But we'll catch up later, I'm all for ditching it if there's a better solution.
-
Well, it's something to discuss... I like the idea myself, but the reasons I like it could well be the same reasons other people dont!
It would probably have the same problem PolitekerNEU currently has, in a 'cumbersome' format, although I am still far from mastering Lua, and these same tables could probably be built in nicer way...
-- English translation file for magitech
--[[ We should 'translate' English as well, the engine automatically capitalises
words and replaces '_' with space, we should do that 'by hand' in one of these files
as the auto-capatilisation may not be appropriate
]]
attack_types.slashing = "Slashing"
attack_types.piercing = "Piercing
attack_types.impact = "Impact"
attack_types.energy = "Energy"
armor_types.organic = "Organic"
armor_types.leather = "Leather"
armor_types.wood = "Wood"
armor_types.metal = "Metal"
armor_types.stone = "Stone"
resources.energy = "Energy"
resources.food = "Food"
resources.gold = "Gold"
resources.stone = "Stone"
resources.wood = "Wood"
magic.upgrades.ancient_summoning = "Ancient Summoning"
magic.upgrades.dragon_call = "Dragon Call"
-- etc
-- faction.units will store the unit name translations
magic.units.archmage = "Archmage"
magic.units.archmage_tower = "Archmage Tower"
magic.units.battlemage = "Battlemage"
-- etc
-- and faction.unit will store the translatables from within the XML
magic.archmage.levels.expert = "Expert"
magic.archmage.levels.master = "Master"
magic.archmage.levels.legendary = "Legendary"
magic.archmage.commands.stop = "Stop"
magic.archmage.commands.move = "Move"
magic.archmage.commands.ive_nova = "Ice Nova"
-- etc
-- in reality English wont need to do sounds (at least for magitech ;)
-- but it would go something like this: [paths from 'tanslation' directory]
magic.archmage.sounds.archmage_select1 = "sounds/archmage/whatdoyouwant1.wav"
magic.archmage.sounds.archmage_select2 = "sounds/archmage/whatdoyouwant2.wav"
magic.archmage.sounds.archmage_select3 = "sounds/archmage/yes1.wav"
magic.archmage.sounds.archmage_select4 = "sounds/archmage/yes2.wav"
magic.archmage.sounds.archmage_select5 = "sounds/archmage/sire.wav"
-- etc
-- obviously, any sounds can be ignored and use the defaults, most attack sounds, for example,
-- wont need translating :-)
It gets quite cumbersome when you get down to the units obviously, but this is a programming language, so you can mitigate that somewhat yourself, eg,
lvls = magic.archmage.levels
lvls.expert = "Expert"
lvls.master = "Master"
lvls.legendary = "Legendary"
cmds = magic.archmage.commands
cmds.stop = "Stop"
cmds.moce = "Move"
cmds.ice_nova = "Ice Nova"
-- etc
Strings would need to be quoted though...
-
Hmm ... wow, a quite different solution. I thought about half an hour whether I should screw my own solution or not, but I still don't know it.
There are some advantages and disadvantages (in comparision to my solution) of this format in my opinion:
Advantages:
- MUCH less cumbersome than my format (my format is much longer)
- Lua scripting could be used to e.g. automatically add <translation of "Build "> + unitName to each command which could be more difficult with my format
- Could be more easily used to switch languages on the fly which is currently not possible with my format
Disadvantages:
- Using lua means it is harder to write tools for language translation - but I don't know if it is necessary (Especially since this format is really shorter than mine)
- I don't like everything placed in one file since it may be more difficult to lookup a missing string - however, having a file for each unit isn't really nice either because you have to create so many files
- Whenever the XML-Format is changed, the translation system would have to be adapted since there is no 1:1-correlation between lua tables and the XML-Format
For my format, I thought of having a default.xml file: Some strings like hold_position occur often and are always translated the same. Because of that, I thought of having a default.xml in the /lang-folder which is used as default translation for every id - but this default can be overriden by placing the translation in e.g. archmage.xml
With tools for language translation, I think of writing a tool (for my format - probably in java) that does the following:
- Go through each XML-File
- In a special XML file there are "expected" <displayName>-Tags (like: /tech-tree/attack-types/attack-type). If the tool doesn't find one, it creates one (maybe using a second xpath to lookup the id)
- Then, for each file it "collects" all translatables for a language and generates the .xml-files in the /lang-Folder. If there already is one xml-file, it appends the missing translatables at the end of the xml-File
- Then, it could display the untranslated Strings and ask the user to translate them
Another suggestion for the lua-Format:
I don't think magic.archmage.sounds.archmage_select1 is a good idea, instead I'd prefer
magic.archmage.sounds.archmage_select[1] etc - because then you could add some sounds or delete some
-
- Using lua means it is harder to write tools for language translation - but I don't know if it is necessary (Especially since this format is really shorter than mine)
Well, you couldn't use the fancy XML stuff, but the Lua would be pretty simple, writing out a text file wouldn't be much more work than constructing the XML... maybe less.
- I don't like everything placed in one file since it may be more difficult to lookup a missing string - however, having a file for each unit isn't really nice either because you have to create so many files
It's Lua :) split it up however you like, then use dofile() to pull it all together.
- Whenever the XML-Format is changed, the translation system would have to be adapted since there is no 1:1-correlation between lua tables and the XML-Format
Not entirely sure I get what you mean here, but if some 'translatable' part of the XML format is changed, then all translations will be compromised, regardless of the system.
For my format, I thought of having a default.xml file: Some strings like hold_position occur often and are always translated the same. Because of that, I thought of having a default.xml in the /lang-folder which is used as default translation for every id - but this default can be overriden by placing the translation in e.g. archmage.xml
Yeah, I hadn't thought of this, over-rideable defaults are a must... but I think the Lua solution for doing this will be cleaner too :)
Another suggestion for the lua-Format:
I don't think magic.archmage.sounds.archmage_select1 is a good idea, instead I'd prefer
magic.archmage.sounds.archmage_select[1] etc - because then you could add some sounds or delete some
Then you're tying it to a 'slot' in the XML, I was just using the filename (sans extension) of the file to replace.
I think the fact that it's a scripting language is somewhat off-putting in this context, but it's just being used to fill tables that can then (very) easily then be read from C++. And I personally don't much like XML for supposedly human readable stuff, it's too bulky, but then I am a bit strange, so maybe that's just me!
-
Well, you couldn't use the fancy XML stuff, but the Lua would be pretty simple, writing out a text file wouldn't be much more work than constructing the XML... maybe less.
True for Languageformat --> Lua, but not for Lua --> Languageformat because you could do anything in Lua (for example, the default "Build <unitname>"-Thing) which is really difficult to parse in a tool.
It's Lua Smiley split it up however you like, then use dofile() to pull it all together.
Your point
Not entirely sure I get what you mean here, but if some 'translatable' part of the XML format is changed, then all translations will be compromised, regardless of the system.
Of course, but what I mean is: If you change something in the XML-Format of glest, if you use my format, you just have to adjust each XML-File, nothing more (the parser doesn't have to be changed). If you change something in the XML-Format of glest and you use your format, you have to change the Table-Reading-System in glest (a point which could be forgotton)
Yeah, I hadn't thought of this, over-rideable defaults are a must... but I think the Lua solution for doing this will be cleaner too
Hmm ... could be
Then you're tying it to a 'slot' in the XML, I was just using the filename (sans extension) of the file to replace.
Ah, but this still means you can't add or remove sounds
I think the fact that it's a scripting language is somewhat off-putting in this context, but it's just being used to fill tables that can then (very) easily then be read from C++. And I personally don't much like XML for supposedly human readable stuff, it's too bulky, but then I am a bit strange, so maybe that's just me!
Hmm ... I don't know whether a scripting language or XML can be read better - but your are right in that XML adds much bloat.
Another thing is that using lua you'll have difficulties implementing versioning/warnings (so: you update a string and change the version to indicate you have changed something) - while this isn't really necessary I would have liked to have it when editing WC3-Maps (and trying to make them bilingual)
EDIT:
And by the way I think that in the long run we'll end up using XML for descriptions anyway (as soon as we have any) (I mean like:
"The dragon rider is <bad>weak</bad> against hunters but <good>strong</good> against soldiers ...") - and in this case using XML for translation would be more suitable too - of course there will be lua-scripts in description too (if, for example, there will be an ability like: does 0.1 damage for each hitpoint you have) - but these couldn't be written in this translations-file anyway since it would be statically evaluated (and therefore we'd need some additional XML-Tag anyway even for lua)
EDIT II:
Regarding the lua solution (Sorry, I still don't know anything about lua) I think it would be better to have functions like getUnitType(String name) returning a UnitType-Object with functions like setDisplayName(String) (what I mean is: nearly all glest objects should be exposed to lua with many of the glest functions). If you have this, you could write the table --> unit-name (etc.) function in lua (which would also completly avoid the XML-Format-change problem)
-
It's Lua Smiley split it up however you like, then use dofile() to pull it all together.
Your point
You could organise your translation however you wanted, one big file? fine. A whole bunch of small files? that's fine too. etc...
... but what I mean is: If you change something in the XML-Format of glest, if you use my format, you just have to adjust each XML-File, nothing more (the parser doesn't have to be changed). If you change something in the XML-Format of glest and you use your format, you have to change the Table-Reading-System in glest (a point which could be forgotton)
No, it's the same... nothing can be hard-coded anyway, the 'translatable' strings need to be read from the XML, then translations read in to override them... from XML or Lua it's only slightly different, and I don't think harder in either. You need to determine the translatables at run-time in either case, for the Lua solution you then just try to read each translatable from the Lua state, eg if you've determined you have a translatable string for the archmage's 'ice_nova' command, you query the Lua table for magic.archmage.commands.ice_nova, if there's a string there, you have your translation, if there isn't it hasn't been translated, and you fall back as normal.
Then you're tying it to a 'slot' in the XML, I was just using the filename (sans extension) of the file to replace.
Ah, but this still means you can't add or remove sounds
Ok... you could remove by replacing with a 'nil' path, adding new one might be trickier to support...
EDIT II:
Regarding the lua solution (Sorry, I still don't know anything about lua) I think it would be better to have functions like getUnitType(String name) returning a UnitType-Object with functions like setDisplayName(String) (what I mean is: nearly all glest objects should be exposed to lua with many of the glest functions). If you have this, you could write the table --> unit-name (etc.) function in lua (which would also completly avoid the XML-Format-change problem)
Well, taking the idea to the extreme, you could eliminate XML from the game completely ;)
-
No, it's the same... nothing can be hard-coded anyway, the 'translatable' strings need to be read from the XML, then translations read in to override them... from XML or Lua it's only slightly different, and I don't think harder in either. You need to determine the translatables at run-time in either case, for the Lua solution you then just try to read each translatable from the Lua state, eg if you've determined you have a translatable string for the archmage's 'ice_nova' command, you query the Lua table for magic.archmage.commands.ice_nova, if there's a string there, you have your translation, if there isn't it hasn't been translated, and you fall back as normal.
May be true, but you at least need code to know that if you need the archmage's 'ice_nova' command, you need to lookup magic.marchmage.commands.ice_nova and not anything else
Well, taking the idea to the extreme, you could eliminate XML from the game completely
True, but IMHO lua should be used for dynamic things and XML for static ones - since again: XML is far easier to parse than lua when it comes to tool support (like a unit editor).
-
... but you at least need code to know that if you need the archmage's 'ice_nova' command, you need to lookup magic.marchmage.commands.ice_nova and not anything else
You're loading the 'magic' faction, 'archmage' unit, and you have just read the 'ice_nova' command, all the information is at hand, do your lookup. Simple :)
Well, taking the idea to the extreme, you could eliminate XML from the game completely
True, but IMHO lua should be used for dynamic things and XML for static ones - since again: XML is far easier to parse than lua when it comes to tool support (like a unit editor).
Yeah, I'm not suggesting we actually do that. And yes, for utility programs XML is nice...
-
Excuse my cross-quote...
silnarm doesn't like it and wants to have another system.
I never said that ;) I was probably pushing for Lua to hard and came accross wrong. The whole Lua Vs XML discussion is actually quite irrelevant.
My main concern is that you may be making it harder for yourself than it needs to be. I know I did that a hell of a lot in my younger days, and still do occasionally now ;) Designing a whole sub-system might be more 'fun' but I'd prefer to look at what we have, what needs translating, and how we can get the job done as easily as possible. Don't start with a 'wish-list', start with a 'need-this-list'.
So anyway, I took the liberty of strolling through the xml loading code again...
These are the 'translatables' we need to collect:
tech-tree: [tech/tech.xml]
Translatables:
Tech-Tree name, directory name (or filename without extension)
Attack-Types, from xml
Armour-Types, from xml
resources: [tech/resources/]
Translatables:
Resource names, sub-directory names
Individual resource XMLs have no translatables
factions: [tech/factions/faction/faction.xml]
Translatables:
Faction name, directory name
XML contains no translatables
upgrades: [tech/factions/faction/upgrades/]
Translatables:
Upgrade names, sub-directory names
Individual upgrade XMLs have no translatables
units: [tech/factions/faction/units/unit/unit.xml]
Translatables:
Unit name, directory name
Levels, from xml
Slection-Sounds, from xml
Command-Sounds, from xml
Skills:
Skill name, from xml*
Skill sounds, from xml
Commands:
Command name, from xml
* Are skill names actualy ever displayed in game??
Here's where/how we collect it... (excuse my semi-pseudo-code)
[NB: this removes the OO style references to translatables I was using earlier, so there is no
duplication of strings in the translatables table.]
// 'Global' (probably in 'Lang', which is a singleton)
map<string,string> translatables;
TechTree::load( string &path ) {
string techname = getNameFromPath( path );
translatables[techname] = techname; // put in translation tables, with default value
// code gets directory names from /tech/resources
foreach ( string name in filenames ) {
translatables[name] = name;
}
// loads tech-tree Xml
foreach ( XmlNode node in AttackTypeNode.childen ) {
translatables[node["name"]] = node["name"];
}
// same for armour types...
// factions... names were passed in as a parameter, in GAE this is a set, I think vanilla Glest
// uses a vector.
foreach ( string name in factionNames ) {
translatables[name] = name;
}
// code loads factions...
}
FactionType::load() {
// code pre-loads unit and upgrade names...
foreach ( string name in (unitNames + upgradeNames) ) {
translatables[name] = name;
}
// code loads units
// code loads upgrades
}
UnitType::load() {
// code starts loading paramaters
if ( levelsNode ) {
foreach ( XmlNode node in levelsNode.children ) {
translatables[node["name"]] = node["name"];
}
}
// code loads more parameters...
// do something with command and selection sounds
// Code loads skills and commands
}
SkillType::load() {
// code gets stuff from xml
translatables[name] = name;
// sounds?
}
CommandTpye::load() {
// code gets stuff from xml
translatables[name] = name;
}
and so then you have all your translatables, you could write out a template file...
// create translation template...
FILE *fp = fopen( "translation_template.ini", "w" );
for ( map<string,string>iterator it = translatables.begin(); it != translatables.end(); ++it ) {
fprintf( fp, "%s=\n", it->first.c_str() );
}
fclose( fp );
or if this is a game, translate...
// or load translation...
for ( map<string,string>iterator it = translatables.begin(); it != translatables.end(); ++it ) {
string translation = Lang::getTranslation( it->first );
if ( translation.size() ) { // if not empty string
it->second = translation;
}
}
That's the best I could come up with. I think it's fairly minimal, clean, and perhaps most importantly, doesn't require modifying any existing XML.
-
ooh, lots of discussion here! I haven't read everything posted yet, but I propose we don't re-invent any wheels that don't really need it. There's a GNU utility called gettext (http://www.gnu.org/software/gettext/manual/gettext.html). Out of the box, it will do the translations for stuff in your code, but that wont work completely for us, we'll still need a mechanism to generate a language file from scenarios, and tech trees. However, the nice thing about buying into gettext is that you get to use the INSANE assortment of tools that exist for creating translations. I haven't figured out KBabel yet, but it appears that you can download translation databases for a lot of different languages to automatically produce translations (which I presume one would prefer to later have an actual native of that language edit). These get spit out in these .po files.
As I said, it wont work for us out of the box, we're still going to have to hack it some to get what we want. None the less, I would much prefer to re-use what the community has done than to write my own, providing it doesn't suck of course. :)
And KBabel's stupid web site is down (how's that for instilling confidence?), but here's a screenshot from freshmeat:
(http://freshmeat.net/screenshots/4d/02/4d0272a671e35a9f9cd281cf98237b45_medium.jpg?1237045423)
Anyway, I presume that this isn't the only tool like this. Also, the entire field is called "i18n" which means "internationalization" (see http://en.wikipedia.org/wiki/Internationalization_and_localization (http://en.wikipedia.org/wiki/Internationalization_and_localization) for an explaination).
-
Aside from it being (somewhat of) a standard, we would get to experience the full the joy of automated translations, like these (Big OT!!!):
[img]http://failblog.files.wordpress.com/2008/07/fail-owned-translation-fail.jpg[/img]
[img]http://failblog.files.wordpress.com/2009/08/fail-owned-translating-fail.jpg[/img]
[img]http://failblog.files.wordpress.com/2008/07/fail-owned-translation-fail1.jpg[/img]
[img]http://failblog.files.wordpress.com/2008/08/fail-owned-engrish-fail.jpg[/img]
And the fun is that when doing this in a language none of us know, we won't know what we're *really* saying until somebody posts a bug (in a language we can speak/read) telling us that we just insulted their entire country!
-
I have made a simple Directory --> .po-File converter as you requested (I hope it is the thing you wanted, of course I can change everything). It doesn't do much except searching for all "interesting" strings by XPath-Queries (there is one file to select interesting "filenames" and one to select interesting attributes)
I made it with java since it is in my opionion far easier to work with it than with e.g. C++ (and since I am not using windows, I won't use C#)
Here are some screenshots - the "Gui" isn't really great, just one FileChooser for selecting the directory and one to place the .po-File
[img]http://img89.imageshack.us/img89/3808/openfolder.png[/img]You can select a folder
[img]http://img225.imageshack.us/img225/9177/savepo.png[/img]You can choose where you want to place the .po-File
[img]http://img225.imageshack.us/img225/8092/pofile.png[/img]The resulting .po-File
As soon as I manage to create a running jar, I'll edit my post
You currently can't read any PO-Files (which is a must IMHO - to just add new strings and remove old ones) and yeah - it can't do much (btw: The string issue "directories" instead of ".po-Files" is already fixed now :-( )
SVN-Url is:
svn co http://subversion.assembla.com/svn/rangliste/TransHelper, will be imported in a few minutes
Hmm ... I got some problems generating the .jar - the files I need aren't looked up correctly (they should be in the folder where the .jar is)
EDIT: And here is the .jar:
[url=http://billhome.at/glest/PoGenerator.zip]http://billhome.at/glest/PoGenerator.zip[/url]Extract and run. (And install java)
EDIT: My webspace is available again. You can download the file from the link above.
-
How awesome! =) Yea, almost everything I learned about internationalizing apps was for Java, it's super-easy in Java because it was designed into the core of the language & VM (for instance, all Strings in Java are wide character -- that doesn't mean that every VM *implements* them that way, but it's never something you need to worry about). I definitely like this, I'll have to give it a spin tomorrow, thanks PolitikerNEU!
Oh yea, I was reading that you can convert your .po files back & forth to Java-style .properties files too, which is essentially the file format of Glest's .lng files. I am accustom to that format, having never worked with .po files before. Either way, this may work out really well! :)
-
Hmm ... sorry, but the glest .lng-Format is that easy that I think it'll be easier to just write another getStringLng()-Method for the PO-Entry-class returning only msgid=msgstr (or a new IniEntry-Class ... doesn't matter)
Now I have added an IniEntry-Class (only saving, no loading possible currently, but I might add it later that day).
Whenever you save a .po, the corresponding .lng-File (be warned: the "algorithm" by which the filename is generated is simply: replace .po with .lng, so make sure your extension of the saved file is .po and the string doesn't occur anywhere else in the filepath, I might fix this bug later) is written in the same directory as the .po-File, just with .lng instead of .po
Screenshot of the generated .lng-File (not really interesting:)
[img]http://img42.imageshack.us/img42/864/savelng.png[/img]Note that there is an "error" in:
found %d fatal error=s'ha trobat %d error fatal
found %d fatal errors=s'han trobat %d errors fatalsBecause of the converted Test-POs from the GNU PO-Site. While they would still be recognized correctly by glest correctly, the "ID" is wrong because it would have to consist of A-Za-z0-9_- only. A warning is emitted in this case (but currently only on Stderr, I might add some "real" error message display)
Of course not every .po-Entry can be converted in .ini because the .po-Format supports far more than this simple .ini-Format, but it is sufficent for the .po-Files this utility generates.
-
Hmm, so what is your opinion then? And before you answer that, let me recap a few issues:
- Afaik the gettext library manages all of the encoding complexities, which would remove this complexity from GAE, although we may have already addressed the majority of those.
- Using gettext out of the box, you are supposed to compile .po files into some binary format that gettext can then access quickly. If we continue using the current mechanism, this will not be necessary. I expect gettext to be slightly faster, but I don't think it matters enough to be a serious consideration (skip the rest of this bullet if you don't care about performance details). The current Property class uses a std::map<std::string, std::string>, thus relying upon the speed of std::string::compare(const string&) const (on the surface, it's more complicated than that, but that's what it compiles down to). I believe that std::string::compare() does a character by character comparison and I'm guessing that gettext creates a 32-bit hash code for it's strings so that less processing is needed, but then again, it's comparing the entire text string and we're just using message identifiers, which are shorter.
So, we could (sorry for the formatting here, I can't figure out how to get it to do a numbered list :( )
- 1. Continue using the Property class as-is (possibly with further enhancements to better manage various character encoding). This solution is more in-line with the standard Java mechanism (isolating all of your messages/text to a single class and sticking the actual messages in a .properties file). Then we can use PolitikerNEU's tool to generate and manage language files for techs, scenarios and whatever other .xml, converting between the .properties/.ini format and .po format to use other translation tools like KBabel.
- 2. Convert to the .po format and convert the Lang class to call gettext instead of our Properties class, but continue to use message IDs instead of actual English text embedded in the code. This would eliminate the need to further improve the Properties class to manage encoding and leave us directly in the .po format, which a wide assortment of tools have support for. This will also require modders to compile their .po files prior to release (which, again, can be encapsulated in a tool we distribute and maybe even have PolitikerNEU's tool do it).
- 3. Entirely eliminate the Lang class and replace it with direct calls to gettext using the _() macro (not my choice personally, but it's used by a lot of software).
With solutions 1 and 2, we'll still need to add some extra glue to the xgettext program to get it to properly strip out our language strings (probably looking for Lang::getInstance().getString("messageID") or lang.getString("messageID")), however, I've learned that this isn't terribly difficult to do, and it even supports parsing the C and C++ languages! :) We should probably also translate error messages, not all of them, but at least those that may be meaningful to a user. Those that only a developer would normally understand can be left in hard-coded English as long as all of our development team speaks English.
Any other ideas for how to approach this? I'm personally leaning towards solution #1, but I'm not ruling out #2 all together.
Final thoughts: I wouldn't mind terribly if we came up with some way to "script" an automatic translation process. Apparently, these translation databases are large-ish (200-ish MB each) so maybe we can do this on some server or something, so modders don't have to download 2GB of data to do translations. Finally (and this is thinking a head a little bit) I hope we can have some kind of mechanism to mark a particular message translation as being human-made or -validated, so that later runs through the translator doesn't attempt to change them. Lastly, to take i18n all the way, if we really want to do it right, we'll have to have support for languages who's text doesn't read from left to right. Probably, what we have already is enough to bite off for now.
-
I too think that gettext won't be much faster than a "normal" map<string,string> lookup and I haven't found (after searching for a very short time) a good way to compile .po from java so I had to do create the binary format from scratch in java - which I don't like.
I don't really know if solution 1 or 2 would be better, but since 2 looks rather compilcated, I'd prefer solution 1 - but if virtual functions are fast enough, we could just create an Lang-"interface" and implement this using either .po or .mo - that way, you could use .po for simply creating a mod and .mo if you got a compiled one.
One problem of using gettext may be - I don't know it - that you cannot change the language on the fly, if that is true, I think we need to support method 1. (Changing language on the fly is useful if you are playing e.g. together with a player using another language - both could switch to e.g. english for a short time to be able to know the correct unit name - but this could be tricky maybe)
I don't know anything about xgettext, but I hope this will be possible - maybe using some macro if nothing else is possible? (for example:
#ifdef xgettext
#define GETSTRING(x)
#else
#define GETSTRING(x) Lang::getInstance().getString(x)
#endif
)
But actually I don't think these translation databases would be of much use for glest players since "normal" programs use strings like "File" or "Edit", but not "Initiate" (at least not in the meaning of the unit in glest), "archmage Tower" or something like that - additionally since we use IDs I doubt this translation database would find anything.
Using a server would be certainly nice, for example there is this launchpad-thing (is it open source already?) which could be used maybe, but I don't know it.
(Sorry, I am rather tired right now :-/ )
-
(Sorry, I am rather tired right now :-/ )
hehe, I know that feeling! :)
You can switch languages dynamically with gettext, and now I'm personally leaning towards using our own stuff, and converting back & forth to .po to make use of all of the translation tools. As far as language IDs, we can just put the english version in the .po files when sending them through translation stuff. And as far as accuracy, each language database is about 200 MB, so I'm better it's better than "File, Edit, etc."
I'm still open to feedback on this. Also, I dunno about the lanchpad thing, I don't think I've ever heard of it.
-
Looking back at what I've been missing, I really liked silnarm's idea for the translations! Although I'm not so sure if it would work well for sounds...?!?
As well, there's no need to translate skill names. Those are just references for the commands so they know which skills to use. If you tried to translate it, you'd probably end up with no working commands! :D