patchFreeciv - Patches: patch #3027, libicu

 
 
Show feedback again

patch #3027: libicu

Submitted by:  Marko Lindqvist <cazfi>
Submitted on:  Sun 06 Nov 2011 11:49:52 PM UTC  
 
Category: generalPriority: 5 - Normal
Status: NonePrivacy: Public
Assigned to: NoneOpen/Closed: Open
Planned Release: 2.6.0

Add a New Comment (Rich MarkupRich Markup):
   

You are not logged in

Please log in, so followups can be emailed to you.

 

Sun 07 Apr 2013 11:04:26 PM UTC, comment #4:

> In some ways it would be a lot easier if the internal character
> set could be relied on to be UTF-8.


Yes, now that I'm looking libicu stuff, I'd like to replace a lot of our most low-level string handling to use its UTF-8 handling. For one, I think we have at least theoretical bugs in how we currently handle strings that are actually of some-encoding as ASCII. Fixing those to work on with any internal encoding is not going to happen, but if we could rely to that everything is UTF-8 except in the very last moment it's written out to non-UTF-8 "device" or very first moment it's read in from, handling would be in most cases trivially simple (call equivalent icu function instead of c-lib function)

At present day, do we have any reason not to switch to UTF-8 that way in 2.6?

Marko Lindqvist <cazfi>
Project Administrator
Sat 30 Mar 2013 09:19:03 PM UTC, comment #3:

http://site.icu-project.org/

Not only UTF-8, but stuff like collations.

As for portability concerns, I added icu build to crosser, and it at least builds fine for MinGW.

Marko Lindqvist <cazfi>
Project Administrator
Sat 30 Jun 2012 02:48:43 PM UTC, comment #2:

(When I say "UTF-8" I might actually mean "Unicode".)

Jacob Nevins <jtn>
Project Administrator
Fri 29 Jun 2012 09:24:59 AM UTC, comment #1:

In some ways it would be a lot easier if the internal character set could be relied on to be UTF-8. For instance I know of portable code to fix bug #17289 without introducing platform dependencies, but only assuming UTF-8. Right now I think I have to code as if it could be any character set (I might end up with Shift-JIS or something), and so be very conservative in my assumptions -- is that right? (I can't immediately locate the developer documentation on character sets, I'm sure I'm seen some somewhere.)

Jacob Nevins <jtn>
Project Administrator
Sun 06 Nov 2011 11:49:52 PM UTC, original submission:

Should we take libicu to use for all kind of UTF-8 manipulation?

This came up after discussion about translations. Sometimes it's really hard to arrange sentences in translation so that it doesn't begin with some "%s", which uses word starting with lower case letter. It was requested that we would add some mechanism to upper case such first letters. Well, that's not trivial to do for variable-byte UTF-8 texts we have. Libicu would probably help a lot here.
We've also had, and still have, a lot of problems of truncating UTF-8 strings potentially mid-multibyte-character. On a somewhat related note, I'd like to change our naming of variables/constants/macros/whatever so that "len" means length of text in characters, and "size" is used when number of bytes are in question.

Marko Lindqvist <cazfi>
Project Administrator

 

(Note: upload size limit is set to 1024 kB, after insertion of the required escape characters.)

Attach File(s):
   
   
Comment:
   

No files currently attached

 

Digest:
   patch dependencies.

Items that depend on this one: None found

 

Carbon-Copy List
  • -unavailable- added by jtn (Posted a comment)
  • -unavailable- added by cazfi (Submitted the item)
  •  

    Do you think this task is very important?
    If so, you can click here to add your encouragement to it.
    This task has 0 encouragements so far.

    Only logged-in users can vote.

     

    Please enter the title of George Orwell's famous dystopian book (it's a date):

     

     

    Follow 2 latest changes.

    Date Changed By Updated Field Previous Value => Replaced By
    Sun 07 Apr 2013 10:53:37 PM UTCcazfiDependencies-=>Depends on patch #3838
    Sat 30 Mar 2013 09:19:03 PM UTCcazfiPlanned Release2.5.0=>2.6.0
    Show feedback again

    Back to the top


    Powered by Savane 3.1-cleanup