Problem concatenating Unicode strings in Delphi 2010

I'm new to Unicode so hopefully I am doing something wrong. But here is the problem.

I have 2 variables defined as String (Rad Studio 2010 Update 1).

var
     s1, s2, s3: String;

begin
  s1 := 'abcdefgh';     // These are two Arabic strings but 
  s2 := 'ijklmno';       //  here is use normal letters so you can see what's happening.

  s3 := s1 + s2;       //Instead of S3 getting "abcdefghijklmno" it gets "abcjklmdefghno"
end;

So the result is a jumbled string. 
The project has MultiByte character support set to True and I noticed SizeOf(Char) is 2. The debugger is showing the correct Arabic characters for s1 and s2, but the characters get jumbled when they are concatenated.

Why is this happening? Did I miss setting?

TIA
Mike
0
Mike
5/28/2010 4:39:52 PM
embarcadero.delphi.win32 2183 articles. 0 followers. Follow

7 Replies
900 Views

Similar Articles

[PageSpeed] 32

Mike Long wrote:

> var
>      s1, s2, s3: String;
> 
> begin
>   s1 := 'abcdefgh';     // These are two Arabic strings but 
>   s2 := 'ijklmno';       //  here is use normal letters so you can
> see what's happening.
> 
>   s3 := s1 + s2;       //Instead of S3 getting "abcdefghijklmno" it
> gets "abcjklmdefghno" end;

That shouldn't be a problem, can you give an real (compilable/code)
example that fails?

-- 
Pieter

"Every gun that is made, every warship launched, every rocket
 fired signifies in the final sense, a theft from those who
 hunger and are not fed, those who are cold and are not clothed.
 This world in arms is not spending money alone. It is spending
 the sweat of its laborers, the genius of its scientists, the
 hopes of its children. This is not a way of life at all in any
 true sense. Under the clouds of war, it is humanity hanging on
 a cross of iron."
 -- Dwight D. Eisenhower
0
Pieter
5/29/2010 2:11:51 PM
> {quote:title=Pieter Zijlstra wrote:}{quote}
> Mike Long wrote:
> 
> > var
> >      s1, s2, s3: String;
> > 
> > begin
> >   s1 := 'abcdefgh';     // These are two Arabic strings but 
> >   s2 := 'ijklmno';       //  here is use normal letters so you can
> > see what's happening.
> > 
> >   s3 := s1 + s2;       //Instead of S3 getting "abcdefghijklmno" it
> > gets "abcjklmdefghno" end;
> 
> That shouldn't be a problem, can you give an real (compilable/code)
> example that fails?
> 
> -- 
> Pieter

Pieter,
    I'll try and create a demo. Where do I send the zip file?

Mike


> 
> "Every gun that is made, every warship launched, every rocket
>  fired signifies in the final sense, a theft from those who
>  hunger and are not fed, those who are cold and are not clothed.
>  This world in arms is not spending money alone. It is spending
>  the sweat of its laborers, the genius of its scientists, the
>  hopes of its children. This is not a way of life at all in any
>  true sense. Under the clouds of war, it is humanity hanging on
>  a cross of iron."
>  -- Dwight D. Eisenhower
0
Mike
5/29/2010 5:33:00 PM
Mike Long wrote:

> Pieter Zijlstra wrote:
> > 
> > That shouldn't be a problem, can you give an real (compilable/code)
> > example that fails?
> 
> I'll try and create a demo. Where do I send the zip file?

I was thinking of just a couple of lines of source code showing the
actual problem, but if it is going to take more than that ...
attachments can be posted here:
https://forums.embarcadero.com/forum.jspa?forumID=2

-- 
Pieter

"Never interrupt your enemy when he is making a mistake."
 -- Napoleon
0
Pieter
5/29/2010 6:02:46 PM
> I was thinking of just a couple of lines of source code showing the
> actual problem, but if it is going to take more than that ...
> attachments can be posted here:
> https://forums.embarcadero.com/forum.jspa?forumID=2

I wrote a test app and it is working fine. Wouldn't you know it? :-)
Arabic is written right to left so maybe that has something to do with it? I have some more experimenting to do and if I run into a problem, I'll post it. Thanks.

Mike
0
Mike
5/31/2010 2:40:20 PM
Pieter,
    After delving into it some more, the problem still exists and I can't seem to find an easy solution. There are some languages that are written right to left. See http://www.i18nguy.com/temp/rtl.html

One of these languages is Arabic. Here is an example describing what is happening (I'm using random words, so no need to translate):

  S1 := 'أ ب ج د';
  S2 := 'ه و';
  S3 := S1 + S2;
  Edit3.Text := 'S1="'+S1+'" S2="'+S2+'" S1+S2="'+S3+'"';

Edit3.Text displays:
S1="أ ب ج د" S2="ه و" S1+S2="أ ب ج ده و"

So you see S2 now appears BEFORE  S1 in Edit3.Text even though it was concatenated to the END of S1. Now if I were just concatenating Arabic words together then this should be fine. 

But I want to place S2 at a particular location on the line, to the right of S1. I am building up lines of text so word #1 should appear at position 10, word #2 at column 20, word #3 at column 30 etc.. (It could be column 11, 24 55 etc.) The Insert() procedure has the same problem. 

So I can't see any way to build up this line so the text is in the correct position on the line. The same routine to build up these lines should handle left to right as well as right to left languages. 
Can you or anyone else shed some light on this? I'm getting dizzy trying to solve this problem. :-)

TIA
Mike
0
Mike
6/14/2010 3:26:00 PM
Mike Long wrote:

> Pieter,
>     After delving into it some more, the problem still exists and I
> can't seem to find an easy solution. There are some languages that
> are written right to left. See http://www.i18nguy.com/temp/rtl.html
> 
> One of these languages is Arabic. Here is an example describing what
> is happening (I'm using random words, so no need to translate):
> 
>   S1 := 'أ ب ج د';
>   S2 := 'ه و';
>   S3 := S1 + S2;
>   Edit3.Text := 'S1="'+S1+'" S2="'+S2+'" S1+S2="'+S3+'"';
> 
> Edit3.Text displays:
> S1="أ ب ج د" S2="ه و" S1+S2="أ ب ج ده و"
> 
> So you see S2 now appears BEFORE  S1 in Edit3.Text even though it was
> concatenated to the END of S1. Now if I were just concatenating
> Arabic words together then this should be fine.

It's probably still concatenated to the right if you look at each
individual character in the string but just displayed reversed.
 
> But I want to place S2 at a particular location on the line, to the
> right of S1. I am building up lines of text so word #1 should appear
> at position 10, word #2 at column 20, word #3 at column 30 etc.. (It
> could be column 11, 24 55 etc.) The Insert() procedure has the same
> problem.

For what purpose (displaying, writing to file, something else)?
If it is for displaying can you use another component like TListView in
report mode?

> So I can't see any way to build up this line so the text is in the
> correct position on the line. The same routine to build up these
> lines should handle left to right as well as right to left languages.
> Can you or anyone else shed some light on this? I'm getting dizzy
> trying to solve this problem. :-)

IIRC there was a way to detect if a string is LTR or RTL but I don't
know it from the top of my head. If you can detect you're dealing with
RTL it should be possible to concatenate the string in the reversed
order or if that gives you problems it should be possible to fill a
fixed length string, character for character and fill it with the
characters from S2 first followed by the characters of S1 (at the next
fixed location).

-- 
Pieter

"Science is the great antidote to the poison of enthusiasm and
 superstition."
 -- Adam Smith
0
Pieter
6/15/2010 9:26:27 PM
Mike Long wrote:

>   S1 := 'أ ب ج د';
>   S2 := 'ه و';
>   S3 := S1 + S2;
>   Edit3.Text := 'S1="'+S1+'" S2="'+S2+'" S1+S2="'+S3+'"';
> 
> Edit3.Text displays:
> S1="أ ب ج د" S2="ه و" S1+S2="أ ب ج ده و"
> 
> So you see S2 now appears BEFORE  S1 in Edit3.Text even though it was
> concatenated to the END of S1. 

Well, of course! Think about it. It is read from right to left, isn't
it? So when reading in that direction, you first see the contents of S1
and then those of S2. I wouldn't have expected anything else.


-- 
Rudy Velthuis (TeamB)        http://www.teamb.com

"Disobedience, in the eyes of anyone who has read history, is
 man's original virtue. It is through disobedience that progress
 has been made, through disobedience and through rebellion."
 -- Oscar Wilde
0
Rudy
6/29/2010 9:17:44 PM
Reply: