TStringStream and UTF8

Hi,

I have this scenario for loading report from database (report is stored
into datafase Memo fied in XML format):

var RptStream: TStringStream;
begin
  Report.Clear;
  Report.Script.Clear;
  RptStream :=
TStringStream.Create(qIzvestaji.FieldByName('REPORTFILE').AsWideString,
TEncoding.UTF8);
  Report.LoadFromStream(RptStream);
  RptStream.Free;
  Report.ShowReport;
end;

Report component hold that XML as report description, and after
modification all changes must be saved to Memo field in database,
something like this:

var
  RptStream: TStringStream;
begin
  RptStream := TStringStream.Create('', TEncoding.UTF8);
  Report.SaveToStream(RptStream);
  dm.qIzvestaji.Edit;

TWideMemoField(dm.qIzvestaji.FieldByName('REPORTFILE')).LoadFromStream(RptStream);
  dm.qIzvestaji.Post;
end;

After this change into database are stored data into chinese characters
and there is no XML description of report into database.

How to solve this conversion problem???

Thanks in advanced...
0
Sasa
9/28/2011 10:04:46 PM
embarcadero.delphi.general 4258 articles. 0 followers. Follow

5 Replies
7654 Views

Similar Articles

[PageSpeed] 23

"Sasa Mihajlovic" <office@msdinfo.com> wrote in message 
news:406207@forums.embarcadero.com...

> I have this scenario for loading report from database (report is
> stored into datafase Memo fied in XML format):

What charset does the DB field actually use?

> RptStream := 
> TStringStream.Create(qIzvestaji.FieldByName('REPORTFILE').AsWideString, 
> TEncoding.UTF8);
>  Report.LoadFromStream(RptStream);

You are reading the DB field data as a UTF-16 encoded WideString, then 
telling the TStringStream to re-encode that data to UTF-8 internally, so the 
LoadFromStream() method will receive UTF-8 encoded byte octets, not UTF-6 
encoded byte octets.

>  RptStream := TStringStream.Create('', TEncoding.UTF8);
>  Report.SaveToStream(RptStream);
> TWideMemoField(dm.qIzvestaji.FieldByName('REPORTFILE')).LoadFromStream(RptStream);

SaveToStream() will save its data as-is to the stream (so presumably it will 
save UTF-8 encoded byte octets).  You are then passing the stream to another 
LoadFromStream() method, which will receive the same byte octets as-is 
(actually, it might not receive any bytes at all, as you are not setting the 
stream's Position back to 0 before calling LoadFromStream()).  The stream's 
TEncoding parameter is completely ignored in this situation.

Since you read the DB field data as a WideString, you should write data back 
to it as a WideString as well.  In which case, the stream's TEncoding 
parameter will take effect to decode the stored data to UTF-16, eg:

{code:delphi}
dm.qIzvestaji.FieldByName('REPORTFILE')).AsWideString := 
RptStream.DataString;
{code}

> After this change into database are stored data into chinese characters
> and there is no XML description of report into database.

You stored whatever raw bytes were generated by the Report component, which 
likely does not match the actual charset used by the DB field.

-- 
Remy Lebeau (TeamB)
0
Remy
9/29/2011 12:20:07 AM
Remy Lebeau (TeamB) wrote on 29.09.2011 :
> "Sasa Mihajlovic" <office@msdinfo.com> wrote in message 
> news:406207@forums.embarcadero.com...
>
>> I have this scenario for loading report from database (report is
>> stored into datafase Memo fied in XML format):
>
> What charset does the DB field actually use?
My database use UTF8 charset (Oracle & Firebird)
>
>> RptStream := 
>> TStringStream.Create(qIzvestaji.FieldByName('REPORTFILE').AsWideString, 
>> TEncoding.UTF8);
>>  Report.LoadFromStream(RptStream);
>
> You are reading the DB field data as a UTF-16 encoded WideString, then 
> telling the TStringStream to re-encode that data to UTF-8 internally, so the 
> LoadFromStream() method will receive UTF-8 encoded byte octets, not UTF-6 
> encoded byte octets.
My Report component can't handle UTF-16 string, and I must re-encode it 
to UTF-8 and than all works fine, in Delphi version before D2009 I just 
pass the stream, but now this is only way to solve this problem. Btw 
Report is FastReport VCL.
>
>>  RptStream := TStringStream.Create('', TEncoding.UTF8);
>>  Report.SaveToStream(RptStream);
>> TWideMemoField(dm.qIzvestaji.FieldByName('REPORTFILE')).LoadFromStream(RptStream);
>
> SaveToStream() will save its data as-is to the stream (so presumably it will 
> save UTF-8 encoded byte octets).  You are then passing the stream to another 
> LoadFromStream() method, which will receive the same byte octets as-is 
> (actually, it might not receive any bytes at all, as you are not setting the 
> stream's Position back to 0 before calling LoadFromStream()).  The stream's 
> TEncoding parameter is completely ignored in this situation.
>
> Since you read the DB field data as a WideString, you should write data back 
> to it as a WideString as well.  In which case, the stream's TEncoding 
> parameter will take effect to decode the stored data to UTF-16, eg:
>
> {code:delphi}
> dm.qIzvestaji.FieldByName('REPORTFILE')).AsWideString := 
> RptStream.DataString;
> {code}
>
Ok, this scenario is working, but I have a request from my users for 
cyrilic and anoted unicode characters, and when I try to save cyrilic 
characters and read it again a get some letter (which is not chinese 
but it isn't cyrilic characters). How to solve this problem?
>> After this change into database are stored data into chinese characters
>> and there is no XML description of report into database.
>
> You stored whatever raw bytes were generated by the Report component, which 
> likely does not match the actual charset used by the DB field.

Thanks for help...
0
Sasa
9/29/2011 5:42:05 AM
"Sasa Mihajlovic" <office@msdinfo.com> wrote in message 
news:406341@forums.embarcadero.com...

> Ok, this scenario is working, but I have a request from my
> users for cyrilic and anoted unicode characters, and when
> I try to save cyrilic characters and read it again a get some
> letter (which is not chinese but it isn't cyrilic characters).

What letter exactly? Is it the '?' character, by chance?  Please be more 
specific.  UTF-8 and UTF-16 support all Unicode characters, so the code I 
gave you should be converting from FastReport's UTF-8 to UnicodeString's 
UTF-16 to the DB's UTF-8.  Converting between UTF-8 and UTF-16 is a 
loss-less conversion.  Make sure FastReport is actually returning valid 
UTF-8, and that TStringStream is return valid UTF-16.  If both are true, 
then the DB component is the culprit.

-- 
Remy Lebeau (TeamB)
0
Remy
9/29/2011 4:38:29 PM
Remy Lebeau (TeamB) submitted this idea :
> "Sasa Mihajlovic" <office@msdinfo.com> wrote in message 
> news:406341@forums.embarcadero.com...
>
>> Ok, this scenario is working, but I have a request from my
>> users for cyrilic and anoted unicode characters, and when
>> I try to save cyrilic characters and read it again a get some
>> letter (which is not chinese but it isn't cyrilic characters).
>
> What letter exactly? Is it the '?' character, by chance?  Please be more 
> specific.  UTF-8 and UTF-16 support all Unicode characters, so the code I 
> gave you should be converting from FastReport's UTF-8 to UnicodeString's 
> UTF-16 to the DB's UTF-8.  Converting between UTF-8 and UTF-16 is a loss-less 
> conversion.  Make sure FastReport is actually returning valid UTF-8, and that 
> TStringStream is return valid UTF-16.  If both are true, then the DB 
> component is the culprit.

I was made some modification in both cases, because I want my saved 
data be readable from more application and if I use UTF-16 
(WideStringField id Delphi 2010) and I try to read that string from web 
browser which expect UTF-8 string will be valid?

Here is my modification and after that all works fine, I try to read 
data from web app and from my app and both give same result.

Read from database into Report component:
          rptStream := 
TStringStream.Create(qIzvestaji.FieldByName('REPORTFILE').AsWideString, 
TEncoding.UTF8);

Write from component to database:

  dm.qIzvestaji.FieldByName('REPORTFILE').AsWideString := 
UTF8Decode(rptStream.DataString);

Is this scenario ok?

Thanks for help...
0
Sasa
9/29/2011 6:29:46 PM
"Sasa Mihajlovic" <office@msdinfo.com> wrote in message 
news:406577@forums.embarcadero.com...

> I was made some modification in both cases, because I want
> my saved data be readable from more application and if I use
> UTF-16 (WideStringField id Delphi 2010) and I try to read
> that string from web browser which expect UTF-8 string will be valid?

If the database natively supports UTF-8, and reading/writing data from/to 
the DB works correctly, there is no need to change the DB to UTF-16.  If you 
store UTF-16 data and then want to pass it to a webbrowser, you would have 
to send it as UTF-6 (with a UTF-16 charset specified) or convert it to UTF-8 
(with a UTF-8 charset specified) before sending it to the browser.

>  dm.qIzvestaji.FieldByName('REPORTFILE').AsWideString :=
> UTF8Decode(rptStream.DataString);

That will not work.  In D2010, reading the TStringStream.DataString property 
returns a UTF-16 encoded UnicodeString.  The property getter method takes 
the raw bytes that are in the stream (which will be UTF-8 if that is what 
FastReport is returning) and decode them to UTF-16 using the specified 
TEncoding class.  Since you are using TEncoding.UTF8 as the encoding, you 
are telling the DataString property to perform a UTF-8-to-UTF-16 conversion 
before your code even sees the data.  UTF8Decode() requires a UTF-8 encoded 
RawByteString as input, but you are passing it a UTF-16 encoded 
UnicodeString instead.  When a UnicodeString is assigned to a RawByteString, 
the RTL converts the UTF-16 data to Ansi using the OS default Ansi codepage, 
which will never be UTF-8 (at least on Windows systems) so you will lose 
data for any non-ASCII characters.  If TWideStringField.AsWideString expects 
a UTF-16 encoded string, and TStringStream.DataString returns a UTF-16 
encoded string, you should be able to just assign the data as-is, like my 
earlier example showed.

-- 
Remy Lebeau (TeamB)
0
Remy
9/29/2011 7:57:00 PM
Reply:

Similar Artilces:

delphi.general
Sorry if this is "out of topic"! but since some days ago I can't read the group *embarcadero.public.delphi.language.delphi.general* Does anybody know if there's something wrong with it? Or maybe is my reader? (Thunderbird) I get this messagge when I try to read the latest messages: Error! newsgroup server responded:no such article found Perhaps the article has expired <692702@forums.embarcadero.com> (33145) Click here to remove all expired articles sergio wrote: > Sorry if this is "out of topic"! but since some days ago I can'...

What generals generally do
Reading a McChrystal thread elsewhere, I came across someone quoting a US general http://en.wikipedia.org/wiki/Smedley_Butler > I spent 33 years and four months in active military service and > during that period I spent most of my time as a high class thug for > Big Business, for Wall Street and the bankers. In short, I was a > racketeer, a gangster for capitalism. I helped make Mexico and > especially Tampico safe for American oil interests in 1914. I helped > make Haiti and Cuba a decent place for the National City Bank boys to > collect revenues in. I ...

delphi.general Timer
This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --JivePart=_202d5.zeB8CwDOH5aMJzMl Content-Type: text/plain; charset="Utf-8" For Paolo Valle --JivePart=_202d5.zeB8CwDOH5aMJzMl Content-Type: application/octet-stream; name="SubroutineTimer.pas" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="SubroutineTimer.pas" dW5pdCBTdWJyb3V0aW5lVGltZXI7DQoNCmludGVyZmFjZQ0KDQpwcm9jZWR1cmUgU3RhcnRTdWJy b3V0aW5lVGltZXI7DQpmdW5jdGlvbiBHZXRTdWJSb...

Delphi + dbGo +firebird + UTF8
Hi, I have an application made with Delphi 2009 (same problem also on BDS2006) that works in multilanguage. I have a FireBird 2 database From delphi I connect to database using dbGo (ADOConnection) on Firebird ODBC. I'm unable to read or write Polish (same also with russian or other different from ISO8859_1) characters. If I write "leznc" I read "leznc" and so on. Database is UTF8 and I can read and write in Polish using directly database tools or connecting to database from delphi using FIBPlus. What's the problem? DbGo or the Firebird ODBC? ...

Delphi 2009, IdHTTP and UTF8
Hello I'm having a problem after migrating to Delphi 2009 Win32. I need to to a login on a web site and parse the returned data, but the problem is that I can't decode UTF8 characters. In delphi 2007 I used WideStrings and UTF8Decode, although this approach is not working anymore with delphi 2009. I replaced WIdeStrings with strings and UTF8Decode to UTF8ToString method, but still can't read the utf8 string. anyone can help? :( here's the code. var Params: TStringList; s: string; begin Params.Add('login=' + UserName); Params.Add('p...

Delphi XE broken UTF8
Hi, I am using the Rad Studio XE trial edition to see how hard an upgrade of our project will be. Getting to compile and link was pretty easy. However, the UnicodeToUtf8 function appears to be badly broken making our app unusable. Please can someone with the full release confirm this and maybe go into the System source file (not available in trial version) to analyse? It is easily reproducable (and this code works fine on D2010). The function appears to be not null terminating (as the help promises it will): {code} function UTF8Str(const WS: string): AnsiString; var szTitle:...

UTF8, UTF-8, utf8, Utf8 encoding blues
Hi All, I'm reading loads, and loads of very confusing and contradicting information about UTF8 in Perl. A lot of posts are also (rightfully IMHO) stating that UTF8 is an absolute nightmare in Perl. Can someone shed some light as to what is going on here please: use Encoding; SysLog("debug", "1 - DEBUG LENGTH: " . length($Response)); my $unicode_chars = Encode::decode('utf8', $Response); SysLog("debug", "** ENCODING: " . find_encoding($Response)); my $newunicode_chars = substr($unicode_chars, 0, -3); my $Body = $newunicode...

Delphi and Delphi for .Net
It seems that Delphi for .Net is slower than Delphi Win32 native applicaiton. I would like to know is it true all .Net application is slower than Win32 native applicaiton or it is Delphi for .Net only. Your information is great appreciated, Inung On 2011-06-21 18:20:17 +0100, Inung Huang said: > It seems that Delphi for .Net is slower than Delphi Win32 native applicaiton. > I would like to know is it true all .Net application is slower than > Win32 native applicaiton or it is Delphi for .Net only. If you are only running the code in the application once then, yes, yo...

Delphi + dbGo +firebird + UTF8
Hi, I have an application made with Delphi 2009 (same problem also on BDS2006) that works in multilanguage. I have a FireBird 2 database From delphi I connect to database using dbGo (ADOConnection) on Firebird ODBC. I'm unable to read or write Polish (same also with russian or other different from ISO8859_1) characters. If I write "leznc" I read "leznc" and so on. Database is UTF8 and I can read and write in Polish using directly database tools or connecting to database from delphi using FIBPlus. What's the problem? DbGo or the Firebird ODBC? ...

utf8::upgrade,utf8::encode and utf8::is_utf8 on EBCDIC platform
Hi, This are the tetstcase i'm runing on EBCDIC platform, my $b = chr(0x0FF); $p=utf8::upgrade($b); print "\n$p"; utf8::upgarde returns the number of octets necessary to represent the string as UTF-X. EBCDIC output is 1 whereas ASCII platform output is 2. Is the return value i'm getting on EBCDIC is correct? my $c=chr(0x0FF); print "before $c\n"; print "\n"; utf8::encode($c); print "after $c\n"; print length($c); On ASCII before is single octet repsentation and after encode is two byte , length is 2. On EBCDIC it...

EVarriantTypeCast error delphi language general
I uploaded above subject to the attachment space.with test.zip. If somebody coul have a look at the test i prepared. Since overall QuickReport does work, i could not guess what is going on here. Thanks in advance to have a look at the attached test i made and hopefully somebody can find the issue. Regards. Lieven Hi xxx, If nothing works, it could mean that somewhere a mistake was made.lol. For the uploaded to public attachment test.zip solution below. Regards Lieven {Sql code} QryPtDat1.Sql.Clear; QryPtDat1.Sql.Add('SELECT D."Names", ...

unicode fss to utf8 with delphi 2010
Hi, I have one of my business application which is use firebird as backgroung application and unicode_fss as collation. I have a lot of problems with unicode and firebird and I would like to create new empty database with utf8 collation and made application in delphi which will read data from one (unicode) database and convert all data to utf8 string and save it to new blank database with utf8 collation. I need procedure which will read all unicode strings and convert it to utf8 character set with all latin letters? Thanks in advanced... Am 31.03.2010 19:33, Sasa Mihajlovi...

Delphi 2010 - IBX - UTF8
I am migrating an application from delphi 6 - ibx - firebird 1.5 that works great to delphi 2010 - Firebird 2.1 - UTF8 database. The problem is that if I use a DBMEMO to display data from a BLOB I get the follwoing error --------------------------- Debugger Exception Notification --------------------------- Project accedo.exe raised exception class EAccessViolation with message 'Access violation at address 00686059 in module 'accedo.exe'. Read of address 000000EC'. --------------------------- If I delete the DBMEMO component the program runs just fine whit out it ...

Web resources about - TStringStream and UTF8 - embarcadero.delphi.general

Resources last updated: 2/11/2016 4:19:28 AM