Using TMemoryStream to read a UTF8 file and saving it to string [Edit]

I thought this would be easy in XE2; but, apparently, it is more
involved than that.

What I am trying to do is this:

-- Read a UTF-8 file (with BOM) into a stream (TMemoryStream or
TStringStream).

-- then, copy the content of the stream into string.

Trouble is, when the contents are copied into the string, so do the BOM
chars at the beginning of it --which I don't need.

I tried to find a suitable function from TEncoding, but I am not sure
there is.

I am looking for something that recognizes the encoding of the incoming
data, especially if there's BOM in it, and convert it to string
properly (of course, removing BOM stuff).

What can I use?

[Edited the subject line]
0
Adem
8/25/2012 10:36:33 AM
embarcadero.delphi.general 4258 articles. 0 followers. Follow

4 Replies
1550 Views

Similar Articles

[PageSpeed] 7

> {quote:title=Adem Meda wrote:}{quote}
> I thought this would be easy in XE2; but, apparently, it is more
> involved than that.
> 
> What I am trying to do is this:
> 
> -- Read a UTF-8 file (with BOM) into a stream (TMemoryStream or
> TStringStream).
> 
> -- then, copy the content of the stream into string.
> 
> Trouble is, when the contents are copied into the string, so do the BOM
> chars at the beginning of it --which I don't need.
> 
> I tried to find a suitable function from TEncoding, but I am not sure
> there is.
> 
> I am looking for something that recognizes the encoding of the incoming
> data, especially if there's BOM in it, and convert it to string
> properly (of course, removing BOM stuff).
> 
> What can I use?
> 

Try using TStringList. If I remember correctly, you can pass encoding information 
as parameter into LoadFromFile.

Dalija Prasnikar
0
Dalija
8/25/2012 11:09:57 AM
Dalija Prasnikar wrote:

> TStringList

This seems to do it.

FStringList.LoadFromFile(OpenDialog1.FileName, TEncoding.UTF8);

Thank you.
0
Adem
8/25/2012 11:31:15 AM
Hello,

I have the exact same problem with a huge UTF8 file that cannot load completely into a TStringList.
So, how would you go about loading it by portions (say 1MB) ?

Do I use TFileStream with a TEncoding ?

Thanks for any assistance !
0
bob
11/30/2012 3:32:08 PM
bob wrote:

> I have the exact same problem with a huge UTF8 file that cannot load
> completely into a TStringList. So, how would you go about loading it
> by portions (say 1MB) ?

It really depends on what you need to do with the file.  If you just need 
to read it one line at a time, look at the TStreamReader class.  You can 
pass a TEncoding to its constructor, and it has a ReadLine() method.  But 
if need random access to the file, then you are better off using a memory-mapped 
file instead and decode the data manually as needed.  Look at the Win32 API 
CreateFileMapping() and MapViewOfFile() functions.


--
Remy Lebeau (TeamB)
0
Remy
11/30/2012 6:00:47 PM
Reply: