UTF8, UTF-8, utf8, Utf8 encoding blues

Hi All,

I'm reading loads, and loads of very confusing and contradicting information
about UTF8 in Perl.  A lot of posts are also (rightfully IMHO) stating that
UTF8 is an absolute nightmare in Perl. 

Can someone shed some light as to what is going on here please:

use Encoding;

SysLog("debug", "1 - DEBUG LENGTH: " . length($Response));
my $unicode_chars = Encode::decode('utf8', $Response);
SysLog("debug", "** ENCODING: " . find_encoding($Response));
my $newunicode_chars = substr($unicode_chars, 0, -3);
my $Body = $newunicode_chars;

Log:
Nov  8 11:44:59 cache12 perl[44786]: DEBUG: 1 - DEBUG LENGTH: 1001 
Nov  8 11:44:59 cache12 perl[44786]: DEBUG: ** ENCODING:  
Nov  8 11:44:59 cache12 perl[44786]: DEBUG: 2 - DEBUG LENGTH: 998

The idea is to remove the last three characters from the string (.\r\n).
Now whilst it looks like it worked because the length is 3 less, the
encoding is entirely whacked.  The encoding at the beginning and the
encoding at the end are different.  find_encoding() does not state which
encoding is used on the string initially, yet are, apparently, more than
happy to decode it as utf8. When Perl now re-encodes the string as utf8,
it's completely whacked and the string just plain and simply is wrong and
the data does not match CRC checksums.

I know for a FACT that the initial data is encoded using UTF8. When I remove
the code to strip the last 3 characters (.\r\n) from the $Response
everything works absolutely fine.  Unfortunately, I *must* remove these last
three characters.

Can anyone perhaps please shed some light on the subject for me.  


0
savage
11/8/2014 9:53:44 AM
perl.beginners 29388 articles. 4 followers. Follow

6 Replies
1005 Views

Similar Articles

[PageSpeed] 8

Hi Chris,

On Sat, 8 Nov 2014 11:53:44 +0200
Chris Knipe <savage@savage.za.org> wrote:

> Hi All,
> 
> I'm reading loads, and loads of very confusing and contradicting information
> about UTF8 in Perl.  A lot of posts are also (rightfully IMHO) stating that
> UTF8 is an absolute nightmare in Perl. 
> 
> Can someone shed some light as to what is going on here please:
> 

Can you provide a self-contained, reproducing, example, with all best practices?

See http://shadow.cat/blog/matt-s-trout/show-us-the-whole-code/ .

Otherwise - your example is hard to reproduce.

Regards,

	Shlomi Fish

-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
Optimising Code for Speed - http://shlom.in/optimise

English spelling aims to be consistent. Publicly and methodically.

Please reply to list if it's a mailing list post - http://shlom.in/reply .
0
shlomif
11/8/2014 3:52:21 PM
Hi Chris,

On Sat, 8 Nov 2014 18:23:03 +0200
Chris Knipe <savage@savage.za.org> wrote:

> Hi,
> 
> Yes, sorry.  This is an entire mess :-(
> 
> I get the content correctly (I have confirmed that numerous times
> through various different ways), but printing the strings out via
> STDOUT messes things up.
> 
> Sorry for the attachment and all of that (I'm desperate, my entire
> company is currently offline due to this and I've been at it now for
> close to 24 hours), but the attached is a sample of the data I receive
> via a socket, and (to test) I have written it to disk:
> 
> open(my $out, '>:raw', $filename) or die "Unable to open: $!";
> binmode($out);
> print $out $Body;
> close($out);

It's hard to discern from your attachments what's going on. I got two files -
c108fca64135d6162654d1625cbbc063-D and test.bin that the UNIX "file" command
both tell me are "data". I don't see any self-contained, reproducing, code
anywhere. Make sure all these hold for it:

* http://perl-begin.org/tutorials/bad-elements/

* http://www.catb.org/~esr/faqs/smart-questions.html

* http://shadow.cat/blog/matt-s-trout/show-us-the-whole-code/

Perhaps you'd like to attach a .zip file or whatever with everything I need?

Regards,

	Shlomi Fish

> 
> That works.  I've used tools to verify the CRC of the yEnc, and the
> CRC matches fine.  So I do receive the data correctly but I am
> apparently not sending it correct from perl back out to the client via
> a print
> 
> When I print this out via STDOUT however (perl script that runs under
> xinetd) the data (output) is messed up completely.  I've tried playing
> with Encode (which breaks other things), Encoding, binmode... I'm at
> my wits end.   I change something and it works, and then almost by
> magic, a few hours later (or days) it just stops working again...
> 
> At this stage I am almost at the point where I'm prepared to have
> someone look at the code and fix it for me ($$$), this is really
> urgent and hurting us badly as a company.  I've been at this now for
> close to 24 hours so my eyes and brains may be overlooking things too
> I suppose.
> 
> If we say, $Output = one of the attached files,
> 
> Surely,
> 
> use utf;
> binmode (STDOUT) or binmode(STDOUT, ':utf8') or binmode(STDOUT,
> ':encoding(utf8)' (can't remember the exact encoding syntax right now)
> print STDOUT $Output;
> 
> Should work... But the data is simply not correct.
> 
> It acts as a proxy, so I receive a request, I collect the data from
> the remote server (capture the packets on the wire with tcpdump), I
> send the data to the client through perl connected to STDOUT (and
> capture it on the wire with tcpdump).
> 
> The entire encoding of the data received from the first tcpdump to the
> parent server, and the tcpdump to the client is WAY different...
> 
> 
> 
> 
> 
> 
> On Sat, Nov 8, 2014 at 5:52 PM, Shlomi Fish <shlomif@shlomifish.org> wrote:
> > Hi Chris,
> >
> > On Sat, 8 Nov 2014 11:53:44 +0200
> > Chris Knipe <savage@savage.za.org> wrote:
> >
> >> Hi All,
> >>
> >> I'm reading loads, and loads of very confusing and contradicting
> >> information about UTF8 in Perl.  A lot of posts are also (rightfully IMHO)
> >> stating that UTF8 is an absolute nightmare in Perl.
> >>
> >> Can someone shed some light as to what is going on here please:
> >>
> >
> > Can you provide a self-contained, reproducing, example, with all best
> > practices?
> >
> > See http://shadow.cat/blog/matt-s-trout/show-us-the-whole-code/ .
> >
> > Otherwise - your example is hard to reproduce.
> >
> > Regards,
> >
> >         Shlomi Fish
> >
> > --
> > -----------------------------------------------------------------
> > Shlomi Fish       http://www.shlomifish.org/
> > Optimising Code for Speed - http://shlom.in/optimise
> >
> > English spelling aims to be consistent. Publicly and methodically.
> >
> > Please reply to list if it's a mailing list post - http://shlom.in/reply .
> >
> > --
> > To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> > For additional commands, e-mail: beginners-help@perl.org
> > http://learn.perl.org/
> >
> >
> 
> 
> 



-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
What Makes Software Apps High Quality -  http://shlom.in/sw-quality

In the Technion, there are many ways to get from one place to the other, but
they are all the same length.

Please reply to list if it's a mailing list post - http://shlom.in/reply .
0
shlomif
11/8/2014 4:42:30 PM
--001a11c0e36e95c31f05075bdaf8
Content-Type: text/plain; charset=UTF-8

On 9 November 2014 05:42, Shlomi Fish <shlomif@shlomifish.org> wrote:

> > Should work... But the data is simply not correct.
> >
> > It acts as a proxy, so I receive a request, I collect the data from
> > the remote server (capture the packets on the wire with tcpdump), I
> > send the data to the client through perl connected to STDOUT (and
> > capture it on the wire with tcpdump).
> >
> > The entire encoding of the data received from the first tcpdump to the
> > parent server, and the tcpdump to the client is WAY different...
>

This segment here suggests you have a stream of data which is not all in a
single encoding, but perhaps, you have pure binary packets at some point in
the stream, with markers saying "500 bytes from here are utf8 bytes" or
something, followed by those utf8 encoded bytes.

Though I haven't fully understood the problem and I'm also tired, so my
tip  could be a red herring.

The good news is if you want to remove only 3 *bytes* from the string
instead of 3 *characters* then that could be straight forward.

And I believe ".\r\n" might be exactly 3 bytes regardless of unicode
magics. ( That is, depending on what you're doing you could get away
without the utf8 transformation, but I really don't know what I'm talking
about now )

-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL

--001a11c0e36e95c31f05075bdaf8
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">=
On 9 November 2014 05:42, Shlomi Fish <span dir=3D"ltr">&lt;<a href=3D"mail=
to:shlomif@shlomifish.org" target=3D"_blank">shlomif@shlomifish.org</a>&gt;=
</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex"><div id=3D":98g" class=3D"=
a3s" style=3D"overflow:hidden">&gt; Should work... But the data is simply n=
ot correct.<br>
&gt;<br>
&gt; It acts as a proxy, so I receive a request, I collect the data from<br=
>
&gt; the remote server (capture the packets on the wire with tcpdump), I<br=
>
&gt; send the data to the client through perl connected to STDOUT (and<br>
&gt; capture it on the wire with tcpdump).<br>
&gt;<br>
&gt; The entire encoding of the data received from the first tcpdump to the=
<br>
&gt; parent server, and the tcpdump to the client is WAY different...</div>=
</blockquote></div><br></div><div class=3D"gmail_extra">This segment here s=
uggests you have a stream of data which is not all in a single encoding, bu=
t perhaps, you have pure binary packets at some point in the stream, with m=
arkers saying &quot;500 bytes from here are utf8 bytes&quot; or something, =
followed by those utf8 encoded bytes.<br><br></div><div class=3D"gmail_extr=
a">Though I haven&#39;t fully understood the problem and I&#39;m also tired=
, so my tip=C2=A0 could be a red herring.<br></div><div class=3D"gmail_extr=
a"><br></div><div class=3D"gmail_extra">The good news is if you want to rem=
ove only 3 *bytes* from the string instead of 3 *characters* then that coul=
d be straight forward.<br><br></div><div class=3D"gmail_extra">And I believ=
e &quot;.\r\n&quot; might be exactly 3 bytes regardless of unicode magics. =
( That is, depending on what you&#39;re doing you could get away without th=
e utf8 transformation, but I really don&#39;t know what I&#39;m talking abo=
ut now )<br clear=3D"all"></div><div class=3D"gmail_extra"><br>-- <br><div =
class=3D"gmail_signature"><div dir=3D"ltr"><div>Kent<font size=3D"1"><b> <b=
r><br></b></font></div><div><span style=3D"color:rgb(204,204,204)"><font si=
ze=3D"1"><b>KENTNL</b> - <a href=3D"https://metacpan.org/author/KENTNL" tar=
get=3D"_blank">https://metacpan.org/author/KENTNL</a></font></span><br></di=
v><div><br></div></div></div>
</div></div>

--001a11c0e36e95c31f05075bdaf8--
0
kentfredric
11/8/2014 4:59:34 PM
Hi Kent,

> Though I haven't fully understood the problem and I'm also tired, so my tip
> could be a red herring.
>
> The good news is if you want to remove only 3 *bytes* from the string
> instead of 3 *characters* then that could be straight forward.
>
> And I believe ".\r\n" might be exactly 3 bytes regardless of unicode magics.
> ( That is, depending on what you're doing you could get away without the
> utf8 transformation, but I really don't know what I'm talking about now )

I agree with you - and it also explains what we are seeing in terms of
that certain data comes through clean, and others doesn't.  I too
expect that the *entire stream* is not encoded with UTF8 (even though
it should be).

In terms of removing the last three characters, that is not what is
causing the issue.  Even if I remove the substr and pass literally
$Body = $Response, the data is still corrupted once it goes out via
STDOUT.

What is also VERY strange to me is that for some reason when I just do
something simple like put a use Encoding into the script, everything
works fine.  Then half an hour later, or a day or two later, and it
stops working and starts becoming corrupt again.  And THIS is what has
my mind completely baffled.  I am the only one with access to the
servers (and code), and I am not even logged in on the machines when
this sudden "change" in behaviour happens.

This is however really urgent to me, and I am not by any means the
"expect" in terms of programming either.  As I've stated in one or two
private emails, I am willing to pay someone to look at the code and
fix this for us.  I'm not asking you to do my homework either - it's a
legitimate issue going on here in a semi big application (and it
worked fine for about a year and a half before all of a sudden just
acting up for some reason).  I won't be surprised if this is a OS
issue even.


-- 

Regards,
Chris Knipe
0
savage
11/8/2014 5:05:31 PM
Chris:

On Sat, Nov 8, 2014 at 12:05 PM, Chris Knipe <savage@savage.za.org> wrote:
> I agree with you - and it also explains what we are seeing in
> terms of that certain data comes through clean, and others
> doesn't.  I too expect that the *entire stream* is not encoded
> with UTF8 (even though it should be).
>
> In terms of removing the last three characters, that is not
> what is causing the issue.  Even if I remove the substr and
> pass literally $Body = $Response, the data is still corrupted
> once it goes out via STDOUT.
>
> What is also VERY strange to me is that for some reason when I
> just do something simple like put a use Encoding into the
> script, everything works fine.  Then half an hour later, or a
> day or two later, and it stops working and starts becoming
> corrupt again.  And THIS is what has my mind completely
> baffled.  I am the only one with access to the servers (and
> code), and I am not even logged in on the machines when this
> sudden "change" in behaviour happens.

It looks like this thread is quiet for 2 days so does that mean
that you have solved your problem off-list, given up/failed, or
it is still in the works?

It doesn't really look like anybody skimmed through the basics of
UTF-8 support in Perl in this thread. Whether or not you have
already done that yourself appears to be uncertain, but it seems
clear from the OP that you aren't comfortable with it and aren't
confident in your understanding of it. I'm not an expert at
Unicode programming, or Perl, or Unicode programming in Perl, but
I have done a little bit of basic UTF-8 handling and for the most
part understand the basics of the API.

A few basics:

# The utf8 pragma is only useful if you want to write Perl source
# code directly in UTF-8. Doesn't affect how you read or write to
# streams. It affects how the source code itself, including
# string literals, are understood.

use utf8;

# The Encode module contains the core API for dealing with
# various encodings.

use Encode;

# This assumes that $foo is a UTF-8 encoded byte string. To Perl
# $foo is just an 8-bit string. $bar becomes a proper Perl
# character-based string with all of the Unicode characters being
# understood by Perl.

my $bar = Encode::decode('utf8', $foo);

# This is obviously the reverse of the above. $bar is considered
# a Perl string where Perl properly understands each character.
# $foo becomes a binary byte-string containing UTF-8 encoded
# characters as sequences of bytes.

my $foo = Encode::encode('utf8', $bar);

__END__

Perl has this concept of its strings being flagged as "UTF-8". In
short, strings are normally not flagged as containing UTF-8 data,
and are assumed to be an 8-bit encoding (e.g., US_ASCII or
Latin1). When a string is decoded in Perl (i.e., Encode::decode)
Perl will decide if the string it is decoding needs to be
internally represented using a multi-byte encoding and
automatically does so. It is transparent to the programmer. In
such a case, that string would be internally flagged as being in
UTF-8, and future character-based operations on that string would
automatically take this into account for you. Visa-versa, when a
character-based string is "encoded" you are turning these magic
strings back into a fixed byte-based string of raw data (these
bytes may or may not be characters).

The thing that we programmers need to understand is that Perl
doesn't know what the encoding of data coming from the outside is
so we have to tell it how to interpret (i.e., decode) it. After
that it does everything automatically. Similarly, Perl doesn't
know what encoding the outside world can handle so we need to
tell it how to encode data whenever we share data with the
outside world (i.e., outside of our application).

Note also that the encoding 'utf8' is "magic" in the sense that
it is not strict in its interpretation of UTF-8. As far as I
know, invalid characters will be silently ignored or perhaps just
silently included as is or something along those lines. Check the
documentation to be sure. I prefer to enforce UTF-8 strictness.

If you know that your data is supposed to be *valid* UTF-8 then I
would consider changing those 'utf8' encodings to literally
'UTF-8' (case-sensitive, IIRC). The 'UTF-8' encoding is the
strict mode that will signal errors when invalid characters are
read. Perhaps that will shed some light on the issue. Or not...

What you really need is a specification for the data that you're
reading. If you don't know what you're reading then it's
basically impossible to properly read it.

Regards,


-- 
Brandon McCaig <bamccaig@gmail.com> <bamccaig@castopulence.org>
Castopulence Software <https://www.castopulence.org/>
Blog <http://www.bambams.ca/>
perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }.
q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.};
tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say'
0
bamccaig
11/10/2014 11:21:43 PM
Hi Brandon,

On Mon, 10 Nov 2014 18:21:43 -0500
Brandon McCaig <bamccaig@gmail.com> wrote:

> Chris:
>=20
> On Sat, Nov 8, 2014 at 12:05 PM, Chris Knipe <savage@savage.za.org> wrote:
> > I agree with you - and it also explains what we are seeing in
> > terms of that certain data comes through clean, and others
> > doesn't.  I too expect that the *entire stream* is not encoded
> > with UTF8 (even though it should be).
> >
> > In terms of removing the last three characters, that is not
> > what is causing the issue.  Even if I remove the substr and
> > pass literally $Body =3D $Response, the data is still corrupted
> > once it goes out via STDOUT.
> >
> > What is also VERY strange to me is that for some reason when I
> > just do something simple like put a use Encoding into the
> > script, everything works fine.  Then half an hour later, or a
> > day or two later, and it stops working and starts becoming
> > corrupt again.  And THIS is what has my mind completely
> > baffled.  I am the only one with access to the servers (and
> > code), and I am not even logged in on the machines when this
> > sudden "change" in behaviour happens.
>=20
> It looks like this thread is quiet for 2 days so does that mean
> that you have solved your problem off-list, given up/failed, or
> it is still in the works?
>=20

We have already solved the problem off-list. It turned out the problem was
elsewhere.

Regards,

	Shlomi Fish

--=20
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
List of Portability Libraries - http://shlom.in/port-libs

Fortran - there isn=E2=80=99t a way to do it... oh wait! Now there is.
    =E2=80=94 http://www.shlomifish.org/humour/ways_to_do_it.html

Please reply to list if it's a mailing list post - http://shlom.in/reply .
0
shlomif
11/11/2014 8:18:57 AM
Reply:

Similar Artilces:

utf8::upgrade,utf8::encode and utf8::is_utf8 on EBCDIC platform
Hi, This are the tetstcase i'm runing on EBCDIC platform, my $b = chr(0x0FF); $p=utf8::upgrade($b); print "\n$p"; utf8::upgarde returns the number of octets necessary to represent the string as UTF-X. EBCDIC output is 1 whereas ASCII platform output is 2. Is the return value i'm getting on EBCDIC is correct? my $c=chr(0x0FF); print "before $c\n"; print "\n"; utf8::encode($c); print "after $c\n"; print length($c); On ASCII before is single octet repsentation and after encode is two byte , length is 2. On EBCDIC it...

superreview granted: [Bug 229962] charset parameter in attached vCards should be utf-8, not utf8 : [Attachment 140011] change utf8 to uft-8
Scott MacGregor <mscott@mozilla.org> has granted Scott MacGregor <mscott@mozilla.org>'s request for superreview: Bug 229962: charset parameter in attached vCards should be utf-8, not utf8 http://bugzilla.mozilla.org/show_bug.cgi?id=229962 Attachment 140011: change utf8 to uft-8 http://bugzilla.mozilla.org/attachment.cgi?id=140011&action=edit ...

? should interpolating a utf8-encoded string preserve utf8ness?
Consider my $s = 's'; utf8::upgrade($s); my $b = ":$s:"; $b isn't in utf8. Should it? I suppose one can argue that it shouldn't matter externally. karl williamson wrote: >Subject: ? should interpolating a utf8-encoded string preserve utf8ness? Interpolation should have the freedom to do whatever is more convenient. If the programmer cares about the ultimate encoding of the string, ey should explicitly upgrade or downgrade the resulting string. -zefram On Mon, Dec 13, 2010 at 10:24:25AM -0700, karl williamson wrote: > Consider > >...

superreview canceled: [Bug 393246] Always encode query string values as UTF-8 (network.standard-url. encode-query-utf8 = true)
D=C3=A3o Gottwald <dao@mozilla.com> has canceled D=C3=A3o Gottwald <dao@moz= illa.com>'s request for superreview: Bug 393246: Always encode query string values as UTF-8 (network.standard-url.encode-query-utf8 =3D true) https://bugzilla.mozilla.org/show_bug.cgi?id=3D393246 Attachment 284307: flip the pref https://bugzilla.mozilla.org/attachment.cgi?id=3D284307&action=3Dedit= ...

superreview requested: [Bug 393246] Always encode query string values as UTF-8 (network.standard-url. encode-query-utf8 = true)
D=C3=A3o Gottwald <dao@mozilla.com> has asked Christian :Biesinger <cbiesinger@gmx.at> for superreview: Bug 393246: Always encode query string values as UTF-8 (network.standard-url.encode-query-utf8 =3D true) https://bugzilla.mozilla.org/show_bug.cgi?id=3D393246 Attachment 284307: flip the pref https://bugzilla.mozilla.org/attachment.cgi?id=3D284307&action=3Dedit ------- Additional Comments from D=C3=A3o Gottwald <dao@mozilla.com> I was told this would really fix bug 387723.= ...

utf8.pm and the utf8 namespace
Hi, utf8.pm's POD first says that you don't have to load the module in order to use its functions. It even has in B<bold> letters that you should only use the pragma if your source is in UTF-8. But later, it says: > Note that in the Perl 5.8.0 and 5.8.1 implementation the functions > utf8::is_utf8, utf8::valid, utf8::encode, utf8::decode, utf8::upgrade, > and utf8::downgrade are always available, without a C<require utf8> > statement-- this may change in future releases. May this really change in future releases? That'll break a lot of code...

use utf8; with bad utf8
Is this supposed to happen? perl -wle 'use utf8; %a = ("�"=>"sterling"); print ord foreach keys %a' Malformed UTF-8 character (2 bytes, need 3) at -e line 1. Possible unintended interpolation of @ܴ in string at -e line 1. Out of memory! [exit code was 1] The two characters in my malformed utf8 are 0xE1 0x80 [I believe. Meta-a Meta-space] Making my utf8 well formed (two meta spaces) and it's all happy, so that bit works. But I've no idea how the black magic in toke mixes with the utf8 black magic, so I don't know where to start on tr...

use utf8; <=> use encoding 'utf8';
Apart from the parser bug spotted earlier today, functionally (from the outside at least) and disregarding scoping issues, the following seem equivalent: use utf8; binmode( STDOUT,':utf8' ); and use encoding 'utf8'; The reason I tried the latter, was because the simple program: == simpleutf8 ======================================================== use utf8; my $string = <<EOD; élève EOD print $string; ====================================================================== produces the output: $ perl -w simpleutf8 ...

utf8
hi, I am trying to use perl's Net::LDAP module to manipulate data in eDirectory 8.6.2. We are located in Scandinavia and have many attributes that include utf8 characters. use utf8; use Net::LDAP; use Net::LDAP::LDIF; use Unicode::String qw(latin1 utf8); The following ldap search works fine, and prints output in the desired latin1 charset: $mesg = $ldap->search ( base => "o=org", filter => "(&(objectclass=user)(cn=$cn))" ); foreach $entry ($mesg->...

UTF8
Powerbuilder 703 10108 Is it possible to read data from a UTF txt. file and put data into a database table? If not. Will pb11 manage this? Roger Nyg�rd I would think you would need PowerBuilder 10 or higher since these are the Unicode aware versions and have capabilities to read and convert the different encodings. I would guess you could come up with a workaround using OLE to have third party component do the conversion. Anyone have any ideas or sample code. Doug Porter DailyAccess Corporation "Roger Nyg�rd" <roger@askit.no> wrote in message ne...

UTF8
--------------ms5D28ED689AFA9B1FF125206B Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Dear All,. Now i use perl to interface LDAP. But i have some problems that LDAP's data format is UTF8 but i want to convert UTF8 to ASCII. Do you know perl have function to convert its? If you know or you have a suggestion please tell to me. Regards,. P. Kumsaikaew ===================================================== Piyamart Kumsaikaew National Electronics and Computer Technology Center (NECTEC) Ministry of Science Technology and Environment, Tha...

UTF8
Does anybody know how to catch UTF8 characters coming in from a text box. I've been getting a lot of them from people cutting and pasting information. Barry Jones DATABUILT, Inc. The Global AEC Information Company 1476 Fording Island Rd. Bluffton, SC 29910 (843) 836-2166 office "Life is like a dogsled team; if you ain't the lead dog, the scenery never changes." - Lewis Grizzard Not sure what you mean by UTF8 characters. Do you mean those in the 128-255 range (corresponding to the high half of the ASCII set), such as the accented characters and so forth? ...

UTF8
Does anybody know how to catch UTF8 characters? Barry Jones DATABUILT, Inc. The Global AEC Information Company 1476 Fording Island Rd. Bluffton, SC 29910 (843) 836-2166 office "Life is like a dogsled team; if you ain't the lead dog, the scenery never changes." - Lewis Grizzard ...

utf8
Doing cross-compilation from Cross directory. miniperl already done. Now this error: "Can't locate unicore/PVA.pl in @INC" There isn't unicore/PVA.pl in the source. Can i build perl without utf8 support and how? On Sun, Nov 21, 2004 at 05:26:17PM +0200, gumbold <gumbold@bonbon.net> wrote: > Doing cross-compilation from Cross directory. > miniperl already done. > Now this error: > > "Can't locate unicore/PVA.pl in @INC" > > There isn't unicore/PVA.pl in the source. You appear to not be doing everything that ...

Web resources about - UTF8, UTF-8, utf8, Utf8 encoding blues - perl.beginners

Encoding (memory) - Wikipedia, the free encyclopedia
Visual, acoustic, and semantic encodings are the most intensively used. Other encodings are also used. Acoustic encoding is the encoding of auditory ...

Twitter image encoding challenge
If a picture's worth 1000 words, how much of a picture can you fit in 140 characters? Note : That's it folks! Bounty deadline is here, and after ...

【medical-news】Genetic Variation in NR1H4 Encoding the Bile Acid Receptor FXR - 医药生命科学动态跟踪 -丁香园论坛
Context: Bile acid signaling via farnesoid X receptor (FXR) regulates glucose and lipid levels, fat mass, and hepatic steatosis in animal models.Objective: ...

HandBrake Open Source video transcoder v0.10 released with hundreds of new features including H.265 and ...
... can be used for transcribing many different types of files/codecs to almost any other. Today’s headliner updates include H.265 and VP8 encoding. ...

CJK Type - CJK Fonts, Character Sets & Encodings. All CJK. All of the time.
As I wrote nearly a year ago , the Adobe-Identity-0 ROS is useful for building special-purpose fonts, especially CJK ones whose glyph coverage ...

Link Encoding Goes Mobile With Deep Links From Bitly
... Facebook and Google have been competing to bring the best solution for tracking deeplinks . This week, Bitly announced its own linking encoding ...

Encoding Articles - AppAdvice iPhone/iPad News
Latest Encoding Articles - AppAdvice iPhone/iPad News

AirMovie - Enjoy the videos in your PC anytime, anywhere with NO ENCODING!!
Holen Sie sich „AirMovie - Enjoy the videos in your PC anytime, anywhere with NO ENCODING!!“ im App Store. Sehen Sie sich Screenshots, Bewertungen ...

More tips on encoding video for Apple TV and iPod, from us to you
Apple tells video podcasters how to encode their content, which is also useful …

Handbrake 0.9.6 gives some, takes some encoding features
The Handbrake Project has announced an update to its open-source, cross-platform video transcoding utility. Handbrake 0.9.6 includes new and ...

Resources last updated: 1/10/2016 12:11:20 AM