Oracle encoding query

--Sig_/QO_13dCNFdvWu7rezRC4cx=
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

I have an Oracle databse with NLS_CHARACTERSET US7ASCII

I have a table "land", like

create table land (
  c_land               number     (4) not null,
  zoekarg              varchar2   (5),
  land                 varchar2  (40),
  diac                 number     (1),
  d_in                 number     (8),
  d_end                number     (8),
  opm                  varchar2 (150),
  mut                  number     (1),
  icao                 varchar2   (3),
  land_u               varchar2  (80)
  );

In there the field "land" has the content *without* special characters,
like

  Zuidrhodesie

the special characters are stored elsewhere, indicated by the field diac

The field land_u contains the extended land *with* diacriticals, like

  Zuidrhodesi=C3=AB
             ^

in UTF-8 encoding. Note that this is possible because of US7ASCII

what is stored in the database is

  Zuidrhodesi\303\253

using perl to extract that makes displaying those values easy, but I
have no idea how I can get programs like SQL Developer to show that
content the way it is intended

Is there an oracle function I could use to convert byte-encode UTF-8
to something SQL developer "understands"

  select utf8_bytes_to_utf16_for_sql_developer (land_u) from land;

any hint is welcome. Currently SQL Developer will show

  Zuidrhodesi=EF=BF=BD=EF=BF=BD

where the trailing =EF=BF=BD's are both \x0fffd (\N{REPLACEMENT CHARACTER}),
which is not really helpful

I have been playing with several variants of

  select convert (land_u, 'AL16UTF16', 'UTF8') from land where c_land =3D 7=
072;

but I didn't get SQL Developer to show the =C3=AB

--=20
H.Merijn Brand  http://tux.nl   Perl Monger  http://amsterdam.pm.org/
using perl5.00307 .. 5.27   porting perl5 on HP-UX, AIX, and openSUSE
http://mirrors.develooper.com/hpux/        http://www.test-smoke.org/
http://qa.perl.org   http://www.goldmark.org/jeff/stupid-disclaimers/

--Sig_/QO_13dCNFdvWu7rezRC4cx=
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBAgAGBQJaKAO0AAoJEAOhR6E+XcCYGBYH/0CQHH5bKK17QE9Skt5FhQM0
laH2wzk6ae3nn4DeOKSOdCtFsEEBYuKJHrhuWg788O1cwtUGvwu+XNYvDmVMINxP
s1WhlfBNXkUhrg0WXhun4xTIJEiDY+tZgSyor/wryzw3I5YxISjyw/guIekYsp5X
E6dg1wWT3hQzOJ6kmPqQyEpBFr2IVUXkb9SPJ4sdOQShAEtUkjvuG09Rhj8t/lUA
RqSfzH394bRkXJ/XqEeMl+M2j6HwVPKc514ed4Kbx+uymZ2/doWxVmhEX4WEkBqR
rxQMbgW9vJTB8RmfUvgJzrPIbpItTi5zVp2SM/Ry9cwK2pwCdMJdJyN8cwMmw14=
=5fz1
-----END PGP SIGNATURE-----

--Sig_/QO_13dCNFdvWu7rezRC4cx=--
0
h
12/6/2017 2:50:11 PM
perl.dbi.users 11047 articles. 1 followers. Follow

4 Replies
7 Views

Similar Articles

[PageSpeed] 38

--Sig_/j/PSdt2pMRJECaJFNv6SL5D
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Wed, 6 Dec 2017 15:50:11 +0100, "H.Merijn Brand"
<h.m.brand@xs4all.nl> wrote:

> I have an Oracle databse with NLS_CHARACTERSET US7ASCII
>=20
> I have a table "land", like
>=20
> create table land (
>   c_land               number     (4) not null,
>   zoekarg              varchar2   (5),
>   land                 varchar2  (40),
>   diac                 number     (1),
>   d_in                 number     (8),
>   d_end                number     (8),
>   opm                  varchar2 (150),
>   mut                  number     (1),
>   icao                 varchar2   (3),
>   land_u               varchar2  (80)
>   );
>=20
> In there the field "land" has the content *without* special characters,
> like
>=20
>   Zuidrhodesie
>=20
> the special characters are stored elsewhere, indicated by the field diac
>=20
> The field land_u contains the extended land *with* diacriticals, like
>=20
>   Zuidrhodesi=C3=AB
>              ^
>=20
> in UTF-8 encoding. Note that this is possible because of US7ASCII
>=20
> what is stored in the database is
>=20
>   Zuidrhodesi\303\253
>=20
> using perl to extract that makes displaying those values easy, but I
> have no idea how I can get programs like SQL Developer to show that
> content the way it is intended
>=20
> Is there an oracle function I could use to convert byte-encode UTF-8
> to something SQL developer "understands"
>=20
>   select utf8_bytes_to_utf16_for_sql_developer (land_u) from land;
>=20
> any hint is welcome. Currently SQL Developer will show
>=20
>   Zuidrhodesi=EF=BF=BD=EF=BF=BD
>=20
> where the trailing =EF=BF=BD's are both \x0fffd (\N{REPLACEMENT CHARACTER=
}),
> which is not really helpful
>=20
> I have been playing with several variants of
>=20
>   select convert (land_u, 'AL16UTF16', 'UTF8') from land where c_land =3D=
 7072;
>=20
> but I didn't get SQL Developer to show the =C3=AB

 select utl_encode.text_encode (land_u) from land where c_land =3D 7072;

=3D>

 Zuidrhodesi=3DC3=3DAB

so SQL developer *does* see the individual bytes as they are stored

I can get the expected display with

 select utl_i18n.raw_to_nchar (utl_i18n.string_to_raw (land_u), 'utf8') fro=
m land where c_land =3D 7072;

which experiences as rather overcomplicated, esp if I need to do this
for all _u fields in the query :(

--=20
H.Merijn Brand  http://tux.nl   Perl Monger  http://amsterdam.pm.org/
using perl5.00307 .. 5.27   porting perl5 on HP-UX, AIX, and openSUSE
http://mirrors.develooper.com/hpux/        http://www.test-smoke.org/
http://qa.perl.org   http://www.goldmark.org/jeff/stupid-disclaimers/

--Sig_/j/PSdt2pMRJECaJFNv6SL5D
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBAgAGBQJaKAj7AAoJEAOhR6E+XcCY0LUH/jZ8t/LLcKYweyCNnP370Kl1
hkT2QvXchAoBAwkDUUZqrsz0d7WdVnD/lGEB9zT6+cvyI5QM1VABjCwJvhzfVXR4
/Q7g1dd37q+8UQ3wZrOOAEhyBv3juus0lKhm5eivFnDUSAQJ6jkQyZR9N3XychXE
16R1W1GKZG2tRZ4LAsMWrweRPvMtZeZOos/YfNxhLF2oLv7uKwDfrcrPlHPZakID
wauDpH2pQTsSHhGJrejWpXtzYYCzgJAZmj4N3Xs+FAone3SWdMVwAx3q9DAf4SWP
npS6YSHw2znhXFxKQz5wLSybmPMrp7wflpq5zglNyl/gROhqXF0MLPHVCN3L4M8=
=qdcB
-----END PGP SIGNATURE-----

--Sig_/j/PSdt2pMRJECaJFNv6SL5D--
0
h
12/6/2017 3:12:53 PM
On Wed, Dec 06, 2017 at 04:12:53PM +0100, H.Merijn Brand wrote:
> On Wed, 6 Dec 2017 15:50:11 +0100, "H.Merijn Brand"
> >=20
> > I have been playing with several variants of
> >=20
> >   select convert (land_u, 'AL16UTF16', 'UTF8') from land where c_land=
 =3D 7072;
> >=20
> > but I didn't get SQL Developer to show the =EB
>=20
>  select utl_encode.text_encode (land_u) from land where c_land =3D 7072=
;
>=20
> =3D>
>=20
>  Zuidrhodesi=3DC3=3DAB
>=20
> so SQL developer *does* see the individual bytes as they are stored
>=20
> I can get the expected display with
>=20
>  select utl_i18n.raw_to_nchar (utl_i18n.string_to_raw (land_u), 'utf8')=
 from land where c_land =3D 7072;
>=20
> which experiences as rather overcomplicated, esp if I need to do this
> for all _u fields in the query :(

A view might be useful. Or perhaps define your own function to wrap the e=
xpression.

Tim.
0
Tim
12/6/2017 3:34:17 PM
--Sig_/fySbRug/RjqINrMRfEE9FCW
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Wed, 6 Dec 2017 15:34:17 +0000, Tim Bunce <Tim.Bunce@pobox.com>
wrote:

> On Wed, Dec 06, 2017 at 04:12:53PM +0100, H.Merijn Brand wrote:
> > On Wed, 6 Dec 2017 15:50:11 +0100, "H.Merijn Brand" =20
> > >=20
> > > I have been playing with several variants of
> > >=20
> > >   select convert (land_u, 'AL16UTF16', 'UTF8') from land where c_land=
 =3D 7072;
> > >=20
> > > but I didn't get SQL Developer to show the =C3=AB =20
> >=20
> >  select utl_encode.text_encode (land_u) from land where c_land =3D 7072;
> >  =20
> > =3D> =20
> >=20
> >  Zuidrhodesi=3DC3=3DAB
> >=20
> > so SQL developer *does* see the individual bytes as they are stored
> >=20
> > I can get the expected display with
> >=20
> >  select utl_i18n.raw_to_nchar (utl_i18n.string_to_raw (land_u), 'utf8')=
 from land where c_land =3D 7072;
> >=20
> > which experiences as rather overcomplicated, esp if I need to do this
> > for all _u fields in the query :( =20
>=20
> A view might be useful. Or perhaps define your own function to wrap the e=
xpression.
>=20
> Tim.

That worked very well!

CREATE OR REPLACE FUNCTION diac_u (v VARCHAR2) RETURN NCHAR IS
    BEGIN
    RETURN utl_i18n.raw_to_nchar (utl_i18n.string_to_raw (v), 'utf8');
    END;
/

select diac_u (land_u) from land where c_land =3D 7072;

=E2=87=92

Zuidrhodesi=C3=AB

Thank you!

--=20
H.Merijn Brand  http://tux.nl   Perl Monger  http://amsterdam.pm.org/
using perl5.00307 .. 5.27   porting perl5 on HP-UX, AIX, and openSUSE
http://mirrors.develooper.com/hpux/        http://www.test-smoke.org/
http://qa.perl.org   http://www.goldmark.org/jeff/stupid-disclaimers/

--Sig_/fySbRug/RjqINrMRfEE9FCW
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBAgAGBQJaKBKLAAoJEAOhR6E+XcCY8jIIAKut485VRQkSJZebCHKAhGeY
aZ5MszgvgDpJBHCYgidjANvJb99BIzilvoP4+ldUwvempneguXo2BmiVvY8jiyH+
/+MKoCiF2JqrpkazNHIn8D/EAloTgN5jyUC7ef6p0rqPWuN1lA5UeHewM8kKPDJI
1ZWyV6ph91b3Bu4xqh2WcglpWHpkIrbohlQ2HcMQ+dhVCzD6DLQQ6G9OcMJWN50m
93N4swoBK56wM6gnvdlRP64fH7dAD2XY2HgJZCW580vyO3byluNyNzC8KF2PZ9lR
aaBd4HmfZI+1OJAqoEnVPTN/PnKfdJwl+jyaMFpw6Gy91whvNMyOy3oEgK38YXo=
=rIyb
-----END PGP SIGNATURE-----

--Sig_/fySbRug/RjqINrMRfEE9FCW--
0
h
12/6/2017 3:53:36 PM
On Wed, Dec 06, 2017 at 04:53:36PM +0100, H.Merijn Brand wrote:
> On Wed, 6 Dec 2017 15:34:17 +0000, Tim Bunce <Tim.Bunce@pobox.com>
> wrote:
> 
> > A view might be useful. Or perhaps define your own function to wrap the expression.
> 
> That worked very well!
> 
> CREATE OR REPLACE FUNCTION diac_u (v VARCHAR2) RETURN NCHAR IS
>     BEGIN
>     RETURN utl_i18n.raw_to_nchar (utl_i18n.string_to_raw (v), 'utf8');
>     END;
> /
> 
> select diac_u (land_u) from land where c_land = 7072;

You might want to add DETERMINISTIC before the IS so the function can be
used more efficiently in some cases.

Tim.
0
Tim
12/11/2017 1:15:39 PM
Reply: