[ID 20010920.003] 5.5.3 vs 5.6.1: A relust of split() changed when split's regexp has capture-()s

Hi !

Description
===========

A result of split() changed when split's regexp has capture-()s

5.005_03 - array elements for unmatched () was undef
5.6.1    - array elements for unmatched () is ''

Test case from real life - parsing HTML's &...; symbol-entity.

Test script source and perl -V below.


How to reproduce
================

../test -script 'aaa { bbb Ǐ ccc   ddd'

Output:

5.005_03                       5.6.1
---------                      ------
       
field 'aaa '                   field 'aaa '
$1 UN-defined                  $1 defined and is ''
$2 defined and is '123'        $2 defined and is '123'
$3 UN-defined                  $3 defined and is ''
                        
field ' bbb '                  field ' bbb '
$1 UN-defined                  $1 defined and is ''
$2 UN-defined                  $2 defined and is ''
$3 defined and is '01CF'       $3 defined and is '01CF'
                        
field ' ccc '                  field ' ccc '
$1 defined and is 'nbsp'       $1 defined and is 'nbsp'
$2 UN-defined                  $2 defined and is ''
$3 UN-defined                  $3 defined and is ''
                        
field ' ddd'                   field ' ddd'


Perl -V for 5.005_03
---------------------
Summary of my perl5 (5.0 patchlevel 5 subversion 3) configuration:
  Platform:
    osname=linux, osvers=2.2.10, archname=i586-linux
    uname='linux fatou 2.2.10 #2 smp thu jul 15 15:03:02 mest 1999 i686 unknown '
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef useperlio=undef d_sfio=undef
  Compiler:
    cc='cc', optimize='-O2 -pipe', gccversion=egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)
    cppflags='-Dbool=char -DHAS_BOOL -I/usr/local/include'
    ccflags ='-Dbool=char -DHAS_BOOL -I/usr/local/include'
    stdchar='char', d_stdstdio=undef, usevfork=false
    intsize=4, longsize=4, ptrsize=4, doublesize=8
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lc -lposix -lcrypt
    libc=, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'


Characteristics of this binary (from libperl): 
  Built under linux
  Compiled at Jul 22 1999 21:20:02
  @INC:
    /usr/lib/perl5/5.00503/i586-linux
    /usr/lib/perl5/5.00503
    /usr/lib/perl5/site_perl/5.005/i586-linux
    /usr/lib/perl5/site_perl/5.005
    .


Perl -V for 5.6.1
---------------------
Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration:
  Platform:
    osname=freebsd, osvers=4.3-release, archname=i386-freebsd-thread
    uname='freebsd serega.citycat.ru 4.3-release freebsd 4.3-release #0: mon jun 25 12:27:41 msd 2001
root@serega.citycat.ru:usrsrcsyscompileserega i386 '
    config_args='-d -Dusethreads -Duse5005threads'
    hint=previous, useposix=true, d_sigaction=define
    usethreads=define use5005threads=define useithreads=undef usemultiplicity=undef
    useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
  Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing -I/usr/local/include',
    optimize='-O',
    cppflags='-fno-strict-aliasing -I/usr/local/include'
    ccversion='', gccversion='2.95.3 [FreeBSD] 20010315 (release)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries:
    ld='cc', ldflags ='-pthread -Wl,-E  -L/usr/local/lib'
    libpth=/usr/lib /usr/local/lib
    libs=-lm -lc_r -lcrypt -lutil
    perllibs=-lm -lc_r -lcrypt -lutil
    libc=, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
    cccdlflags='-DPIC -fpic', lddlflags='-shared  -L/usr/local/lib'


Characteristics of this binary (from libperl): 
  Compile-time options: USE_THREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
  Built under freebsd
  Compiled at Jul  6 2001 19:15:27
  @INC:
    /usr/local/lib/perl5/5.6.1/i386-freebsd-thread
    /usr/local/lib/perl5/5.6.1
    /usr/local/lib/perl5/site_perl/5.6.1/i386-freebsd-thread
    /usr/local/lib/perl5/site_perl/5.6.1
    /usr/local/lib/perl5/site_perl
    .


Test script source:
----------------------
#!/usr/bin/perl

@A = split(/&(?:([a-zA-Z0-9]+)|#([0-9]+)|#[xX]([0-9a-hA-H]+));/go,@ARGV[0]);

while(@A)
 {
  print "field '",shift(@A),"'\n";

  last unless(@A);

  foreach (qw(1 2 3))
   {
    $a = shift(@A);
    print '$',$_,' ';
    if (defined $a)
     {
      print "defined and is '$a'\n";
     }
    else
     {
      print "UN-defined\n";
     }
   }
  print "\n";
 }
----------------------  

--------------------------------------
Pavel Yakovlev
mailto: hac@subscribe.ru ICQ 8085803
PPY-RIPN PY125-RIPE
--------------------------------------
"When God talks with me it's the miracle. When I talk with God it's the madness"


0
hac
9/20/2001 1:45:31 PM
perl.perl5.porters 48287 articles. 1 followers. Follow

1 Replies
756 Views

Similar Articles

[PageSpeed] 56
Get it on Google Play
Get it on Apple App Store

On Sep 20, hac said:

>A result of split() changed when split's regexp has capture-()s
>
>5.005_03 - array elements for unmatched () was undef
>5.6.1    - array elements for unmatched () is ''

I patched this for bleadperl about a month ago or so.  It'll be
"fixed" (it will back to the unmatched-()-returns-undef behavior) in the
next release of Perl.

-- 
Jeff "japhy" Pinyan      japhy@pobox.com      http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
** Look for "Regular Expressions in Perl" published by Manning, in 2002 **

0
jeffp
9/20/2001 3:20:38 PM
Reply:

Similar Artilces:

5.5.4 client with 5.5.5 xla's
Is it possible to run the 5.5.4 client with xla's from 5.5.5? We are running Win and Office XP and cannot use the 5.5.5 client due to issues with workflow. Can the 5.5.5 xla's be used and will they integrate better with Excel than the 5.5.4 ones? I don't think the XLA actually changed. -- Michael J. Bell Novell Support Connection Volunteer Sysop PLEASE: Do not e-mail me privately unless specifically asked. I'm a volunteer, not a Novell employee! All opinions and advice provided are MINE alone and NOT Novell's unless specifically identified ...

5.5.5 nlm's on 5.5.4 mta and poa
Dear All, I am thinking of upgrading my 5.5.4 nlm's to 5.5.5 nlm's. Basically I don't want to go thru a whole process of a normal upgrade. I know this will work as I have tested it. we are using a mixture of Netware 6 and 5 with NT/98 desktop/laptops. Does anyone know of any draw backs or problems I will have ? Many Thanks for any help. Varuni Please explain. You haven't really said anything. -- Michael J. Bell Novell Support Connection Volunteer Sysop Author of Guinevere (http://www.openhandhome.com) PLEASE: Do not e-mail me priv...

Update Thunderbird to 3.1.5 or Newer than 3.1.5 Version and accelerate Send And Receive Message' s Speed
Name: EliYah Hu Email: huelijahathotmaildotcom Product: Other (please state) Summary: Update Thunderbird to 3.1.5 or Newer than 3.1.5 Version and accelerate Send And Receive Message' s Speed Comments: Thunderbird 3.1's Send and Receive E-mail message speed is faster than thunerbird 1&2 but isn't the fastest, Would you please update and upgrade Thunderbird to version 3.1.5 or Version that is newer than 3.1.5 like 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4.0-4.5.0 or even 5.0 this versions Browser Details: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5....

What's the story with Tie::RefHash (5.6.1 is 'newer' than 5.7.2?)
I noticed that lib/Tie/RefHash.pm in perl 5.6.1 seems to be newer than than the one in perl-current that I fetch with rsync. I hadn't noticed it before, but when I run perl -e 'use CPAN; CPAN::Shell->r;' I'm now getting a report that perl 5.7.2's Tie::Refhash (dated Dec 6 2000,) is an older rev than the Apr 8 2001 version in perl 5.6.1 Another possibility is that there is (or was) an indexing problem with CPAN. Package namespace installed latest in CPAN file Tie::RefHash 1.21 1.3 G/GS/GSAR/perl-5.6.1.tar.gz ...

Update Thunderbird to 3.1.5 or Newer than 3.1.5 Version and accelerate Send And Receive Message' s Speed #2
Name: EliYah Hu Email: huelijahathotmaildotcom Product: Other (please state) Summary: Update Thunderbird to 3.1.5 or Newer than 3.1.5 Version and accelerate Send And Receive Message' s Speed Comments: when updating and upgrading thunderbird to 3.1.5 or newer version, Could you accelerate the speed its send and receive E-mail message speed to Fastest OR Extremely Fast? Browser Details: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET CLR 3.0.4506.2152; .NET CLR 3....

GroupWise 6.5 WebAccess with 5.5 POA's
The Caledonia manual states that you can configure WebAccess 6.5 to use a WebAccess 5.5 agent buy adding it to the 6.5's GroupWiseProvider WebAccess Agent list and making the PO's default WebAccess gateway point to the 5.5 agent. I have not been able to make this work.... has any one had luck setting this up? My login seems to still go to the 6.5 agent because I get a PO mismatch error. I have my 6.5 servlets pointed to both a 5.5ep agent and a 5.2 agent with the following two lines in my webacc.cfg: Provider.GWAP.Default.address.1=x.x.x.x:7205 (5.5ep) ...

Abend with GW 6.5 (2POA's) and NCS on NW 6.5.1
--____LPHMXLZMXOMRLFKSEJCW____ Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Hi, I posted this message in the NCS-Group but there qas no reply. so again.... Hi, we have an urgent Problem with NCS on NW 6.5 SP1.1 and GW . There are running 2 POA's and multiple times a week on Cluster node = abends with the message in sys$log.err: Removed address space because of memory protection violation Address Space: MAIL Reason: Page Fault, Insufficient rights to write to page Running Thread: Server 00:58 EIP: ...

6.5.5 upgrade fails; rollback to 6.5.1 causes S/MIME verification failure
My employer updated from 6.5.1 to 6.5.6. I was using 6.5.1 remotely in cache mode without problem. I installed the update and GW failed whenever I tried to open an email or filter email. Didn't matter if caching or online. Environment is XP Pro SP2 patched, Windows OneCare as firewall/AV. I don't know if that's relevant, don't care because I don't have time to resolve. Resolution was system restore to pre-install point. That works fine to get back to 6.5.1 except now every time I start up, I get a message that Verification of S/MIME component failed. I foun...

Use of 'cut' in regexps; speed in 5.5.3 v 5.5.63
Below is a program that investigates a problem in combinatorics of words; it spends most of its time checking for disallowed subsequences, so that for example if an earlier sequence includes '0120', later substrings must not match /0.*1.*2.*0/. I've added a termination point at length 400 to facilitate timing (which means I'm looking for sequences of up to 200 digits), and found two things to surprise me. Firstly, 5.005_03 runs nearly twice as fast as 5.005_63 for both the testcases; secondly, I had thought to get a minor speedup using the (?>...) 'cut' mec...

Can't get server's time .... Migration 5.1 to 6.5
Migration Wizard (latest version) 5.1 (secondary time server) to new hardware 6.5 (secondary time server) goes well until I get to the NDS/eDir Verification stage...I get message (Server IS1A "destination" won't return the correct time). The new server will not synchronize to the network. Time Sync is active on new server and I have it set to SECONDARY. I have run DSREPAIR but no help. I can login to the server from any workstation. Please help if possible as I am in the middle of the migration and stalled. On Fri, 08 Jul 2005 12:31:06 GMT, kruseb@manson-nw.k...

webaccess 6.5 incompatible w/ 5.5 po's
we're going to upgrade our gw5.5 system to gw6.5. we were suprised to find out that the 6.5 webaccess agent is not compatible w/ 5.5 po's (unless you have 5.5 ep sp3 which we don't). we have quite a few remote po's. upgrading them may take a few weeks. we're thinking about having a 6.5 webaccess agent running possibly on a windows server & the existing 5.5 webaccess agent running as is until the conversion is complete. any thoughts on whether this is a good idea or not? ie. how would we redirect the older po's to the 5.5 webaccess agent & the ne...

NetWare 6.0 vs 6.5 upgrade path Pro's and Con's
Hello, Forgive me for the double post. I just wanted to get maximum exposure and responses. I am the only NetWare 6 CNE in the IT Department where I work, and I have been asked by my supervisor to write a one page report on the best path for upgrade from NW 5.1. So he is asking me whether we should be upgrading each server individually to NW 6.0, or NW 6.5? We currently have already upgraded the Master Replica NetWare server and a Groupwise 5.5 EP2 GWIA/MTA NetWare Server to 6.0 SP3. But he wants to know if we should instead be upgrading the next 16 NW 5.1 servers to NW ...

NW 6.0 vs 6.5 for upgrade path Pro's and Con's
Hello, I am the only NetWare 6 CNE in the IT Department where I work, and I have been asked by my supervisor to write a one page report on the best path for upgrade from NW 5.1. So he is asking me whether we should be upgrading each server individually to NW 6.0, or NW 6.5? We currently have already upgraded the Master Replica NetWare server and a Groupwise 5.5 EP2 GWIA/MTA NetWare Server to 6.0 SP3. But he wants to know if we should instead be upgrading the next 16 NW 5.1 servers to NW 6.5? We tend to upgrade each individual server over a 24 month period, and so we are a...

[ID 20010730.038] constant item check broken in 5.6.1 (works in 5.003 and 5.6.0)
This is a bug report for perl from nick@burst.net, generated with the help of perlbug 1.33 running under perl v5.6.1. ----------------------------------------------------------------- [Please enter your report here] 1 line crasher with 5.6.1 (user,group)=split(//,$_); [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=core severity=high --- Site configuration information for perl v5.6.1: Configured by burst at Mon Jul 30 17:52:17 EDT 2001. Summary of my perl5 (revisio...