binary test and position?

Hi All,

On a type Buf, what do I use to check for the
position of a byte pattern?


Many thanks,
-T
0
perl6
2/2/2019 3:22:13 AM
perl.perl6.users 1146 articles. 0 followers. Follow

20 Replies
28 Views

Similar Articles

[PageSpeed] 49

This would work:

    my $b = Buf.new( 0,0,0, 1, 2, 0 );
    my $match = Buf.new( 1, 2 );

    $b.rotor( $match.elems => 1 - $match.elems ).grep(* eqv $match.List, :k)

If you only need the first one, swap out `grep` for `first`

Another iffy option is to decode it as latin1

    $b.decode('latin1').index($match.decode('latin1'))

On Fri, Feb 1, 2019 at 9:22 PM ToddAndMargo via perl6-users
<perl6-users@perl.org> wrote:
>
> Hi All,
>
> On a type Buf, what do I use to check for the
> position of a byte pattern?
>
>
> Many thanks,
> -T
0
b2gills
2/2/2019 3:35:53 AM
On 2/1/19 7:22 PM, ToddAndMargo via perl6-users wrote:
> Hi All,
> 
> On a type Buf, what do I use to check for the
> position of a byte pattern?
> 
> 
> Many thanks,
> -T


Basically, what am I doing wrong here?

$ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b= 
$handle.read(5); say $b; say $b[2..4];; if ( $b[2..4] eq 0x90,0x00,0x04 
) {say "y";} else {say "n"}; $handle.close;'
Buf[uint8]:0x<4D 5A 90 00 03>
(144 0 3)
y


I am testing to see if the pattern 0x90 0x00 0x04 exists,
which is does not.
0
perl6
2/2/2019 3:37:27 AM
 >
 > On Fri, Feb 1, 2019 at 9:22 PM ToddAndMargo via perl6-users
 > <perl6-users@perl.org> wrote:
 >>
 >> Hi All,
 >>
 >> On a type Buf, what do I use to check for the
 >> position of a byte pattern?
 >>
 >>
 >> Many thanks,
 >> -T

On 2/1/19 7:35 PM, Brad Gilbert wrote:
> This would work:
> 
>      my $b = Buf.new( 0,0,0, 1, 2, 0 );
>      my $match = Buf.new( 1, 2 );
> 
>      $b.rotor( $match.elems => 1 - $match.elems ).grep(* eqv $match.List, :k)
> 
> If you only need the first one, swap out `grep` for `first`
> 
> Another iffy option is to decode it as latin1
> 
>      $b.decode('latin1').index($match.decode('latin1'))

I am trying to avoid decoding.

match is giving me a fit about not being an Any:


$ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b= 
$handle.read(5); say $b; say $b[2..4]; if ( $b.match( 0x90,0x00,0x04 ) ) 
{say "y";} else {say "n"}; $handle.close;'

Buf[uint8]:0x<4D 5A 90 00 03>
(144 0 3)
Invocant of method 'match' must be a type object of type 'Any', not an 
object instance of type 'Buf[uint8]'.  Did you forget a 'multi'?
   in block <unit> at -e line 1





-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Computers are like air conditioners.
They malfunction when you open windows
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0
perl6
2/2/2019 3:45:05 AM
On 2/1/19 7:37 PM, ToddAndMargo via perl6-users wrote:
> On 2/1/19 7:22 PM, ToddAndMargo via perl6-users wrote:
>> Hi All,
>>
>> On a type Buf, what do I use to check for the
>> position of a byte pattern?
>>
>>
>> Many thanks,
>> -T
> 
> 
> Basically, what am I doing wrong here?
> 
> $ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b= 
> $handle.read(5); say $b; say $b[2..4];; if ( $b[2..4] eq 0x90,0x00,0x04 
> ) {say "y";} else {say "n"}; $handle.close;'
> Buf[uint8]:0x<4D 5A 90 00 03>
> (144 0 3)
> y
> 
> 
> I am testing to see if the pattern 0x90 0x00 0x04 exists,
> which is does not.


Okya,  no error now, but the WRONG answer:

$ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b= 
$handle.read(5); say $b; if ( $b[2..4] == Buf.new(0x90,0x00,0x04) ) {say 
"y";} else {say "n"}; $handle.close;'

Buf[uint8]:0x<4D 5A 90 00 03>
y




-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Computers are like air conditioners.
They malfunction when you open windows
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0
perl6
2/2/2019 3:50:37 AM
`eq` is string equality
`==` is numeric equality

a Buf is neither.

You want `eqv` (equivalent)

    $b[2..4] eqv (0x90,0x00,0x04)

On Fri, Feb 1, 2019 at 9:37 PM ToddAndMargo via perl6-users
<perl6-users@perl.org> wrote:
>
> On 2/1/19 7:22 PM, ToddAndMargo via perl6-users wrote:
> > Hi All,
> >
> > On a type Buf, what do I use to check for the
> > position of a byte pattern?
> >
> >
> > Many thanks,
> > -T
>
>
> Basically, what am I doing wrong here?
>
> $ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b=
> $handle.read(5); say $b; say $b[2..4];; if ( $b[2..4] eq 0x90,0x00,0x04
> ) {say "y";} else {say "n"}; $handle.close;'
> Buf[uint8]:0x<4D 5A 90 00 03>
> (144 0 3)
> y
>
>
> I am testing to see if the pattern 0x90 0x00 0x04 exists,
> which is does not.
0
b2gills
2/2/2019 3:57:27 AM
 > On Fri, Feb 1, 2019 at 9:37 PM ToddAndMargo via perl6-users
 > <perl6-users@perl.org> wrote:
 >>
 >> On 2/1/19 7:22 PM, ToddAndMargo via perl6-users wrote:
 >>> Hi All,
 >>>
 >>> On a type Buf, what do I use to check for the
 >>> position of a byte pattern?
 >>>
 >>>
 >>> Many thanks,
 >>> -T
 >>
 >>
 >> Basically, what am I doing wrong here?
 >>
 >> $ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b=
 >> $handle.read(5); say $b; say $b[2..4];; if ( $b[2..4] eq 0x90,0x00,0x04
 >> ) {say "y";} else {say "n"}; $handle.close;'
 >> Buf[uint8]:0x<4D 5A 90 00 03>
 >> (144 0 3)
 >> y
 >>
 >>
 >> I am testing to see if the pattern 0x90 0x00 0x04 exists,
 >> which is does not.


On 2/1/19 7:57 PM, Brad Gilbert wrote:
> `eq` is string equality
> `==` is numeric equality
> 
> a Buf is neither.
> 
> You want `eqv` (equivalent)
> 
>      $b[2..4] eqv (0x90,0x00,0x04)
> 

That was it.  Thank you!

$ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b= 
$handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x03) ) {say "y";} 
else {say "n"}; $handle.close;'
Buf[uint8]:0x<4D 5A 90 00 03>
y

$ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b= 
$handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x04) ) {say "y";} 
else {say "n"}; $handle.close;'
Buf[uint8]:0x<4D 5A 90 00 03>
n





-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Computers are like air conditioners.
They malfunction when you open windows
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0
perl6
2/2/2019 4:03:25 AM
On 2/1/19 8:03 PM, ToddAndMargo via perl6-users wrote:
>  > On Fri, Feb 1, 2019 at 9:37 PM ToddAndMargo via perl6-users
>  > <perl6-users@perl.org> wrote:
>  >>
>  >> On 2/1/19 7:22 PM, ToddAndMargo via perl6-users wrote:
>  >>> Hi All,
>  >>>
>  >>> On a type Buf, what do I use to check for the
>  >>> position of a byte pattern?
>  >>>
>  >>>
>  >>> Many thanks,
>  >>> -T
>  >>
>  >>
>  >> Basically, what am I doing wrong here?
>  >>
>  >> $ p6 'my $handle=3Dopen("filever.exe", :bin, :ro); my Buf $b; $b=3D=

>  >> $handle.read(5); say $b; say $b[2..4];; if ( $b[2..4] eq 0x90,0x00,=
0x04
>  >> ) {say "y";} else {say "n"}; $handle.close;'
>  >> Buf[uint8]:0x<4D 5A 90 00 03>
>  >> (144 0 3)
>  >> y
>  >>
>  >>
>  >> I am testing to see if the pattern 0x90 0x00 0x04 exists,
>  >> which is does not.
>=20
>=20
> On 2/1/19 7:57 PM, Brad Gilbert wrote:
>> `eq` is string equality
>> `=3D=3D` is numeric equality
>>
>> a Buf is neither.
>>
>> You want `eqv` (equivalent)
>>
>> =C2=A0=C2=A0=C2=A0=C2=A0 $b[2..4] eqv (0x90,0x00,0x04)
>>
>=20
> That was it.=C2=A0 Thank you!
>=20
> $ p6 'my $handle=3Dopen("filever.exe", :bin, :ro); my Buf $b; $b=3D=20
> $handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x03) ) {say "y";=
}=20
> else {say "n"}; $handle.close;'
> Buf[uint8]:0x<4D 5A 90 00 03>
> y
>=20
> $ p6 'my $handle=3Dopen("filever.exe", :bin, :ro); my Buf $b; $b=3D=20
> $handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x04) ) {say "y";=
}=20
> else {say "n"}; $handle.close;'
> Buf[uint8]:0x<4D 5A 90 00 03>
> n


How do I find the position of a pattern in a Buf?
0
perl6
2/2/2019 4:07:14 AM
On 2/1/19 8:07 PM, ToddAndMargo via perl6-users wrote:
> On 2/1/19 8:03 PM, ToddAndMargo via perl6-users wrote:
>> =C2=A0> On Fri, Feb 1, 2019 at 9:37 PM ToddAndMargo via perl6-users
>> =C2=A0> <perl6-users@perl.org> wrote:
>> =C2=A0>>
>> =C2=A0>> On 2/1/19 7:22 PM, ToddAndMargo via perl6-users wrote:
>> =C2=A0>>> Hi All,
>> =C2=A0>>>
>> =C2=A0>>> On a type Buf, what do I use to check for the
>> =C2=A0>>> position of a byte pattern?
>> =C2=A0>>>
>> =C2=A0>>>
>> =C2=A0>>> Many thanks,
>> =C2=A0>>> -T
>> =C2=A0>>
>> =C2=A0>>
>> =C2=A0>> Basically, what am I doing wrong here?
>> =C2=A0>>
>> =C2=A0>> $ p6 'my $handle=3Dopen("filever.exe", :bin, :ro); my Buf $b;=
 $b=3D
>> =C2=A0>> $handle.read(5); say $b; say $b[2..4];; if ( $b[2..4] eq=20
>> 0x90,0x00,0x04
>> =C2=A0>> ) {say "y";} else {say "n"}; $handle.close;'
>> =C2=A0>> Buf[uint8]:0x<4D 5A 90 00 03>
>> =C2=A0>> (144 0 3)
>> =C2=A0>> y
>> =C2=A0>>
>> =C2=A0>>
>> =C2=A0>> I am testing to see if the pattern 0x90 0x00 0x04 exists,
>> =C2=A0>> which is does not.
>>
>>
>> On 2/1/19 7:57 PM, Brad Gilbert wrote:
>>> `eq` is string equality
>>> `=3D=3D` is numeric equality
>>>
>>> a Buf is neither.
>>>
>>> You want `eqv` (equivalent)
>>>
>>> =C2=A0=C2=A0=C2=A0=C2=A0 $b[2..4] eqv (0x90,0x00,0x04)
>>>
>>
>> That was it.=C2=A0 Thank you!
>>
>> $ p6 'my $handle=3Dopen("filever.exe", :bin, :ro); my Buf $b; $b=3D=20
>> $handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x03) ) {say=20
>> "y";} else {say "n"}; $handle.close;'
>> Buf[uint8]:0x<4D 5A 90 00 03>
>> y
>>
>> $ p6 'my $handle=3Dopen("filever.exe", :bin, :ro); my Buf $b; $b=3D=20
>> $handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x04) ) {say=20
>> "y";} else {say "n"}; $handle.close;'
>> Buf[uint8]:0x<4D 5A 90 00 03>
>> n
>=20
>=20
> How do I find the position of a pattern in a Buf?

Need `pos` for Buf
0
perl6
2/2/2019 4:26:10 AM
On 2/1/19 8:26 PM, ToddAndMargo via perl6-users wrote:
> On 2/1/19 8:07 PM, ToddAndMargo via perl6-users wrote:
>> On 2/1/19 8:03 PM, ToddAndMargo via perl6-users wrote:
>>> =C2=A0> On Fri, Feb 1, 2019 at 9:37 PM ToddAndMargo via perl6-users
>>> =C2=A0> <perl6-users@perl.org> wrote:
>>> =C2=A0>>
>>> =C2=A0>> On 2/1/19 7:22 PM, ToddAndMargo via perl6-users wrote:
>>> =C2=A0>>> Hi All,
>>> =C2=A0>>>
>>> =C2=A0>>> On a type Buf, what do I use to check for the
>>> =C2=A0>>> position of a byte pattern?
>>> =C2=A0>>>
>>> =C2=A0>>>
>>> =C2=A0>>> Many thanks,
>>> =C2=A0>>> -T
>>> =C2=A0>>
>>> =C2=A0>>
>>> =C2=A0>> Basically, what am I doing wrong here?
>>> =C2=A0>>
>>> =C2=A0>> $ p6 'my $handle=3Dopen("filever.exe", :bin, :ro); my Buf $b=
; $b=3D
>>> =C2=A0>> $handle.read(5); say $b; say $b[2..4];; if ( $b[2..4] eq=20
>>> 0x90,0x00,0x04
>>> =C2=A0>> ) {say "y";} else {say "n"}; $handle.close;'
>>> =C2=A0>> Buf[uint8]:0x<4D 5A 90 00 03>
>>> =C2=A0>> (144 0 3)
>>> =C2=A0>> y
>>> =C2=A0>>
>>> =C2=A0>>
>>> =C2=A0>> I am testing to see if the pattern 0x90 0x00 0x04 exists,
>>> =C2=A0>> which is does not.
>>>
>>>
>>> On 2/1/19 7:57 PM, Brad Gilbert wrote:
>>>> `eq` is string equality
>>>> `=3D=3D` is numeric equality
>>>>
>>>> a Buf is neither.
>>>>
>>>> You want `eqv` (equivalent)
>>>>
>>>> =C2=A0=C2=A0=C2=A0=C2=A0 $b[2..4] eqv (0x90,0x00,0x04)
>>>>
>>>
>>> That was it.=C2=A0 Thank you!
>>>
>>> $ p6 'my $handle=3Dopen("filever.exe", :bin, :ro); my Buf $b; $b=3D=20
>>> $handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x03) ) {say=20
>>> "y";} else {say "n"}; $handle.close;'
>>> Buf[uint8]:0x<4D 5A 90 00 03>
>>> y
>>>
>>> $ p6 'my $handle=3Dopen("filever.exe", :bin, :ro); my Buf $b; $b=3D=20
>>> $handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x04) ) {say=20
>>> "y";} else {say "n"}; $handle.close;'
>>> Buf[uint8]:0x<4D 5A 90 00 03>
>>> n
>>
>>
>> How do I find the position of a pattern in a Buf?
>=20
> Need `pos` for Buf

Actually I do believe it is the binary equivalent of `index` I
am looking for

--=20
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Computers are like air conditioners.
They malfunction when you open windows
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0
perl6
2/2/2019 5:01:21 AM
    sub buf-index ( Buf $buf, +@match ) {
        my $elems = @match.elems;
        $buf.rotor( $elems => 1 - $elems ).first(* eqv @match.List, :k)
    }

    my $buf = Buf[uint8].new(0x4D, 0x5A, 0x90, 0x00, 0x03);

    say buf-index( $buf, (0x90, 0x00, 0x03)); # 2

On Fri, Feb 1, 2019 at 11:02 PM ToddAndMargo via perl6-users
<perl6-users@perl.org> wrote:
>
> On 2/1/19 8:26 PM, ToddAndMargo via perl6-users wrote:
> > On 2/1/19 8:07 PM, ToddAndMargo via perl6-users wrote:
> >> On 2/1/19 8:03 PM, ToddAndMargo via perl6-users wrote:
> >>>  > On Fri, Feb 1, 2019 at 9:37 PM ToddAndMargo via perl6-users
> >>>  > <perl6-users@perl.org> wrote:
> >>>  >>
> >>>  >> On 2/1/19 7:22 PM, ToddAndMargo via perl6-users wrote:
> >>>  >>> Hi All,
> >>>  >>>
> >>>  >>> On a type Buf, what do I use to check for the
> >>>  >>> position of a byte pattern?
> >>>  >>>
> >>>  >>>
> >>>  >>> Many thanks,
> >>>  >>> -T
> >>>  >>
> >>>  >>
> >>>  >> Basically, what am I doing wrong here?
> >>>  >>
> >>>  >> $ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b=
> >>>  >> $handle.read(5); say $b; say $b[2..4];; if ( $b[2..4] eq
> >>> 0x90,0x00,0x04
> >>>  >> ) {say "y";} else {say "n"}; $handle.close;'
> >>>  >> Buf[uint8]:0x<4D 5A 90 00 03>
> >>>  >> (144 0 3)
> >>>  >> y
> >>>  >>
> >>>  >>
> >>>  >> I am testing to see if the pattern 0x90 0x00 0x04 exists,
> >>>  >> which is does not.
> >>>
> >>>
> >>> On 2/1/19 7:57 PM, Brad Gilbert wrote:
> >>>> `eq` is string equality
> >>>> `==` is numeric equality
> >>>>
> >>>> a Buf is neither.
> >>>>
> >>>> You want `eqv` (equivalent)
> >>>>
> >>>>      $b[2..4] eqv (0x90,0x00,0x04)
> >>>>
> >>>
> >>> That was it.  Thank you!
> >>>
> >>> $ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b=
> >>> $handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x03) ) {say
> >>> "y";} else {say "n"}; $handle.close;'
> >>> Buf[uint8]:0x<4D 5A 90 00 03>
> >>> y
> >>>
> >>> $ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b=
> >>> $handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x04) ) {say
> >>> "y";} else {say "n"}; $handle.close;'
> >>> Buf[uint8]:0x<4D 5A 90 00 03>
> >>> n
> >>
> >>
> >> How do I find the position of a pattern in a Buf?
> >
> > Need `pos` for Buf
>
> Actually I do believe it is the binary equivalent of `index` I
> am looking for
>
> --
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Computers are like air conditioners.
> They malfunction when you open windows
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0
b2gills
2/2/2019 2:09:18 PM
 > On Fri, Feb 1, 2019 at 11:02 PM ToddAndMargo via perl6-users> 
<perl6-users@perl.org> wrote:
 >>
 >> On 2/1/19 8:26 PM, ToddAndMargo via perl6-users wrote:
 >>> On 2/1/19 8:07 PM, ToddAndMargo via perl6-users wrote:
 >>>> On 2/1/19 8:03 PM, ToddAndMargo via perl6-users wrote:
 >>>>>   > On Fri, Feb 1, 2019 at 9:37 PM ToddAndMargo via perl6-users
 >>>>>   > <perl6-users@perl.org> wrote:
 >>>>>   >>
 >>>>>   >> On 2/1/19 7:22 PM, ToddAndMargo via perl6-users wrote:
 >>>>>   >>> Hi All,
 >>>>>   >>>
 >>>>>   >>> On a type Buf, what do I use to check for the
 >>>>>   >>> position of a byte pattern?
 >>>>>   >>>
 >>>>>   >>>
 >>>>>   >>> Many thanks,
 >>>>>   >>> -T
 >>>>>   >>
 >>>>>   >>
 >>>>>   >> Basically, what am I doing wrong here?
 >>>>>   >>
 >>>>>   >> $ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b=
 >>>>>   >> $handle.read(5); say $b; say $b[2..4];; if ( $b[2..4] eq
 >>>>> 0x90,0x00,0x04
 >>>>>   >> ) {say "y";} else {say "n"}; $handle.close;'
 >>>>>   >> Buf[uint8]:0x<4D 5A 90 00 03>
 >>>>>   >> (144 0 3)
 >>>>>   >> y
 >>>>>   >>
 >>>>>   >>
 >>>>>   >> I am testing to see if the pattern 0x90 0x00 0x04 exists,
 >>>>>   >> which is does not.
 >>>>>
 >>>>>
 >>>>> On 2/1/19 7:57 PM, Brad Gilbert wrote:
 >>>>>> `eq` is string equality
 >>>>>> `==` is numeric equality
 >>>>>>
 >>>>>> a Buf is neither.
 >>>>>>
 >>>>>> You want `eqv` (equivalent)
 >>>>>>
 >>>>>>       $b[2..4] eqv (0x90,0x00,0x04)
 >>>>>>
 >>>>>
 >>>>> That was it.  Thank you!
 >>>>>
 >>>>> $ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b=
 >>>>> $handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x03) ) {say
 >>>>> "y";} else {say "n"}; $handle.close;'
 >>>>> Buf[uint8]:0x<4D 5A 90 00 03>
 >>>>> y
 >>>>>
 >>>>> $ p6 'my $handle=open("filever.exe", :bin, :ro); my Buf $b; $b=
 >>>>> $handle.read(5); say $b; if ( $b[2..4] eqv (0x90,0x00,0x04) ) {say
 >>>>> "y";} else {say "n"}; $handle.close;'
 >>>>> Buf[uint8]:0x<4D 5A 90 00 03>
 >>>>> n
 >>>>
 >>>>
 >>>> How do I find the position of a pattern in a Buf?
 >>>
 >>> Need `pos` for Buf
 >>
 >> Actually I do believe it is the binary equivalent of `index` I
 >> am looking for
 >>

On 2/2/19 6:09 AM, Brad Gilbert wrote:
>      sub buf-index ( Buf $buf, +@match ) {
>          my $elems = @match.elems;
>          $buf.rotor( $elems => 1 - $elems ).first(* eqv @match.List, :k)
>      }
> 
>      my $buf = Buf[uint8].new(0x4D, 0x5A, 0x90, 0x00, 0x03);
> 
>      say buf-index( $buf, (0x90, 0x00, 0x03)); # 2

Hi Brad,

Thank you!

Did you forget the "return" in the sub?

And, would this be faster or slower than converting to
a string and using "index"?

-T
0
perl6
2/3/2019 3:06:34 AM
On 2/2/19 6:09 AM, Brad Gilbert wrote:
>      sub buf-index ( Buf $buf, +@match ) {
>          my $elems = @match.elems;
>          $buf.rotor( $elems => 1 - $elems ).first(* eqv @match.List, :k)
>      }
> 
>      my $buf = Buf[uint8].new(0x4D, 0x5A, 0x90, 0x00, 0x03);
> 
>      say buf-index( $buf, (0x90, 0x00, 0x03)); # 2

What did I do wrong?

First I did a byte wise conversion of

    Buf $BinaryFile   to   Str $StrFile

and

    Buf $VersionInfoBuf  to  Str $VersionInfoStr



sub Buf-Index ( Buf $Buffer, +@SubBuf ) {
    # `index` for buffers
    # $Buffer is the buffer to search through
    # $ +@SubBuf is the sub buffer pattern to search for in $Buffer
    # returns the first instance of a match, Nil if no match

    my Int $Position = Nil;
    my $Elems = @SubBuf.elems;

    $Position = $Buffer.rotor( $Elems => 1 - $Elems ).first( * eqv 
@SubBuf.List, :k );
    return $Position;
}

    $i  = index(     $StrFile,    $VersionInfoStr );
    $bi = Buf-Index( $BinaryFile, $VersionInfoBuf );
    say "i = <$i>   bi = <$bi>";




$ FileVer.pl6
i = <11371>   bi = <>


11371 is correct.



What did I do wrong?

Many thanks,
-T
0
perl6
2/3/2019 3:49:53 AM
Subs do not need to have a `return` statement if it is returning the last value.

You also broke the return value of the subroutine that I wrote by
assigning it to a variable.

What I wrote would return `Nil` if it failed to find a match, yours
will return an undefined `Int`.
It should return `Nil` because that is what `index` would return.

Doing bytewise conversion from Buf to Str is pointless. It will break
on Unicode data.
It would also be exactly the same as converting ASCII if it worked.
(It won't work on binary data)

If you are dealing with something that is mostly Unicode but also has
binary data
decode using 'utf8-c8'.

If you are dealing with something that is mostly binary, decode using 'latin1',
or just use raw bytes in a buffer.

    my Buf $buffer = $fh.read(10);
    my Str $string = $buffer.decode('latin1');

    # the first three bytes were really a Utf8 encoded character
    my Str $char = $string.substr(0,3).encode('latin1').decode('utf8');
    # or
    my Str $char = $buffer.subbuf(0,3).decode('utf8');

Also note that `encode` doesn't always return a Buf.

    my Buf $buf = Buf.new( 'hello'.encode('utf8') );

---

The subroutine I wrote was simplified to work for an Array or List, not a Buf.

It is also weird that you are using CamelCase for variables,
and a mixture of CamelCase and snake-case for the subroutine name.


Improving your variant, and changing it so the second parameter is a Buf.

    sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
        my List $Matcher = $SubBuf.List; # only call .List once
        my Any $Position is default(Nil) = Nil;
        my Int $Elems = $Matcher.elems;

        $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
$Matcher, :k);
        return $Position;
    }

`$Position` has to be `Any` (or `Mu`) so that it can store the value `Nil`.
`Nil` sets a variable to its default, so we have to change the default
with `is default(Nil)`.
(The normal default is the same as the container type)
(The `= Nil;` is always pointless in the declaration of a `$scalar` variable.)

One simplification is to just have the return value as the last thing
in the subroutine without a `return`.
(It may also be slightly faster, but not by much.)

    sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
        my List $Matcher = $SubBuf.List;
        my Any $Position is default(Nil) = Nil;
        my Int $Elems = $Matcher.elems;

        $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
$Matcher, :k);
        $Position; # <------------
    }

Assignment is a rvalue, so we can remove that last line

    sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
        my List $Matcher = $SubBuf.List;
        my Any $Position is default(Nil) = Nil;
        my Int $Elems = $Matcher.elems;

        $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
$Matcher, :k);
        # <-----------
    }

`$Position` is now completely pointless.

    sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
        my List $Matcher = $SubBuf.List;
        my Int $Elems = $Matcher.elems;

        $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
        # ^
    }

If you want `return` (even though it isn't doing anything)

    sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
        my List $Matcher = $SubBuf.List;
        my Int $Elems = $Matcher.elems;

        return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
        # ^
    }

You could also declare the type of the return value

    sub Buf-Index ( Buf $Buffer, Buf $SubBuf --> Int ) { # <----
        my List $Matcher = $SubBuf.List;
        my Int $Elems = $Matcher.elems;

        return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
    }

Note that `Nil` can sneak around the return value type check.

---

As an added bonus, here is a subroutine that returns all of the indices.
(Note that the only differences are `grep` rather than `first`, and
the return type)

    sub Buf-Indices ( Buf $Buffer, Buf $SubBuf --> Seq ) {
        my List $Matcher = $SubBuf.List;
        my Int $Elems = $Matcher.elems;

        return $Buffer.rotor($Elems => 1- $Elems).grep(* eqv $Matcher, :k);
    }

    }

On Sat, Feb 2, 2019 at 10:05 PM ToddAndMargo via perl6-users
<perl6-users@perl.org> wrote:
>
> On 2/2/19 6:09 AM, Brad Gilbert wrote:
> >      sub buf-index ( Buf $buf, +@match ) {
> >          my $elems = @match.elems;
> >          $buf.rotor( $elems => 1 - $elems ).first(* eqv @match.List, :k)
> >      }
> >
> >      my $buf = Buf[uint8].new(0x4D, 0x5A, 0x90, 0x00, 0x03);
> >
> >      say buf-index( $buf, (0x90, 0x00, 0x03)); # 2
>
> What did I do wrong?
>
> First I did a byte wise conversion of
>
>     Buf $BinaryFile   to   Str $StrFile
>
> and
>
>     Buf $VersionInfoBuf  to  Str $VersionInfoStr
>
>
>
> sub Buf-Index ( Buf $Buffer, +@SubBuf ) {
>     # `index` for buffers
>     # $Buffer is the buffer to search through
>     # $ +@SubBuf is the sub buffer pattern to search for in $Buffer
>     # returns the first instance of a match, Nil if no match
>
>     my Int $Position = Nil;
>     my $Elems = @SubBuf.elems;
>
>     $Position = $Buffer.rotor( $Elems => 1 - $Elems ).first( * eqv
> @SubBuf.List, :k );
>     return $Position;
> }
>
>     $i  = index(     $StrFile,    $VersionInfoStr );
>     $bi = Buf-Index( $BinaryFile, $VersionInfoBuf );
>     say "i = <$i>   bi = <$bi>";
>
>
>
>
> $ FileVer.pl6
> i = <11371>   bi = <>
>
>
> 11371 is correct.
>
>
>
> What did I do wrong?
>
> Many thanks,
> -T
0
b2gills
2/3/2019 5:29:16 AM
 > On Sat, Feb 2, 2019 at 10:05 PM ToddAndMargo via perl6-users
 > <perl6-users@perl.org> wrote:
 >>
 >> On 2/2/19 6:09 AM, Brad Gilbert wrote:
 >>>       sub buf-index ( Buf $buf, +@match ) {
 >>>           my $elems = @match.elems;
 >>>           $buf.rotor( $elems => 1 - $elems ).first(* eqv 
@match.List, :k)
 >>>       }
 >>>
 >>>       my $buf = Buf[uint8].new(0x4D, 0x5A, 0x90, 0x00, 0x03);
 >>>
 >>>       say buf-index( $buf, (0x90, 0x00, 0x03)); # 2
 >>
 >> What did I do wrong?
 >>
 >> First I did a byte wise conversion of
 >>
 >>      Buf $BinaryFile   to   Str $StrFile
 >>
 >> and
 >>
 >>      Buf $VersionInfoBuf  to  Str $VersionInfoStr
 >>
 >>
 >>
 >> sub Buf-Index ( Buf $Buffer, +@SubBuf ) {
 >>      # `index` for buffers
 >>      # $Buffer is the buffer to search through
 >>      # $ +@SubBuf is the sub buffer pattern to search for in $Buffer
 >>      # returns the first instance of a match, Nil if no match
 >>
 >>      my Int $Position = Nil;
 >>      my $Elems = @SubBuf.elems;
 >>
 >>      $Position = $Buffer.rotor( $Elems => 1 - $Elems ).first( * eqv
 >> @SubBuf.List, :k );
 >>      return $Position;
 >> }
 >>
 >>      $i  = index(     $StrFile,    $VersionInfoStr );
 >>      $bi = Buf-Index( $BinaryFile, $VersionInfoBuf );
 >>      say "i = <$i>   bi = <$bi>";
 >>
 >>
 >>
 >>
 >> $ FileVer.pl6
 >> i = <11371>   bi = <>
 >>
 >>
 >> 11371 is correct.
 >>
 >>
 >>
 >> What did I do wrong?
 >>
 >> Many thanks,
 >> -T


On 2/2/19 9:29 PM, Brad Gilbert wrote:
> Subs do not need to have a `return` statement if it is returning the last value.
> 
> You also broke the return value of the subroutine that I wrote by
> assigning it to a variable.
> 
> What I wrote would return `Nil` if it failed to find a match, yours
> will return an undefined `Int`.
> It should return `Nil` because that is what `index` would return.
> 
> Doing bytewise conversion from Buf to Str is pointless. It will break
> on Unicode data.
> It would also be exactly the same as converting ASCII if it worked.
> (It won't work on binary data)
> 
> If you are dealing with something that is mostly Unicode but also has
> binary data
> decode using 'utf8-c8'.
> 
> If you are dealing with something that is mostly binary, decode using 'latin1',
> or just use raw bytes in a buffer.
> 
>      my Buf $buffer = $fh.read(10);
>      my Str $string = $buffer.decode('latin1');
> 
>      # the first three bytes were really a Utf8 encoded character
>      my Str $char = $string.substr(0,3).encode('latin1').decode('utf8');
>      # or
>      my Str $char = $buffer.subbuf(0,3).decode('utf8');
> 
> Also note that `encode` doesn't always return a Buf.
> 
>      my Buf $buf = Buf.new( 'hello'.encode('utf8') );
> 
> ---
> 
> The subroutine I wrote was simplified to work for an Array or List, not a Buf.
> 
> It is also weird that you are using CamelCase for variables,
> and a mixture of CamelCase and snake-case for the subroutine name.
> 
> 
> Improving your variant, and changing it so the second parameter is a Buf.
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List; # only call .List once
>          my Any $Position is default(Nil) = Nil;
>          my Int $Elems = $Matcher.elems;
> 
>          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> $Matcher, :k);
>          return $Position;
>      }
> 
> `$Position` has to be `Any` (or `Mu`) so that it can store the value `Nil`.
> `Nil` sets a variable to its default, so we have to change the default
> with `is default(Nil)`.
> (The normal default is the same as the container type)
> (The `= Nil;` is always pointless in the declaration of a `$scalar` variable.)
> 
> One simplification is to just have the return value as the last thing
> in the subroutine without a `return`.
> (It may also be slightly faster, but not by much.)
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Any $Position is default(Nil) = Nil;
>          my Int $Elems = $Matcher.elems;
> 
>          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> $Matcher, :k);
>          $Position; # <------------
>      }
> 
> Assignment is a rvalue, so we can remove that last line
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Any $Position is default(Nil) = Nil;
>          my Int $Elems = $Matcher.elems;
> 
>          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> $Matcher, :k);
>          # <-----------
>      }
> 
> `$Position` is now completely pointless.
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>          # ^
>      }
> 
> If you want `return` (even though it isn't doing anything)
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>          # ^
>      }
> 
> You could also declare the type of the return value
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf --> Int ) { # <----
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>      }
> 
> Note that `Nil` can sneak around the return value type check.
> 
> ---
> 
> As an added bonus, here is a subroutine that returns all of the indices.
> (Note that the only differences are `grep` rather than `first`, and
> the return type)
> 
>      sub Buf-Indices ( Buf $Buffer, Buf $SubBuf --> Seq ) {
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          return $Buffer.rotor($Elems => 1- $Elems).grep(* eqv $Matcher, :k);
>      }


Thank you!

I am going to have to read it over slowly!

-T
0
perl6
2/3/2019 6:33:32 AM
On 2/2/19 9:29 PM, Brad Gilbert wrote:
> It is also weird that you are using CamelCase for variables,
> and a mixture of CamelCase and snake-case for the subroutine name.

Hi Brad,

An explanation.  I do this for "maintainability".

I have been able to "type" since high school typing
class.   Upper and lower are no difference in speed
to me.  (I do realize that "hunt and peckers" go nuts
with upper and lower case.)

So, Camel Case tells me instantly that a variable and
sub name came from me and are not something I copied
from somewhere else (zef), or a reserved word or a
build-in sub name.  There is no "Did I write this
or ...?" when I go months later to fix/add some
something.

Since I come from Modula 2, I live and breath modules.
Awful nice to be able to know at a glance if I wrote
something or I used someone else's module.

And I absolutely ADORE the way Perl 6 does their modules.
Well, importing them needs some work.  They need to
restore Perl 5's functionality where you can explicitly
state which subs you are importing from a module:

      use Term::ANSIColor qw ( BOLD BLUE RED GREEN RESET );
      use Term::ReadKey qw ( ReadKey ReadMode );

Again, for maintainability, so I know where the blazes
something came from.

I work around the problem with my own imposed comments:

      use CurlUtils;    # qw[ CurlDownloadFile  CurlGetWebSite 
CurlGetHeader CurlExists CurlSendMail, CurlGetRedirectUrl ];

and

    #`{

      Interface to curl from Perl 6

      To use these, place the following at the top(ish) of your program
         use lib "/home/linuxutil";
         use CurlUtils;  # qx[ CurlDownloadFile  CurlGetWebSite 
CurlGetHeader CurlExists CurlSendMail ]
       }

Oh ya, and I live and die in "Top Down", so I GOT TO HAVE my subs!

Perl is a "Write Only language" if you let it be.

-T
0
perl6
2/3/2019 10:31:31 AM
On 2/2/19 9:29 PM, Brad Gilbert wrote:
> Subs do not need to have a `return` statement if it is returning the last value.
> 
> You also broke the return value of the subroutine that I wrote by
> assigning it to a variable.
> 
> What I wrote would return `Nil` if it failed to find a match, yours
> will return an undefined `Int`.
> It should return `Nil` because that is what `index` would return.
> 
> Doing bytewise conversion from Buf to Str is pointless. It will break
> on Unicode data.
> It would also be exactly the same as converting ASCII if it worked.
> (It won't work on binary data)
> 
> If you are dealing with something that is mostly Unicode but also has
> binary data
> decode using 'utf8-c8'.
> 
> If you are dealing with something that is mostly binary, decode using 'latin1',
> or just use raw bytes in a buffer.
> 
>      my Buf $buffer = $fh.read(10);
>      my Str $string = $buffer.decode('latin1');
> 
>      # the first three bytes were really a Utf8 encoded character
>      my Str $char = $string.substr(0,3).encode('latin1').decode('utf8');
>      # or
>      my Str $char = $buffer.subbuf(0,3).decode('utf8');
> 
> Also note that `encode` doesn't always return a Buf.
> 
>      my Buf $buf = Buf.new( 'hello'.encode('utf8') );
> 
> ---
> 
> The subroutine I wrote was simplified to work for an Array or List, not a Buf.
> 
> It is also weird that you are using CamelCase for variables,
> and a mixture of CamelCase and snake-case for the subroutine name.
> 
> 
> Improving your variant, and changing it so the second parameter is a Buf.
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List; # only call .List once
>          my Any $Position is default(Nil) = Nil;
>          my Int $Elems = $Matcher.elems;
> 
>          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> $Matcher, :k);
>          return $Position;
>      }
> 
> `$Position` has to be `Any` (or `Mu`) so that it can store the value `Nil`.
> `Nil` sets a variable to its default, so we have to change the default
> with `is default(Nil)`.
> (The normal default is the same as the container type)
> (The `= Nil;` is always pointless in the declaration of a `$scalar` variable.)
> 
> One simplification is to just have the return value as the last thing
> in the subroutine without a `return`.
> (It may also be slightly faster, but not by much.)
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Any $Position is default(Nil) = Nil;
>          my Int $Elems = $Matcher.elems;
> 
>          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> $Matcher, :k);
>          $Position; # <------------
>      }
> 
> Assignment is a rvalue, so we can remove that last line
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Any $Position is default(Nil) = Nil;
>          my Int $Elems = $Matcher.elems;
> 
>          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> $Matcher, :k);
>          # <-----------
>      }
> 
> `$Position` is now completely pointless.
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>          # ^
>      }
> 
> If you want `return` (even though it isn't doing anything)
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>          # ^
>      }
> 
> You could also declare the type of the return value
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf --> Int ) { # <----
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>      }
> 
> Note that `Nil` can sneak around the return value type check.
> 
> ---
> 
> As an added bonus, here is a subroutine that returns all of the indices.
> (Note that the only differences are `grep` rather than `first`, and
> the return type)
> 
>      sub Buf-Indices ( Buf $Buffer, Buf $SubBuf --> Seq ) {
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          return $Buffer.rotor($Elems => 1- $Elems).grep(* eqv $Matcher, :k);
>      }
> 
>      }
> 
> On Sat, Feb 2, 2019 at 10:05 PM ToddAndMargo via perl6-users
> <perl6-users@perl.org> wrote:
>>
>> On 2/2/19 6:09 AM, Brad Gilbert wrote:
>>>       sub buf-index ( Buf $buf, +@match ) {
>>>           my $elems = @match.elems;
>>>           $buf.rotor( $elems => 1 - $elems ).first(* eqv @match.List, :k)
>>>       }
>>>
>>>       my $buf = Buf[uint8].new(0x4D, 0x5A, 0x90, 0x00, 0x03);
>>>
>>>       say buf-index( $buf, (0x90, 0x00, 0x03)); # 2
>>
>> What did I do wrong?
>>
>> First I did a byte wise conversion of
>>
>>      Buf $BinaryFile   to   Str $StrFile
>>
>> and
>>
>>      Buf $VersionInfoBuf  to  Str $VersionInfoStr
>>
>>
>>
>> sub Buf-Index ( Buf $Buffer, +@SubBuf ) {
>>      # `index` for buffers
>>      # $Buffer is the buffer to search through
>>      # $ +@SubBuf is the sub buffer pattern to search for in $Buffer
>>      # returns the first instance of a match, Nil if no match
>>
>>      my Int $Position = Nil;
>>      my $Elems = @SubBuf.elems;
>>
>>      $Position = $Buffer.rotor( $Elems => 1 - $Elems ).first( * eqv
>> @SubBuf.List, :k );
>>      return $Position;
>> }
>>
>>      $i  = index(     $StrFile,    $VersionInfoStr );
>>      $bi = Buf-Index( $BinaryFile, $VersionInfoBuf );
>>      say "i = <$i>   bi = <$bi>";
>>
>>
>>
>>
>> $ FileVer.pl6
>> i = <11371>   bi = <>
>>
>>
>> 11371 is correct.
>>
>>
>>
>> What did I do wrong?
>>
>> Many thanks,
>> -T


Got it working.  Thank you!

It is a tad slow.  Depending on the file's size, it is
20 to 190 times slower than "index".

I have a lot of thinking to do.
0
perl6
2/5/2019 7:47:02 AM
`index` is an NQP op, which means in this case that it is written in C
(assuming you are using MoarVM)

https://github.com/MoarVM/MoarVM/blob/ddde09508310a5f60c63474db8f9682bc922700b/src/strings/ops.c#L557-L656

The code I gave for finding a Buf inside of another one was quickly
made in a way to prevent certain bugs from even being possible.


I rewrote it using knowledge of the internals to be a bit faster:

    sub blob-index ( Blob:D $buffer, Blob:D $needle, UInt:D $init = 0
--> Int ) {
        use nqp;
        my int $needle-width = $needle.elems;
        my int $elems = $buffer.elems;

        if $elems < $needle-width + $init {
            fail 'needle is larger than the buffer'
        }

        my uint $from = $init;
        my uint $to   = $from + $needle-width - 1;

        loop ( ; $to < $elems ; ++$from,++$to ) {
            return $from if $needle eq nqp::slice($buffer,$from,$to)
            # eq is safe since they are two Blobs/Bufs
        }
        return Nil
    }

It's faster mainly because it doesn't use an iterator.

This was quickly written.

Note that Buf is a subtype of Blob, so it will also work if given a Buf.

It could be written to be faster, but it still isn't going to be as
fast as `index`.
For one `index` uses a subroutine named `knuth_morris_pratt_string_index`.

---

Also Unicode strings are not stored flat in MoarVM

    'a' x 10

That is stored something like the following internally.

    REPEAT( STR('a'), 10 )

What this means is that if you are looking for a 'b' for example, it
could check the first repetition for 'b', and skip the rest of the
REPEAT.
That would save 9 string comparisons in this example.

That is not how Bufs are stored in Perl6 at all.
They are declared with `is repr('VMArray')` which basically means they
are stored as C arrays.

On Tue, Feb 5, 2019 at 1:47 AM ToddAndMargo via perl6-users
<perl6-users@perl.org> wrote:
>
> On 2/2/19 9:29 PM, Brad Gilbert wrote:
> > Subs do not need to have a `return` statement if it is returning the last value.
> >
> > You also broke the return value of the subroutine that I wrote by
> > assigning it to a variable.
> >
> > What I wrote would return `Nil` if it failed to find a match, yours
> > will return an undefined `Int`.
> > It should return `Nil` because that is what `index` would return.
> >
> > Doing bytewise conversion from Buf to Str is pointless. It will break
> > on Unicode data.
> > It would also be exactly the same as converting ASCII if it worked.
> > (It won't work on binary data)
> >
> > If you are dealing with something that is mostly Unicode but also has
> > binary data
> > decode using 'utf8-c8'.
> >
> > If you are dealing with something that is mostly binary, decode using 'latin1',
> > or just use raw bytes in a buffer.
> >
> >      my Buf $buffer = $fh.read(10);
> >      my Str $string = $buffer.decode('latin1');
> >
> >      # the first three bytes were really a Utf8 encoded character
> >      my Str $char = $string.substr(0,3).encode('latin1').decode('utf8');
> >      # or
> >      my Str $char = $buffer.subbuf(0,3).decode('utf8');
> >
> > Also note that `encode` doesn't always return a Buf.
> >
> >      my Buf $buf = Buf.new( 'hello'.encode('utf8') );
> >
> > ---
> >
> > The subroutine I wrote was simplified to work for an Array or List, not a Buf.
> >
> > It is also weird that you are using CamelCase for variables,
> > and a mixture of CamelCase and snake-case for the subroutine name.
> >
> >
> > Improving your variant, and changing it so the second parameter is a Buf.
> >
> >      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
> >          my List $Matcher = $SubBuf.List; # only call .List once
> >          my Any $Position is default(Nil) = Nil;
> >          my Int $Elems = $Matcher.elems;
> >
> >          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> > $Matcher, :k);
> >          return $Position;
> >      }
> >
> > `$Position` has to be `Any` (or `Mu`) so that it can store the value `Nil`.
> > `Nil` sets a variable to its default, so we have to change the default
> > with `is default(Nil)`.
> > (The normal default is the same as the container type)
> > (The `= Nil;` is always pointless in the declaration of a `$scalar` variable.)
> >
> > One simplification is to just have the return value as the last thing
> > in the subroutine without a `return`.
> > (It may also be slightly faster, but not by much.)
> >
> >      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
> >          my List $Matcher = $SubBuf.List;
> >          my Any $Position is default(Nil) = Nil;
> >          my Int $Elems = $Matcher.elems;
> >
> >          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> > $Matcher, :k);
> >          $Position; # <------------
> >      }
> >
> > Assignment is a rvalue, so we can remove that last line
> >
> >      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
> >          my List $Matcher = $SubBuf.List;
> >          my Any $Position is default(Nil) = Nil;
> >          my Int $Elems = $Matcher.elems;
> >
> >          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> > $Matcher, :k);
> >          # <-----------
> >      }
> >
> > `$Position` is now completely pointless.
> >
> >      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
> >          my List $Matcher = $SubBuf.List;
> >          my Int $Elems = $Matcher.elems;
> >
> >          $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
> >          # ^
> >      }
> >
> > If you want `return` (even though it isn't doing anything)
> >
> >      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
> >          my List $Matcher = $SubBuf.List;
> >          my Int $Elems = $Matcher.elems;
> >
> >          return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
> >          # ^
> >      }
> >
> > You could also declare the type of the return value
> >
> >      sub Buf-Index ( Buf $Buffer, Buf $SubBuf --> Int ) { # <----
> >          my List $Matcher = $SubBuf.List;
> >          my Int $Elems = $Matcher.elems;
> >
> >          return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
> >      }
> >
> > Note that `Nil` can sneak around the return value type check.
> >
> > ---
> >
> > As an added bonus, here is a subroutine that returns all of the indices.
> > (Note that the only differences are `grep` rather than `first`, and
> > the return type)
> >
> >      sub Buf-Indices ( Buf $Buffer, Buf $SubBuf --> Seq ) {
> >          my List $Matcher = $SubBuf.List;
> >          my Int $Elems = $Matcher.elems;
> >
> >          return $Buffer.rotor($Elems => 1- $Elems).grep(* eqv $Matcher, :k);
> >      }
> >
> >      }
> >
> > On Sat, Feb 2, 2019 at 10:05 PM ToddAndMargo via perl6-users
> > <perl6-users@perl.org> wrote:
> >>
> >> On 2/2/19 6:09 AM, Brad Gilbert wrote:
> >>>       sub buf-index ( Buf $buf, +@match ) {
> >>>           my $elems = @match.elems;
> >>>           $buf.rotor( $elems => 1 - $elems ).first(* eqv @match.List, :k)
> >>>       }
> >>>
> >>>       my $buf = Buf[uint8].new(0x4D, 0x5A, 0x90, 0x00, 0x03);
> >>>
> >>>       say buf-index( $buf, (0x90, 0x00, 0x03)); # 2
> >>
> >> What did I do wrong?
> >>
> >> First I did a byte wise conversion of
> >>
> >>      Buf $BinaryFile   to   Str $StrFile
> >>
> >> and
> >>
> >>      Buf $VersionInfoBuf  to  Str $VersionInfoStr
> >>
> >>
> >>
> >> sub Buf-Index ( Buf $Buffer, +@SubBuf ) {
> >>      # `index` for buffers
> >>      # $Buffer is the buffer to search through
> >>      # $ +@SubBuf is the sub buffer pattern to search for in $Buffer
> >>      # returns the first instance of a match, Nil if no match
> >>
> >>      my Int $Position = Nil;
> >>      my $Elems = @SubBuf.elems;
> >>
> >>      $Position = $Buffer.rotor( $Elems => 1 - $Elems ).first( * eqv
> >> @SubBuf.List, :k );
> >>      return $Position;
> >> }
> >>
> >>      $i  = index(     $StrFile,    $VersionInfoStr );
> >>      $bi = Buf-Index( $BinaryFile, $VersionInfoBuf );
> >>      say "i = <$i>   bi = <$bi>";
> >>
> >>
> >>
> >>
> >> $ FileVer.pl6
> >> i = <11371>   bi = <>
> >>
> >>
> >> 11371 is correct.
> >>
> >>
> >>
> >> What did I do wrong?
> >>
> >> Many thanks,
> >> -T
>
>
> Got it working.  Thank you!
>
> It is a tad slow.  Depending on the file's size, it is
> 20 to 190 times slower than "index".
>
> I have a lot of thinking to do.
0
b2gills
2/5/2019 3:55:45 PM
--000000000000ff6efa0581281f7d
Content-Type: text/plain; charset="UTF-8"

If you have glibc (probably yes for Linux or Mac, probably no for Windows),
you can call memmem():

use NativeCall;

sub memmem(Blob $haystack, size_t $haystacklen,
           Blob $needle,   size_t $needlelen --> Pointer) is native {}

sub buf-index(Blob $buffer, Blob $needle) {
    (memmem($buffer, $buffer.bytes, $needle, $needle.bytes) // return)
        - nativecast(Pointer, $buffer)
}

my $buf = Buf.new(0,0,0,1,2,0);
my $needle = Buf.new(1,2);

say buf-index($buf, $needle);

Curt

--000000000000ff6efa0581281f7d
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><br><div>If you have gli=
bc (probably yes for Linux or Mac, probably no for Windows), you can call m=
emmem():</div><div><br></div><div><div><font face=3D"monospace, monospace">=
use NativeCall;</font></div><div><font face=3D"monospace, monospace"><br></=
font></div><div><font face=3D"monospace, monospace">sub memmem(Blob $haysta=
ck, size_t $haystacklen,</font></div><div><font face=3D"monospace, monospac=
e">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Blob $needle,=C2=A0 =C2=A0size_=
t $needlelen --&gt; Pointer) is native {}</font></div><div><font face=3D"mo=
nospace, monospace"><br></font></div><div><font face=3D"monospace, monospac=
e">sub buf-index(Blob $buffer, Blob $needle) {</font></div><div><font face=
=3D"monospace, monospace">=C2=A0 =C2=A0 (memmem($buffer, $buffer.bytes, $ne=
edle, $needle.bytes) // return)</font></div><div><font face=3D"monospace, m=
onospace">=C2=A0 =C2=A0 =C2=A0 =C2=A0 - nativecast(Pointer, $buffer)</font>=
</div><div><font face=3D"monospace, monospace">}</font></div><div><font fac=
e=3D"monospace, monospace"><br></font></div><div><font face=3D"monospace, m=
onospace">my $buf =3D Buf.new(0,0,0,1,2,0);</font></div><div><font face=3D"=
monospace, monospace">my $needle =3D Buf.new(1,2);</font></div><div><font f=
ace=3D"monospace, monospace"><br></font></div><div><font face=3D"monospace,=
 monospace">say buf-index($buf, $needle);</font></div></div><div><br></div>=
<div>Curt</div><div><br></div></div></div></div>

--000000000000ff6efa0581281f7d--
0
curt
2/5/2019 4:26:43 PM
On 2/5/19 7:55 AM, Brad Gilbert wrote:
> `index` is an NQP op, which means in this case that it is written in C
> (assuming you are using MoarVM)
> 
> https://github.com/MoarVM/MoarVM/blob/ddde09508310a5f60c63474db8f9682bc922700b/src/strings/ops.c#L557-L656
> 
> The code I gave for finding a Buf inside of another one was quickly
> made in a way to prevent certain bugs from even being possible.
> 
> 
> I rewrote it using knowledge of the internals to be a bit faster:
> 
>      sub blob-index ( Blob:D $buffer, Blob:D $needle, UInt:D $init = 0
> --> Int ) {
>          use nqp;
>          my int $needle-width = $needle.elems;
>          my int $elems = $buffer.elems;
> 
>          if $elems < $needle-width + $init {
>              fail 'needle is larger than the buffer'
>          }
> 
>          my uint $from = $init;
>          my uint $to   = $from + $needle-width - 1;
> 
>          loop ( ; $to < $elems ; ++$from,++$to ) {
>              return $from if $needle eq nqp::slice($buffer,$from,$to)
>              # eq is safe since they are two Blobs/Bufs
>          }
>          return Nil
>      }
> 
> It's faster mainly because it doesn't use an iterator.
> 
> This was quickly written.
> 
> Note that Buf is a subtype of Blob, so it will also work if given a Buf.
> 
> It could be written to be faster, but it still isn't going to be as
> fast as `index`.
> For one `index` uses a subroutine named `knuth_morris_pratt_string_index`.
> 
> ---
> 
> Also Unicode strings are not stored flat in MoarVM
> 
>      'a' x 10
> 
> That is stored something like the following internally.
> 
>      REPEAT( STR('a'), 10 )
> 
> What this means is that if you are looking for a 'b' for example, it
> could check the first repetition for 'b', and skip the rest of the
> REPEAT.
> That would save 9 string comparisons in this example.
> 
> That is not how Bufs are stored in Perl6 at all.
> They are declared with `is repr('VMArray')` which basically means they
> are stored as C arrays.
> 
> On Tue, Feb 5, 2019 at 1:47 AM ToddAndMargo via perl6-users
> <perl6-users@perl.org> wrote:
>>
>> On 2/2/19 9:29 PM, Brad Gilbert wrote:
>>> Subs do not need to have a `return` statement if it is returning the last value.
>>>
>>> You also broke the return value of the subroutine that I wrote by
>>> assigning it to a variable.
>>>
>>> What I wrote would return `Nil` if it failed to find a match, yours
>>> will return an undefined `Int`.
>>> It should return `Nil` because that is what `index` would return.
>>>
>>> Doing bytewise conversion from Buf to Str is pointless. It will break
>>> on Unicode data.
>>> It would also be exactly the same as converting ASCII if it worked.
>>> (It won't work on binary data)
>>>
>>> If you are dealing with something that is mostly Unicode but also has
>>> binary data
>>> decode using 'utf8-c8'.
>>>
>>> If you are dealing with something that is mostly binary, decode using 'latin1',
>>> or just use raw bytes in a buffer.
>>>
>>>       my Buf $buffer = $fh.read(10);
>>>       my Str $string = $buffer.decode('latin1');
>>>
>>>       # the first three bytes were really a Utf8 encoded character
>>>       my Str $char = $string.substr(0,3).encode('latin1').decode('utf8');
>>>       # or
>>>       my Str $char = $buffer.subbuf(0,3).decode('utf8');
>>>
>>> Also note that `encode` doesn't always return a Buf.
>>>
>>>       my Buf $buf = Buf.new( 'hello'.encode('utf8') );
>>>
>>> ---
>>>
>>> The subroutine I wrote was simplified to work for an Array or List, not a Buf.
>>>
>>> It is also weird that you are using CamelCase for variables,
>>> and a mixture of CamelCase and snake-case for the subroutine name.
>>>
>>>
>>> Improving your variant, and changing it so the second parameter is a Buf.
>>>
>>>       sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>>>           my List $Matcher = $SubBuf.List; # only call .List once
>>>           my Any $Position is default(Nil) = Nil;
>>>           my Int $Elems = $Matcher.elems;
>>>
>>>           $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
>>> $Matcher, :k);
>>>           return $Position;
>>>       }
>>>
>>> `$Position` has to be `Any` (or `Mu`) so that it can store the value `Nil`.
>>> `Nil` sets a variable to its default, so we have to change the default
>>> with `is default(Nil)`.
>>> (The normal default is the same as the container type)
>>> (The `= Nil;` is always pointless in the declaration of a `$scalar` variable.)
>>>
>>> One simplification is to just have the return value as the last thing
>>> in the subroutine without a `return`.
>>> (It may also be slightly faster, but not by much.)
>>>
>>>       sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>>>           my List $Matcher = $SubBuf.List;
>>>           my Any $Position is default(Nil) = Nil;
>>>           my Int $Elems = $Matcher.elems;
>>>
>>>           $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
>>> $Matcher, :k);
>>>           $Position; # <------------
>>>       }
>>>
>>> Assignment is a rvalue, so we can remove that last line
>>>
>>>       sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>>>           my List $Matcher = $SubBuf.List;
>>>           my Any $Position is default(Nil) = Nil;
>>>           my Int $Elems = $Matcher.elems;
>>>
>>>           $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
>>> $Matcher, :k);
>>>           # <-----------
>>>       }
>>>
>>> `$Position` is now completely pointless.
>>>
>>>       sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>>>           my List $Matcher = $SubBuf.List;
>>>           my Int $Elems = $Matcher.elems;
>>>
>>>           $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>>>           # ^
>>>       }
>>>
>>> If you want `return` (even though it isn't doing anything)
>>>
>>>       sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>>>           my List $Matcher = $SubBuf.List;
>>>           my Int $Elems = $Matcher.elems;
>>>
>>>           return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>>>           # ^
>>>       }
>>>
>>> You could also declare the type of the return value
>>>
>>>       sub Buf-Index ( Buf $Buffer, Buf $SubBuf --> Int ) { # <----
>>>           my List $Matcher = $SubBuf.List;
>>>           my Int $Elems = $Matcher.elems;
>>>
>>>           return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>>>       }
>>>
>>> Note that `Nil` can sneak around the return value type check.
>>>
>>> ---
>>>
>>> As an added bonus, here is a subroutine that returns all of the indices.
>>> (Note that the only differences are `grep` rather than `first`, and
>>> the return type)
>>>
>>>       sub Buf-Indices ( Buf $Buffer, Buf $SubBuf --> Seq ) {
>>>           my List $Matcher = $SubBuf.List;
>>>           my Int $Elems = $Matcher.elems;
>>>
>>>           return $Buffer.rotor($Elems => 1- $Elems).grep(* eqv $Matcher, :k);
>>>       }
>>>
>>>       }
>>>
>>> On Sat, Feb 2, 2019 at 10:05 PM ToddAndMargo via perl6-users
>>> <perl6-users@perl.org> wrote:
>>>>
>>>> On 2/2/19 6:09 AM, Brad Gilbert wrote:
>>>>>        sub buf-index ( Buf $buf, +@match ) {
>>>>>            my $elems = @match.elems;
>>>>>            $buf.rotor( $elems => 1 - $elems ).first(* eqv @match.List, :k)
>>>>>        }
>>>>>
>>>>>        my $buf = Buf[uint8].new(0x4D, 0x5A, 0x90, 0x00, 0x03);
>>>>>
>>>>>        say buf-index( $buf, (0x90, 0x00, 0x03)); # 2
>>>>
>>>> What did I do wrong?
>>>>
>>>> First I did a byte wise conversion of
>>>>
>>>>       Buf $BinaryFile   to   Str $StrFile
>>>>
>>>> and
>>>>
>>>>       Buf $VersionInfoBuf  to  Str $VersionInfoStr
>>>>
>>>>
>>>>
>>>> sub Buf-Index ( Buf $Buffer, +@SubBuf ) {
>>>>       # `index` for buffers
>>>>       # $Buffer is the buffer to search through
>>>>       # $ +@SubBuf is the sub buffer pattern to search for in $Buffer
>>>>       # returns the first instance of a match, Nil if no match
>>>>
>>>>       my Int $Position = Nil;
>>>>       my $Elems = @SubBuf.elems;
>>>>
>>>>       $Position = $Buffer.rotor( $Elems => 1 - $Elems ).first( * eqv
>>>> @SubBuf.List, :k );
>>>>       return $Position;
>>>> }
>>>>
>>>>       $i  = index(     $StrFile,    $VersionInfoStr );
>>>>       $bi = Buf-Index( $BinaryFile, $VersionInfoBuf );
>>>>       say "i = <$i>   bi = <$bi>";
>>>>
>>>>
>>>>
>>>>
>>>> $ FileVer.pl6
>>>> i = <11371>   bi = <>
>>>>
>>>>
>>>> 11371 is correct.
>>>>
>>>>
>>>>
>>>> What did I do wrong?
>>>>
>>>> Many thanks,
>>>> -T
>>
>>
>> Got it working.  Thank you!
>>
>> It is a tad slow.  Depending on the file's size, it is
>> 20 to 190 times slower than "index".
>>
>> I have a lot of thinking to do.


Thank you!

Hmmmm.  Buf's are C arrays.  Interesting.

I still can't get away with

     my Buf $x=Buf.new(0x55, 0x66, 0x77, 0x78);
     my Buf $y=$x[ 1..2 ];

I have to
     my Buf $x=Buf.new(0x55, 0x66, 0x77, 0x78);
     my Buf $y=$x.subbuf(1, 2); say $y;

-T


-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Computers are like air conditioners.
They malfunction when you open windows
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0
perl6
2/6/2019 4:47:18 AM
On 2/5/19 8:26 AM, Curt Tilmes wrote:
>=20
> If you have glibc (probably yes for Linux or Mac, probably no for=20
> Windows), you can call memmem():
>=20
> use NativeCall;
>=20
> sub memmem(Blob $haystack, size_t $haystacklen,
>  =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Blob $needle,=C2=A0 =C2=A0siz=
e_t $needlelen --> Pointer) is native {}
>=20
> sub buf-index(Blob $buffer, Blob $needle) {
>  =C2=A0 =C2=A0 (memmem($buffer, $buffer.bytes, $needle, $needle.bytes) =
// return)
>  =C2=A0 =C2=A0 =C2=A0 =C2=A0 - nativecast(Pointer, $buffer)
> }
>=20
> my $buf =3D Buf.new(0,0,0,1,2,0);
> my $needle =3D Buf.new(1,2);
>=20
> say buf-index($buf, $needle);
>=20
> Curt
>=20

Thank you!
0
perl6
2/6/2019 4:49:20 AM
Reply: