Jumbo Signatures extensions discussion

"Your scientists were so preoccupied with whether or not they could,
they didn't stop to think if they should."
    -- Dr Malcolm, Jurassic Park

I want to spend the next year or so adding lots of new features to Perl's
subroutine signatures. Before I start, I need a rough consensus on what
all the new features should look like. Within this thread I will shortly
be creating a number of sub-threads, one for each proposal (with a
suitably altered subject line). To keep things manageable, discussions
about each proposal should occur only within each sub-thread; i.e. don't
discuss things directly within this top-level thread. To discuss something
not covered by the sub-threads, create a new sub-thread with a suitable
new subject line.

The text of each proposal is in a vague pseudo-POD style, but wouldn't
ever pass a POD checker.

I have been slowly developing this set of proposals over the last couple
of years, taking into account: existing CPAN signature modules, Perl 6,
and various p5p discussions (particularly Zefram's ideas). What I have
ended up with is what I hope are (mostly) a coherent set of proposals that
all work with each other syntactically and semantically, and which I am
reasonably confident that I can implement efficiently.  Unless otherwise
stated, each proposal is my (near) last word on the subject, i.e. it's the
way I want things to be done, unless anyone can persuade me otherwise. The
main exception is the proposal on constraints, which I'm not at all
confident about.

I've tried to follow the Perl philosophy of making easy things easy and
hard things possible.

Given the long gestation period, I may have lost/forgotten the original
attributions for various suggestions, so sorry if I haven't credited you.

I intend to allow a month or so for discussions, then start work on
implementing things. I'll concentrate initially on proposals which break
backwards compatibility, with a firm goal of getting all of those into
5.32. Other improvements will come afterwards, and may not all make 5.32.
I'll leave constraints and optimisation till last.

After that plus 2 years, I expect signatures to stop being experimental.

Each proposal is mostly stand-alone, so you can, if you wish, just read
and discuss the ones that interest you (the complete set of proposals is
quite long). However, in terms of referring back to other proposals, they
are designed to be read in roughly the following order. The ones marked X
include potentially non-backwards-compatible suggestions, and so are
probably the ones you should concentrate on initially. You can always come
back in a month or whatever to discuss the others.

    Parameter Attributes
    Named Parameters
    Query Parameters
  X @_ Suppression
    Aliasing and Read-only variables
    Type and Value Constraints and Coercions
    Scope and Ordering
    Miscellaneous suggestions, including
        X Allow a shortcut for a 'default' default value
        X Whitespace
        X Duplicate parameter names should be an error

The "Miscellaneous suggestions" is a bit of a ragtag of random suggestions,
so it would be well worth reading it even you don't have time to read the
whole set of proposals.

Finally, here is a contrived example of a signature which demonstrates
many of the proposals. Some of it will make more sense after each proposal
has been read.

sub foo (
    $self,
                     # parameter declarations starting with '?' examine,
                     # but don't consume any further arguments:

    ?*@args,         # @args is set to the equivalent of @_ with one
                     # arg shifted; i.e. like a slurpy but peeking ahead
                     # and not actually consuming any args; the '*' means
                     # that each @args element is aliased to a passed arg

    ?$peek_a,        # $peek_a is bound to next arg but without consuming it
    ??$has_a,        # $has_a is set to true if there's an argument for $a
    $a,              # normal scalar parameter
    ?{ print $a },   # embedded code block - runs code; doesn't shift args

    Dog $spot,       # works the same as 'my Dog $spot'
    $b :shared,      # you can use normal variable attributes
    $c :ro,          # at compile time, $c in lvalue context croaks

    \@ary,           # aliases @ary to a passed array reference
    \%hash,          # aliases %hash to a passed hash reference
    *$scalar,        # aliases $scalar to the next argument

                     # Constraints and coercions:
    $d!,             #    croak if $d not defined
    $e isa Foo::Bar, #    croak if $e isn't of that class
    $f is Int
      where $_ > 0,  #    croak if $f not a positive integer

    $x = 0,          # a default value
    $y?,             # short for $y = undef
    \@array?,        # short for \@array = []

    :$name1 = 0,
    :$name2 = 0,     # consume name-value pairs (name1 => ..., name2 => ....)

    @rest           # slurp up any remaining arguments
) { .... }



-- 
"You may not work around any technical limitations in the software"
    -- Windows Vista license
0
davem
11/28/2019 4:59:46 PM
perl.perl5.porters 47888 articles. 1 followers. Follow

47 Replies
45 Views

Similar Articles

[PageSpeed] 5

=head2 Synopsis:

    sub f ($x :foo, $y :bar(baz) bar2(baz2), ...) { ... }

    analogous to:

    my $x :foo;
    my $y :bar(baz) bar2(baz2);


We should support parameter attributes. I think this is a relatively
uncontroversial proposal.

What exactly should the semantics be? Lets first review the current syntax
as applied to 'my' declarations:

    my ($x, $y) :foo(foo_arg) :bar(bar_arg);

is roughly equivalent to

    use attributes ();
    my ($x,$y);
    attributes->import(, __PACKAGE__, \$x, "foo(foo_arg)", "bar(bar_arg)");
    attributes->import(, __PACKAGE__, \$y, "foo(foo_arg)", "bar(bar_arg)");

except that some attributes are built-in and are recognised and handled
directly by the lexer / parser, without attributes.pm ever getting
involved.

Note that attributes.pm says that attributes on variables are currently
experimental, although in practice we've supported things like
my $x : shared; for years.

As an aside, note that any argument to the attribute is scanned as a
single-quoted string - i.e. like q(...) - but is otherwise uninterpreted
by Perl itself.  Thus hypothetically a constraint expressed as an
attribute, e.g.

    sub foo ($x :where($x ne '(' ));
    
wouldn't get correctly parsed unless we handled it specially somehow,
which seems to be an argument for *not* using attributes for such things,
and instead use purpose-designed syntax, like, for example:

    sub foo ($x where $x ne '(', ...)

Second and subsequent attributes may be preceded by a colon, but don't
have to be: these are equivalent:

    my $x :foo :bar(1) :bar(2);
    my $x :foo  bar(1) :bar(2);

Thus for signatures, the obvious semantics would be that

    sub f ($a :foo, ...) {...}
    
is equivalent to

    sub f { my $a :foo; $a = $_[0]; .... }

The exact details of when attributes->import() is called is discussed in
the "Scope and Ordering" thread.

Once available, built-in attributes could in principle be used where Perl
6 uses traits, e.g.

    sub f($x is ro) { ... } # Perl 6
    sub f($x :ro)   { ... } # Perl 5 ???

See the "Aliasing and Read-only variables" thread for more detailed
proposals.

Attributes can't be used on a placeholder parameter:

    ($x :foo) # ok
    ($  :foo) # error

Attributes can't be used with aliasing, except for slurpies (which alias
individual elements rather than the aggregate itself):

    (\$x :foo) # error
    (\@a :foo) # error
    (\%h :foo) # error
    (*$x :foo) # error
    (*@a :foo) # ok - like: my @a: foo;  \$a[0]  = ...;  \$a[1]  = ...
    (*%h :foo) # ok - like: my %h: foo;  \$h{..} = ...;  \$h{..} = ...


Note that in Perl 6 and some CPAN signature modules, the 'method' keyword
declares an implicit $self parameter, whose name can be overridden using
a postfix ':':

    method foo($x, $y)      { $self->{$x} = $y }  # implicit $self
    method foo($me: $x, $y) {   $me->{$x} = $y }  # explicit invocant

I have no plans to introduce such a 'method' keyword, but if we did,
we might need different syntax for the invocant, as the ':' would be
interpreted as the start of an attribute unless the toker was clever and
we are very careful that all signature syntax is capable of being
disambiguated.
0
davem
11/28/2019 5:01:00 PM
=head2 Synopsis:

    sub foo (
             $pos1,  # positional parameter; consumes 1 arg
             $pos2,  # positional parameter; consumes 1 arg
            :$name1, # named parameter, consumes ("name1", value) arg pair
            :$name2, # named parameter, consumes ("name2", value) arg pair
            @rest, # consumes all unrecognised name/value pairs
        ) { ... }

This seems a popular feature request: give Perl 5 something similar to
Perl 6's named parameters:

    sub foo(:$name1, :$name2) { .... }
    foo(name2 => 200, name1 => 100);

The Perl 5 variant will have to be a bit different from Perl 6, in that
information about the signature is not available to the caller, i.e. the
perl parser has no concept of named arguments at the call site. So in
Perl 5, named arguments will in fact be just two arguments used as a
name/value pair, which are mapped to a single named parameter. In
particular, while these two are different in Perl 6:

    foo( name  => 100); # a single named arg, with value 100
    foo('name' => 100); # two positional args, with values 'name' and 100

they will be the same in Perl 5. Note also that Perl 5 will not do
compile-time checking of the caller's args for valid names.

I propose a fairly straightforward implementation for Perl 5: that in the
presence of at least one named parameter in the signature, any arguments
following all positional arguments are treated as name/value pairs, and
the values of recognised names are bound to the corresponding named
parameter. Any unrecognised pairs are left for a slurpy array or hash to
consume (or to croak if no slurpy). An odd number of named args will
croak.

So for example, this:

    sub foo($pos1, $pos2, :$named1, :$named2, %rest) {...}

is about equal to the following:

    sub foo($pos1, $pos2, %rest) {
        my $named1 = delete $rest{'named1'};
        my $named2 = delete $rest{'named2'};
        ...;
    }

but with proper default and error handling, and better performance.

Duplicate named arguments are allowed, the last value being used. This
allows the useful idiom of foo(%defaults, %options) to work.

The ordering of parameter types within a signature would be extended to be:

    1) zero or more mandatory positional parameters, followed by
    2) zero or more optional positional parameters, followed by
    3) zero or more mandatory named parameters, followed by
    4) zero or more optional named parameters, followed by
    5) zero or one slurpy array or hash

Except that would be a compile-time error to have both (2) and (3).

(3) and (4) are new. Note that here isn't any semantic need for any
optional named parameters to always follow all mandatory named parameters,
but including that restriction doesn't prevent you doing anything (as far
as I can see), provides consistency with positional parameters, and
potentially allows better optimisations.

Because named arguments can be supplied by the caller in any order, there
are issues as to which order default expressions are evaluated. To put
things on a firm footing, I propose a conceptual pre-sorting of the
argument list, followed by processing of parameters in strict
left-to-right order.  See also the "Query Parameters" thread for how this
sorting is important there too. (This sorting is just conceptual: the
actual implementation can do whatever is most convenient or efficient, as
long as it manifests the same visible behaviour.)

The sort uses the following rules:

    a) We start with the original raw argument list, i.e. before any
       default expressions have been run or added to the argument list.
    b) All positional arguments (which always come first) are left as-is.
    c) any remaining arguments are treated as key value pairs (with a croak
       if not even).
    d) The pairs are stable sorted into order of named parameters, but
       with any unrecognised pairs left in their original order at the
       end.  Duplicate recognised names are de-duplicated, keeping the
       right-most value; unrecognised names *aren't* de-duplicated.

    For example, with

        sub foo($p1, p2, :$n1, :$n2, :$n3, :$n4 = undef, @rest) { ... }

        foo(1, 2, x => -1, n2 => -2, n1 => 11, y => 1002,
                    x => 1001, n3 => 13, n2 => 12)

    the args list gets sorted to:

        (1, 2,                          # positional args
         n1 => 11, n2 => 12, n3 => 13,  # sorted, de-duped named args
         x => -1, y => 1002, x => 1001) # unrecognised named args
         
Then each parameter is processed in turn in left-to-right order, consuming
the next argument or argument pair as appropriate, or evaluating the default
expression if the argument is missing.

Note that for duplicate arguments, only the right-most value is likely to
evaluated; for example foo(name => $tied1, name => $tied2) would likely
only call (tied $tied2)->FETCH(), although we don't guarantee this.

By pre-sorting the argument list into parameter order, then binding
arguments to parameters in L-R order, we have reduced any ordering issues
down to the same complexity as we already had with positional arguments,
e.g. whether earlier parameter variables are available for use by later
default expressions and in what order default expressions are evaluated.

Whereas in Perl 6, named parameters are optional by default, I think that
for Perl 5, making them mandatory by default makes more sense, for
consistency with positional parameters. If not, then we would need a new
syntax (like p6's :$foo!) to indicate mandatory.

=head2 Named parameter syntax

In the above discussion I have used the Perl 6 :$foo syntax, but we needn't
necessarily use that. We could potentially use a different character; or
more radically, we could divide the signature into two sections separated
by a semicolon. For example

    sub foo($p1, p2, :$n1, :$n2, @slurpy)

would instead be written as:

    sub foo($p1, p2;
            $n1, $n2, @slurpy)

I prefer the former. It's noisier, but conversely the noise makes the fact
that they're named parameters stand out.

Note that in neither case is the slurpy a named parameter: at no point
does the caller do 'foo(..., slurpy => ... )'. And in fact :@foo and :%bar
as parameters are compile-time errors.  If you want a named list, use a
named reference alias instead, e.g.

    sub foo(..., \:@coordinates);
    foo(..., coordinates => [1,2,3]).

For the (...; ...) version, we would need to decide whether an empty named
parameter list is legal. Perhaps allow it, in case we want in future to
add more ';'-separated sections to a signature.

In terms of characters, the following are already taken, or might be taken
under some of the other proposals:

    $ @ %    sigils
    \ *      aliasing
    ?        query parameter
    , )      signature syntax
    #        comment

Personally I think we should stick with ':'.

I don't think the ':' should be considered part of the sigil, and
whitespace should be allowed, e.g. (: $n1, : $n2). Note that Perl 6
doesn't allow whitespace.
0
davem
11/28/2019 5:01:42 PM
=head2 Synopsis:

    ?$x       peek ahead to the next arg
    ??$x      peek ahead and see if there is a next arg
    ?@a       peek ahead and copy all remaining args
    ?%h       peek ahead and copy all remaining key/val arg pairs
    ?{ code } execute some code without consuming any args

Sometimes you want to find out some extra information about the arguments
and the state of argument processing.  With @_ available, this can be
done, if messily; if @_ isn't populated (see the "@_ Suppression" thread),
then it becomes harder/impossible, unless some other mechanisms are added.
Such things include:

* For an optional argument, distinguishing between whether an undefined arg
  was supplied, or an arg wasn't supplied at all; e.g. in:

        sub get_or_set($self, $val = undef) { ... }

   how do you distinguish between these two:

        $foo->get_or_set();
        $foo->get_or_set(undef);

* Passing the original arguments to a superclass method after shifting
  the invocant, but ignoring default values; e.g. the equivalent of:

    sub foo {
        my $self = shift;
        my ($x,$y) = @_;
        return $self->SUPER::foo(@_) if $x + $y > 0;
    }

Query parameters is what I propose as a general mechanism to provide an
equivalent facility to the now unavailable @_.

Normally, each parameter element in the signature introduces a new lexical
variable, in some fashion binds that variable to the next argument, and
at the same time I<consumes> that argument (i.e. removes it from the list
of arguments remaining to to processed).

I propose that if a parameter element begins with '?' it becomes a "query
parameter". It still typically introduces a new lexical variable, but that
variable is set to the value of some query about the current state of
argument processing, I<without> any arguments being consumed. It is
effectively peeking ahead.

Note that '?' is not part of any sigil - it is the first character of the
parameter element; for example if we decide to allow typed lexicals, then
you would write

    sub f (? Int $argc, .....) { ... }
not 
    sub f (Int  ?$argc, .....) { ... }

This is a general escape mechanism, with the possibility to add whatever
new syntax we like following the '?'; but for now I propose the following
four forms. Note that most of these peek at the sorted/deduplicated arg
list (i.e. the one generated for named arguments), except ?@a, which uses
the original raw list.

=head2 Boolean query parameter

This is of the form ??$foo. It sets the specified parameter variable to a
boolean value indicating whether an appropriate argument is available to
bind to the next (non-query) parameter. For example:

    sub foo($self,  ??$has_x, $x = 0,  ??$also_y, $y = 0) { ... }

    $p->foo();         # $has_x false, $also_y false
    $p->foo(100);      # $has_x true,  $also_y false
    $p->foo(100,200);  # $has_x true,  $also_y true

Before a named parameter, it indicates that at least one available
name-value argument pair matches the name of that parameter:

    sub foo(??$has_x, :$x = 0,
            ??$has_y, :$y = 0,
            ??$has_z, :$z = 0)
        { ... }

    foo(y => 100); # $has_x false, $has_y true, $has_z false

Before a slurpy parameter, it indicates that at least one argument is left
to populate it.

[ I did think about making ??$foo an integer value rather than just a
boolean, indicating how many arguments are left to consume. But this
doesn't work well before a named parameter, because there may be arguments
left (so return a positive integer), yet none of those arguments happen to
pair with the next name (so return a false value).  ]

=head2 Scalar query parameter

This is of the form ?$foo. The parameter $foo will be bound to the next
argument, but without consuming it. It behaves the same as ??$foo in terms
of what argument it examines.  If no suitable argument is available, it
will be set to undef. It may have a default value.

For example in the following, $self and $all[0] will have the same
value:

    sub foo(?$self, @all) { bar(@all) if $self->yes }

    # like:

    sub foo { my $self = $_[0]; bar(@_) if $self->yes }

=head2 Array query parameter

This is of the form ?@foo. @foo will be set to any remaining arguments,
without consuming them. So it behaves like a slurpy array, but doesn't
have to be the last parameter. For example:

    sub foo($self,  ?@point,  $x = 0,  $y = 0,  $z = 0) { 
        $self->bar(@point) if $x + $y + $z >= 1;
    }

    $p->foo(1); # calls $p->bar(1), not  $p->bar(1, 0, 0);

Unlike the other query types, this one always examines the raw argument
list, (i.e. before being sorted for named parameters). Because of this,
an array query parameter is forbidden from appearing anywhere to the right
of any named parameter.

Note that the query array can use any extra syntax which is applicable to
slurpy arrays; for example '*' indicates that the elements of the array
are aliased rather than copied (see the "Aliasing and Read-only variables"
thread). So in particular,

    sub foo(? *@args, ......) { ... }   # @args now simulates @_

=head2 Hash query parameter

This is most useful in the presence of named parameters. As discussed in
the "Named Parameters" thread, in the presence of named parameters, the
argument list first goes through a conceptual sorting process before being
applied to parameters in left-to-right order. Any arguments beyond the
positional parameters are treated as (name, value) pairs, and are stable
sorted in order of named parameters, with any unrecognised names going to
the end. Duplicate recognised names discard any earlier pairs.

A hash query parameter acts like a hash slurpy, but without consuming any
arguments. It is applied against the remaining part of the (conceptually)
sorted argument list, (compare that to an array query parameter, which is
applied against the original raw unsorted list). For example, in:

    sub foo (
         $p1,  $p2,         # positional parameters
        :$n1, :$n2,         # named      parameters
        ? %qslurpy,         # query      parameter
        :$n3 = 0, :$n4 = 0, # more named parameters
        %slurpy
    ) { ... }

    foo('a', 'b', n4 => 4, n1 => 1, other => 99, n2 => 2);

    %slurpy  contains (other => 99), and
    %qslurpy contains (n4 => 4, other => 99).

A hash query parameter can appear anywhere in the signature, including in
amongst positional parameters, but in that case it is an error unless it
is an even number of positional parameters before any first named
parameter:

    sub foo($p1, ?%query, $p2, $p3, :$n1, :$n2) { ... }  # ok
    sub foo($p1, $p2, ?%query, $p3, :$n1, :$n2) { ... }  # compile-time err


=head2 Restrictions on query parameters.

Apart from the scalar version $?foo, they cannot have default values:
??$foo = 0 makes no sense and slurpies (@foo and %foo) aren't allowed
default values anyway.

They cannot be a placeholder: i.e. these aren't legal: ??$, ?$, ?@, ?%.

The boolean form, ??$foo, cannot be aliased. The other types are ok.

They cannot be named parameters, e.g. ?:$foo.

They *can* have attributes; e.g.: ?$foo :ro.

Apart from the scalar version $?foo, they cannot have constraints (see
the "Type and Value Constraints and Coercions" thread for details about
constraints).

A query parameter used at the end of a signature is a compile-time error,

=head2 Some considerations and alternative proposals

Here are some of the other suggestions and ideas for getting back some of
the information associated with the now abandoned @_, and why I rejected
them.

Note that I'm not fond of anything which implicitly creates variables, e.g.
an :argcount subroutine attribute which magically declares 'my $argc'.
Ditto for implicitly created predicate variables.

The following ticket has a discussion on predicates (i.e. something that
indicates whether a parameter was passed an argument):

    Subject: [perl #132444] parameter predicates in signatures

The three main approaches suggested in that ticket are:

    * Follow the real variable with a predicate var preceded by a '?':

        sub abc ( $foo = undef, ?$has_foo) { ... }

      This is the basic approach I have taken, although I've generalised
      it into the concept of query parameters. Also with my proposal, the
      query parameter comes first.

    * Include the predicate var as part of the parameter:

        sub foo ($x, $y ? $has_y) { ... }

      which wasn't well liked.

    * Allow SVs to have an alternative 'undefined' state ('uninitialised',
      say) so that a parameter variable's state can be differentiated
      between undefined and never set. But what happens if such an
      uninitialised variable is used as an argument to a second function?
      Suddenly it's turtles all the way down.
      
      Also, its not clear to me how this could be implemented.

    * use an attribute:

            sub foo($self, $x : passed($has_x) = undef)  { ... }

      but arguments to attributes are just single-quoted strings, so
      tricking the toker/parser into declaring 'my $has_x' would be messy.

    * get the equivalent of scalar(@_) from caller(). I hate this. It's
      clunky, and using the argument count to determine whether a
      parameter is passed is messy.
0
davem
11/28/2019 5:02:35 PM
=head2 Synopsis:

    @_ will not be set, unset or localised on entry to or exit from
    a signatured sub; it will still be the @_ of the caller. But any use
    of @_ within the lexical scope of sub a sub will warn.

At the moment we still populate @_ on signature sub calls, which entails
considerable overhead. There have been some discussions about when @_
should be populated, and if it isn't, what value (if any) it should have.
See:

    http://nntp.perl.org/group/perl.perl5.porters/233053

I propose the following:

1) @_ is *always* untouched by the call to the signature function; it is
still the same AV as the caller's: it's as if the call had been made using
the '&foo;' argless call mechanism, and any attempt to modify @_ actually
modifies the parent's @_ (or grandparent's or whatever, depending on
whether the parent etc subs also have a signature).

2) There will be *no* facility (via a pragma or otherwise) to re-enable
@_-populating. I think this is probably the most controversial part of
this proposal, but after a lot of thought I feel strongly about this. It
has the huge advantage of not needing twin code paths to support both
types of argument access, and the need for twice as many tests.  It also
side-steps ambiguities: for example, what happens if a parameter's default
expression modifies @_? Is that seen by subsequent parameters?

The main argument I recall being made for retaining both forms is for when
converting existing code to use signatures, where some of that code may
still use @_ directly or indirectly, e.g.

    sub foo { 
        my (...) = @_;  # easy to spot and convert
        ...
        bar(@_);        # might be missed
    }

However, I think (or at least hope) that any use cases of that sort will
be satisfied by my next two points.

3) I propose that within the direct lexical scope of a signature sub, any
code which uses the @_ variable (both rvalue and lvalue use, and as a
container or as individual elements),  where detectable by the parser,
will trigger a compile-time warning. This should catch most of the
code-conversion issues mentioned above. For the occasional case where you
really do need to use @_ (such as tail-call elimination), just turn off
the warning:

    sub foo (...) {
        ...
        no warnings 'signature';
        local @_ = (1,2,3); # 'local' because we're modifying the *parent* @_
        goto &bar;
    }

Such warnings will extend to any use of @_ within evals etc that are
themselves within the lexical scope of the sub, but they don't extend to
other subs within the lexical scope; for example, this won't warn:

    sub foo (...) {
        my $s = sub { my ($x,$y) = @_; .... }
    }

I'm sure there will be ways to bypass the 'use of @_' warning by directly
manipulating stashes and globs, but perl has always allowed you the rope
to hang yourself if you so desire.

4) With my 'Query Parameters' proposal (see "Query Parameters" thread),
just about any need you may have had for @_ - such as determining whether
a parameter was bound to an argument or to a default expression - can now
be done (usually more easily) with query parameters (these 'query' the
remaining arguments by peeking ahead, but don't actually consume any of
them). As a trivial degenerate case, the equivalent of @_can be
reconstructed as a lexical variable by using an aliasing array query
parameter as the first element of a sub's signature:

    sub foo(?*@args, .....) { ... } # @args equivalent to @_

Note that you can't do

    sub foo(?*@_, .....) { ... }

as we don't allow @_ (nor $_ nor %_) as a lexical, nor as a parameter
variable name. I see no need to change that.

Similarly, boolean query parameters allow you to see whether an argument
was available for the next parameter:

    sub foo(??$saw_x, $x = 0) {{ ... }

The warnings category: I propose a new category called 'signature'
which includes the @_ warning plus any other new signature-related
warnings we might add. (At the moment there are no such warnings: all
current signature issues are croaks rather than warns.) But I'm open to
other suggestions - perhaps one of the existing categories is suitable?
0
davem
11/28/2019 5:03:16 PM
=head2 Synopsis:

    sub foo {
        # the lexical vars $a, @b and %c are aliased to the things pointed
        # to by the reference values passed as the first three arguments:

        \$a \@b, \%c,

        # $d is aliased to the fourth arg:

        *$d,

        # $e is aliased to the fifth arg,  but at compile time, any use
        # of $e in lvalue context within this lexical scope is a compile
        # time error:

        *$e :ro,

        # the elements of @f are aliased to any remaining args,
        # i.e. a slurpy with aliasing ...:

        *@f

        # .. or a slurpy hash; every second remaining arg is aliased to
        # the hash's values:

        *%g
    ) { ... }


=head2 :ro

Before discussing the general issues around aliasing of signature
parameters, I want to make a specific proposal concerning read-only (RO)
variables. This can apply to all lexical variables, not just signature
parameters, although it would be particularly useful for the latter. This
can then inform the aliasing proposals which follow.

In Perl 5, it's hard to efficiently create a read-only alias of another
SV.  You can't mark the original itself as SVt_READONLY, as the original
should still be writeable. Instead you have to create some sort of
read-only proxy SV, which is slow and expensive.

I propose a much more restrictive (but hopefully still useful) RO regime;
rather than making the variable itself RO, instead at compile time, ban
lvalue *uses* of the variable in any code its lexical scope. More
precisely, for the $x in this example:

    {
        my $x :ro = ...;
        ...;
    }

then within the lexical scope of $x (from the point it's been introduced
onwards), any lvalue-context usage of that variable is a compile-time
error. Note that as well as forbidding all the obvious stuff like $x++,
all the following would also be compile-time errors:

    foo($x);
    for ($x, ...) { ...}
    $y = \$x;
    sub foo: lvalue { ....; return $x }

since they all use $x in (potentially) lvalue context. This is very
restrictive, but at least since the check is at compile time you will find
this out immediately, rather than discovering at run-time months later.

For lexical arrays and hashes, both container and element use are
forbidden in lvalue context:

    my @a :ro = ...;
    my %h :ro = ...;

    # all these are illegal:
    \@a;
    \%h;
    $a[0] = 1;
    foo($h{bar});
    for ($a[$i], $h{$j}) { ... }

Note that a :ro variable could still be modified by, for example, being
tied and the FETCH() modifying itself (think of a tied fetch counter for
example). But STORE() should never be called.

Note that a :ro variable will still be subject to the usual upgrades that
SVs go through:

    sub f(*$x :ro) { $x + 1 } # $x is an alias - see below
    my $s = "123";            # $s is an SvPV
    f($s);                    # $s is now an SvPVIV

In terms of implementation, the basic principle is that the OPf_MOD flag
being set on an OP_PADSV op or similar for such a lexical variable will
croak.

=head2 Aliasing

It would be nice sometimes for a parameter to be an alias of the argument
rather than a copy, for convenience and/or performance. For example
aliasing an array parameter to a passed array ref argument or, less
commonly, providing access to the caller's value so that can be altered.

First, lets look at how Perl 6 does it. It uses traits, which allow
the following permutations for parameter variables:

     $x is readonly # the default - an alias, but not modifiable
     $x is rw       # direct alias: modifying $x modifies the argument
     $x is copy     # like current Perl 5 signature parameters
     $x is raw      # (I don't understand this one)

     Note that rw on a slurpy parameter is "reserved for future use by
     language designers": (*@a is rw).

I'm not proposing that we use this syntax, but it gives an idea of what
we need to provide in another way.

Note that read-onlyness is orthogonal to aliasing/copying; in principle
you can have all four of these permutations:

    copy   rw  # like P6's 'is copy'
    copy   ro  # no P6 equivalent, and not very useful?
    alias  rw  # like P6's 'is rw'
    alias  ro  # like P6's 'is ro'

I propose that we use the new :ro attribute (which I discussed above) to
indicate readonly-ness, and use the syntax discussed below to enable
aliasing rather than copying. Each can be selected independently of each
other.

There have been two significant RT tickets that discussed aliasing for
signature parameters.

The first was from May 2016:

    RT #128242: Aliasing via sub signature

and the second by Zefram in Nov 2017:

    RT #132472: aliasing in signatures

which was really a summation of the ideas from the first ticket firmed up
into a coherent proposal.

I am in full agreement with the analysis provided by Zefram, and think we
should use his proposal as-is bar some bike-shedding about the actual
syntax.

Zefram pointed out that there are really two distinct types of aliasing
we might wish to do.

=head3 Reference Aliasing

The first type, which I will call "reference aliasing", expects the
argument to be a I<reference> to something, and the signature processing
code first dereferences that argument (with dereference overloading
honoured) and aliases the parameter to the resulting container - croaking
if it's not a suitable reference. For example:

    sub foo(\$x, \@a, \%h, $other, $stuff) { ... }

    foo(\$X, [], \%H, 1, 2);

Then within the body of foo(), $x is an alias for $X, @a for the anonymous
array, and %h for %H. This type of aliasing is more useful for array and
hash references, but scalars are supported for completeness. Note that @a
and %h are *not* slurpy parameters; they consume a single argument, and more
parameters can follow them.

Any default value expressions must return a reference to a suitable
container type, or croak: e.g.:

    sub foo(\$x = \1, \@y = [], \%z = \%::Defaults) { ... }

As an aside, in the "Miscellaneous suggestions" thread, I propose a
'default default' for optional expressions, where for example ($x?) is
short for ($x=undef), \@a? is short for \@a=[] and %h? is short for
\%h={}.

The use of \ to prefix the parameter name seems uncontroversial, since
it mimics the existing lexical variable aliasing syntax:

    my \$x = \$X;
    my \@a = \@A;

Placeholder parameters would check that the argument is a suitable
reference, then throw it away:

    sub foo(\@, \%, ...)

Similarly, default expressions for placeholder parameters will still be
evaluated and checked before being thrown away. The optional placeholder,
'\@=' checks the argument, if present, then skips it.

=head3 Direct Aliasing

The second form of aliasing which should be supported can be thought of as
'direct': it doesn't use references, and it aliases the *elements* of
arrays and hashes rather than the containers. Zefram proposed using a
trailing \ to indicate this, but I'm not keen on that; instead for now
I'll use a '*' prefix, which I'll try to justify later. Direct aliasing
can be applied to both scalar parameters and to slurpy parameters (of
which there can of course be only one, located at the end of each
signature). For example, given:

    sub foo(*$a, *$b, *@c     ) { ... }
        foo( $A,  $B,  $C, @D);

then within the body of foo(),

    $a    is an alias of $A;
    $b    is an alias of $B;
    $c[0] is an alias of $C;
    $c[1] is an alias of $D[0];
    $c[2] is an alias of $D[1];
    etc

It works similarly for a *%h hash slurpy, except that only the hash's
*values* (and not keys) are aliased. Given:

    sub foo(*%h) { ... }
    foo($k1, $v1, $k2, $v2);

then within the body of foo(),

    $h{$k1} is an alias of $v1;
    $h{$k2} is an alias of $v2;

while the keys of the hash are just plain strings as usual.

Placeholder direct alias parameters are forbidden. There's no (*$, *@).

Default values are allowed for scalar direct aliased parameters, but not for
direct aliased slurpies (since slurpies aren't allowed defaults).

Direct aliasing would be most useful with :ro. It allows the performance
gain of not copying, but with safety. So for example

    sub foo { $foo[$_[0]] }

becomes

    sub foo (*$i :ro) { $foo[$i] }

Now for the bikeshedding about what syntax to use for direct aliasing.
Zefram tentatively proposed using $x\, whereas I propose *$x.

Here's why I prefer my suggestion:

* By having a single syntactical slot to indicate aliasing, that slot can
  populated with only one of two chars to indicate which type of aliasing
  is wanted (\$x and *$x). With two slots there's ambiguity: what does
  \$x\ mean?

* A trailing \ implies some sort of (de)referencing, but there isn't any
  for direct aliasing. (Conversely, \$x is good for reference aliasing
  because the \ correctly implies that a reference is involved.)

* A trailing \ could potentially interfere with any new syntax which
  follows the parameter name, or could look like its trying to escape it.

I'm happy to consider other character candidates for the direct aliasing
syntactic slot, but I like '*' because:

* It has a loose mnemonic association with aliasing via typeglob
  assignment: *foo = ....;.

* It has a loose association with Perl 6's array flattening syntax, *@a,
  which flattens the elements of the array rather than referring to the
  container itself; by analogy, whereas \@a aliases the array's container,
  *@a aliases the elements of @a.

The alias character should be considered syntax rather than being part of
the sigil, and in particular, whitespace should be allowed: (\ @x, * @a).

=head3 default aliasing behaviour

There is the question of default behaviour. In Perl 5 currently the
default is to copy, while in Perl 6 the default is a read-only alias.
I think changing  Perl 5 to match Perl 6 is probably too big a step. We'd
also have to introduce a new attribute, :copy say, which indicates that we
*don't* want :ro.

A second possible default is what to do in the explicit presence of :ro.
For parameters, this attribute is most useful for aliasing, so one
possibility is to make aliasing the default in the presence of :ro. This
means that rather than typing (*$x :ro) you could just type ($x :ro).
However, the problems with that are:

1) it's ambiguous what sort of aliasing is being automatically enabled
   (\ or *)
2) we'd need some extra syntax to be able to turn off the aliasing.

So all in all, I think we should leave things as they are:

    ($x)      read-write copy
    (*$x :ro) Perl 6-style read-only alias.
0
davem
11/28/2019 5:04:01 PM
[
This proposal is the one where I am most outside of my comfort zone.
It attempts to supply useful features, and to allow expandable hooks so
that systems like Moose which have their own constraint systems can make
use of it, but I'm not intimately acquainted with such systems, and there
may be better ways to do this.

It's something I probably wouldn't use myself much, and were I to
implement it, its probably the thing I would do last or nearly last.
]

=head2 Synopsis:

    sub f(
            $self isa Foo::Bar,         # croak unless $self->isa('Foo::Bar');
            $foo  isa Foo::Bar?,        # croak unless undef or of that class
            $a!,                        # croak unless $a is defined
            $b    is  Int,              # croak if $b not int-like
            $c    is  Int?,             # croak unless undefined or int-like
            $d    is PositiveInt,       # user-defined type
            $e    is Int where $_ >= 1, # multiple constraints
            $f    is \@,                # croak unless  array ref
            $aref as ref ? $_ : [ $_ ]  # coercions: maybe modify the param
    ) { ...};


=head2 Background

It seems that people express a desire to be able to write something like:

    sub f (Int $x) { ... }

as a shorthand for for something like:

    sub f ($x)  {  croak unless
                             defined $x
                          && !ref($x)
                          && $x =~ /\A-?[0-9]+\Z/;
                    ....;
                }

Similarly people want

    sub f (Some::Arbitrary::Class $x) { ... }

as a shorthand for

    sub f ($x)  {  croak unless
                             defined $x
                          && ref($x)
                          && $x->isa(Some::Arbitrary::Class);
                    ...;
                }

Furthermore, there are also Perl 6 generic constraints:

    sub f ($x where * < 10*$y) { ... }

which (in Perl 6 at least) can be used either as-is, or as part of a
multi-method dispatch - which selects whichever version of f() best
matches its (constant) argument constraints.

In any such a scheme, there would have to be support for built-in types
(such as Int) plus the ability to extend the type system with
user-defined (or pragma-defined) types (e.g. PositiveInt say).

So, what's the best way of giving the public what they want?

First, we should be very clear that's what is contained within this
proposal is an argument-checking system, not a type system for variables.
It will guarantee that, at the start of execution of the main body of the
sub, the parameter variable has a value that meets certain constraints,
possibly via some initial modification; and if this is not possible, the
sub will instead have croaked at that point.

It *doesn't* mean that the sub's caller is any way constrained at compile
time as to what arguments it can pass. Unlike prototypes, which affect
caller compilation, signatures are at heart just efficient syntactic sugar
for code at the start of the body of a sub which checks and manipulates
the contents of @_ while binding them to local lexical variables.

Similarly, a constraint doesn't constrain the value of a lexical parameter
variable later in the body of the sub. For example:

    sub f (Int $x) {
        ...;         # $x contains a valid integer value at this point
        $x = [];     # legal, even though not an Int value
    }

    f( [] );         # compiles ok; only croaks when f() is called.

This is in contrast to a "real" type system. For example, the existing
'my Dog $spot' syntax can be made (in conjunction with 'use fields') to
croak on invalid hash keys:

    package Dog;
    use fields qw(nose tail);
    my Dog $spot = {};
    $spot->{fins} = 1; # compile time error

It might also, in some hypothetical future version of perl, support:

    my Int $x = 1;
    $x = []; # error

A "real" type system might also allow optimisations: e.g. storing the
value of $x directly as an integer rather than an SV, and planting special
versions of arithmetic ops which deal directly with an int on the stack
rather than an SV.

So, given that a constraint type system and a "real" type system are two
separate things (unless someone smarter than me can can suggest a way of
unifying them), I think that they should be kept syntactically separate.
In particular, we shouldn't use a type prefix before the variable name to
specify a constraint; that should be reserved for a hypothetical future
type system.

=head2 Main Proposal

Instead, what I propose is a special postfix '!' symbol, plus four
parameter postfix keywords (akin to Perl 6 traits): C<where>, C<as>, C<is>
and C<isa>. These can be applied in any order, and more than once, to each
parameter. Syntactically, they are similar to statement modifiers, except
that they can be stacked. For a given parameter, they come after all of
the parameter's other syntax (including default values).  They are
processed against the lexical parameter, after any binding of arguments or
default value. In detail:

=over

=item $param!

This is a special short cut for what I assume is a common requirement.
It is equivalent to

    $param where defined $_

Unlike the other constraints, the exclamation mark goes directly after
the parameter name:

    $param! :shared = 0

But like the other constraints, it is tested I<after> any default
value has been applied.

=item where <boolean expression>

This temporarily aliases $_ to the current parameter, and croaks if
the expression doesn't return a true value: e.g.

    sub f ($x = 0 where defined && $_ < 10, ...) { ... }

=item isa <Class::Name>

    sub f ($x isa Class::name)

is roughly shorthand for

    sub f ($x where    defined $_
                    && ref($_)
                    && $_->isa('Class:name'))

Although if Paul Evan's 'isa' infix keyword is accepted into core, then
the signature 'isa' trait should become exactly shorthand for:

    sub f ($x where $_ isa Class::Name) { ... }

and any rules regarding whether the class name is quoted and/or
has a trailing '::' should be the same.

Note that it uses perl package/class names, not constraint type names.

A class name followed by '?' indicates that an undefined value is
allowed:

    sub f ($x isa Class::name?)

is effectively shorthand for

    sub f ($x where    !defined $x
                    || (   ref($x)
                        && $x->isa('Class:name')))


=item as <coercion expression>

This temporarily aliases $_ to the current parameter, evaluates the
expression, and assigns the result to $_, which may cause the parameter
lexical variable to be updated: e.g.

    sub f ($array_ref as (ref ? $_ : [ $_ ]), ...) { ... }

So
    ($x as expr, ...))

can be thought of as shorthand for

    ($x where (($_= expr, 1), ...)

In practice I would expect 'as' to be used mostly by pragma writers
to define custom types for use by 'is' as described below; 'as' itself
would appear less frequently actually in signatures.

=item is <constraint-type-name>

This implements the functionality desired by the hypothetical 'Int $x'
example above to check whether the parameter's value satisfies the
named constraint, possibly coercing it too. It supports using the
hints mechanism to allow pragmata to add new constraint types in
addition to those already built in. For example:

    # built-in type:
    sub foo ($x is Int ) { ... }
    # roughly equivalent to: die unless defined && /^-?[0-9]+$/

    # user-defined type:
    sub foo ($x is PositiveInt) { ... }
    # roughly equivalent to: ($x is Int where $x >= 0)

See below for details of how custom constraint types can be created.

Like 'isa', 'is' type names can be followed by '?', indicating that an
undefined value is also allowed. If the argument is undefined, the
type check is skipped. (So a bit like Moose's MaybeRef etc.)

    sub foo ($x is Int?) { ... }

Type names as used by 'is' occupy a different namespace than perl
packages and classes, and in particular they can't include '::'
in their name, so they are less likely to be confused with typical
Foo::Bar package names.

Note that in an earlier draft of this proposal I used the 'isa'
trait to handle both 'isa' and 'is'; the idea being that the type name
would be first looked up as a built-in/custom type name, and if not
recognised, would fall back to an isa() check. But after some private
discussions, I think its best to keep the two concepts (and name
spaces) entirely separate.

Note that there are some specific advantages of having the type as a
postfix trait, i.e. ($x is Foo, ...) rather than (Foo $x, ...): it makes
it consistent with the other constraint features (where/as/isa), and keeps
everything being processed in a strict left to right order; for example in

    sub f ($x :shared is Int where $x > 10)

all the constraints are processed *after* the 'shared' attribute code
is called; in

    sub f (Int $x :shared where $x > 10)

the order is all mixed up.

The built-in constraint types will also coerce the resultant parameter
to be a suitable type of SV. So for example,

    sub f($i is Int, $s is Str) { ...}

would do the rough equivalent of

    sub f { my ($i, $s) = (int($_[0]), "$_[1]"); ... }

So even if called as f("123"), $i won't have an initial string value
(internally it will be an SVt_IV, not an SVt_PVIV).

Similarly, even if the argument is overloaded, the resulting
parameter won't be - but may trigger calling the relevant overload
conversion method (int, "" or whatever) to get the plain value.

This means that (for example), if a Math::BigInt value is passed as
the argument for $i, the resulting parameter will just be a plain int
and any extra data or behaviour will have been lost.

On the occasions where this is unacceptable, the coder can of course
just not declare a constraint in the signature and do any checks
manually in the body of the function.

=back

So that's the basic idea. I think that most of the time end-users will
just use 'is Type', 'is CustomType' or 'isa Some::Class', while pragmata
writers will make more extensive use of 'where' and 'as' to create custom
constraint types such as 'CustomType'. So most code will use e.g. 'is
AlwaysArrayRef' and only behind the scenes is this defined fully as
something like 'where defined($_) as [ ref ? $_ : [ $_ ] ]'.

Hopefully this proposal provides a general-enough framework such that the
implementers of systems like Moose can make use of it to make them run
the same (but faster) on newer releases of perl.

=head2 Some general rules for constraints

Constraints apart from '!' and 'isa' cannot be used on a parameter which
is a direct alias (e.g.  *$x), since this might trigger coercing the
passed argument and thus causing unexpected action at a distance.

The where/as/isa/is keywords will only be recognised as keywords at the
appropriate point(s) where lexing a signature parameter; elsewhere, they
are treated as normal barewords / function names as before.

At the start of constraint processing, $_ is aliased to the lexical
parameter variable, and any modification of $_ will modify the parameter,
with the change being visible to any further constraints.

The behaviour of $_ if it becomes unaliased from the lexical parameter
(e.g. via local *_ = \$x) is undefined for any further constraints in the
current parameter declaration which make explicit or implicit use of $_,
such as for $_->isa(...) and for the variable which the result of 'as' is
assigned to. The variable being used/modified might end up actually being
either $_ or the lexical parameter, and this might vary between perl
releases and levels of optimisation.

The complete collection of where/as/isa/is clauses are collectively
enclosed in their own logical single scope, in order that $_ can be
efficiently localised just once. This means that any lexical variables
declared inside will not be visible outside of those clauses. For example:

    my $foo;
    sub f ($x    is Int where (my $foo=2*$x) < 10,    $y = $foo) { $foo }

is treated kind of like:

    my $foo;
    sub f ($x {  is Int where (my $foo=2*$x) < 10  }, $y = $foo) { $foo }

in that the $y parameter and the body of the sub both see the outer $foo,
not the inner one.

Note that in my proposal, constraints are applied to a parameter's value
*after* binding, regardless of whether that value was from an argument or
from a default expression. This is because in something like:

    ($x,  $y = $x is Int)

you have no say over what $x might contain. This does however mean that
you may get the inefficiency of applying constraints to e.g. constant
default values. It might be possible in this case to run the constraint
checker against the default value once at compile time, then skip the
check at run time if the default value is used.

Constraints can only be supplied to scalar parameters; in particular they
can't be applied to:

* Slurpy parameters like @array and %hash;

* Reference-aliased aggregate parameters like \@array and \%hash (but in
  these cases perl will already croak at runtime if the supplied arg isn't
  an array/hash ref);

* Query parameters apart from scalar, ?$x.

* Placeholder (nameless) parameters. In the very rare cases where you
  actually want to check the passed argument while throwing it away
  anyway, you can always fallback to using a named parameter:

    sub foo ($self, $       is Int where $_ > 0) { ... }   # illegal
    sub foo ($self, $unused is Int where $_ > 0) { ... }   # ok

  Imposing this restriction makes implementing and optimising constraints
  easier.

=head2 Constraint type names

I envisage that type names will be allowed the same set of characters as
normal identifiers such as variables, and that this set is extended as
expected when in the scope of 'use utf8'. But they aren't allowed ':' (and
specifically not '::') to avoid confusion with package/class names, which
are a separate namespace.

I have a further suggestion (which caused at least one porter to privately
recoil in horror).  I think that '+', '-' and '!' characters should also be
allowed as part of the type name (but not as the first character, and
possibly only as a trailing character). So just for example, either
perl itself or a pragma could define these additional types:

    $x is Int--   equivalent to:   $x is Int where $_ <  0
    $x is Int-    equivalent to:   $x is Int where $_ <= 0
    $x is Int+    equivalent to:   $x is Int where $_ >= 0
    $x is Int++   equivalent to:   $x is Int where $_ >  0

    $x is Str+    equivalent to:   $x is Str where length($_) >  0

which are easier to type and read than "PositiveInt", "StrictlyPositiveInt"
etc, say.

Similarly, a trailing '!' as part of the name might imply a stricter
version of a type. For example, "Int!" might croak if passed any value
which can't be losslessly converted to an integer; so 123.4 and "123.4"
would croak, while 123 and "123" would pass. Plain "Int" would allow both
of those, but would croak on "123abc".

I think we should also include a few built-in "symbol" constraint type
names, specifically:

    is \$   # must be a scalar ref
    is \@   # must be an array reference
    is \%   # must be a hash reference
    is \&   # must be a code ref
    is \*   # must be a glob ref

Which are less clunky than 'is ArrayRef' etc.  I think a plain ref is
better specified as 'is Ref' rather than 'is \ ' though.

(Note however that '$aref is \@' will often be easier to write as '\@a';
i.e. get perl itself to deref and alias the array, doing the check for
free. Ditto \%.)

=head2 Details on 'is' built-in constraint types

We need to decide exactly what built-in types perl should support, and
what value(s) those built-ins (e.g. Int, Int!, Num, Str etc) should accept
and what coercions they perform. I think that these details are still up
for discussion and I don't have any strong feelings. For example in the
discussion above about Int and Int!, I'm assuming that perl will convert a
string containing a valid integer value into an integer rather than
croaking. Perhaps people would prefer instead that a string like "123"
should croak if being coerced to an Int. Or perhaps only a lower-case
variant, "int", should croak. Which of these count as Int:

        undef
        1.2
        ""
        "123"
        " 123 "
        "1.2"
        "0 but true"
        "0.0"
        "0abc"
        "0E0"

etc? The one thing I'm mostly certain of is that Int should *not* just be
a check that the argument has the Svf_IOK flag set.

Perhaps lower-case-only names (like int) should be reserved for perl
built-ins?

=head2 Custom constraint types

At compile time it will be possible for pragmata and similar to add
lexically-scoped type hook functions via the hints mechanism. These will
allow constraint type names to be looked up and handled according to the
pragma's wishes.

It is intended that the lexical scope of the hooks allows built-in types
to be overridden, e.g.

    sub f1($i is Int) {} # built-in Int
    {
        use Types::MakeIntMoreStrict;
        sub f2($i is Int) {} # Int as defined by Types::MakeIntMoreStrict
        {
            use Types::EvenStricter;
            sub f3($i is Int) {} # Int as defined by Types::EvenStricter
        }
    }

In the presence of hooks, the hook functions are called at the
subroutine's *compile* time to look up the constraint type name. The
return value of the hook can indicate either:

1) An error string.

2) Unrecognised: pass through to the next hook, or in the absence of
further hooks, treat as a built-in.

3) A returned checker sub ref which will be called at run-time each time
the parameter is processed. The sub ref takes a single argument, which is
the parameter being processed, and the return value(s) can indicate
either:

    * an error string;
    * the parameter is ok;
    * or a return value  which should be used in place of the
      parameter (this allows coercion).

Note that the hook sub ref itself can have a signature with constraints.
So the extra constraint processing done by the sub ref can be handled
either as explicit code in its body, or implicitly with its own signature
constraints.

Also, the sub ref can (if it chooses) modify its $_[0], which means it's
modifying $_, which is aliased to the parameter of the caller currently
being processed and possibly aliased to the checker sub's caller's
caller's argument too.  In fact arguably it should achieve coercion by
modifying $_[0] rather than returning a new value as was suggested above.

The sub should only croak on some sort of internal error; when detecting a
constraint violation, it should just return an error string; this allows
for the possibility of alternations (although I'm not keen on allowing
alternations).

4) Return a string containing a source code snippet to be inserted into
the source text at that point.

This option is in many ways the most interesting, as it effectively allows
pragmata to inject extra constraints into the source code. For example,
suppose there's a user-written pragma called Type::IntRanges; then with
this code:

    use Type::IntRanges;
    sub f ($x is PositiveInt) { ...}

At 'use' compile time the pragma registers itself in the lexically scoped
hints. Then when the signature is parsed and compiled, the pragma's hook
function is looked up in the hints, then called with the type name
'PositiveInt'; the hook returns the string

    'is Int where $_ >= 0'

which is injected into the source code stream as if the coder had instead
directly written:

    sub f ($x is Int where $_ >= 0) { ...}

Similarly, ($x is AlwaysRef) might be translated at compile time into
        ($x where defined($_) as ref ? $_ : [$_] )

(i.e. coerce into a ref, but croak if not defined).

This can be nested; the injected source code can also contain a custom
type name which will also trigger a source code injection.

[ Note: I have no idea how easy it will be to inject raw src text into the
input stream, especially if the lexer has already processed the token
following the type name and passed it to the parser as the lookahead
token. If not viable, then I may have to drop this option. ]

While powerful, this code injection has a couple of downsides.  First, you
may get compiler warnings or errors appearing to come from a place in your
source code where there is no such syntax. To avoid this, hook writers
should be encouraged to write hooks which only supply simple, well tested
code snippets which shouldn't produce warnings or compile errors (they can
of course cause constraint errors). Secondly, there's nothing to stop a
hook returning 'bobby tables'-like source code like 'is Int,
$extra_param'. The docs should state that doing anything other than
injecting extra constraints into the current parameter is undefined
behaviour.

Conversely, using a sub ref to process every parameter avoids the
confusion of code injection, but is slow: a sub call for every parameter
in the current sub call. Also, error messages may appear to come from the
hook sub ref buried somewhere in a pragma.pm module, rather than the
user's code.

5) This is a tentative suggestion that would replace options 3) and 4).
This would be for a constraint hook to be specified as a empty-bodied sub
with a single parameter. The constraint(s) specified for that parameter
become the custom constraints which that hook provides. In some fashion
the code previously compiled for that "prototype" sub's constraint is
copied and/or executed. This would be more efficient than calling a whole
sub for each parameter, and would more constrained than injecting text
into the source code. It would of course be nestable; for example:

    hook 1: 'PositiveInt' maps to: sub ($x is Int where $_ >= 0) {}
    hook 2: 'OddPosInt'   maps to: sub ($x is PositiveInt where $_ % 2) {}

    sub foo($self, $arg is OddPosInt) { ... }

Most of these hooking methods may have issues with deparsing correctly, so
this needs careful implementation.

=head2 Checking types outside of signatures.

I propose that for each built-in constraint type there will be a
corresponding function in the 'is::' namespace which returns a boolean
indicating whether the argument passes that constraint. This would be
particularly useful where the constraint is too complex to be specified in
the signature, e.g.

    sub f ($n) { die unless is::Int($n) || is::Num($n); ... }

Note that these functions would only do the checking part of the type's
action, not the coercion part (if any).

The is:: namespace would behave similarly to utf8::, in that the functions
are always present without requiring 'use is'.

I'm not sure whether a similar facility can be provided for custom types.
Perhaps have an is::is($x, 'Type') function which at runtime looks up
"Type" using the same lexical hints, to find the right hook. This would
require custom hooks to provide info for both the signature compilation
and a function to be called at runtime. This is a bit hand-wavey. It would
be also be useful for built-ins having extra characters in them like I
suggested above, e.g. is::is($x, 'Int++') and is::is($aref, '\@');

=head2 Moosey extensions to 'is'

Moose supports aggregate and alternation / composite constraints; for
example, ArrayRef[Int] and [Int|Num].

Personally I think that we shouldn't support these; it will make things
far too complex. Also, the nested HashRef[ArrayRef[Int]] form quickly
becomes a performance nightmare, with every element of the AoH having to
be checked for Int-ness on every call to the function.
0
davem
11/28/2019 5:04:42 PM
We need to determine and document how the various parts of a signature
behave as regards to lexical scope, visibility, tainting and ordering of
things like default expressions and constraints.

=head2 Scope

I propose for lexical scoping that:

    sub f($a, $b, $c, ... ) { BODY; } 

Is logically equivalent to:

    sub f {
        my $a = ....;
        my $b = ....;
        my $c = ....;
        ....;
        BODY;
    }

In particular, each parameter element is independent taint-wise, and each
parameter variable has been fully introduced and is visible to default
expressions and the like in further parameters to the right of it.

For example, the first default expression in

    sub foo($x = $x + 1, $y = $x + 2)
    
sees any outer or global $x rather than the parameter (similar to
C<my $x = $x + 1>), while the second expression sees the parameter $x.

'my' declarations in default expressions are visible to further parameter
elements and the main body of the program, e.g.

        sub f($a = 1 + (my $x = 1), $b = $x, ...) { ... $x }

Formally, lexical parameter variables are introduced at the end of the
parameter declaration, and in particular are not visible to 'where' and
'as' clauses for the current parameter.

'local' declarations have similar scope.

(This is all already the current behaviour.)

However, there will be an implicit scope around the collection of
where/as/isa/is traits (see the "Type and Value Constraints and Coercions"
thread).

=head2 Ordering of evaluation of terms

There is much external visibility, both from explicit execution of things
like default expressions and constraints, and implicitly from things like
FETCH(), overloaded stringify, and attribute handlers. The question is how
are these ordered, and what do we guarantee?

I propose that a few aspects of ordering are well defined; everything
else is left undefined, to allow us to change things in different releases
for the purposes of optimisation etc.

Within a single parameter element, we guarantee this order:

 1) attributes->import() is called as appropriate for any :attribute;
 2) the default expression (if present and needed) is run;
 3) the parameter variable is bound to its argument or default value;
 4) the constraint expression (if any) is run.

Between parameter elements, we guarantee that parameters are processed in
left-to-right order. This means that that when calling any explicit code
for parameter N+1 (such as a constraint or default expression), all such
code for parameters 1..N will already have been called, and that
parameters 1..N will have already been bound to their arguments.
This applies to named parameters too, regardless of the ordering of the
name/value pairs in the argument list.

Anything else is undefined and subject to change. In particular:

* There are no guarantees exactly when error checking is performed and
  thus when a croak() might happen; for example, an odd number of
  arguments to a hash slurpy might be detected at the start of signature
  processing, or only at the end of assigning to the hash.

* There are no guarantees of which order or when arguments are processed:
  this may become visible for example as FETCH() calls for arguments,
  string overloading for arguments treated as parameter names, dereference
  overload for \@a-type aliasing, or uninitialized-value warnings.

  Note that this doesn't apply to argument expressions; for example, in
  f($x, g(), $y), the function g() will definitely have been called before
  f's signature processing is started. However, if $x, $y and the return
  value of g() are all overloaded, then there is no guarantee which of the
  three overload method calls will be performed first.

  This lack of ordering is especially important for handling named
  parameters sanely and efficiently.

Perl will be free to re-order things internally, as long as it has no
user-visible side-effects that violate the promises given above.

=head2 Flow control

To maintain sanity and ordering, 'goto LABEL' I<into> a block containing a
default expression or constraint (or any other such code we might add to
parameter elements) should be explicitly disallowed and should croak.
It's currently deprecated.

That will stop abominations such as

    sub f ($a = do { goto FOO }, $b = ..., $c = do { FOO: ...; }) { ... }

(what happens to $b here?)

It's okay to exit a sub via flow control within such a block e.g.:

    sub f ($a = do { next SKIP if ... }, $b = do { return if ...; }) { ... }

Ditto last, redo. It's also okay to die, exit and _exit.

While a real fork should be okay, an ithreads pseudo-fork might have
difficulties, as would creating a new ithread. The documentation should
note this.

Any goto *out* of such a block and into the main body of the sub
should croak (or at least be undefined behaviour if not detectable):

    sub f ($a = do { goto FOO }, $b = ...) {
        ...;
      FOO:
        ...;
    }

This is because it skips over the initialisation of further elements,
possibly ignoring constraints etc.

It's okay to goto out of a sub altogether.
0
davem
11/28/2019 5:05:21 PM
=head2 Synopsis:

    sub foo (
        Dog $spot,       # same as my Dog $spot
        $x ||= $default, # use default value if arg is missing or false
        $x //= $default, # use default value if arg is missing or undef
        $foo?,           # short  for $foo = undef
        \@bar?,          # short  for \@bar = []
        \%baz?,          # short  for \%baz = {}
    ) { ...}

    bar(=$x, =$y);       # short for bar(x => $x, y => $y)

Here are a few random suggestions that people have made at various times
(or that I thought up all by myself!).


=head2 Defined-or

in [perl #132444], Ovid suggested

    sub f($x //= expr) { ... }

which is like

    sub f($x = expr) { ... }

Except that it uses the default expression if the argument is undef as
well as if it is missing.

Presumably we should also have ||= .


=head2 Allow typed variables.

See the big proposal for a constraint system. Part of that proposal is that
the syntactic slot for types (similar to 'my Foo $foo') shouldn't be used
for a constraint (that becomes ($x is Foo) or ($x isa Foo) instead).

So I propose we allow

    sub f(Dog $spot, ...) { ..}

and make it mean exactly mean the same as the currently supported

    my Dog $spot;

which (among other things) allows compile-and runtime checking of
subscripts of hash references. If the semantics of 'my Dog $spot' ever
expand in the future, then the meaning of 'sub f (Dog $spot)' expands in
lockstep.

Placeholder parameters wouldn't be allowed types.

=head2 Allow a shortcut for a 'default' default value

allow $foo? as a shortcut for $foo=undef
  and $?    as a shortcut for $=

Similarly for reference aliases,
    \$s? becomes a shortcut for \$s = undef,
    \@a? becomes a shortcut for \@a = [],
    \%h? becomes a shortcut for \%h = {},

At the same time, ban the existing legal syntax '$=' which means an
optional placeholder, and allow only the new '$?'. This would make things
more consistent, as '$foo=' is currently illegal. Also, I find a bare
trailing equals sign ugly, and it could potentially clash with future
syntax which might be added to the end of a parameter.

Note that ?$foo is a query parameter "borrowing" the next argument, while
$foo? is an optional parameter which is assigned an undef value if no
argument is present.


=head2 Auto-declare $self

Perhaps allow simple syntax to auto-declare $self as the first argument?

In Perl 6, the invocant is implicit in method subs, and can be accessed
using the 'self' keyword:

    method foo ($x) {  self.do($x, 'foo') }

but can be explicitly named (note the lack of comma):

    method foo ($me: $x) {  $me.do($x, 'foo') }

Cperl supports a similar auto-declaration with an added 'method' keyword:

    method foo ()           { $self->{foo} }
    method bar ($this:, $x) { $this->{$x}  }

Perl 5 of course doesn't have a 'method' keyword, and if we were to add
it, we would need to decide what semantics it brought to the table.

I don't have any strong urge to add such a feature.


=head2 Allow a code block

Using the general query parameter escape mechanism (which doesn't consume
an argument), perhaps a parameter starting with '?{' could be a code block
which would be executed at that point in the argument processing. E.g. 

    sub foo($x, ?{ print "x=$x\n" }, $y = $x+1) { ... }

(The docs will need to warn that it may affect (as in remove) optimisation
of subsequent parameter processing.)

I suppose the question is, whether this is useful, and whether it allows
you to do things that can't be done with default value expressions, with
the proposed $x //= 0 'undef parameter' handling, and with the proposed
constraint syntax (where/as etc)?

The '?' before the '{' isn't strictly necessary syntax-wise, but grouping
it in with the 'query parameter' syntax emphasises that this parameter
doesn't consume an argument.


=head2 Whitespace

We need to decide where whitespace is allowed or forbidden in things
like 
    ??$foo
    :\@bar=[]
etc.

My feeling (and as expressed in other individual proposals here) is that
everything apart from the sigil and parameter name is signature syntax and
can have optional whitespace around it, like perl stuff generally can.
E.g. both these are allowed:

    Dog\:$foo:shared=0 is Int where$_>0
    Dog \ : $foo : shared = 0 is Int where $_ > 0

Similarly, both of these are ok:

    ??$has_x
    ? ? $has_x

The remaining issue is whether whitespace is allowed between a sigil and a
parameter name. Perl 5 currently allows:

    my $ x;
    my $
    y;

and similarly allows:

    sub f ($ x, $
       y)
    { ... }

On the other hand, Perl 6 doesn't allow whitespace. Should we similarly
ban it from Perl 5 signatures? My gut feeling is yes: fix this while still
experimental.


=head2 Other traits

(This section is just a vague bit of hand-waving.)

The current Constraints proposal defines 4 traits which can follow a
parameter declaration (an attribute and a default value have been included
jn theses examples to demonstrate where traits fit in with them):

    $x :shared = 0 where ...
    $x :shared = 0 as    ...
    $x :shared = 0 isa   ...
    $x :shared = 0 is    ...

Should we in some fashion allow additional user/pragma defined traits?
E.g. 'does', 'has' etc? I have absolutely no idea of how they could be
hooked in (or even whether they could), or how useful they would be, or
what they (in general terms) would do.

Also, is the existing attribute mechanism sufficient instead? The main
differences are that:
  1. Attributes take simple q()-quoted strings as their argument, and
     are called immediately after the parameter lexical variable has been
     created but before the argument has been processed and bound to the
     parameter.
  2. The currently proposed constraint traits are processed after argument
     and/or default value binding, and what follows them is general perl
     syntax, at a precedence such that only a ',', ')' or another trait
     can terminate them. In addition, the complete collection of trait
     code is enclosed in a logical scope where $_ has been initially
     aliased to the parameter variable. Presumably custom traits would
     follow a similar pattern.


=head2 Order of features within a parameter declaration

I propose the following order:

    ?                 Optional start of query parameter
    ?                 Optional start of boolean query parameter (??$x)
    Int               Optional type
    '\' or '*'        Aliasing
    :                 Named parameter
    [$@%]foo          Sigil with optional parameter name
    !                 Optional "croak if undef"
    :foo(...)         Optional attribute(s)
    ?                 'default' default value (instead of default value below)
    = ....            Default value
    where/as/isa/is ... Constraint
    

=head2 Duplicate parameter names should be an error

At the moment, this just gives a warning:

    sub f ($a,$a) { ... }

    "my" variable $a masks earlier declaration in same scope

I think it should croak instead. (p5hack agreed).


=head2 Signature introspection API.

It has been suggested that  there should be a Signature Introspection API
(possibly via a CPAN XS module) which say, given a code ref, allows perl
code to return information about the sub's signature declaration.

Should perl supply such an API? Failing that, should perl make it easy
(e.g. by guaranteeing a stable optree layout) for a 3rd party to provide
such an API? Or should we declare that this is A Bad Thing - that the
signature is private implementation detail of a sub which it's free to
change, and that external code inspecting is wrong. In which case we would
offer no support or guarantees to anything attempting to implement such an
API.

IIRC Aaron Crane has been championing this, and that in a moment of
weakness I may have encouraged him (or at least not discouraged him).
Now from a more sober standpoint, I'm reverting to my usual gut feeling
that we shouldn't provide guarantees about opcodes etc.


=head2 Caller parameter name auto-generation

(This isn't strictly speaking a proposal about signatures.)

When passing named parameters, you often end up passing both a name and
a variable with that same name:

    sub foo(:$x, :$y) { print $x + $y }

    my ($x, $y) = ....;
    foo(x => $x, y => $y); # 'x' and 'y' appear twice in the source

Perl 6 has a bit of syntactic sugar which allows you to avoid repeating
yourself:

    foo(:$x, :$y);

is short for

    foo(x => $x, y => $y)

It would be nice if Perl 5 provided a similar syntax. In fact this
wouldn't just apply to sub calls, it could be used anywhere in list
context, e.g.

    %hash = (:$x, :$y);

It would be a compile-time error in void/scalar context. Unknown context
would be treated as list context, so e.g. these

    return :$y;
    return :$x, :$y;

would be compiled respectively as

    return y => $y;
    return x => $x,  y => $y;

and if that sub was called in scalar context, the sub would return just
the last element, ie. $y.

Of course, the specific syntax ':$x' wont work in Perl 5, as it's seen as
part of a ? : conditional. In fact most punctuation characters appearing
before a sigil already have some sort of meaning in Perl 5. These appear
to be free still (at least in the context of when a term is expected):

    ^$x
    =$x
    >$x
    .$x

A second possibility is some sort of punctuation char between the sigil
and variable name, e.g. $*foo. Most of the time this currently gets parsed
as a special punctuation variable followed immediately by a barewword.
Possibly the lexer's behaviour could be changed so that a specific
punctuation var followed immediately by a bareword would be treated
specially. This then gives lots of possibilities, e.g.

    $!x
    $"x

etc. However, personally I prefer the special char being before the sigil,
and of the four listed above I think I prefer '='. So that's

    foo(=$x, =$y);
    %points = (=$x, =$y);

etc.

Whitespace should be allowed:

    @points = (= $x, = $y);

Note that it would be a parse error for '=' to precede anything other than
a plain scalar lexical or package variable name. So these are legal:

    = $foo
    = $1
    = ${^FOO} # same as "^FOO" => ${^FOO} # or should this be illegal?
    = $Foo::Bar                           # or should this be illegal?

while these are illegal:

    = @foo
    = $foo[1]

Although arguably

    =@foo
    =%foo

could be shorthand for

    foo => \@foo
    foo => \%foo

(I'm not entirely convinced, though).
0
davem
11/28/2019 5:06:21 PM
[I'm Forwarding this on behalf of Toby Inkster]

This is a reply to Dave Mitchell's email, but it will probably get
threaded badly as I've only just subscribed to perl5-porters and don't
have the original email to click "reply" on.


This is all pretty over-engineered. It can be simplified to something
that would work well with existing type implementations such as
MooseX::Types, MouseX::Types, Type::Tiny, and Specio.

First, make this:

  sub f ($x is Int) {
    ...;
  }

effectively a shorthand for:

  sub f {
    my ($x) = @_;
    Int->check($x) or Carp::croak(Int->get_message($x));
  }

Second, define these:

  use Scalar::Util;
  sub UNIVERSAL::check {
    my ($class, $object) = @_;
    blessed($object) && !blessed($class) && $object->DOES($class);
  }
  sub UNIVERSAL::get_message {
    my ($class, $value) = @_;
    "$value is not $class";
  }

Third, nothing. There is no third.

Once you've got those two things working, then the following will
"just work":

  use MooseX::Types::Common qw(PositiveInt);
  sub add_counts ($x is PositiveInt, $y is PositiveInt) {
    return $x + $y;
  }

This works because PositiveInt is just a sub that returns an
object with `check` and `get_message` methods.

And this will also "just work" (thanks to UNIVERSAL):

  sub fetch_page ($ua is HTTP::Tiny, $url) {
    my $response = $ua->get($url);
    $response->{success} or die $response->{message};
    return $response->{content};
  }

Yeah, `check` and `get_message` are pretty generic-sounding methods
to be adding to UNIVERSAL and there are arguments in favour of, say,
`CHECK` and `GET_MESSAGE`.

The advantage of `check` and `get_message` is that the following
already works out of the box in MooseX::Types, MouseX::Types, and
Type::Tiny, and can be made to work with Specio with a pretty small
shim.

  Int->check($x) or Carp::croak(Int->get_message($x));

Shim for Specio is:

  sub Int () { return t("Int") }

> Also, the nested HashRef[ArrayRef[Int]] form quickly becomes a
> performance nightmare, with every element of the AoH having to
> be checked for Int-ness on every call to the function.

It's not as bad for performance as you might think.

MouseX::Types and Type::Tiny are capable of checking
HashRef[ArrayRef[Int]] with a single XS sub call. MooseX::Types
and Specio will check it without XS, but it's still a single
sub call, just with a lot of loops and regexp checks.

That said, there are performance improvements that can be made.
One would be at compile time, when Perl sees:

  sub f ($x is Int) {
    ...;
  }

It would call:

  $code = eval { Int->inline_check('$x') };

The Int object would return a string of Perl code like:

  q{ defined($x) && !ref($x) && $x =~ /^-?[0-9]+$/ }

And this would be inlined into the function like:

  sub f {
    my ($x) = @_;
    do {
      defined($x) && !ref($x) && $x =~ /^-?[0-9]+$/
    } or Carp::croak(Int->get_message($x));
  }

(Note I'm using the block form of eval when Perl fetches the inline
code, so Perl isn't evaluating the string of code at run time. It
just allows Int to throw an exception if it's unable to inline the
check. Some checks are hard or impossible to inline.)

Once again, there's discussion to be had about the name of the
method `inline_check`, but Type::Tiny and Specio already offer an
`inline_check` method exactly like this. And Moose offers
`_inline_check`. Mouse offers neither, so Perl would just fall
back to doing Int->check($x) to check the value.


Coercions are a whole different kettle of fish and how they interact
with aliasing and read only parameters can get confusing. For this
reason, I'd recommend simply leaving them out of signatures, at least
while people get used to having type constraints in signatures.

People can coerce stuff manually in the body of their sub. This is not
hard, it's probably what they're doing already, and it's almost
certainly more readable than any syntax you can squeeze into the
signature.
0
davem
11/29/2019 2:44:38 PM
Dave Mitchell <davem@iabyn.com> wrote:
:Duplicate named arguments are allowed, the last value being used. This
:allows the useful idiom of foo(%defaults, %options) to work.
[...]
:Note that for duplicate arguments, only the right-most value is likely to
:evaluated; for example foo(name => $tied1, name => $tied2) would likely
:only call (tied $tied2)->FETCH(), although we don't guarantee this.

Such a guarantee would be useful for the foo(%defaults, %options) idiom.
It may be worth revisiting the practicality of such a guarantee once
there is an implementation.

Hugo
0
hv
11/29/2019 4:44:24 PM
Dave Mitchell <davem@iabyn.com> wrote:
:=head2 Array query parameter
[...]
:Unlike the other query types, this one always examines the raw argument
:list, (i.e. before being sorted for named parameters). Because of this,
:an array query parameter is forbidden from appearing anywhere to the right
:of any named parameter.

Is the transition point only the actual declaration of a named parameter?
It would also be plausible to say that in sub foo(?$have_name, :$name)
the boolean query already introduces the start of processing for named
parameters, and therefore that it should be not permitted to write
sub foo(?$have_name, ?@all, :$name).

:=head2 Hash query parameter
:
:This is most useful in the presence of named parameters. [...]
:
:A hash query parameter can appear anywhere in the signature, including in
:amongst positional parameters, but in that case it is an error unless it
:is an even number of positional parameters before any first named
:parameter:
:
:    sub foo($p1, ?%query, $p2, $p3, :$n1, :$n2) { ... }  # ok
:    sub foo($p1, $p2, ?%query, $p3, :$n1, :$n2) { ... }  # compile-time err

Maybe most useful, but not only useful in the presence of named parameters.
As such, it needs to allow for optional positionals. I think for this case
it should be defined explicitly _not_ to check parity, and silently include
a trailing undef if needed.

Hugo
0
hv
11/29/2019 4:51:15 PM
Dave Mitchell <davem@iabyn.com> wrote:
:3) I propose that within the direct lexical scope of a signature sub, any
:code which uses the @_ variable (both rvalue and lvalue use, and as a
:container or as individual elements),  where detectable by the parser,
:will trigger a compile-time warning.

If possible this should also warn on implicit uses such as goto.

Hugo
0
hv
11/29/2019 4:52:20 PM
Dave Mitchell <davem@iabyn.com> wrote:
:=head2 :ro
[...]
:then within the lexical scope of $x (from the point it's been introduced
:onwards), any lvalue-context usage of that variable is a compile-time
:error. Note that as well as forbidding all the obvious stuff like $x++,
:all the following would also be compile-time errors:
:
:    foo($x);
:    for ($x, ...) { ...}
:    $y = \$x;
:    sub foo: lvalue { ....; return $x }

I can imagine that being inconvenient. Is there an easy way to get an
unmodified temp copy of the value (for uses such as foo($x) or for (...))
without needing another variable? Modified temp copies are easy to get
with eg "$x" or 0+$x, but I don't offhand know of an easy way to get an
unmodified copy.

If we can make it convenient enough, I can see a lot of value in using
this to write more obviously functional code.

Hugo
0
hv
11/29/2019 4:56:55 PM
Dave Mitchell <davem@iabyn.com> wrote:
:=item isa <Class::Name>
:
:    sub f ($x isa Class::name)
:
:is roughly shorthand for
:
:    sub f ($x where    defined $_
:                    && ref($_)
:                    && $_->isa('Class:name'))
:
:Although if Paul Evan's 'isa' infix keyword is accepted into core, then
:the signature 'isa' trait should become exactly shorthand for:
:
:    sub f ($x where $_ isa Class::Name) { ... }
:
:and any rules regarding whether the class name is quoted and/or
:has a trailing '::' should be the same.

Will this permit '$x isa $expr' (or does it depend on whether we get
Paul's keyword)? If so, is the expression resolved to a constant at
compile time or at runtime?

:=item is <constraint-type-name>
[...]
:This means that (for example), if a Math::BigInt value is passed as
:the argument for $i, the resulting parameter will just be a plain int
:and any extra data or behaviour will have been lost.

I use bigints a lot, and most functions that don't check their params
too carefully work just fine. I'd love to see an easy way to bypass
the coercion when I know what I'm doing, but fear I won't get one.

The majority don't know or care about bigints, so I anticipate that
Int declarations will get sprayed about liberally primarily as a means
of writing more self-documenting code.

Hugo
0
hv
11/29/2019 5:05:29 PM
--000000000000aca9d505987fdbd4
Content-Type: text/plain; charset="UTF-8"

On Thu, Nov 28, 2019 at 12:07 PM Dave Mitchell <davem@iabyn.com> wrote:

> =head2 Allow a shortcut for a 'default' default value
>
> allow $foo? as a shortcut for $foo=undef
>   and $?    as a shortcut for $=
>
> Similarly for reference aliases,
>     \$s? becomes a shortcut for \$s = undef,
>     \@a? becomes a shortcut for \@a = [],
>     \%h? becomes a shortcut for \%h = {},
>
> At the same time, ban the existing legal syntax '$=' which means an
> optional placeholder, and allow only the new '$?'. This would make things
> more consistent, as '$foo=' is currently illegal. Also, I find a bare
> trailing equals sign ugly, and it could potentially clash with future
> syntax which might be added to the end of a parameter.
>

While I don't disagree with this from a design standpoint, I would point
out that of the breaking changes that have been proposed, this is the only
one that could make it difficult to write signatures reusable back to the
5.20 versions in the common case. You could use '$dummy = undef' in both
cases, but it might be nice to continue allowing trailing = and maybe allow
it to go through a deprecation cycle later.

-Dan

--000000000000aca9d505987fdbd4
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr">On Thu, Nov 28, 2019 at 12:07 PM Dave Mit=
chell &lt;<a href=3D"mailto:davem@iabyn.com">davem@iabyn.com</a>&gt; wrote:=
</div><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D=
"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-le=
ft:1ex">
=3Dhead2 Allow a shortcut for a &#39;default&#39; default value<br>
<br>
allow $foo? as a shortcut for $foo=3Dundef<br>
=C2=A0 and $?=C2=A0 =C2=A0 as a shortcut for $=3D<br>
<br>
Similarly for reference aliases,<br>
=C2=A0 =C2=A0 \$s? becomes a shortcut for \$s =3D undef,<br>
=C2=A0 =C2=A0 \@a? becomes a shortcut for \@a =3D [],<br>
=C2=A0 =C2=A0 \%h? becomes a shortcut for \%h =3D {},<br>
<br>
At the same time, ban the existing legal syntax &#39;$=3D&#39; which means =
an<br>
optional placeholder, and allow only the new &#39;$?&#39;. This would make =
things<br>
more consistent, as &#39;$foo=3D&#39; is currently illegal. Also, I find a =
bare<br>
trailing equals sign ugly, and it could potentially clash with future<br>
syntax which might be added to the end of a parameter.<br></blockquote><div=
><br></div><div>While I don&#39;t disagree with this from a design standpoi=
nt, I would point out that of the breaking changes that have been proposed,=
 this is the only one that could make it difficult to write signatures reus=
able back to the 5.20 versions in the common case. You could use &#39;$dumm=
y =3D undef&#39; in both cases, but it might be nice to continue allowing t=
railing =3D and maybe allow it to go through a deprecation cycle later.</di=
v><div><br></div><div>-Dan=C2=A0</div></div></div>

--000000000000aca9d505987fdbd4--
0
grinnz
11/29/2019 5:49:10 PM
On Thu, 28 Nov 2019 17:01:00 +0000
Dave Mitchell <davem@iabyn.com> wrote:

> Note that in Perl 6 and some CPAN signature modules, the 'method'
> keyword declares an implicit $self parameter, whose name can be
> overridden using a postfix ':':
> 
>     method foo($x, $y)      { $self->{$x} = $y }  # implicit $self
>     method foo($me: $x, $y) {   $me->{$x} = $y }  # explicit invocant
> 
> I have no plans to introduce such a 'method' keyword, but if we did,
> we might need different syntax for the invocant, as the ':' would be
> interpreted as the start of an attribute unless the toker was clever
> and we are very careful that all signature syntax is capable of being
> disambiguated.

That should be fine - I don't see a need to offer users ability to
rename the `$self` anyhow.

In my "Object::Pad" experiment on CPAN, playing around with what ideas
that sort of `method` keyword might add, I didn't stop to consider
letting users change the invocant variable name; there's nothing that
can be gained by adding it as compared all the complexity of
implementing it. Even the most contrived example, that of a (named)
method generating a method closure which still wants to refer to the
outer $self, can be obtained by making a new lexical for the purpose:

  method generate_closure() {
    my $outer_self = $self;
    return method {
      say "My invocant is $self but outer was $outer_self";
    };
  }

I think that's clear enough for those rare cases, and avoids any
complication that comes from trying to offer customisation of the $self
lexical.

-- 
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk      |  https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/  |  https://www.tindie.com/stores/leonerd/
0
leonerd
11/30/2019 3:20:11 PM
I am cautiously optimistic on this one.

On Thu, 28 Nov 2019 17:01:42 +0000
Dave Mitchell <davem@iabyn.com> wrote:

....
> In terms of characters, the following are already taken, or might
> be taken under some of the other proposals:
> 
>     $ @ %    sigils
>     \ *      aliasing
>     ?        query parameter
>     , )      signature syntax
>     #        comment
> 
> Personally I think we should stick with ':'.

I agree there aren't many characters left, but I wonder if this feature
combined with attributes both using the colon, might get into any
syntax ambiguities? There is already the existing collision between
colon for sub arguments vs. named labels, which causes

  sub :attr { code here }

to either parse as an attributed anonymous sub, or as a labeled call to
attr() with a hashref constructor.

It may be useful to stare carefully at both these features in
combination to satisfy ourselves it won't be ambiguous.

-- 
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk      |  https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/  |  https://www.tindie.com/stores/leonerd/
0
leonerd
11/30/2019 3:30:14 PM
On Thu, 28 Nov 2019 17:04:01 +0000
Dave Mitchell <davem@iabyn.com> wrote:

> =head2 :ro
....

I like this and its semantics, and I like the fact it isn't called
":const", thus saving that for perhaps an even-stronger version one day
that would guarantee no memory mutation (i.e. SV type upgrades), as may
be useful for shared memory or other ideas.

+1

> =head3 Reference Aliasing
> 
> The first type, which I will call "reference aliasing", expects the
> argument to be a I<reference> to something, and the signature
> processing code first dereferences that argument (with dereference
> overloading honoured) and aliases the parameter to the resulting
> container - croaking if it's not a suitable reference. For example:
> 
>     sub foo(\$x, \@a, \%h, $other, $stuff) { ... }
> 
>     foo(\$X, [], \%H, 1, 2);
> 
> Then within the body of foo(), $x is an alias for $X, @a for the
> anonymous array, and %h for %H. This type of aliasing is more useful
> for array and hash references, but scalars are supported for
> completeness. Note that @a and %h are *not* slurpy parameters; they
> consume a single argument, and more parameters can follow them.

An often-overlooked thought - what about ref-aliases to code blocks?

Now we have lexical subs, I wonder if we'd permit the following:

  sub MY_map(\&body, @args) {
    my @ret;
    push @ret, body($_) for @args;
    return @ret;
  }

It seems a simple-enough addition of the idea to ref-alias a
`my sub foo` here.

-- 
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk      |  https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/  |  https://www.tindie.com/stores/leonerd/
0
leonerd
11/30/2019 3:52:06 PM
On Thu, 28 Nov 2019 17:04:42 +0000
Dave Mitchell <davem@iabyn.com> wrote:

> 5) This is a tentative suggestion that would replace options 3) and
> 4). This would be for a constraint hook to be specified as a
> empty-bodied sub with a single parameter. The constraint(s) specified
> for that parameter become the custom constraints which that hook
> provides. In some fashion the code previously compiled for that
> "prototype" sub's constraint is copied and/or executed. This would be
> more efficient than calling a whole sub for each parameter, and would
> more constrained than injecting text into the source code. 

This would be non-trivial to implement, but I have a growing collection
of situations in which I would love to be able to copy (partial)
optrees from one sub into another. I've often found it doesn't quite
work in a naive approach but I suspect a proper effort into making it
work wouldn't be that difficult. Somewhat outside of the scope of
signatures as such, but it may be worth us collecting up a list of
situations this could help with, to motivate an effort into looking at
making it work.

-- 
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk      |  https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/  |  https://www.tindie.com/stores/leonerd/
0
leonerd
11/30/2019 4:22:53 PM
On Thu, 28 Nov 2019 17:06:21 +0000
Dave Mitchell <davem@iabyn.com> wrote:

> =head2 Auto-declare $self
> 
> Perhaps allow simple syntax to auto-declare $self as the first
> argument?
> 
> In Perl 6, the invocant is implicit in method subs, and can be
> accessed using the 'self' keyword:
> 
>     method foo ($x) {  self.do($x, 'foo') }
> 
> but can be explicitly named (note the lack of comma):
> 
>     method foo ($me: $x) {  $me.do($x, 'foo') }
> 
> Cperl supports a similar auto-declaration with an added 'method'
> keyword:
> 
>     method foo ()           { $self->{foo} }
>     method bar ($this:, $x) { $this->{$x}  }
> 
> Perl 5 of course doesn't have a 'method' keyword, and if we were to
> add it, we would need to decide what semantics it brought to the
> table.
> 
> I don't have any strong urge to add such a feature.

As mentioned elsewhere - I've been experimenting in Object::Pad with
adding this, and in summary I don't feel a great need to allow it to be
customised. The invocant variable is called "$self", just like e.g. the
core builtin to print text is called "print".

> =head2 Allow a code block
....

I think allowing custom code block gives people an extension mechanism
to do more-or-less whatever they want that can't (yet) be provided; and
we can suitably annotate it with performance/etc.. warnings. In
addition it may be worth making it subject to an additional `use
feature` that people have to specifically request; so maybe one day we
decide it's too experimental and remove it without affecting anything
else.

That would allow us the space to experiment with what other features
people need in the aim to adding them as core things later on, letting
people replace their code blocks with those features natively.

> =head2 Whitespace
....
> On the other hand, Perl 6 doesn't allow whitespace. Should we
> similarly ban it from Perl 5 signatures? My gut feeling is yes: fix
> this while still experimental.

+1  Please forbid that "$  foo" whitespace :)

> =head2 Other traits
....
> Should we in some fashion allow additional user/pragma defined traits?
> E.g. 'does', 'has' etc? I have absolutely no idea of how they could be
> hooked in (or even whether they could), or how useful they would be,
> or what they (in general terms) would do.

I think saying "we might add more or an extension facility in future"
is sufficient here. Until a concrete requirement for something turns up
there's no point guessing around it. People can use code-blocks for now
to experiment with whatever else they want, until officially-sanctioned
features exist.

> =head2 Signature introspection API.
> 
> It has been suggested that  there should be a Signature Introspection
> API (possibly via a CPAN XS module) which say, given a code ref,
> allows perl code to return information about the sub's signature
> declaration.

I don't have much requirement for an XS API on signatures, but I can
imagine that Future::AsyncAwait is going to need to know, when
suspending one of these new CVs, that the @_ suppression is in effect.
Normally it has to take special measures around the PL_defav variable,
so it would need to act differently if such is not being used. Aside
from that, parameters get parsed into regular-looking lexical
variables, so it would act the same from that point onwards.

Will there be some CV flag or other mechanism by which I can ask "Is
this CV using signatures and hence snail suppression", and if so skip
that part?

> =head2 Caller parameter name auto-generation
....
>     foo(=$x, =$y);
>     %points = (=$x, =$y);
....
> (I'm not entirely convinced, though).

I dislike these. The entire attraction of Perl6's version is that the
callsite syntax looks the same as the callee's declaration. If we
couldn't provide that syntactic equivalence there's less in favour of
doing it. While DRY is good to an extent, I've never felt overly
burdened by having to type a little bit more just to pass

  draw(colour => $colour);

-- 
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk      |  https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/  |  https://www.tindie.com/stores/leonerd/
0
leonerd
11/30/2019 4:46:03 PM
Dave Mitchell wrote:
>            :$name1, # named parameter, consumes ("name1", value) arg pair

I'm dubious about this being something that the core should do.  Named
parameters are not a feature of Perl 5 sub calling per se, the way they
are built into Perl 6.  Seems like something that a plugin on CPAN should
do (once there's a signature plugin mechanism).  I worry about a repeat
of the smartmatch debacle, where the Perl 5 core inhaled too big a dose
of a specific person's rough translation of a bunch of Perl 6 semantics.
Especially with the new semantics being available only in this one way
through this quite specific feature, rather than being a new paradigm
that shows up in several places.

If it is implemented (especially if in the core), then I'd want there to
be some way to separate the naming of the parameter in the argument list
(part of the sub's API) from the naming of the lexical variable (part
of the sub's implementation).  You didn't mention any such mechanism.
Although it's good to have some shorthand for the common case of the names
coinciding, it really ought to be possible to give the lexical variable
a different name, or indeed no name to ignore it.  If you go with the
Perl 6 ":$name" syntax, then the Perl 6 ":apiname($internalname)" and
":apiname($)" syntax would serve.

However, I also have a concern about copying the Perl 6 ":$name" syntax.
It's not a priori bad syntax in the Perl 5 context, but you're certainly
not getting the benefits that that syntax has in the Perl 6 context.
In Perl 6, in an ordinary expression context ":$name" is shorthand for
"name => $name", and the syntax for a named argument with API name
different from internal name likewise resembles a pair-construction
syntax.  In the Perl 5 context, it's tempting to imitate pair syntax
afresh, getting "apiname => $internalname" and "apiname => $" for
the names-differing cases, but then we have no Perl 5 syntax for
the shorthand pair constructor.  Perhaps the signature syntax for the
names-matching case should be "=> $name", although that can never become
the corresponding pair-constructing shorthand in expression context.

I'm also dubious about the way "\" and ":" interact.  You show
"\:@coordinates".  I think the naming indicator, whether ":" or "=>",
should go before the "\".  "foo => \@foo" would accurately imitate a
potential call site, so seems less confusing than anything with the
"\" first.

-zefram
0
perl5
12/2/2019 1:44:54 PM
Dave Mitchell wrote:
>    @_ will not be set, unset or localised on entry to or exit from
>    a signatured sub;

Bad idea.  Having a warning for recognisable use of @_ is better than
nothing, but according to decisions we already made it's not enough.
Way back, we determined that detection of the use of @_ wouldn't
be sufficiently complete to serve as the trigger for @_ suppression.
The reasons for that decision haven't changed.  It is right that we paid
attention to that issue in repeated decisions that we would not tie @_
suppression to signatures.  Nothing in the rationale for those decisions
has changed either.

@_ suppression should be controlled by means orthogonal to signatures,
probably a per-sub attribute (which can turn suppression either on or off)
with a lexical feature flag setting a default.  Default suppression can
become part of a feature bundle.

-zefram
0
perl5
12/2/2019 1:56:40 PM
Dave Mitchell wrote:
>    ?$x       peek ahead to the next arg
>    ??$x      peek ahead and see if there is a next arg

Poor Huffman coding: the latter will be used more often.  I think the
peek-ahead cases should (all, consistently) have a second character after
the "?" and before the sigil, and the predicate case should only have one
"?".

>They cannot be a placeholder: i.e. these aren't legal: ??$, ?$, ?@, ?%.

Seems to me that those should be legal no-ops.  Except for the hash one,
which isn't even a no-op, because it would enforce even parity of the
remaining argument list.

-zefram
0
perl5
12/2/2019 2:08:59 PM
Dave Mitchell wrote:
>            $self isa Foo::Bar,         # croak unless $self->isa('Foo::Bar');
>            $foo  isa Foo::Bar?,        # croak unless undef or of that class
>            $a!,                        # croak unless $a is defined
>            $b    is  Int,              # croak if $b not int-like
>            $c    is  Int?,             # croak unless undefined or int-like
>            $d    is PositiveInt,       # user-defined type
>            $e    is Int where $_ >= 1, # multiple constraints
>            $f    is \@,                # croak unless  array ref
>            $aref as ref ? $_ : [ $_ ]  # coercions: maybe modify the param

Yuck.  This is a huge amount of new syntax to add.  The new syntax doesn't
pull its weight, given that it can only be used in this one context.
If you're adding a bunch of syntax for type constraints, it should also
be available for type checking purposes outside signatures.

It's also rather too Perl6ish for Perl 5: all these consecutive barewords
will cause a bunch of new parsing ambiguities.  Thinking about how the
constraint syntax would be made available in general expression contexts
might help in coming up with less troublesome syntax.

>So, given that a constraint type system and a "real" type system are two
>separate things (unless someone smarter than me can can suggest a way of
>unifying them), I think that they should be kept syntactically separate.

Yes, this is a good decision.

>processed against the lexical parameter, after any binding of arguments or
>default value.

In previous discussion, we were leaning towards exempting default values
from constraints.  Given that this is about constrainting arguments,
rather than applying types to lexical variables, I still think exempting
defaults is advantageous.

As for aliasing, it seems to me that in a signature (\@foo), \@foo is
a scalar value capable of being constrained.  It makes perfect sense to
apply a constraint to an argument that is received by aliasing, and no,
the automatic constraint to it being an array reference isn't enough.

>    sub f ($x isa Class::name)

Would "($x isa $other_class)" be legal?  The stuff about postfix
"?" suggests that this syntax is too specific to permit the use of an
arbitrary expression.  But forbidding general expressions would be an
annoying limitation on the use of "isa".

>    sub foo ($x is Int ) { ... }

Although you say you're not creating a core type system, you are somewhat
doing exactly that here.  You're certainly inventing a namespace populated
with a bunch of type-like objects.  This is not to be done lightly,
and deciding what "Int" means is a substantial task.  You're very much
importing Perl 6 syntax that's tied to semantics that Perl 5 doesn't have.
Perl 6 already has a well defined thing called "Int", which knows which
values satisfy it and which don't, whereas Perl 5 has a semantic that
*anything* is an integer if you want to treat it that way.  We certainly
can come up with concepts of "integer" for Perl 5 that identify a proper
subset of values, but there are many possible concepts, and there's no
precedent for the Perl 5 core being concerned with any of them.

>    sub foo ($x is PositiveInt) { ... }
>    # roughly equivalent to: ($x is Int where $x >= 0)

I hope you don't think that zero is a positive integer.

>Like 'isa', 'is' type names can be followed by '?', indicating that an
>undefined value is also allowed.

This again implies that general expressions won't be permitted on the
rhs of "is".  It's a bigger problem for "is" than for "isa".

>Type names as used by 'is' occupy a different namespace than perl
>packages and classes,

It's quite necessary to make this distinction, and particularly to
distinguish between "isa" and "is".  But I have issues with the new
namespace used by "is"; see below.

>The built-in constraint types will also coerce the resultant parameter

It's a bad idea to mix these separate concerns.  Constraint checking and
type coercion are different ideas that should remain distinct.  Also,
just as there are multiple ideas of what is an integer in Perl 5, there
are multiple ideas of what turning a value into a `purer' integer entails.
Remember, passing "is Int" implies that the supplied argument *is* an
integer (whatever that means), so coercing it *to* an integer should
be the identity operation.  If you're doing a non-identity coercion,
that means you've got a second, stricter, concept of integer in play.
Wanting to check that an argument satisfies one concept of integer
does not imply which stricter concept of integer you'd like it to be
converted to.

>Constraints apart from '!' and 'isa' cannot be used on a parameter which
>is a direct alias (e.g.  *$x), since this might trigger coercing the
>passed argument and thus causing unexpected action at a distance.

It seems essential, to me, that constraints should be applicable to
aliased parameters.  Constraint checking code should not have such bad
taste as to side-effect its parameter.  Coercion would also better be
seen as a function applied to the parameter to return a coerced value,
rather than mutating its input.  But if code is written such that it
does behave so badly, well, it's not the first nor even the fifth place
in Perl 5 that side effects can surprise distant code.

>The complete collection of where/as/isa/is clauses are collectively
>enclosed in their own logical single scope,

This sits uneasily with the interleaving of these clauses with default
value expressions.  I'm not sure what the scope of lexical variables
introduced in a where clause *should* be, but I'm pretty sure it shouldn't
be visible in a later where clause without also being visible in an
intervening default value expression.  I think it's also difficult
to implement such selective visibility, given the way lexical scopes
are managed.

I'm concerned about the idea of this scope, whether in its lexical or
dynamic aspects, being at all visible to the programmer.  It has the
whiff of implementation leaking out.

>Constraints can only be supplied to scalar parameters; in particular they
>can't be applied to:
....
>* Placeholder (nameless) parameters.

Bad idea.  It should be possible to type check an argument that is
otherwise ignored.  If it's just a matter of the implementation wanting
a lexical variable to apply the constraint logic to, you can perfectly
well create a lexical variable (pad slot) without any name.

>    $x is Int+    equivalent to:   $x is Int where $_ >= 0
>    $x is Str+    equivalent to:   $x is Str where length($_) >  0

Failure to make the "+" parts analogous.  This suggests that this kind of
name (for which there's no precedent in Perl) would be fairly confusing.

>I think we should also include a few built-in "symbol" constraint type
>names,

Doesn't seem worth the irregularity.

>At compile time it will be possible for pragmata and similar to add
>lexically-scoped type hook functions via the hints mechanism.

It seems to me that Perl already has serviceable namespacing mechanisms,
and doesn't need a new kind of namespace just for type constraints.
It would be better for the rhs of "is" to take an arbitrary expression,
and use the value to which that expression evaluates as the type
constraint object.  This way we get to use all our existing mechanisms
to manage the names of type constraints.  One "use" declaration and
the programmer can have "Int" et al defined the way you imagine.
This would also avoid the core taking some arbitrary position on what
"Int" `really' means.

The ability to construct type constraints in a general expression can
easily subsume "isa" and "where".  There's no need for so much syntax.

We also already have a serviceable mechanism for type constraint objects:
objects that overload the smartmatch operator.  No need to reinvent
the wheel.

>4) Return a string containing a source code snippet to be inserted into
>the source text at that point.

Yuck.  Terrible plugin mechanism; very vulnerable to lexical state
affecting the parsing.  Don't bring Devel::Declare crack into the core,
and don't encourage people to write fragile plugin code.  It's already
possible for a constraint checking sub to inline itself via call checker
magic.

>This would be for a constraint hook to be specified as a empty-bodied sub
>with a single parameter. The constraint(s) specified for that parameter
>become the custom constraints which that hook provides.

Nasty.

>I propose that for each built-in constraint type there will be a
>corresponding function in the 'is::' namespace which returns a boolean
>indicating whether the argument passes that constraint.

Too limited.  If there's special syntax on the rhs of "is", then the
whole thing, including "where" clauses, "?" decorations, and references
to user-defined type constraints, should be available in some kind of
expression context.  Essentially, "$x is Int where $_ > 3" should be a
truth-value expression.  Of course, this runs into the problem of the
bareword-based syntax not playing nicely with existing expression syntax;
the syntax would have to be redesigned to fix that.

>be also be useful for built-ins having extra characters in them like I
>suggested above, e.g. is::is($x, 'Int++') and is::is($aref, '\@');

Wrong way to do it.  It would mean essentially implementing the type
constraint syntax twice: once in the actual parser, for signatures,
and a second time to handle the string argument to is::is().

>Moose supports aggregate and alternation / composite constraints; for
>example, ArrayRef[Int] and [Int|Num].
>
>Personally I think that we shouldn't support these; it will make things
>far too complex.

Semantically, things like ArrayRef[Int] and junctions are quite
frequently needed.  It should be easy to construct such type constraints.
Predeclaring and giving them monomial names as user-defined type
constraints seems rather cumbersome.  This is part of why I favour the
rhs of "is" being a general expression context.

>                 Also, the nested HashRef[ArrayRef[Int]] form quickly
>becomes a performance nightmare, with every element of the AoH having to
>be checked for Int-ness on every call to the function.

If that's the type checking that's actually required, then the cost
of checking must be borne.  It is a false economy to discourage the
programmer from making the proper checks.

-zefram
0
perl5
12/2/2019 3:50:20 PM
Dave Mitchell wrote:
>    copy   ro  # no P6 equivalent, and not very useful?

Slightly useful: unlike a ro alias, its value couldn't change in a way
that the sub doesn't know about.

>Placeholder direct alias parameters are forbidden. There's no (*$, *@).

I don't see a good reason to forbid these.

>Now for the bikeshedding about what syntax to use for direct aliasing.
>Zefram tentatively proposed using $x\, whereas I propose *$x.

I'm fairly happy with the prefix "*".  I do find it clearer than the
postfix "\".  However, I don't find any of your specific arguments for
it meritorious.  In particular:

>* By having a single syntactical slot to indicate aliasing, that slot can
>  populated with only one of two chars to indicate which type of aliasing
>  is wanted (\$x and *$x). With two slots there's ambiguity: what does
>  \$x\ mean?

"\$x\" would of course be prohibited, just as "*\$x" or "\*$x" would
be prohibited.  I don't find either syntax for both kinds of aliasing at
once to be more or less inviting than the other.  With "\*$x" actually
being legal (and useful) expression syntax, it really doesn't look as
though "\" or "*" syntactically precludes the other.

>* It has a loose mnemonic association with aliasing via typeglob
>  assignment: *foo = ....;.

I find this association to be an argument *against* prefix "*".  The
usage you're proposing has nothing at all to do with typeglobs, so the
association with the glob sigil is misleading.  It also might interfere
a bit with any future work to process actual typeglobs in signatures.
(It's quite feasible to add a facility for lexical glob names in the
future, and there's motivation in that I/O handles are very often wrapped
in globs.)

>* It has a loose association with Perl 6's array flattening syntax, *@a,

This association, too, is misleading, because the facility is semantically
very different.

>There is the question of default behaviour. In Perl 5 currently the
>default is to copy, while in Perl 6 the default is a read-only alias.

The default should remain copying.  Read-only aliasing is nicer
behaviour, but a Perl 5 :ro alias (as you described it) is not, because
of the inability to then pass the variable as a subroutine argument.
Writable aliasing would also be surprising behaviour, because it is
customary to copy arguments, and therefore writing to argument-derived
variables normally doesn't write to the arguments.

>possibility is to make aliasing the default in the presence of :ro.

Too confusing.  Until now, ":ro" has been an attribute that gets
orthogonally composed with other signature features.  Don't break the
orthogonality.

-zefram
0
perl5
12/2/2019 4:19:32 PM
Dave Mitchell wrote:
>However, there will be an implicit scope around the collection of
>where/as/isa/is traits (see the "Type and Value Constraints and Coercions"
>thread).

I'm dubious about this scope, as I described on that subthread.

>I propose that a few aspects of ordering are well defined; everything
>else is left undefined,

This sounds unPerlish.  Traditionally we define order of evaluation
quite strongly.  You can also bet that any change to order of evaluation
will bite someone.

-zefram
0
perl5
12/2/2019 4:27:10 PM
Dave Mitchell wrote:
>    sub f($x //= expr) { ... }

I'm dubious about that.  It's implying that, after $x has been initialised
from the argument, "$x //= expr" is performed.  The problem with that is
that it suggests that other kinds of assignment should work similarly:
not only "||=", but consider "sub f($x *= 2) {...}".  This could be
implemented just fine: initialise $x from the argument, and then perform
"$x *= 2" so that the body sees twice the argument value.  But then that
implies semantics for "sub f($x = expr) {...}" that are very different
from what we've already decided on.  I think if "//=" is going to be
allowed here then the defaulting "=" has to change to "!!=" or some such.

>    sub f(Dog $spot, ...) { ..}

Meh.  I suppose if attributes are supported then this should be supported
too.

In the spirit of matching the "my" usage, the type should always be
adjacent to the specification of the lexical variable.  Thus "*Dog $spot",
not "Dog *$spot".

>allow $foo? as a shortcut for $foo=undef

Meh.  I'm generally not a fan of these shortcuts.

>  and $?    as a shortcut for $=

If "$foo?" is allowed then "$?" certainly should be, but "$=" seems
short enough already.

>    \$s? becomes a shortcut for \$s = undef,

Presumably you mean "\$s = \undef"; "\$s = undef" would signal an error
in the default case.

>    \@a? becomes a shortcut for \@a = [],
>    \%h? becomes a shortcut for \%h = {},

Defaulting to empty aggregates, sure, but fresh mutable ones?  To parallel
the scalar default they should alias *im*mutable empty aggregates.
Probably always the same immutable array and the same immutable hash.

If there's serious doubt about what the "?" defaults should be, then the
"?" shorthand clearly isn't going to work.

>At the same time, ban the existing legal syntax '$='

I disagree.  This syntax is fine as it is, part of a coherent design.
"=" is consistently used in the syntax for optional parameters.

>Perhaps allow simple syntax to auto-declare $self as the first argument?

Imposing a specific variable name on the user?

It's a bit difficult to discuss this absent any "method" keyword or
anything else that implies a subroutine taking one or more stereotyped
positional parameters.

>    sub foo($x, ?{ print "x=$x\n" }, $y = $x+1) { ... }
....
>I suppose the question is, whether this is useful, and whether it allows
>you to do things that can't be done with default value expressions,

Clearly it can do things that would be awkward otherwise.  For example,
it can munge the value of a parameter variable in arbitrary fashion,
making the munged value visible to expressions associated with setting up
later parameter variables.  Linking to the "//=" discussion above, "($x,
?{ $x //= 5 }, $y = $x+1)" lets one easily achieve the effect proposed
for "($x //= 5, $y = $x+1)", without needing to build in a "//=" operator.

So yes, it's useful, and it reduces the pressure for a bunch of other
features of dubious value.

>The '?' before the '{' isn't strictly necessary syntax-wise, but grouping
>it in with the 'query parameter' syntax emphasises that this parameter
>doesn't consume an argument.

It's a bit ugly, but the "query parameter" concept may make up for that.

>My feeling (and as expressed in other individual proposals here) is that
>everything apart from the sigil and parameter name is signature syntax and
>can have optional whitespace around it, like perl stuff generally can.

Yes, these things should permit whitespace.

>On the other hand, Perl 6 doesn't allow whitespace. Should we similarly
>ban it from Perl 5 signatures? My gut feeling is yes: fix this while still
>experimental.

No.  Perl 5 permits whitespace between sigil and identifier in all sorts
of non-experimental situations, including of course the non-experimental
"my" syntax for declaring lexical variables.  The signature syntax should
be consistent with the rest of Perl 5.  There is no justification for
importing a Perl 6 syntax rule just for this one case.

>Should we in some fashion allow additional user/pragma defined traits?
>E.g. 'does', 'has' etc?

"does" and "has" sound like additional forms of type constraint, and as
such should be handled as user-defined type constraints.  Preferably by
permitting arbitrary expressions on the rhs of "is", as I discussed in
that subthread.

As for other traits, we don't seem to have any generalised concept of
what traits are for.  Of all the traits you've proposed, one is for
coercion, and all the others are about constraints.  (Except for the bit
where you imagine "is Int" performing both constraint *and* coercion,
which I reckon is a bad idea.)  It would only be meaningful to have a
generalised trait system if there were a generalised way for plugged-in
trait code to do meaningful things.  We kinda already have such a system
with attributes, but it turns out there's very little that attribute code
can meaningfully do.  In Perl 6, trait handlers have a rich metaobject
ecosystem to mess about in; Perl 5 doesn't have anything like that,
so I think a generalised trait system would be at least as useless as
the generalised attribute system.

>I propose the following order:
....
>    Int               Optional type

This should go immediately before the sigilled name, as part of the
declaration of the lexical variable rather than anything to do with
arguments per se.

>    :foo(...)         Optional attribute(s)

This should go immediately after the sigilled name, for the same reason.

>    = ....            Default value
>    where/as/isa/is ... Constraint

This order is appropriate if constrants and coercion apply to default
values.  However, it means that trait keywords screw up expression syntax
in default value expressions, which should be avoided.  If default values
are exempted from constraint checking (as I've suggested), the order
should be swapped.  If traits come before default value then there's
no such syntactic problem: the defaulting "=" is a fine delimiter for
a trait expression, based purely on operator precedence.

>    sub f ($a,$a) { ... }
>
>    "my" variable $a masks earlier declaration in same scope
>
>I think it should croak instead. (p5hack agreed).

I think it should be consistent with other ways of declaring lexical
variables.  The rules should not be different just for signatures.
The programmer is free to make shadowing warnings fatal, and if we think
this is terrible style then we're free to implement a stricture (which
can be part of a Perl-version feature bundle) to make shadowing fatal.

>=head2 Signature introspection API.
....
>Should perl supply such an API?

Introspecting a signature should be exactly as easy (no easier and
no more difficult) than introspecting any other op in a sub's body.
It should absolutely not be promoted as subroutine metadata to be
examined independently from the rest of the body.  It's an internal
implementation detail that should not leak out independent of the rest
of the implementation.  If it is perceived as metadata, there's a danger
that it'll be treated as part of the sub's API, and become something
that isn't allowed to change between module versions.

>    foo(:$x, :$y);
....
>It would be nice if Perl 5 provided a similar syntax.

Yes, a bit.  If such a syntax is added, the named parameter syntax for
the common same-name case should imitate the syntax for this shorthand.

>It would be a compile-time error in void/scalar context.

That sounds like a bad idea.  List expressions are generally permitted in
void and scalar context.  I think this feature should be pure shorthand
for (foo=>$foo).

>Of course, the specific syntax ':$x' wont work in Perl 5, as it's seen as
>part of a ? : conditional.

Yes.  I think it might also clash with some of the other uses of colon.

>                                                            These appear
>to be free still (at least in the context of when a term is expected):
>
>    ^$x
>    =$x
>    >$x
>    .$x

"=" would be problematic because of clashes with POD syntax.  There's
already a bit of a clash, but it only arises where a new paragraph
(as judged for POD purposes) happens in the middle of an expression.
"=$x" would mean that expression syntax can *start* with "=", making it
much easier to mistake code for POD.

Of these options, I have some preference for ">$x", which at least looks
like an abbreviation of "x=>$x".  But all these options are ugly.

>A second possibility is some sort of punctuation char between the sigil
>and variable name, e.g. $*foo.

This is worse, for the reason you outlined.

>    =@foo
>    =%foo

This seems more confusing than it's worth, with the implicit
enreferencement.  On the receiving end of a named parameter, I'd want
the syntax to be "foo => \@foo", to make the aliasing clear, and the
call should be similar.

-zefram
0
perl5
12/2/2019 5:35:04 PM
Paul "LeoNerd" Evans wrote:
>imagine that Future::AsyncAwait is going to need to know, when
>suspending one of these new CVs, that the @_ suppression is in effect.

Yes, that's a good point.  @_ suppression is technically part of the
internal implementation, and so shouldn't be promoted as something for
general users to introspect.  But it is of a different character from the
sub body, and there's legitimate reason for modules like yours to look
at it.  Presumably you'd be able to look at the same flag, probably one
of the CvFLAGS(), that the core looks at in deciding whether to set up @_.

And we should be clear that what you're introspecting here is whether @_
gets set up, *not* whether the sub has a signature.

-zefram
0
perl5
12/2/2019 5:41:55 PM
--0000000000005e84ae0598bd6bbf
Content-Type: text/plain; charset="UTF-8"

I like prefix backslash because it is similar to C++ pass-by-reference
syntax. As that's the default in Perl though -- for direct access into @_
at least -- I'm gathering the idea is

   sub mysub($foo,$bar, baz){ ...

is the same as
   sub mysub{ my ($foo, $bar, $baz) = @_       ; ... # copies

while
    mysub($foo,$bar,\$baz){ ...
copies $foo and $bar but $baz is an alias? Just like a C++ argument
declaration of
    scalar_t mysub (scalar_t foo, bar, &baz) {...
would do?

does that match the big proposal I didn't read in detail?

also, does it match my-list assignment syntax, so the transform from having
an argument list to starting the subroutine with assignment to the argument
list would still work, were it to be implemented as a source filter?





-- 
Coming to you live, from behind Sneelock's store, in the big vacant lot.

--0000000000005e84ae0598bd6bbf
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote"><div><br></div><div>I like pref=
ix backslash because it is similar to C++ pass-by-reference syntax. As that=
&#39;s the default in Perl though -- for direct access into=C2=A0@_ at leas=
t -- I&#39;m gathering the idea is</div><div><br></div><div>=C2=A0 =C2=A0su=
b mysub($foo,$bar, baz){ ...</div><div><br></div><div>is the same as</div><=
div>=C2=A0 =C2=A0sub mysub{ my ($foo, $bar, $baz) =3D=C2=A0@_=C2=A0 =C2=A0 =
=C2=A0 =C2=A0; ... # copies</div><div><br></div><div>while=C2=A0</div><div>=
=C2=A0 =C2=A0

 mysub($foo,$bar,\$baz){ ...

</div><div>copies $foo and $bar but $baz is an alias? Just like a C++ argum=
ent declaration=C2=A0of=C2=A0</div><div>=C2=A0 =C2=A0 scalar_t mysub (scala=
r_t foo, bar, &amp;baz) {...</div><div>would do?</div><div><br></div><div>d=
oes that match the big proposal I didn&#39;t read in detail?</div><div><br>=
</div><div>also, does it match my-list assignment syntax, so the transform =
from having an argument list to starting the subroutine with assignment to =
the argument list would still work, were it to be implemented as a source f=
ilter?</div><div><br></div><div><br></div><div><br></div><div><br></div><di=
v>=C2=A0</div></div>-- <br><div dir=3D"ltr" class=3D"gmail_signature"><div =
dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"lt=
r"><div><div dir=3D"ltr"><div><div>Coming to you live, from behind Sneelock=
&#39;s store, in the big vacant lot.</div></div></div></div></div></div></d=
iv></div></div></div></div></div></div>

--0000000000005e84ae0598bd6bbf--
0
davidnicol
12/2/2019 7:16:05 PM
--000000000000fde13f0598beac10
Content-Type: text/plain; charset="UTF-8"

On Mon, Dec 2, 2019 at 8:57 AM Zefram via perl5-porters <
perl5-porters@perl.org> wrote:

> Dave Mitchell wrote:
> >    @_ will not be set, unset or localised on entry to or exit from
> >    a signatured sub;
>
> Bad idea.  Having a warning for recognisable use of @_ is better than
> nothing, but according to decisions we already made it's not enough.
> Way back, we determined that detection of the use of @_ wouldn't
> be sufficiently complete to serve as the trigger for @_ suppression.
> The reasons for that decision haven't changed.  It is right that we paid
> attention to that issue in repeated decisions that we would not tie @_
> suppression to signatures.  Nothing in the rationale for those decisions
> has changed either.
>

This is not conditional on detection, but on enabling of the signatures
feature, a deterministic and obvious mechanism.


> @_ suppression should be controlled by means orthogonal to signatures,
> probably a per-sub attribute (which can turn suppression either on or off)
> with a lexical feature flag setting a default.  Default suppression can
> become part of a feature bundle.
>

While I'm not disagreeing with your argument, I also feel that practically
they are linked as far as the user is concerned, and so separating them for
the user will only lead to extra boilerplate for everyone. Suppressing @_
without enabling signatures would make subroutines virtually impossible to
use, and @_ is useless once signatures are enabled (provided the proposed
aliasing and predicate functionality is implemented).

-Dan

--000000000000fde13f0598beac10
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr">On Mon, Dec 2, 2019 at 8:57 AM Zefram via=
 perl5-porters &lt;<a href=3D"mailto:perl5-porters@perl.org">perl5-porters@=
perl.org</a>&gt; wrote:<br></div><div class=3D"gmail_quote"><blockquote cla=
ss=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid =
rgb(204,204,204);padding-left:1ex">Dave Mitchell wrote:<br>
&gt;=C2=A0 =C2=A0 @_ will not be set, unset or localised on entry to or exi=
t from<br>
&gt;=C2=A0 =C2=A0 a signatured sub;<br>
<br>
Bad idea.=C2=A0 Having a warning for recognisable use of @_ is better than<=
br>
nothing, but according to decisions we already made it&#39;s not enough.<br=
>
Way back, we determined that detection of the use of @_ wouldn&#39;t<br>
be sufficiently complete to serve as the trigger for @_ suppression.<br>
The reasons for that decision haven&#39;t changed.=C2=A0 It is right that w=
e paid<br>
attention to that issue in repeated decisions that we would not tie @_<br>
suppression to signatures.=C2=A0 Nothing in the rationale for those decisio=
ns<br>
has changed either.<br></blockquote><div><br></div><div>This is not conditi=
onal on detection, but on enabling of the signatures feature, a determinist=
ic and obvious mechanism.</div><div>=C2=A0</div><blockquote class=3D"gmail_=
quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,=
204);padding-left:1ex">
@_ suppression should be controlled by means orthogonal to signatures,<br>
probably a per-sub attribute (which can turn suppression either on or off)<=
br>
with a lexical feature flag setting a default.=C2=A0 Default suppression ca=
n<br>
become part of a feature bundle.<br></blockquote><div><br></div><div>While =
I&#39;m not disagreeing with your argument, I also feel that practically th=
ey are linked as far as the user is concerned, and so separating them for t=
he user will only lead to extra boilerplate for everyone. Suppressing=C2=A0=
@_ without enabling signatures would make subroutines virtually impossible =
to use, and=C2=A0@_ is useless once signatures are enabled (provided the pr=
oposed aliasing and predicate functionality is implemented).</div><div><br>=
</div><div>-Dan=C2=A0</div></div></div>

--000000000000fde13f0598beac10--
0
grinnz
12/2/2019 8:46:25 PM
Dan Book wrote:
>This is not conditional on detection, but on enabling of the signatures
>feature, a deterministic and obvious mechanism.

The point is that the acceptability of controlling @_ suppression
based on signatures is being founded on the warning that is based on
automatic detection of references to @_.  The automatic detection being
too incomplete means that that warning can't be relied upon to catch @_
references, which means we have the problem of signature introduction
being liable to leave undetected uses of @_ in place.

-zefram
0
perl5
12/2/2019 9:55:01 PM
On Thu, Nov 28, 2019 at 05:01:42PM +0000, Dave Mitchell wrote:
> The ordering of parameter types within a signature would be extended to be:
> 
>     1) zero or more mandatory positional parameters, followed by
>     2) zero or more optional positional parameters, followed by
>     3) zero or more mandatory named parameters, followed by
>     4) zero or more optional named parameters, followed by
>     5) zero or one slurpy array or hash
> 
> Except that would be a compile-time error to have both (2) and (3).
> 
> (3) and (4) are new. Note that here isn't any semantic need for any
> optional named parameters to always follow all mandatory named parameters,
> but including that restriction doesn't prevent you doing anything (as far
> as I can see), provides consistency with positional parameters, and
> potentially allows better optimisations.

I'd prefer to be able to mix mandatory and optional named parameters,
a function might take one mandatory parameter that's necessary for the
function, and an optional parameter with a reasonable default that
describes how to interpret that parameter.

Allowing those to remain together in the signature improves the
readability of the signature.

For example, pixels and type in

https://metacpan.org/pod/distribution/Imager/lib/Imager/Draw.pod#setscanline

Tony
0
tony
12/2/2019 11:32:40 PM
--000000000000f415550598c11dd7
Content-Type: text/plain; charset="UTF-8"

On Mon, Dec 2, 2019 at 6:37 PM Tony Cook <tony@develop-help.com> wrote:

> On Thu, Nov 28, 2019 at 05:01:42PM +0000, Dave Mitchell wrote:
> > The ordering of parameter types within a signature would be extended to
> be:
> >
> >     1) zero or more mandatory positional parameters, followed by
> >     2) zero or more optional positional parameters, followed by
> >     3) zero or more mandatory named parameters, followed by
> >     4) zero or more optional named parameters, followed by
> >     5) zero or one slurpy array or hash
> >
> > Except that would be a compile-time error to have both (2) and (3).
> >
> > (3) and (4) are new. Note that here isn't any semantic need for any
> > optional named parameters to always follow all mandatory named
> parameters,
> > but including that restriction doesn't prevent you doing anything (as far
> > as I can see), provides consistency with positional parameters, and
> > potentially allows better optimisations.
>
> I'd prefer to be able to mix mandatory and optional named parameters,
> a function might take one mandatory parameter that's necessary for the
> function, and an optional parameter with a reasonable default that
> describes how to interpret that parameter.
>
> Allowing those to remain together in the signature improves the
> readability of the signature.
>
> For example, pixels and type in
>
>
> https://metacpan.org/pod/distribution/Imager/lib/Imager/Draw.pod#setscanline


I agree. This restriction isn't necessary, doesn't have any clear benefit,
and means existing calls would very likely have to be adjusted when such a
signature is applied in a codebase.

-Dan

--000000000000f415550598c11dd7
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr">On Mon, Dec 2, 2019 at 6:37 PM Tony Cook =
&lt;<a href=3D"mailto:tony@develop-help.com">tony@develop-help.com</a>&gt; =
wrote:<br></div><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote=
" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);=
padding-left:1ex">On Thu, Nov 28, 2019 at 05:01:42PM +0000, Dave Mitchell w=
rote:<br>
&gt; The ordering of parameter types within a signature would be extended t=
o be:<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A01) zero or more mandatory positional parameters, fo=
llowed by<br>
&gt;=C2=A0 =C2=A0 =C2=A02) zero or more optional positional parameters, fol=
lowed by<br>
&gt;=C2=A0 =C2=A0 =C2=A03) zero or more mandatory named parameters, followe=
d by<br>
&gt;=C2=A0 =C2=A0 =C2=A04) zero or more optional named parameters, followed=
 by<br>
&gt;=C2=A0 =C2=A0 =C2=A05) zero or one slurpy array or hash<br>
&gt; <br>
&gt; Except that would be a compile-time error to have both (2) and (3).<br=
>
&gt; <br>
&gt; (3) and (4) are new. Note that here isn&#39;t any semantic need for an=
y<br>
&gt; optional named parameters to always follow all mandatory named paramet=
ers,<br>
&gt; but including that restriction doesn&#39;t prevent you doing anything =
(as far<br>
&gt; as I can see), provides consistency with positional parameters, and<br=
>
&gt; potentially allows better optimisations.<br>
<br>
I&#39;d prefer to be able to mix mandatory and optional named parameters,<b=
r>
a function might take one mandatory parameter that&#39;s necessary for the<=
br>
function, and an optional parameter with a reasonable default that<br>
describes how to interpret that parameter.<br>
<br>
Allowing those to remain together in the signature improves the<br>
readability of the signature.<br>
<br>
For example, pixels and type in<br>
<br>
<a href=3D"https://metacpan.org/pod/distribution/Imager/lib/Imager/Draw.pod=
#setscanline" rel=3D"noreferrer" target=3D"_blank">https://metacpan.org/pod=
/distribution/Imager/lib/Imager/Draw.pod#setscanline</a></blockquote><div><=
br></div><div>I agree. This restriction isn&#39;t necessary, doesn&#39;t ha=
ve any clear benefit, and means existing calls would very likely have to be=
 adjusted when such a signature is applied in a codebase.</div><div><br></d=
iv><div>-Dan=C2=A0</div></div></div>

--000000000000f415550598c11dd7--
0
grinnz
12/2/2019 11:41:10 PM
On Mon, Dec 02, 2019 at 02:08:59PM +0000, Zefram via perl5-porters wrote:
> Dave Mitchell wrote:
> >    ?$x       peek ahead to the next arg
> >    ??$x      peek ahead and see if there is a next arg
> 
> Poor Huffman coding: the latter will be used more often.  I think the
> peek-ahead cases should (all, consistently) have a second character after
> the "?" and before the sigil, and the predicate case should only have one
> "?".

I too think the shorter version should be the predicate rather than
the peek ahead.

> 
> >They cannot be a placeholder: i.e. these aren't legal: ??$, ?$, ?@, ?%.
> 
> Seems to me that those should be legal no-ops.  Except for the hash one,
> which isn't even a no-op, because it would enforce even parity of the
> remaining argument list.

I'd rather they were illegal if they're no-ops, since we might find a
later use for the syntax that isn't a no-op.

That said all this punctuation might be a bit too much, with perl's
reputation as a line-noise language being further cemented.

But COBOL lies down the other road.

Tony
0
tony
12/3/2019 4:59:39 AM
Tony Cook wrote:
>I'd rather they were illegal if they're no-ops, since we might find a
>later use for the syntax that isn't a no-op.

They can't be acceptably used for any other meaning, because the logic
of combining the peek-ahead syntax with the ignored-parameter syntax
provides a clear implication that they're no-ops.

-zefram
0
perl5
12/3/2019 8:31:45 AM
I wrote:
>Dave Mitchell wrote:
>>    =@foo
>>    =%foo
>
>This seems more confusing than it's worth, with the implicit
>enreferencement.

Further thought: I'd be OK with a similar shorthand that made the
enreferencement explicit.  So, accepting "=" for the purposes of this
illustration, "=\@foo" would be shorthand for "(foo=>\@foo)".  Probably
only permit one reference deep, but probably permit it on scalars, subs,
and globs, not just on arrays and hashes.  Matching syntax should be
available in signatures: "foo=>\@bar" would be the syntax for an argument
named "foo" that has to be an array ref and results in the lexical named
"@bar" being aliased to that array, then "=\@foo" would be signature
shorthand for "foo=>\@foo".

-zefram
0
perl5
12/3/2019 12:11:19 PM
I wrote:
>usage you're proposing has nothing at all to do with typeglobs, so the
>association with the glob sigil is misleading.  It also might interfere
>a bit with any future work to process actual typeglobs in signatures.

Thinking about it more, I find the case compelling to avoid using
any sigil character as a prefix meaning anything other than the sigil.
But I agree with a prefix character for this aliasing use being superior
to a postfix (though not for your reason).

I suggest that "=" would be a better prefix character, mnemonically
suggesting a stronger-than-usual equality between the argument and the
parameter variable.  In assigning that character I'm assuming that "="
wouldn't be used for the pair-constructor shorthand (because of the POD
clash) and therefore also wouldn't be used for named-parameter shorthand
in signatures.  However, making it possible to have "=" at the start
of a signature item invokes a milder version of the POD clash that I
described in respect of pair shorthand.  It's milder because this isn't
letting an actual expression begin with "=", and therefore isn't letting
a *statement* begin with "=".

If "=" is somehow ruled out for this purpose, there are many options
of weaker mnemonic value that are about as good as each other.  "+"
is probably my second favourite.

-zefram
0
perl5
12/3/2019 12:29:57 PM
On Fri, Nov 29, 2019 at 04:51:15PM +0000, hv@crypt.org wrote:
> Dave Mitchell <davem@iabyn.com> wrote:
> :=head2 Array query parameter
> [...]
> :Unlike the other query types, this one always examines the raw argument
> :list, (i.e. before being sorted for named parameters). Because of this,
> :an array query parameter is forbidden from appearing anywhere to the right
> :of any named parameter.
> 
> Is the transition point only the actual declaration of a named parameter?
> It would also be plausible to say that in sub foo(?$have_name, :$name)
> the boolean query already introduces the start of processing for named
> parameters, and therefore that it should be not permitted to write
> sub foo(?$have_name, ?@all, :$name).

I think it could be argued either way. The raw and sorted arg lists
are identical up until the first named arg is consumed, so
(?$have_name, ?@all, :$name) has (or can have) a well-defined meaning.

> 
> :=head2 Hash query parameter
> :
> :This is most useful in the presence of named parameters. [...]
> :
> :A hash query parameter can appear anywhere in the signature, including in
> :amongst positional parameters, but in that case it is an error unless it
> :is an even number of positional parameters before any first named
> :parameter:
> :
> :    sub foo($p1, ?%query, $p2, $p3, :$n1, :$n2) { ... }  # ok
> :    sub foo($p1, $p2, ?%query, $p3, :$n1, :$n2) { ... }  # compile-time err
> 
> Maybe most useful, but not only useful in the presence of named parameters.
> As such, it needs to allow for optional positionals. I think for this case
> it should be defined explicitly _not_ to check parity, and silently include
> a trailing undef if needed.

So you mean something like:

    sub foo(?%all, $p1, $p2 = 0)

?

My current intent would be:

In the absence of any named params (or trailing hash slurpy), no runtime
parity check is done on the args.

If called as foo("a"), then %all becomes ("a", undef).

What about 

    sub foo(?%all, $p1, $p2, $p3 = 0)

should that be a compile-time error or wok with an implicit undef 4th arg?

-- 
Dave's first rule of Opera:
If something needs saying, say it: don't warble it.
0
davem
12/3/2019 1:19:45 PM
On Mon, Dec 02, 2019 at 02:08:59PM +0000, Zefram via perl5-porters wrote:
> Dave Mitchell wrote:
> >    ?$x       peek ahead to the next arg
> >    ??$x      peek ahead and see if there is a next arg
> 
> Poor Huffman coding: the latter will be used more often.  I think the
> peek-ahead cases should (all, consistently) have a second character after
> the "?" and before the sigil, and the predicate case should only have one
> "?".

This came about because originally I had just ?$x, ?@a, ?%h with ?$x being
the predicate, Someone pointed out that by analogy with ?@a and ?%h, ?$x
might be expected to peek and copy the next arg - hence the predicate was
re-invented as ??$x.

I'm happy for it to become the other way round:

    ?$x       peek ahead and see if there is a next arg
    ??$x      peek ahead to the next arg
    ??@a      peek ahead and copy all remaining args
    ??%h      peek ahead and copy all remaining key/val arg pairs
    ?{ code } execute some code without consuming any args

or am open to other syntax suggestions.

Although in a way I quite like ??$x for the predicate - it makes it stand
out more from the plain parameters which it is likely to be interspersed
with. Which do people prefer:

    (?$has_x, $x, ?$has_y, $x, ?$has_z, $z) 

    (??$has_x, $x, ??$has_y, $x, ??$has_z, $z) 

> >They cannot be a placeholder: i.e. these aren't legal: ??$, ?$, ?@, ?%.
> 
> Seems to me that those should be legal no-ops.  Except for the hash one,
> which isn't even a no-op, because it would enforce even parity of the
> remaining argument list.

Parity is already enforced if there are any named parameters. If not,
then whether ?% should/would enforce and/or be useful is tied up in the
separate subthread started by Hugo about ?%foo in the absence of named
params.

But personally I don't like the idea of quietly ignoring noops - I can't
see any good reason not to croak. Maybe it indicates a typo or thinko?

-- 
"Strange women lying in ponds distributing swords is no basis for a system
of government. Supreme executive power derives from a mandate from the
masses, not from some farcical aquatic ceremony."
    -- Dennis, "Monty Python and the Holy Grail"
0
davem
12/3/2019 1:33:32 PM
Dave Mitchell wrote:
>In the absence of any named params (or trailing hash slurpy), no runtime
>parity check is done on the args.

These semantics are sounding awkward.  You've got one set of semantics
for consuming signature items, and a separate set of semantics for
lookahead signature items that have parallel syntax.  I think it needs
to be simpler: a lookahead signature item should be able to contain any
signature item, and should have identical semantics to its sub-item apart
from not consuming any arguments (and consequently not restricting what
order items can come in).

This simple rule, if done fully, incidentally implies that you can't have
"?" to introduce lookahead alongside "??" for a predicate, because there's
no semantic problem with nesting lookahead inside lookahead.  I suggest
that lookahead should be signalled by "?=", imitating the regexp syntax.
So, matching semantics, "?=%foo" has to impose a parity requirement,
and "?=$foo" has to require that there be at least one more argument.
"?=%" and "?=$" (with no identifier) are actually not no-ops: they impose
these argc requirements without doing anything else.

-zefram
0
perl5
12/3/2019 1:43:52 PM
Dave Mitchell wrote:
>I'm happy for it to become the other way round:
>
>    ?$x       peek ahead and see if there is a next arg
>    ??$x      peek ahead to the next arg

I dislike "??".  Looks too much like you're doing something really weird:
it's so weird that you have to warn the reader with a double question
mark.  The general lookahead is certainly a weirder operation than
the predicate, but I don't think either of them justify such prominent
signalling.  In another message on this subthread I've suggested "?="
for lookahead, imitating the regexp syntax.

>But personally I don't like the idea of quietly ignoring noops - I can't
>see any good reason not to croak. Maybe it indicates a typo or thinko?

It would be unPerlish to croak.  This isn't Python.  There are some
problems that arise from keeping quiet about no-ops that look like they do
something, such as passing an import list to an undefined import method,
or passing excess arguments to a sub that doesn't check how many it got.
But those situations involve action at a distance, where the call site
looks like it's doing something meaningful, and you can only work out that
it's a no-op by examining the distant code that it invokes.  The no-op
signature items that we're concerned with *look* like no-ops.  You don't
have to look at any other bit of code, their no-op nature is inherent
and just naturally arises from the composition of signature features.
Orthogonal composition is the greater value here.

-zefram
0
perl5
12/3/2019 1:56:12 PM
On Mon, Dec 02, 2019 at 01:56:40PM +0000, Zefram via perl5-porters wrote:
> Dave Mitchell wrote:
> >    @_ will not be set, unset or localised on entry to or exit from
> >    a signatured sub;
> 
> Bad idea.  Having a warning for recognisable use of @_ is better than
> nothing, but according to decisions we already made it's not enough.
> Way back, we determined that detection of the use of @_ wouldn't
> be sufficiently complete to serve as the trigger for @_ suppression.
> The reasons for that decision haven't changed.  It is right that we paid
> attention to that issue in repeated decisions that we would not tie @_
> suppression to signatures.  Nothing in the rationale for those decisions
> has changed either.

As I recall, you were very keen on it being orthogonal, and I hated the
idea. Since then, I have added two further proposals:

* query parameters, doing away with the need for @_
* a warning on most/all spottable uses of @_ in lexical scope.

So the only residual reason for keeping them orthogonal, as I understand
it, is when refactoring existing code to use signatures, where a function
makes use of @_ in such a way that it isn't spotted by visual code
inspection, isn't spotted by by the lexical warning, and doesn't trip
anything in the code's test suite.

And in a case like that, you'd still have the problem if the refactorer
failed to set the attribute / pragmata correctly to preserve @_, and if
they got that wrong, they wouldn't spot it due to all the above reasons.`

This really doesn't seem to me a good reason to maintain a whole duplicate
infrastructure, with adding a bunch of extra syntax, and having all the
signature code written so that it can pull the next arg from @_ or from
the stack, etc. Plus all the ambiguity, loss of optimisation potential,
and bugs if code blocks etc can manipulate @_ while its being processed.

-- 
The Enterprise is captured by a vastly superior alien intelligence which
does not put them on trial.
    -- Things That Never Happen in "Star Trek" #10
0
davem
12/3/2019 2:08:18 PM
there's Ada'a (and PL/SQL's,etc) convention of marking all parameters
as IN, OUT, or INOUT depending on if they carry values (IN) or are
pass-by-reference (the INOUT ones) or meant for accepting results
(OUT). Assignment operator in function signatures generally means
default value, does it not?  looked at that way, standard perl subs
are all the same with positional INOUT parameters and one OUT value --
except for scalar/list context of course.

https://cs.lmu.edu/~ray/notes/subroutines/ is a nice overview
including Ada and everything else

On 12/3/19, Zefram via perl5-porters <perl5-porters@perl.org> wrote:
> I wrote:
>>usage you're proposing has nothing at all to do with typeglobs, so the
>>association with the glob sigil is misleading.  It also might interfere
>>a bit with any future work to process actual typeglobs in signatures.
>
> Thinking about it more, I find the case compelling to avoid using
> any sigil character as a prefix meaning anything other than the sigil.
> But I agree with a prefix character for this aliasing use being superior
> to a postfix (though not for your reason).
>
> I suggest that "=" would be a better prefix character, mnemonically
> suggesting a stronger-than-usual equality between the argument and the
> parameter variable.  In assigning that character I'm assuming that "="
> wouldn't be used for the pair-constructor shorthand (because of the POD
> clash) and therefore also wouldn't be used for named-parameter shorthand
> in signatures.  However, making it possible to have "=" at the start
> of a signature item invokes a milder version of the POD clash that I
> described in respect of pair shorthand.  It's milder because this isn't
> letting an actual expression begin with "=", and therefore isn't letting
> a *statement* begin with "=".
>
> If "=" is somehow ruled out for this purpose, there are many options
> of weaker mnemonic value that are about as good as each other.  "+"
> is probably my second favourite.
>
> -zefram
>


-- 
Coming to you live, from behind Sneelock's store, in the big vacant lot.
0
davidnicol
12/3/2019 2:21:25 PM
Dave Mitchell <davem@iabyn.com> wrote:
:On Fri, Nov 29, 2019 at 04:51:15PM +0000, hv@crypt.org wrote:
:> Dave Mitchell <davem@iabyn.com> wrote:
:> :=head2 Hash query parameter
:> :
:> :This is most useful in the presence of named parameters. [...]
:> :
:> :A hash query parameter can appear anywhere in the signature, including in
:> :amongst positional parameters, but in that case it is an error unless it
:> :is an even number of positional parameters before any first named
:> :parameter:
:> :
:> :    sub foo($p1, ?%query, $p2, $p3, :$n1, :$n2) { ... }  # ok
:> :    sub foo($p1, $p2, ?%query, $p3, :$n1, :$n2) { ... }  # compile-time err
:> 
:> Maybe most useful, but not only useful in the presence of named parameters.
:> As such, it needs to allow for optional positionals. I think for this case
:> it should be defined explicitly _not_ to check parity, and silently include
:> a trailing undef if needed.
:
:So you mean something like:
:
:    sub foo(?%all, $p1, $p2 = 0)
:
:?

Yes.

:
:My current intent would be:
:
:In the absence of any named params (or trailing hash slurpy), no runtime
:parity check is done on the args.
:
:If called as foo("a"), then %all becomes ("a", undef).

Good, that's what I would have hoped.

:What about 
:
:    sub foo(?%all, $p1, $p2, $p3 = 0)
:
:should that be a compile-time error or wok with an implicit undef 4th arg?

I think it should get an implicit undef if parity requires it: called with
two arguments, %all should have just one pair, and not an additional
(undef => undef) pair. I.e. it should behave the same as C< %all = @_ >.

Hugo
0
hv
12/3/2019 2:38:23 PM
Dave Mitchell wrote:
>signature code written so that it can pull the next arg from @_ or from
>the stack,

I wouldn't do it that way.  The signature should work from the arguments
actually passed, regardless of whether @_ was also populated from those
arguments.

>                Plus all the ambiguity, loss of optimisation potential,
>and bugs if code blocks etc can manipulate @_ while its being processed.

I don't see how any of these arise.

-zefram
0
perl5
12/4/2019 4:27:56 PM
Another thought: If `@_` is not generated, how do modules like
Carp::Always inspect the args of all of the callstack? Currently those
look at (I believe) @DB::args though the exact logic confuses me. Will
that remain working here?


$ perl -MCarp::Always -E 'sub A { B(123); } sub B { C(456); } sub C {
die 789; } A' 789 at -e line 1.
        main::C(456) called at -e line 1
        main::B(123) called at -e line 1
        main::A() called at -e line 1

https://metacpan.org/pod/Carp::Always

I think it's implemented by:

  https://metacpan.org/release/Carp/source/lib/Carp.pm#L329-391

-- 
Paul "LeoNerd" Evans

leonerd@leonerd.org.uk      |  https://metacpan.org/author/PEVANS
http://www.leonerd.org.uk/  |  https://www.tindie.com/stores/leonerd/
0
leonerd
12/4/2019 5:49:04 PM
Paul "LeoNerd" Evans wrote:
>look at (I believe) @DB::args though the exact logic confuses me. Will
>that remain working here?

Ooh, this turns out to be more interesting than it first seemed.
Obviously we'd want @DB::args to still be populated: it's still a
useful facility, and logically the concept of "the arguments with which
the subroutine was invoked" (as stated in the caller() doc) applies
just as well regardless of whether those arguments went into an @_.
But the implementation in pp_caller() is very tied up with @_, actually
referring to the AV that serves as the sub's @_.  (Some, but not all,
alterations to @_ affect what shows up in @DB::args.)  This wouldn't
work with @_ suppression.  It'll be necessary to implement a new way
for context stack frames to refer to a sub's arguments; probably just
pointing to them on the stack.

-zefram
0
perl5
12/4/2019 10:00:11 PM
Reply: