Re: RFC 155 - Remove geometric functions from core

A friend pointed out, technically most are trigonometric functions,
not geometric.  atan2, cos, sin, acos, asin and tan are all trig.
exp, log, sqrt are... just math I guess.

So I suppose the proposed module would be Math::Trig or some such.  Or
maybe, as the source code suggests, Math::High:Falutin.


However, since those funtions take up about 200 lines in the core, are
very stable and relatively easy to document, what do we win by
removing them?


PS  The idea of adding acos, asin and tan is good.


-- 

Michael G Schwern      http://www.pobox.com/~schwern/      schwern@pobox.com
Just Another Stupid Consultant                      Perl6 Kwalitee Ashuranse
Our business in life is not to succeed but to continue to fail in high
spirits.
                -- Robert Louis Stevenson
0
schwern (707)
8/24/2000 9:41:44 PM
perl.perl6.internals 7376 articles. 0 followers. Follow

62 Replies
660 Views

Similar Articles

[PageSpeed] 19
Get it on Google Play
Get it on Apple App Store

>A friend pointed out, technically most are trigonometric functions,
>not geometric.  atan2, cos, sin, acos, asin and tan are all trig.
>exp, log, sqrt are... just math I guess.

>So I suppose the proposed module would be Math::Trig or some such.  Or
>maybe, as the source code suggests, Math::High:Falutin.

>However, since those funtions take up about 200 lines in the core, are
>very stable and relatively easy to document, what do we win by
>removing them?

>PS  The idea of adding acos, asin and tan is good.

It's sounding like folks want to delete this, delete that.  Let's
just cut to the chase and propose that anything that can be redefined
be deleted.  After all, if it's so mutable and boring that people
can diddle its definition, then who cares?  Yes, that means that
anything that's not a reserved word should be banished from Perl,
relegated to some third-class back-of-the-bus module.

Which are those?

Those that return a negative number in the C-language keyword()
function in the toke.c file in your Perl source kit may be overridden.
Keywords that *cannot* be overridden are chop, defined, delete, do,
dump, each , else, elsif, eval, exists, for, foreach, format, glob,
goto, grep, if, keys, last, local, m, map, my, next, no, package,
pop, pos, print, printf, prototype, push, q, qq, qw, qx, redo,
return, s, scalar, shift, sort, splice, split, study, sub, tie,
tied, tr, undef, unless, unshift, untie, until, use, while, and y.
All the rest can--and so, it seems, should go to the back of the
module bus.  If it had been all that important, it wouldn't have
been mutable.

--tom
0
tchrist
8/24/2000 9:54:05 PM
Tom Christiansen <tchrist@chthon.perl.com> writes:

> Keywords that *cannot* be overridden are chop, defined, delete, do,
> dump, each , else, elsif, eval, exists, for, foreach, format, glob,
> goto, grep, if, keys, last, local, m, map, my, next, no, package,
> pop, pos, print, printf, prototype, push, q, qq, qw, qx, redo,
> return, s, scalar, shift, sort, splice, split, study, sub, tie,
> tied, tr, undef, unless, unshift, untie, until, use, while, and y.

Hmm. Quite a few of these should no longer be special:

 chop, defined, delete, dump, each, exists, glob, grep, keys, map,
 pop, pos, print, printf, prototype, push, scalar, shift, sort,
 splice, split, study, tie, tied, undef?, unshift, untie.

-- Johan
0
JVromans
8/25/2000 9:12:58 AM
Lightning flashed, thunder crashed and Michael G Schwern <schwern@pobox.com> wh
ispered:
| However, since those funtions take up about 200 lines in the core, are
| very stable and relatively easy to document, what do we win by
| removing them?
| 
| PS  The idea of adding acos, asin and tan is good.

You just answered your own question.  It is very difficult to add new
functions to the core.  It is very easy to write new modules.  Doesn't it
make sense that if you have to use Math::Trig to get to acos and friends,
you might as well make the language definition clean and say all of acos
friends should be in that module, not some in the core?

That's the basic goal behind my RFCs for moving things to modules.  In
general, I hope to make the language cleaner, easier to learn and use, and
easier to extend.  If at the same time the language became better
performing because of a removal of some of the core, all the better.  As
you say, 200 lines isn't much.  But combine that with the IPC, the
environment, the system, etc it all adds up.

-spp
0
spp
8/25/2000 1:12:32 PM
At 09:12 AM 8/25/00 -0400, Stephen P. Potter wrote:
>  As you say, 200 lines isn't much.  But combine that with the IPC, the
>environment, the system, etc it all adds up.

Not to much, though. We've been down this road for perl 5. You'd be 
surprised at how little code gets removed if you yank most of the functions 
under discussion. (They're generally trivial wrappers around library calls, 
with very little code involved)

					Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk

0
dan
8/25/2000 1:20:53 PM
>That's the basic goal behind my RFCs for moving things to modules.  In
>general, I hope to make the language cleaner, easier to learn and use, and
>easier to extend.  

"Clean"?  What is "clean"?  Huh?  And since when has Perl ever been 
supposed to be "clean"?  I've got plenty of quotage from Larry to the
contrary.

You must now define objective and quantifiable criteria for what
does and thus what does not constitute:

    * Making a language cleaner
    * Making a language easier to learn
    * Making a language easier to use
    * Making a language easier to extend

Provide working frameworks for those in concrete terms that can be
applied to your proposals in a non-feel-good fashion, one that is
discretely measurable by any party, and only then we can see whether
they actually make any sense.  Right now, they seem like random
fantasies without any basis in reality.

Note that it is more important that a language be easier to use
than anything else you've listed.

>If at the same time the language became better
>performing because of a removal of some of the core, all the better.  As
>you say, 200 lines isn't much.  But combine that with the IPC, the
>environment, the system, etc it all adds up.

What the bloody blazes are you going to do with a language that can't
do systems work?  Number crunching, I suppose.  Oh wait--you already
stripped that out, too.  :-(

Remember: big languages make small programs, and small language make
big programs.   Larry in observing this has clearly weighed in on the
side of small programs.  Micro-language people always have forth.

--tom
0
tchrist
8/25/2000 1:28:55 PM
>Not to much, though. We've been down this road for perl 5. You'd be 
>surprised at how little code gets removed if you yank most of the functions 
>under discussion. (They're generally trivial wrappers around library calls, 
>with very little code involved)

Thaniks -- I wish people wouldn't forget this.

And it seems crazy to worry about sacrificing convenient functionality
by saving a few bytes when memory is as incredibly cheap as it is.

--tom
0
tchrist
8/25/2000 1:30:19 PM
On Fri, Aug 25, 2000 at 09:12:32AM -0400, Stephen P. Potter wrote:
> Lightning flashed, thunder crashed and Michael G Schwern <schwern@pobox.com> wh
> ispered:
> | PS  The idea of adding acos, asin and tan is good.
> 
> You just answered your own question.  It is very difficult to add new
> functions to the core.  It is very easy to write new modules.  Doesn't it
> make sense that if you have to use Math::Trig to get to acos and friends,
> you might as well make the language definition clean and say all of acos
> friends should be in that module, not some in the core?

Actually I was suggesting that acos, asin and tan be added to the core.

Most likely, splitting this out into a module wouldn't make anything
much simpler.  To get anything like the performance math functions
need, Math::Trig could not be written in plain Perl.  It would have to
be written in C or XS (or whatever perl6 winds up being written in)
probably calling the system's math libraries and still dragging in all
the basic problems of a core patch and configuration.


> That's the basic goal behind my RFCs for moving things to modules.  In
> general, I hope to make the language cleaner, easier to learn and use, and
> easier to extend.  If at the same time the language became better
> performing because of a removal of some of the core, all the better.  As
> you say, 200 lines isn't much.  But combine that with the IPC, the
> environment, the system, etc it all adds up.

I think I basically agree with tchrist here.  This is nickel and dime
stuff.  Cutting out the math functions isn't going to do squat.  IPC
and the system... that might do something, but Dan pointed out that it
doesn't amount to much either.

I think we should back up and reconsider what the intent of this RFC
is.  

If the intent is to make perl more maintainable and easier to patch,
I've already commented on that.  You're just shuffling code around.
And only a small fraction at that.

If you wish to make perl smaller and faster, just tearing things out
isn't going to help.  Its hit-or-miss optimizing.  You could remove
half the core functions and find out you only gained 5%.

Like all other optimizing attempts, the first step is analysis.
People have to sit down and systematically go through and find out
what parts of perl (and Perl) are eating up space and speed.  The
results will be very surprising, I'm sure, but it will give us a
concrete idea of what we can do to really help out perl's performance.

There should probably be an RFC to this effect, and I'm just visiting
here in perl6-language so I dump it on somebody else.

        exit('stage' << 1);


-- 

Michael G Schwern      http://www.pobox.com/~schwern/      schwern@pobox.com
Just Another Stupid Consultant                      Perl6 Kwalitee Ashuranse
When faced with desperate circumstances, we must adapt.
        - Seven of Nine
0
schwern
8/25/2000 9:29:38 PM
On Fri, Aug 25, 2000 at 09:20:53AM -0400, Dan Sugalski wrote:
> At 09:12 AM 8/25/00 -0400, Stephen P. Potter wrote:
> >  As you say, 200 lines isn't much.  But combine that with the IPC, the
> >environment, the system, etc it all adds up.
> 
> Not to much, though. We've been down this road for perl 5. You'd be 
> surprised at how little code gets removed if you yank most of the functions 
> under discussion. (They're generally trivial wrappers around library calls, 
> with very little code involved)

Here are some numbers for people who have forgotten the above.
Using the latest perl development sources, using an unnamed UNIX:

								bytes

microperl, which has almost nothing os dependent (*) in it	1212416
shared libperl 1277952 bytes + perl 32768 bytes			1310720
dynamically linked perl						1376256
statically linked perl with all the core extensions		2129920

  (*) I haven't tried building it in non-UNIX boxes, so I can't be certain
  of how fastidiously features have been disabled.

So ripping all this 'cruft' would save us about 100-160 kB, still
leaving us with well over a 1MB-plus executable.  It's Perl itself
that's big, not the thin glue to the system functions.

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen
0
jhi
8/25/2000 10:48:59 PM
>So ripping all this 'cruft' would save us about 100-160 kB, still
>leaving us with well over a 1MB-plus executable.  It's Perl itself
>that's big, not the thin glue to the system functions.

Moreover, if you rip out that 100-160 kb, allocating it to modules,
then I can guarantee you that it will cost significantly more memory
as a module to pull in that if it here already there.  There's always
some overhead involved in these things.  Notice how the Byteloader
produces much bigger executables than the originals.  Some work has
been done to fix that, but as a minimum, you've got the Byteloader 
module itself.  And it gets worse... 

Disastrously, you will then also lose the shared text component,
which is what makes all this cheap when Perl loads.  Since the
modules will have to be pasted in the data segment of each process
that wants them, they aren't going to be in a shared region, except
perhaps for some of the non-perl parts of them on certain architectures.
But certainly the Perl parts are *NEVER* shared.  That's why the
whole CGI.pm or IO::whatever.pm stuff hurts so badly: you run with
10 copies of Perl on your system (as many people do, if not much
more than that), then you have to load them, from disk, into each
process that wants them, and eth result of what you've loaded cannot
be shared, since you loaded and compiled source code into non-shared
parse trees.  This is completely abysmal.  Loading bytecode is no win:
it's not shared text.

--tom
0
tchrist
8/26/2000 12:01:15 AM
Tom Christiansen wrote:
> 
> >So ripping all this 'cruft' would save us about 100-160 kB, still
> >leaving us with well over a 1MB-plus executable.  It's Perl itself
> >that's big, not the thin glue to the system functions.
> 
> Moreover, if you rip out that 100-160 kb, allocating it to modules,
> then I can guarantee you that it will cost significantly more memory
> as a module to pull in that if it here already there.  There's always
> some overhead involved in these things.  Notice how the Byteloader
> produces much bigger executables than the originals.  Some work has
> been done to fix that, but as a minimum, you've got the Byteloader
> module itself.  And it gets worse...

Depends on your definition of "module". Many people seem to be assuming
"module" eq "shared library". I submit that a better definition would be
anything that uses the Perl6 Core API.

The vision I have is that all existing builtins _could_ be linked with
the "perl core". The new builtins (a small list) always will be -- they
require inside knowledge of data structures that should specifically not
be exposed to extensions. (With a "please", not a shotgun.) Any or all
of the "core modules" could be placed into one or separate shared
libraries. They would have no more access to the internals than any 3rd
party module. (That's my real definition of a module -- it's a property
of the architecture, not of the linkage, preprocessing, author, or
whatever else.)

What this means is that we have to define a recommended API for modules
to use (and warn that compatibility is not assured if they dig deeper
than this), and to make sure that sin(), socket() etc. strictly obey
this discipline. If this API is too raw for the convenience of many
extensions, then we can always layer it with a kindler gentler API, but
the bottom-most API should be the important one. At compile time, you
can choose which things you want linked into the perl core,  which ones
to stick in a shared library, and which ones to leave out entirely, with
a warning "what you are about to compile is not Perl" if they choose to
leave out any "core module" entirely. In fact, the binary should then be
called "phbbbbt" rather than "perl".

Whether this is implemented via the preprocessor, autogenerated code, or
? is a separate issue. It seems easily possible to define an API that
allows either static or dynamic linking with no performance loss.

Maybe this is what some people are already assuming. I just thought I
should spell it out. It's probably implied by the RFC that says user
subs should be indistinguishable from opcodes (doesn't one that sound
something like that exist? I'm too lazy to go look.)
0
sfink
8/26/2000 1:19:56 AM
Tom Christiansen <tchrist@chthon.perl.com> writes:
>
>Disastrously, you will then also lose the shared text component,
>which is what makes all this cheap when Perl loads.  

But we can on modern OSes have shared data too.

>Since the
>modules will have to be pasted in the data segment of each process
>that wants them, they aren't going to be in a shared region, except
>perhaps for some of the non-perl parts of them on certain architectures.
>But certainly the Perl parts are *NEVER* shared.  That's why the
>whole CGI.pm or IO::whatever.pm stuff hurts so badly: you run with
>10 copies of Perl on your system (as many people do, if not much
>more than that), then you have to load them, from disk, into each
>process that wants them, and eth result of what you've loaded cannot
>be shared, since you loaded and compiled source code into non-shared
>parse trees.  This is completely abysmal.  Loading bytecode is no win:
>it's not shared text.

Loading perl5 bytecode is a non-win I agree 110%.

But if perl6 bytecode does not need to be modified to be used
it can be mmap()'ed shared, read-only and hence page-cached and reused.

-- 
Nick Ing-Simmons

0
nick
8/27/2000 8:28:04 PM
Jarkko Hietaniemi <jhi@iki.fi> writes:
>								bytes
>
>microperl, which has almost nothing os dependent (*) in it	1212416
>shared libperl 1277952 bytes + perl 32768 bytes			1310720
>dynamically linked perl						1376256
>statically linked perl with all the core extensions		2129920
>
>  (*) I haven't tried building it in non-UNIX boxes, so I can't be certain
>  of how fastidiously features have been disabled.

"bytes" of what? - size of executable, size of .text, ???
If we are taling executable with -g size then  a lot of that is symbol-table
and is tedious repetition of "sv.h" & co. re-itteerated in each .o file.

But the basic point is that these things are small.

>
>So ripping all this 'cruft' would save us about 100-160 kB, still
>leaving us with well over a 1MB-plus executable.  It's Perl itself
>that's big, not the thin glue to the system functions.

My support for the idea is not to reduce the size of perl in the UNIX
case, but to allow replacement. I would also like to have the mechanism
worked out and "proven" on something that we know gets used so 
that we can have good solid testing of the mechanism. Then something 
less obvious (say Damian's any/all operators) which might be major
extra size and not of universal appeal can use a well tried mechanism,
and we can flip default to re-link sockets or sin/cos/tan into the core.

-- 
Nick Ing-Simmons

0
nick
8/27/2000 8:37:38 PM
On Sun, Aug 27, 2000 at 08:37:38PM +0000, Nick Ing-Simmons wrote:
> Jarkko Hietaniemi <jhi@iki.fi> writes:
> >								bytes
> >
> >microperl, which has almost nothing os dependent (*) in it	1212416
> >shared libperl 1277952 bytes + perl 32768 bytes			1310720
> >dynamically linked perl						1376256
> >statically linked perl with all the core extensions		2129920
> >
> >  (*) I haven't tried building it in non-UNIX boxes, so I can't be certain
> >  of how fastidiously features have been disabled.
> 
> "bytes" of what? - size of executable, size of .text, ???

Sizes of the executables/libs, code fully optimized -- and symbols stripped.

> If we are taling executable with -g size then  a lot of that is symbol-table
> and is tedious repetition of "sv.h" & co. re-itteerated in each .o file.
> 
> But the basic point is that these things are small.

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen
0
jhi
8/27/2000 8:57:24 PM
Lightning flashed, thunder crashed and Steve Fink <sfink@digital-integrity.com>
 whispered:
| Depends on your definition of "module". Many people seem to be assuming
| "module" eq "shared library".

Yes, exactly.  I use module as a generic term for something other than the
main perl binary itself, a black box if you will.  The more modular we can
make perl, I think the easier it is for it to be ported or embedded or
whatever.  It is very easy to simply not install a "module" or write stub
functions, or even to add a new function.  It is not so easy to remove
things from "the core".  Note some of the deprecated features we've lived
with for years (such as EQ and friends).

-spp
0
spp
8/28/2000 1:50:24 AM
Nick Ing-Simmons <nick@ing-simmons.net> writes:

> But if perl6 bytecode does not need to be modified to be used 

I'd assume that.

-- Johan
0
JVromans
8/28/2000 6:21:57 PM
--Rzq/nSLlHy1djmXS
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Tom Christiansen <tchrist@chthon.perl.com> writes:
> Keywords that *cannot* be overridden are chop, defined, delete, do,
> dump, each , else, elsif, eval, exists, for, foreach, format, glob,
> goto, grep, if, keys, last, local, m, map, my, next, no, package,
> pop, pos, print, printf, prototype, push, q, qq, qw, qx, redo,
> return, s, scalar, shift, sort, splice, split, study, sub, tie,
> tied, tr, undef, unless, unshift, untie, until, use, while, and y.

Thanks! That's a really helpful list!

2000-08-25-05:12:58 Johan Vromans:
> Hmm. Quite a few of these should no longer be special:
>=20
>  chop, defined, delete, dump, each, exists, glob, grep, keys, map,
>  pop, pos, print, printf, prototype, push, scalar, shift, sort,
>  splice, split, study, tie, tied, undef?, unshift, untie.

If that were to be done, it'd be a nice clean superset of solving
RFC 70: if (as best I can tell by eyeballing that list) glob, print,
and printf were overridable, then Fatal.pm could be completed
usefully for those of us who would rather have to explicitly catch
errors to prevent them from exiting with an error message, rather
than having to explicitly check for errors to prevent them from
being ignored.

-Bennett

--Rzq/nSLlHy1djmXS
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.2 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE5q9t0L6KAps40sTYRAtg9AJ93Crmp9/ucIK8PS/AeUC1gXSv4HACeKJ8u
1AVpY2uqnePMPpAReWfCres=
=P/98
-----END PGP SIGNATURE-----

--Rzq/nSLlHy1djmXS--
0
bet
8/29/2000 3:49:08 PM


Well then.  It is impossible to rearchitect it to make it shared
text?  Perhaps the first instance of perl sets up some vast shared
memory segments and a way for the newcomers to link in to it and look
at the modules that have been loaded, somewhere on this system, and use
the common copy?




This handwringing naysaying is depressing.




Tom Christiansen wrote:

> Disastrously, you will then also lose the shared text component,
> which is what makes all this cheap when Perl loads.  Since the
> modules will have to be pasted in the data segment of each process
> that wants them, they aren't going to be in a shared region, except
> perhaps for some of the non-perl parts of them on certain architectures.
> But certainly the Perl parts are *NEVER* shared.

This sounds like a problem to be fixed.  Relax, Tom, we'll take it from
here.


> That's why the
> whole CGI.pm or IO::whatever.pm stuff hurts so badly: you run with
> 10 copies of Perl on your system (as many people do, if not much
> more than that), then you have to load them, from disk, into each
> process that wants them, and eth result of what you've loaded cannot
> be shared, since you loaded and compiled source code into non-shared
> parse trees.  This is completely abysmal.  Loading bytecode is no win:
> it's not shared text.
> 
> --tom

-- 
                          David Nicol 816.235.1187 nicold@umkc.edu
                                        Ask me about sidewalk eggs
0
david
8/29/2000 5:43:05 PM
David L. Nicol writes:
> This handwringing naysaying is depressing.

Yes, it's depressing to find out there are problems in one's grand
plans.  However, I'm very glad that people (including Tom) are
pointing out problems *before* we commit to a course of action.

Nat
0
gnat
8/29/2000 5:48:22 PM
>Well then.  It is impossible to rearchitect it to make it shared
>text?  Perhaps the first instance of perl sets up some vast shared
>memory segments and a way for the newcomers to link in to it and look
>at the modules that have been loaded, somewhere on this system, and use
>the common copy?

I'd be astonished to see a general-purpose, cross-platform, and
maintainable solution to this problem.  I predict that you'd at the
very best, only address this a few places.  Feel free to astonish me.

>This handwringing naysaying is depressing.

Very well, then: I'll save it for an after-the-fact I-TOLD-YOU-SO,
which, believe it or not, is truly *not* a pleasant thing to be
able to say.

--tom
0
tchrist
8/29/2000 6:02:53 PM
On Tue, 29 Aug 2000, David L. Nicol wrote:

> Well then.  It is impossible to rearchitect it to make it shared
> text?  Perhaps the first instance of perl sets up some vast shared
> memory segments and a way for the newcomers to link in to it and look
> at the modules that have been loaded, somewhere on this system, and use
> the common copy?

That approach invites big security problems.  Any system that involves one
program trusting another program to load executable code into their memory
space is vulnerable to attack.  This kind of thing works for forking
daemons running identical code since the forked process trusts the parent
process.  In the general case of a second perl program starting on a
machine, why would this second program trust the first program to not
load a poison module?

I don't believe you can simply "rearchitect it to make it shared text".

> This sounds like a problem to be fixed.  Relax, Tom, we'll take it from
> here.

Are you so sure?  From where I'm sitting he's got some pretty tough points
there.  If you've got a solution then I'm quite suprised, which would be
great.  If not then I suggest you avoid writing the proverbial bad check.

-sam


0
sam
8/29/2000 6:04:27 PM
At 12:02 PM 8/29/00 -0600, Tom Christiansen wrote:
> >Well then.  It is impossible to rearchitect it to make it shared
> >text?  Perhaps the first instance of perl sets up some vast shared
> >memory segments and a way for the newcomers to link in to it and look
> >at the modules that have been loaded, somewhere on this system, and use
> >the common copy?
>
>I'd be astonished to see a general-purpose, cross-platform, and
>maintainable solution to this problem.  I predict that you'd at the
>very best, only address this a few places.  Feel free to astonish me.

It's possible we'll manage this with mmap()ing predigested bytecode, but 
I'm not entirely sure that's feasable--it means the optree (or whatever) 
that perl runs through would have to have no pointers at all in it, and 
that might be a rather big speed hit. (Presumably branches would need to do 
relative lookups and such)

On the other hand, if we go the TIL route (which has its own cross-platform 
headaches) we could precompile segments of code into something that would 
be shareable as it'd be real executable code.

> >This handwringing naysaying is depressing.
>
>Very well, then: I'll save it for an after-the-fact I-TOLD-YOU-SO,
>which, believe it or not, is truly *not* a pleasant thing to be
>able to say.

Personally I'd rather have someone throwing wet blankets now, rather than 
later. If the ideas have merit enough to be worth doing then they'll 
survive the uncomfortable scrutiny. If they don't, better to get the ego 
bruising out of the way now, rather than after spending a month or more of 
wasted time.

					Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk

0
dan
8/29/2000 6:16:50 PM
On Tue, 29 Aug 2000, Sam Tregar wrote:

> On Tue, 29 Aug 2000, David L. Nicol wrote:
> 
> > Well then.  It is impossible to rearchitect it to make it shared
> > text? 

> I don't believe you can simply "rearchitect it to make it shared text".

That depends on what the meaning of "it" is. :-)

If "it" is primarily XS code, then I can easily imagine building a new
perl binary with the compiled XS code linked in (kind of like the current
static linking of extensions).  If "it" is primarily perl code (like
CGI.pm) then it's harder.  But if the compiler back end works out well
enough, then CGI.o could be linked into the main executable too.  Would
that solve the shared text problem?

Just because we can leave almost everything out and dynamically link it in
later doesn't mean that the default build should be so extreme.  After
all, it's rather silly to dynamically load something you're going to load
almost every time you run.

-- 
    Andy Dougherty		doughera@lafayette.edu
    Dept. of Physics
    Lafayette College, Easton PA 18042

0
doughera
8/29/2000 6:40:06 PM
Sam Tregar wrote:
> 
> On Tue, 29 Aug 2000, David L. Nicol wrote:
> 
> > Well then.  It is impossible to rearchitect it to make it shared
> > text?  Perhaps the first instance of perl sets up some vast shared
> > memory segments and a way for the newcomers to link in to it and look
> > at the modules that have been loaded, somewhere on this system, and use
> > the common copy?
> 
> That approach invites big security problems.  Any system that involves one
> program trusting another program to load executable code into their memory
> space is vulnerable to attack.  This kind of thing works for forking
> daemons running identical code since the forked process trusts the parent
> process.  In the general case of a second perl program starting on a
> machine, why would this second program trust the first program to not
> load a poison module?

does sysV shm not support the equivalent security as the file system?

Did I not just describe how a .so or a DLL works currently?

Yes, the later Perls would have to trust the first one to load the modules
into the shared space correctly, and none of them would be allowed to
barf on the couch.  

A paranoid mode would be required in which
you don't use the shared pre-loaded module pool.

In the ever-immenent vaporware implementation, this whole thing may
be represented as a big file into which we can seek() to locate stuff.
 




-- 
                          David Nicol 816.235.1187 nicold@umkc.edu
                                               Yum, sidewalk eggs!
0
david
8/29/2000 6:47:52 PM
Lightning flashed, thunder crashed and Tom Christiansen <tchrist@chthon.perl.co
m> whispered:
| Very well, then: I'll save it for an after-the-fact I-TOLD-YOU-SO,
| which, believe it or not, is truly *not* a pleasant thing to be
| able to say.

Tom, we appreciate your constructive comments and your help in making sure
we've considered all issues before we embark on a particular path.  It is
when you start calling us stupid for even suggesting that we look at the
other paths that problems arise.

There are lots of paths out there.  Some of them may have quicksand in the
middle of them, but can be safely navigated if we see them first.  Some of
them may be rabbit paths instead of super highways.  Some of them may even
lead to the pits of despair.  But, if no one ever says "what about this
path?" we'll never go anywhere except around the block.

-spp
0
spp
8/29/2000 7:06:59 PM
David L . Nicol <david@kasey.umkc.edu> writes:
>
>does sysV shm not support the equivalent security as the file system?

mmap() has the file system.

>
>Did I not just describe how a .so or a DLL works currently?

And behind the scenes that does something akin to:

int fd = open("file_of_posn_indepenant_byte_code",O_RDONLY);
struct stat st;
fstat(fd,&st);
code_t *code = mmap(NULL,st.st_len,PROT_READ,MAP_SHARED,fd,0);
close(fd);

strace (linux) or truss (solaris) will show you what I mean.

And then trusts to OS to honour MAP_SHARED.  (mmap() is POSIX.)

Win32 has "something similar" but I don't remember the function names off
hand.

Or you can embed your bytecode in 

const char script[] = {...};

and link/dlopen() it and then you have classical shared text.



-- 
Nick Ing-Simmons

0
nick
8/29/2000 7:32:36 PM
On Tue, 29 Aug 2000, David L. Nicol wrote:

> does sysV shm not support the equivalent security as the file system?

Well, no, I don't think it does.  It supports permissions on individual
segments but it doesn't support anything like directory perimssions.  It
might be enough, and it might not be.  A user can run two programs and not
expect one to have an automatic exploit on the other just because they're
both Perl!  Think "nobody".

Yes, you'd provide a paranoid mode for experts to use to avoid the
problems to which most users would be exposed.  Great.

> Did I not just describe how a .so or a DLL works currently?

Certainly not.  You wrote only a few sentences.  I'm no expert but I don't
think that shared libraries are that simple.  I also don't think they're
implemented using SysV IPC shared memory, but you might know differently.

> In the ever-immenent vaporware implementation, this whole thing may
> be represented as a big file into which we can seek() to locate stuff.

Zuh?  What are you talking about?  Is this some kind of Inline.pm-esque
system?

-sam


0
sam
8/29/2000 7:43:14 PM
At 07:32 PM 8/29/00 +0000, Nick Ing-Simmons wrote:
>David L . Nicol <david@kasey.umkc.edu> writes:
> >
> >Did I not just describe how a .so or a DLL works currently?
>
>And behind the scenes that does something akin to:
>
>int fd = open("file_of_posn_indepenant_byte_code",O_RDONLY);
>struct stat st;
>fstat(fd,&st);
>code_t *code = mmap(NULL,st.st_len,PROT_READ,MAP_SHARED,fd,0);
>close(fd);

Don't forget the fixup work that needs to be done afterwards. Loading the 
library into memory's only the first part--after that the loader needs to 
twiddle with transfer vectors and such so the unresolved calls into the 
routines in the newly loaded library get resolved.

					Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk

0
dan
8/29/2000 8:15:06 PM
On Tue, 29 Aug 2000, Nick Ing-Simmons wrote:
> David L . Nicol <david@kasey.umkc.edu> writes:
> >
> >does sysV shm not support the equivalent security as the file system?
> 
> mmap() has the file system.

I wasn't aware that mmap() was part of SysV shared memory.  My
mistake?  It's not on the SysV IPC man pages on my Linux system.  The mmap
manpage doesn't mention SysV IPC either.

-sam



0
sam
8/29/2000 8:28:47 PM
Sam Tregar <sam@tregar.com> writes:
>On Tue, 29 Aug 2000, Nick Ing-Simmons wrote:
>> David L . Nicol <david@kasey.umkc.edu> writes:
>> >
>> >does sysV shm not support the equivalent security as the file system?
>> 
>> mmap() has the file system.
>
>I wasn't aware that mmap() was part of SysV shared memory. 

It is NOT. It is another (POSIX) way of getting shared memory bewteen 
processes. Even without MAP_SHARED OS will share un-modified pages 
between processes.

It happens to be the way modern UNIX implemements "shared .text".
i.e. the ".text" part of the object file is mmap()'ed  into 
each process.

>My
>mistake?  It's not on the SysV IPC man pages on my Linux system.  The mmap
>manpage doesn't mention SysV IPC either.

SysV IPC is a mess IMHO. 

My point was that if the "file system" is considered
sufficient then mmap()ing file system objects will get you "shared code"
or "shared data" without any tedious reinventing of wheels.

-- 
Nick Ing-Simmons

0
nick
8/29/2000 8:31:23 PM
>>> mmap() has the file system.
>>I wasn't aware that mmap() was part of SysV shared memory. 

>It is NOT. It is another (POSIX) way of getting shared memory bewteen 
>processes. Even without MAP_SHARED OS will share un-modified pages 
>between processes.

....

>SysV IPC is a mess IMHO. 

For a good time, see Camel 3's introductory discussion of SysV IPC. :-)

--tom
0
tchrist
8/29/2000 9:57:02 PM
>>>>> "ST" == Sam Tregar <sam@tregar.com> writes:

  ST> On Tue, 29 Aug 2000, Nick Ing-Simmons wrote:
  >> David L . Nicol <david@kasey.umkc.edu> writes:
  >> >
  >> >does sysV shm not support the equivalent security as the file system?
  >> 
  >> mmap() has the file system.

  ST> I wasn't aware that mmap() was part of SysV shared memory.  My
  ST> mistake?  It's not on the SysV IPC man pages on my Linux system.
  ST> The mmap manpage doesn't mention SysV IPC either.

mmap came from berkeley. i used it on early versions of sunos which was
based on BSD. so calling it SysV IPC is wrong.

uri

-- 
Uri Guttman  ---------  uri@sysarch.com  ----------  http://www.sysarch.com
SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting
The Perl Books Page  -----------  http://www.sysarch.com/cgi-bin/perl_books
The Best Search Engine on the Net  ----------  http://www.northernlight.com
0
uri
8/29/2000 10:43:34 PM
On Tue, Aug 29, 2000 at 06:43:34PM -0400, Uri Guttman wrote:
> >>>>> "ST" == Sam Tregar <sam@tregar.com> writes:
> 
>   ST> On Tue, 29 Aug 2000, Nick Ing-Simmons wrote:
>   >> David L . Nicol <david@kasey.umkc.edu> writes:
>   >> >
>   >> >does sysV shm not support the equivalent security as the file system?
>   >> 
>   >> mmap() has the file system.
> 
>   ST> I wasn't aware that mmap() was part of SysV shared memory.  My
>   ST> mistake?  It's not on the SysV IPC man pages on my Linux system.
>   ST> The mmap manpage doesn't mention SysV IPC either.
> 
> mmap came from berkeley. i used it on early versions of sunos which was
> based on BSD. so calling it SysV IPC is wrong.

Yup.  I think somebody said that mmap() is POSIX.  It isn't.  POSIX
realtime extensions (and Single UNIX Spec) have shared memory objects,
which are different from either SysV IPC or mmap().  The smos have a
system-wide flat namespace, not connected to the usual filesystem
namespace (in the definition, that is, nobody of course forbids making
it visible)  mmap() is also in the SUS.

Executive summary: there are three different "shared memory" APIs.

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen
0
jhi
8/29/2000 10:51:52 PM
>Yup.  I think somebody said that mmap() is POSIX.  It isn't.  

Are you sure?

    http://slacvx.slac.stanford.edu/HELP/POSIX/CALLABLE_FUNCTIONS/MMAP


       The mmap() function maps process addresses to a memory object.

       IEEE Std 1003.1b-1993, �12.2.1.

       C Format

	 #include <sys/mman.h>

	 void *mmap(void *addr, size_t len, int prot, int flags, int
		   fildes, off_t off);
     


Googling for "POSIX mmap" comes up with various hits.  But I don't have
the stuff to look up.

--tom
0
tchrist
8/29/2000 11:12:02 PM
On Tue, Aug 29, 2000 at 05:12:02PM -0600, Tom Christiansen wrote:
> >Yup.  I think somebody said that mmap() is POSIX.  It isn't.  
> 
> Are you sure?

Rats, I was wrong.  I dug up my copy and mmap(), mlock(), etc are all
in the latest edition of 1003.1. 

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen
0
jhi
8/29/2000 11:25:00 PM
>>>>> "TC" == Tom Christiansen <tchrist@chthon.perl.com> writes:

  >> Yup.  I think somebody said that mmap() is POSIX.  It isn't.  
  TC> Are you sure?

  TC>     http://slacvx.slac.stanford.edu/HELP/POSIX/CALLABLE_FUNCTIONS/MMAP


  TC>        The mmap() function maps process addresses to a memory object.

i think he meant POSIX didn't create the mmap call (which we agree was a
bsd thing first). jarkko already said that posix has both mmap and their
own shared memory api.

uri

-- 
Uri Guttman  ---------  uri@sysarch.com  ----------  http://www.sysarch.com
SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting
The Perl Books Page  -----------  http://www.sysarch.com/cgi-bin/perl_books
The Best Search Engine on the Net  ----------  http://www.northernlight.com
0
uri
8/29/2000 11:29:00 PM
On Tue, Aug 29, 2000 at 07:29:00PM -0400, Uri Guttman wrote:
> >>>>> "TC" == Tom Christiansen <tchrist@chthon.perl.com> writes:
> 
>   >> Yup.  I think somebody said that mmap() is POSIX.  It isn't.  
>   TC> Are you sure?
> 
>   TC>     http://slacvx.slac.stanford.edu/HELP/POSIX/CALLABLE_FUNCTIONS/MMAP
> 
> 
>   TC>        The mmap() function maps process addresses to a memory object.
> 
> i think he meant POSIX didn't create the mmap call (which we agree was a
> bsd thing first). jarkko already said that posix has both mmap and their
> own shared memory api.

Valiant attempt to interpret me but I really meant that POSIX doesn't
have mmap, and I was valiantly wrong :-)

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen
0
jhi
8/29/2000 11:31:07 PM
Dan Sugalski wrote:

> 
> Don't forget the fixup work that needs to be done afterwards. Loading the
> library into memory's only the first part--after that the loader needs to
> twiddle with transfer vectors and such so the unresolved calls into the
> routines in the newly loaded library get resolved.
> 
>                                         Dan


This is what I was talking about when I suggested the language maintain
a big list of all the addresses of each function, and after the function
gets loaded or compiled it is added to the big list, and after this stage
the placeholder in the op can be replaced with a longjump.

Since the shared segments live at different addresses in different
processes (or should I have stayed awake through that lecture)


And there you go, a JIT.


-- 
                          David Nicol 816.235.1187 nicold@umkc.edu
0
david
8/29/2000 11:52:59 PM
David L Nicol <david@kasey.umkc.edu> writes:

> This is what I was talking about when I suggested the language maintain
> a big list of all the addresses of each function, and after the function
> gets loaded or compiled it is added to the big list, and after this
> stage the placeholder in the op can be replaced with a longjump.

> Since the shared segments live at different addresses in different
> processes (or should I have stayed awake through that lecture)

I'm not sure I'm completely following what you're arguing for here, but be
careful not to go too far down the road of duplicating what the dynamic
loader already knows how to do.  There be dragons; that stuff is seriously
baroque.  You really don't want to reimplement it.

I'd love to see Perl aggressively take advantage of new capabilities in
dynamic loaders, though.  Among other things, I'll point out that symbol
versioning is the way that things like libc manage to be backward
compatible while still changing things, and we should probably seriously
consider using symbol versioning in a shared libperl.so as a means to
provide that much desired and extremely difficult to implement stable API
for modules and the XS-equivalent.

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>
0
rra
8/30/2000 12:43:18 AM
On 29 Aug 2000, Russ Allbery wrote:

> I'm not sure I'm completely following what you're arguing for here, but be
> careful not to go too far down the road of duplicating what the dynamic
> loader already knows how to do.  There be dragons; that stuff is seriously
> baroque.  You really don't want to reimplement it.

I'd very much like to not do it. It's bad enough to have to do that sort
of thing with perl code we completely control--having to do it portably
for what's essentially native code is more work than I'd really like to
do. Clever People have already provided the capabilities on individual
platforms and that's just fine by me.
 
> I'd love to see Perl aggressively take advantage of new capabilities in
> dynamic loaders, though.  Among other things, I'll point out that symbol
> versioning is the way that things like libc manage to be backward
> compatible while still changing things, and we should probably seriously
> consider using symbol versioning in a shared libperl.so as a means to
> provide that much desired and extremely difficult to implement stable API
> for modules and the XS-equivalent.

This is where my lack of strange Unix knowledge comes to the fore. Is this
really a problem? It seems to me to be a standard sort of thing to be
dealing with. (OTOH, my platform of choice has 20-year-old executables as
part of its test suite and a strong engineering bent, so I may be coming
at things from a different direction than most folks)

I've been working on a spec for the API that'll hopefully isolate
extensions enough that we can yank the guts around at will without
affecting them. I'm really hoping that an extension built against perl
6.0.1 (presumably the first stable release... :) will work against perl
6.12.4. That's the plan, at least.

					Dan

0
dan
8/30/2000 1:59:17 AM
Dan Sugalski <dan@sidhe.org> writes:
> On 29 Aug 2000, Russ Allbery wrote:
 
>> I'd love to see Perl aggressively take advantage of new capabilities in
>> dynamic loaders, though.  Among other things, I'll point out that
>> symbol versioning is the way that things like libc manage to be
>> backward compatible while still changing things, and we should probably
>> seriously consider using symbol versioning in a shared libperl.so as a
>> means to provide that much desired and extremely difficult to implement
>> stable API for modules and the XS-equivalent.

> This is where my lack of strange Unix knowledge comes to the fore. Is
> this really a problem? It seems to me to be a standard sort of thing to
> be dealing with. (OTOH, my platform of choice has 20-year-old
> executables as part of its test suite and a strong engineering bent, so
> I may be coming at things from a different direction than most folks)

Well, it depends on what your goals are, basically.  For most shared
libraries, people don't take the trouble.

Basically, no matter how well you design the API up front, if it's at all
complex you'll discover that down the road you really want to *change*
something, not just add something new (maybe just add a new parameter to a
function).  At that point, the standard Perl thing up until now to do is
to just change it in a major release and require people to relink their
modules against the newer version.  And relink their applications that
embed Perl.

Not a big deal, and that's certainly doable.  But it's possible to do more
than that if you really want to.  The glibc folks have decided to comment
to nearly full binary compatibility for essentially forever; the theory is
that upgrading libc should never break a working application even if the
ABI changes.  I'm not familiar with the exact details of how symbol
versioning works, but as I understand it, this is what it lets you do.
Both the old and the new symbol are available, and newly built
applications use the new one while older applications continue to use the
previous symbol.  That means that all your older binary modules keep
working, and if your applications that embed Perl are linked dynamically,
you can even upgrade Perl underneath them without having to rebuild them.

I'm not sure it's worth the trouble, but it's something to consider.

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>
0
rra
8/30/2000 2:37:36 AM
At 07:37 PM 8/29/00 -0700, Russ Allbery wrote:
>Dan Sugalski <dan@sidhe.org> writes:
> > On 29 Aug 2000, Russ Allbery wrote:
>
> >> I'd love to see Perl aggressively take advantage of new capabilities in
> >> dynamic loaders, though.  Among other things, I'll point out that
> >> symbol versioning is the way that things like libc manage to be
> >> backward compatible while still changing things, and we should probably
> >> seriously consider using symbol versioning in a shared libperl.so as a
> >> means to provide that much desired and extremely difficult to implement
> >> stable API for modules and the XS-equivalent.
>
> > This is where my lack of strange Unix knowledge comes to the fore. Is
> > this really a problem? It seems to me to be a standard sort of thing to
> > be dealing with. (OTOH, my platform of choice has 20-year-old
> > executables as part of its test suite and a strong engineering bent, so
> > I may be coming at things from a different direction than most folks)
>
>Well, it depends on what your goals are, basically.  For most shared
>libraries, people don't take the trouble.

That's OK--we will. :)

>Basically, no matter how well you design the API up front, if it's at all
>complex you'll discover that down the road you really want to *change*
>something, not just add something new (maybe just add a new parameter to a
>function).  At that point, the standard Perl thing up until now to do is
>to just change it in a major release and require people to relink their
>modules against the newer version.  And relink their applications that
>embed Perl.

It's just hit me why VMS' system service interface has managed to handle 
this as well as it has over the years. Unfortunately one of the things that 
helped it's longevity is rather inconvenient for the average C programmer.

I'll write up something more concrete once I've batted it around some in my 
brain, and we can see if I'm off-base or, if not, whether it's worth it.

>Not a big deal, and that's certainly doable.  But it's possible to do more
>than that if you really want to.  The glibc folks have decided to comment
>to nearly full binary compatibility for essentially forever; the theory is
>that upgrading libc should never break a working application even if the
>ABI changes.  I'm not familiar with the exact details of how symbol
>versioning works, but as I understand it, this is what it lets you do.
>Both the old and the new symbol are available, and newly built
>applications use the new one while older applications continue to use the
>previous symbol.  That means that all your older binary modules keep
>working, and if your applications that embed Perl are linked dynamically,
>you can even upgrade Perl underneath them without having to rebuild them.
>
>I'm not sure it's worth the trouble, but it's something to consider.

I'm sure it is. I really, *really* want long-term binary compatibility.

Luckily for us, perl may end up with a reasonably small external API, 
which'll make life easier.

					Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk

0
dan
8/30/2000 4:49:01 PM
On Tue, 29 Aug 2000, Russ Allbery wrote:

> Not a big deal, and that's certainly doable.  But it's possible to do more
> than that if you really want to.  The glibc folks have decided to comment
> to nearly full binary compatibility for essentially forever; the theory is
> that upgrading libc should never break a working application even if the
> ABI changes.  I'm not familiar with the exact details of how symbol
> versioning works, but as I understand it, this is what it lets you do.

I'm sure the glibc folks indeed work very hard at this and are largely
successful.  I also know, however, that over the past couple of years or
so, I've had to recompile nearly all of my applications on several
occasions when I've upgraded glibc.  Other times, glibc upgrades have gone
without a hitch.  It's probably my fault and probably somewhere deep in my
personal library I'm incorrectly fiddling with stdio internals or
something, but I just wanted to offer a counter data point that doing
this sort of this robustly is, indeed, very hard.

-- 
    Andy Dougherty		doughera@lafayette.edu
    Dept. of Physics
    Lafayette College, Easton PA 18042

0
doughera
8/30/2000 6:56:35 PM
On Wed, 30 Aug 2000, Andy Dougherty wrote:

> On Tue, 29 Aug 2000, Russ Allbery wrote:
> 
> > Not a big deal, and that's certainly doable.  But it's possible to do more
> > than that if you really want to.  The glibc folks have decided to comment
> > to nearly full binary compatibility for essentially forever; the theory is
> > that upgrading libc should never break a working application even if the
> > ABI changes.  I'm not familiar with the exact details of how symbol
> > versioning works, but as I understand it, this is what it lets you do.
> 
> I'm sure the glibc folks indeed work very hard at this and are largely
> successful.  I also know, however, that over the past couple of years or
> so, I've had to recompile nearly all of my applications on several
> occasions when I've upgraded glibc.  Other times, glibc upgrades have gone
> without a hitch.  It's probably my fault and probably somewhere deep in my
> personal library I'm incorrectly fiddling with stdio internals or
> something, but I just wanted to offer a counter data point that doing
> this sort of this robustly is, indeed, very hard.

I think we can pull this off if we're careful and draw really strict lines
about what is and isn't public stuff. It's not easy, and it will mean
we'll have to have some reference executables in a test suite for
verification, but I think we can manage that.

I do want to have a set of C/XS/whatever sources as part of the test suite
as well--right now perl's test suite only tests the language, and I think
we should also test the HLL interface we present, as it's just as
important in some ways.

					dan

0
dan
8/30/2000 7:29:07 PM
Dan Sugalski <dan@sidhe.org> writes:
>At 07:32 PM 8/29/00 +0000, Nick Ing-Simmons wrote:
>>David L . Nicol <david@kasey.umkc.edu> writes:
>> >
>> >Did I not just describe how a .so or a DLL works currently?
>>
>>And behind the scenes that does something akin to:
>>
>>int fd = open("file_of_posn_indepenant_byte_code",O_RDONLY);
>>struct stat st;
>>fstat(fd,&st);
>>code_t *code = mmap(NULL,st.st_len,PROT_READ,MAP_SHARED,fd,0);
>>close(fd);
>
>Don't forget the fixup work that needs to be done afterwards. Loading the 
>library into memory's only the first part--after that the loader needs to 
>twiddle with transfer vectors and such so the unresolved calls into the 
>routines in the newly loaded library get resolved.

I finessed the "fixup work" by saying "position independant byte code".
The fixups break the shareability of the pages which is why you compile
shared libs -fPIC. So we should strive to have minimal fixups and 
collect them in one place (which vtables do very nicely).


-- 
Nick Ing-Simmons

0
nick
8/30/2000 7:58:10 PM
On Wed, 30 Aug 2000, Dan Sugalski wrote:
>
> I think we can pull this off if we're careful and draw really strict lines
> about what is and isn't public stuff. It's not easy, and it will mean
> we'll have to have some reference executables in a test suite for
> verification, but I think we can manage that.
> 
> I do want to have a set of C/XS/whatever sources as part of the test suite
> as well--right now perl's test suite only tests the language, and I think
> we should also test the HLL interface we present, as it's just as
> important in some ways.

One of the off-the-cuff suggestions I sent in to one of the myriad
polls for "fixing Perl", is that the bulk of Perl should be written in
the XS replacement.  This would force a two-pass build of perl, of
course - the first pass to build the XSR, and the second to build perl
itself.  But the advantage was that it would really solidify the
internals interface.  And give you your test suite above.  :-)

 -- 
Bryan C. Warnock
(bwarnock@gtemail.net)
0
bwarnock
8/30/2000 8:46:03 PM
Andy Dougherty <doughera@lafayette.edu> writes:

> I'm sure the glibc folks indeed work very hard at this and are largely
> successful.  I also know, however, that over the past couple of years or
> so, I've had to recompile nearly all of my applications on several
> occasions when I've upgraded glibc.  Other times, glibc upgrades have
> gone without a hitch.  It's probably my fault and probably somewhere
> deep in my personal library I'm incorrectly fiddling with stdio
> internals or something, but I just wanted to offer a counter data point
> that doing this sort of this robustly is, indeed, very hard.

It may not be your fault... my understanding is that glibc 2.0 really
didn't do things right, and that glibc 2.1 did break some binary
compatibility to fix some serious bugs.  It's probably only fair to start
holding glibc to this standard from 2.2 and up.

Perl *should* have a *much* easier task than glibc, given that our
interface is positively tiny compared to the entire C library.

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>
0
rra
8/30/2000 9:35:59 PM
Dan Sugalski wrote:

> I do want to have a set of C/XS/whatever sources as part of the test suite
> as well--right now perl's test suite only tests the language, and I think
> we should also test the HLL interface we present, as it's just as
> important in some ways.

I want to see Perl become a full-blown C/C++ JIT.  Since Perl is for
a large part a compatible subset of C I don't see this as unrealistic.

Delaying any post-token parsing of barewords until after looking at
what local declarations are in effect is part of it,  dealing with the
one or two differences in operator precedence that exist is another

(Old precedence semantics unless  new-ism like a declared typed bareword
exists in the current or a surrounding block would be the easiest way to do
it I think)

Typed barewords as an available good syntax would please those who find
perl overpunctuated.

XS would become a more proper part of the language, the line would blur
as we could mix Perl and C freely with very little performance loss due
to late binding except in things that are not known at "compile time"
things which by definition cannot be clarified without run-time inputs.



-- 
                          David Nicol 816.235.1187 nicold@umkc.edu
0
david
8/30/2000 10:49:31 PM
"David L. Nicol" wrote:
> 
> Dan Sugalski wrote:
> 
> > I do want to have a set of C/XS/whatever sources as part of the test suite
> > as well--right now perl's test suite only tests the language, and I think
> > we should also test the HLL interface we present, as it's just as
> > important in some ways.
> 
> I want to see Perl become a full-blown C/C++ JIT.  Since Perl is for
> a large part a compatible subset of C I don't see this as unrealistic.
> 
> Delaying any post-token parsing of barewords until after looking at
> what local declarations are in effect is part of it,  dealing with the
> one or two differences in operator precedence that exist is another
> 
> (Old precedence semantics unless  new-ism like a declared typed bareword
> exists in the current or a surrounding block would be the easiest way to do
> it I think)
> 
> Typed barewords as an available good syntax would please those who find
> perl overpunctuated.
> 
> XS would become a more proper part of the language, the line would blur
> as we could mix Perl and C freely with very little performance loss due
> to late binding except in things that are not known at "compile time"
> things which by definition cannot be clarified without run-time inputs.
> 
> --
>                           David Nicol 816.235.1187 nicold@umkc.edu


A C JIT is an interesting idea.  

I think that a project works best when it has a set of goals (I haven't
seen one yet really for Perl 6).  Unless this is one of the goals, I can
easily see how this could become a serious distraction to what I
perceive as the likely goals of Perl6.
-- 
David Corbin 		
Mach Turtle Technologies, Inc.
http://www.machturtle.com
dcorbin@machturtle.com
0
dcorbin
8/31/2000 12:05:15 PM
David Corbin wrote:

 
> A C JIT is an interesting idea.
> 
> I think that a project works best when it has a set of goals (I haven't
> seen one yet really for Perl 6).  Unless this is one of the goals, I can
> easily see how this could become a serious distraction to what I
> perceive as the likely goals of Perl6.
> --
> David Corbin


what is and what is not a goal?  The danger of getting semantic about
what the conversation is about -- arguments over, is it a function,
a subroutine, or a method and why, for instance -- is very real.

Perl looks, and AFAIK has always looked, like "C plus lune noise" to
many people.  To adopt that as a listed goal -- yet another extended C --
may not be a new goal but rather a slightly different viewpoint for
including several previously stated goals, including:

	strong typing
	polymorphism
	run-time efficiency

The ability to parse various input syntaces _is_ on the perl6 agenda,
since LW mentioned it in his initial announcement.  The idea of a
"C to Perl translator" has been kicked around as a funny joke in various
forums, such as the FunWithPerl list for one.  The "C to Perl Translator"
is funny (with current perl) because of these reasons:

	1: Efficiencywise, it is backwards.  C is speedier and is to be preferred,
when you have something that works in C

	2: It seems like it would be trivial to accomplish

	3: If you already have working C code, why would you want to translate
it to Perl rather than just use it as is?


One of the more recently stated goals is for perl6 to be fast, fast, fast.
If we have a C language front end for it, we will be able to compare its 
approach with the mature compilers -- we may very well get something that
can take you from source-code to running-process faster than `make && make run`.

Since C is very well defined and is very similar to perl -- the matching brackets
are mostly the same, for instance, and the idea of what can be a variable name is
very similar if not identical -- developing C mode might be easier than developing
Python mode as an alternate mode for the "different front ends" goal.


....

-- 
                          David Nicol 816.235.1187 nicold@umkc.edu
0
david
8/31/2000 4:14:10 PM
On Thu, 31 Aug 2000, David L. Nicol wrote:

> Perl looks, and AFAIK has always looked, like "C plus lune noise" to
> many people.

I think Perl looks like "C plus moon noise" to former C programmers.  I
imagine some people see it and think "Csh plus Awk noise".  Perl is a lot
more than C-with-scalars.

> 	strong typing

C's typing is not particularily strong.  Witness the common abuse of
"(void *)".  Witness enums that are all compatible with integers.  If we
want strong typing (I don't) there are better places to look.

> 	run-time efficiency

C doesn't get run-time efficiency from its syntax, so we can't really
expect to get anything here.  It gets it from its compilation
architecture.  If you want to build a Perl frontend for GCC I think you
might find a way leverage C's efficiency but you won't get it just by
accepting C syntax.

-sam


0
sam
8/31/2000 4:31:10 PM
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

A Perl frontend to GCC would make my life wonderful. Who would I talk
to about that? I'm not about to pretend that I have any idea how to
do that.

- --Dave

- -> -----Original Message-----
- -> From: Sam Tregar [mailto:sam@tregar.com]
- -> Sent: Thursday, August 31, 2000 9:31 AM
- -> To: perl6-language@perl.org
- -> Subject: Re: the C JIT
- -> 
- -> 
- -> On Thu, 31 Aug 2000, David L. Nicol wrote:
- -> 
- -> > Perl looks, and AFAIK has always looked, like "C plus lune
noise" to
- -> > many people.
- -> 
- -> I think Perl looks like "C plus moon noise" to former C
programmers.  I
- -> imagine some people see it and think "Csh plus Awk noise".  Perl
is a lot
- -> more than C-with-scalars.
- -> 
- -> > 	strong typing
- -> 
- -> C's typing is not particularily strong.  Witness the common abuse
of
- -> "(void *)".  Witness enums that are all compatible with integers. 
If we
- -> want strong typing (I don't) there are better places to look.
- -> 
- -> > 	run-time efficiency
- -> 
- -> C doesn't get run-time efficiency from its syntax, so we can't
really
- -> expect to get anything here.  It gets it from its compilation
- -> architecture.  If you want to build a Perl frontend for GCC I
think you
- -> might find a way leverage C's efficiency but you won't get it just
by
- -> accepting C syntax.
- -> 
- -> -sam
- -> 
- -> 

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 6.5.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBOa6IhILSbfyTGCGzEQJtuACghhLs1A0KhX9K6vy/HoAENbS6WvwAnjMy
su/kB0y5pB89GKHqwAhusO6j
=W7oD
-----END PGP SIGNATURE-----

0
dolbersen
8/31/2000 4:32:31 PM
[perl6-language removed from the follow-up]

"David L. Nicol" wrote:
> I want to see Perl become a full-blown C/C++ JIT.  Since Perl is for
> a large part a compatible subset of C I don't see this as unrealistic.

Trolling? First, Perl is more like lisp with a good syntax -- in other
words about as far from C as you can get. (Perl is so good at what it does
though that lots of people would agree with you.)  Second, lots of
people don't have a C compiler but still want to run Perl code.

I'm not opposed to producing intermediate C code with a Perl compiler,
but a C-based JIT is not possible.

- Ken
0
kfox
8/31/2000 8:07:41 PM
Ken Fox wrote:

> Trolling? 


No, I'm not, it's the direction that RFC 61 ends up if you let it
take you there.

fast perl6 becomes, as well as slicing, dicing and scratching your
back, a drop-in replacement for gcc.

-- 
                          David Nicol 816.235.1187 nicold@umkc.edu
          Kansas City Perl Mongers will meet Sept. 20th at 7:00 in
          Westport Flea Market Bar & Grill  http://tipjar.com/kcpm
0
david
8/31/2000 9:03:01 PM
"David L. Nicol" wrote:
> No, I'm not, it's the direction that RFC 61 ends up if you let it
> take you there.

You seem to be confusing:

   (1) linking C code with Perl

with

   (2) compiling Perl to C code

There is a world of difference. Swig does (1) pretty well already.
If you want a first class blessed/tied interface you learn the perl
types and internals by reading perlguts. This is *not* XS.

If you want (2) then you've got a lot of work. For example, you can't
use built-in C data types or even the C function call stack because
they aren't compatible with Perl. It's possible to design a large
library of data structures to help, but then you've simply re-invented
libperl.so. The real problems of exception handling, closures, dynamic
scoping, etc. are just not possible to solve using simple C code.

- Ken
0
kfox
8/31/2000 9:45:31 PM
Sam Tregar wrote:
> 
> On Thu, 31 Aug 2000, David L. Nicol wrote:

> >       run-time efficiency
> 
> C doesn't get run-time efficiency from its syntax, so we can't really
> expect to get anything here.  It gets it from its compilation
> architecture.  If you want to build a Perl frontend for GCC I think you
> might find a way leverage C's efficiency but you won't get it just by
> accepting C syntax.
> 
> -sam


C gets specificity from its syntax.  cc depends on that specificity.
Perl does not require specificity, which is good.  But when I want
to rewrite my Perl program faster, in C, I have to rewrite the whole
thing instead of just aggressivley specifying the bottlenex.


We're talking about making a faster Perl.  C's syntax requires enough
clarity to compile to something quick.  it is a very short hop from
	my dog $spot;
to
	dog spot;

If we only allow this where enough info is available to allocate dog-sized
pieces of memory directly, Perl can blaze through the code that deals with
dogs.




-- 
                          David Nicol 816.235.1187 nicold@umkc.edu
          Kansas City Perl Mongers will meet Sept. 20th at 7:00 in
          Westport Flea Market Bar & Grill  http://tipjar.com/kcpm
0
david
8/31/2000 10:29:23 PM
Ken Fox wrote:
> . The real problems of exception handling, closures, dynamic
> scoping, etc. are just not possible to solve using simple C code.
> 
> - Ken

I'm not talking about translating perl to C code, I'm talking about
translating perl to machine language.  

C is babytalk compared to Perl, when it comes to being something
which is translatable to machine language.  Ug.


-- 
                          David Nicol 816.235.1187 nicold@umkc.edu
          Kansas City Perl Mongers will meet Sept. 20th at 7:00 in
          Westport Flea Market Bar & Grill  http://tipjar.com/kcpm
0
david
8/31/2000 10:50:46 PM
>>>>> "DLN" == David L Nicol <david@kasey.umkc.edu> writes:

  DLN> Ken Fox wrote:
  >> . The real problems of exception handling, closures, dynamic
  >> scoping, etc. are just not possible to solve using simple C code.
  >> 
  >> - Ken

  DLN> I'm not talking about translating perl to C code, I'm talking about
  DLN> translating perl to machine language.  

  DLN> C is babytalk compared to Perl, when it comes to being something
  DLN> which is translatable to machine language.  Ug.

the best fit is the TIL (threaded inline code) model we have
discussed. it generates just the sub calls and stack stuff in machine
code. the rest of the work is done with subs. it has some benefits on
both sides. it is not as hard to generate as full compilation and you
can get a good speedup by bypassing the opcode dispatch loop of the
interpreter. 

this would just be a plugin to the backend and it could support multiple
cpu types with a set of architecture specific modules. 

you could then deliver perl as a single binary (though easily decoded)
which many want for ease of delivery (but what about loading modules?).

in any case, TIL is not JIT but a full pass done by a backend to
generate the sub calls.

uri

-- 
Uri Guttman  ---------  uri@sysarch.com  ----------  http://www.sysarch.com
SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting
The Perl Books Page  -----------  http://www.sysarch.com/cgi-bin/perl_books
The Best Search Engine on the Net  ----------  http://www.northernlight.com
0
uri
8/31/2000 11:11:58 PM
On Thu, 31 Aug 2000, David L. Nicol wrote:

> We're talking about making a faster Perl.  C's syntax requires enough
> clarity to compile to something quick.  it is a very short hop from
> 	my dog $spot;
> to
> 	dog spot;

What about the second version would result in faster execution?  Do you
think that the "$" slows down Perl?  Is it that dropping the "my" would
make "spot" global and thus faster?  What are you getting at?

Again: C doesn't get its speed from its syntax.  Supporting C-esque syntax
doesn't make Perl faster.

-sam


0
sam
9/1/2000 12:00:31 AM
Ken Fox wrote:
> Perl is more like lisp with a good syntax -- in other
> words about as far from C as you can get. 

I agree 100%.

-- 
John Porter

0
jdporter
9/1/2000 3:20:37 PM
David L. Nicol wrote:
> Ken Fox wrote:
> > . The real problems of exception handling, closures, dynamic
> > scoping, etc. are just not possible to solve using simple C code.
> > 
> > - Ken
> 
> I'm not talking about translating perl to C code, I'm talking about
> translating perl to machine language.  

Same diff.   C is just portable assembly language.


> C is babytalk compared to Perl, when it comes to being something
> which is translatable to machine language.  Ug.

If you meant what it looks like you wrote, then you're wrong.
C is as translatable to machine language as anything in the
world, and more so than most, at least if you don't consider
assembly languages proper.  Perl's machine model is extremely
different from any "real" machine ever made -- with the possible
exception of the Lisp Machine (I don't know, never used one) --
and this alone makes it hard to translate into machine code,
even going through C.  And this is true even if our target is
something like the JVM, which is still essentially a low-level
machine, not unlike silicon.

-- 
John Porter

	We're building the house of the future together.

0
jdporter
9/1/2000 3:24:39 PM
Uri Guttman wrote:
> 
> the best fit is the TIL (threaded inline code) model we have
> discussed. 

Yes!


-- 
John Porter

0
jdporter
9/1/2000 3:25:06 PM
Sam Tregar wrote:
> 
> On Thu, 31 Aug 2000, David L. Nicol wrote:
> 
> > We're talking about making a faster Perl.  C's syntax requires enough
> > clarity to compile to something quick.  it is a very short hop from
> >       my dog $spot;
> > to
> >       dog spot;
> 
> What about the second version would result in faster execution?  Do you
> think that the "$" slows down Perl?  Is it that dropping the "my" would
> make "spot" global and thus faster?  What are you getting at?

No, I imagine that dropping both the my and the $ would make spot a
fixed-size stacked lexical, as it does in C.  A speed increase would
result from the code clarifier being able to resolve references to spot within
the current and enclosed blocks as a fixed offset from a known pointer rather
than five or six of those.


> Again: C doesn't get its speed from its syntax.  Supporting C-esque syntax
> doesn't make Perl faster.

 C syntax and Perl syntax are mostly compatible in what is not
a syntax error in the other.

A lot of code is in C.

Including it -- like,

	#include "somecode.c"

has immense potential for re-use value.

Supporting C semantics -- fixed offsets from a stack pointer instead of
table lookups -- may very well make perl faster.



-- 
                          David Nicol 816.235.1187 nicold@umkc.edu
          Kansas City Perl Mongers will meet Sept. 20th at 7:00 in
          Westport Flea Market Bar & Grill  http://tipjar.com/kcpm
0
david
9/1/2000 4:36:49 PM
Reply: