RFC 326 (v1) Symbols, symbols everywhere

This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Symbols, symbols everywhere

=head1 VERSION

  Maintainer: Paolo Molaro <lupus@debian.org>
  Date: 26 Sep 2000
  Mailing List: perl6-internals@perl.org
  Number: 326
  Version: 1
  Status: Developing

=head1 ABSTRACT

Perl should adopt scheme-like symbols, both at the language level
and at the internals level.

=head1 DESCRIPTION

Symbols can be useful in a variety of ways both at the language
level and as an implementation detail for efficiency reasons.

=head2 Symbols at the language level

A possible operator to create a symbol could be qs// for B<q>uote B<s>ymbol.
A one character operator could be useful as well, but since ^ is likely
to be taken by curried expressions, an alternative could be C<:>.

	$symbol = qs(methodname); # or ^methodname or :methodname
	$object->$symbol;

They can be also useful in XS modules to map constants more efficiently
than today's constant() function.

Characters following the one-char operator are limited to the usual
identifier characters. Probably, the same character used as operator
should be used in subroutines prototypes to allow for compile time
optimizations.

Only equal/non equal comparison operations should be allowed on symbols.

=head2 Symbols in the inplementation

Symbols are "interned" during compilation. From then on symbols are
actually treated like numbers, so comparisons are faster. They should be
used to hold package names, method names and so on. This way symbol tables
can be implemented as optimized integer hash tables instead of string hash 
tables (for packages with few methods a binary search could be also a win).
A symbol should be used anywhere a package name or method name is used now
at the API level (this applies to variable names, too).

=head1 IMPLEMENTATION

Should be trivial. A symbol scalar could be a read-only scalar with both
the string and integer properties set. Note that additional memory should
not be allocated for each copy of a symbol: once "interned" they are  never 
free'd: the vtable stuff can make this very simple.

=head1 REFERENCES

Your preferred scheme book.

GLib's GQuarks and Xlib's Atoms for implementations.

0
perl6
9/27/2000 5:37:06 AM
perl.perl6.internals 7376 articles. 0 followers. Follow

9 Replies
470 Views

Similar Articles

[PageSpeed] 54
Get it on Google Play
Get it on Apple App Store

At 05:37 AM 9/27/00 +0000, Perl6 RFC Librarian wrote:
>Perl should adopt scheme-like symbols, both at the language level
>and at the internals level.

The explanation of this isn't that clear for me. (I have no scheme 
experience at all)

It sounds like a sort of dynamically-created version of C's enum or a 
mutant version of the pseudo-hash stuff in perl 5.6. Is that more or less 
right?

					Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk

0
dan
9/27/2000 6:34:01 AM

On Wed, 27 Sep 2000, Dan Sugalski wrote:

> At 05:37 AM 9/27/00 +0000, Perl6 RFC Librarian wrote:
> >Perl should adopt scheme-like symbols, both at the language level
> >and at the internals level.
> 
> The explanation of this isn't that clear for me. (I have no scheme 
> experience at all)

	It isn't terribly clear to me either, but I think that what he's
saying is that you can qs() a method name, get a "thingie" out, store the
thingine in a scalar, and then that scalar becomes a direct portal to the
method...somewhat like a coderef, but without a required deref.

			Dave

0
dstorrs
9/27/2000 4:27:18 PM
Dave Storrs wrote:
> It isn't terribly clear to me either

Well, he does give a couple references that would clear it up.
X11 Atoms are well documented.

> saying is that you can qs() a method name, get a "thingie" out, store the
> thingine in a scalar, and then that scalar becomes a direct portal to the
> method...somewhat like a coderef, but without a required deref.

Actually it's more trivial than that. When you "intern" a symbol, you're
adding a string-to-integer mapping to the compiler's symbol table. Whenever
the compiler sees the string, it replaces it with the corresponding
integer. (The type stays "symbol" though; I'm sort of mixing implementation
and semantics.) Think of it like a compile-time hash for barewords.

Dan was right to think of this as a C enum equivalent. The only real
differences being that you don't have a chance to define the integer
mapping and that the printable identity of the symbol is remembered by
the run-time.

I'm not sure I'm as enthusiastic about symbols speeding things up as
the RFC author though. I guess it speeds up hash lookups and stuff, but
that could be memoized by the compiler anyways.

- Ken
0
kfox
9/27/2000 7:07:19 PM
On Wed, Sep 27, 2000 at 03:07:19PM -0400, Ken Fox wrote:
> Dan was right to think of this as a C enum equivalent. The only real
> differences being that you don't have a chance to define the integer
> mapping and that the printable identity of the symbol is remembered by
> the run-time.

I don't yet understand why this is useful.

-- 
"Having just ordered 40 books and discovered I have no change out of a grand, 
I'm thinking of getting a posse together and going after some publishers. I'd 
walk into a petrol station and buy lots of petrol on Monday, too, but I think 
I'd get funny looks. More funny looks." - Mark Dickerson
0
simon
9/27/2000 7:20:01 PM
On 09/27/00 Ken Fox wrote:
> Dave Storrs wrote:
> > It isn't terribly clear to me either
> 
> Well, he does give a couple references that would clear it up.
> X11 Atoms are well documented.
> 
> > saying is that you can qs() a method name, get a "thingie" out, store the
> > thingine in a scalar, and then that scalar becomes a direct portal to the
> > method...somewhat like a coderef, but without a required deref.
> 
> Actually it's more trivial than that. When you "intern" a symbol, you're
> adding a string-to-integer mapping to the compiler's symbol table. Whenever
> the compiler sees the string, it replaces it with the corresponding
> integer. (The type stays "symbol" though; I'm sort of mixing implementation
> and semantics.) Think of it like a compile-time hash for barewords.

Not only that: every time the compiler sees another symbol with the
same string representation, it uses the already created symbol, so
it doesn't use more memory.
A non-trivial program probably will use several packages (or binary
modules that use several packages, ie Gtk). Let's look at the DESTROY 
method. Currently a string is malloc()ed (in the symbol table for
every package), so that takes 8 bytes for the string + the malloc 
overhead (at least 4 bytes, probably 8 on 32 bit systems). This
doesn't consider other memory that could be saved using hash tables
optimized for symbols (ie integers instead of strings).
Repeat that for all the duplicated method names in a class hierarchy
and you'll easily gain several KB of memory.
As a bonus you'll get faster performance (as integer compare is
faster than a strcmp).

As for the possible uses in the language I should have used a
better example. Let's consider an XML/SGML file with all that
ugly tags and attributes we love (well, no!). An XML parser
loads the file and stores the tags and attributes names as
strings: a lot of tags appear many times in an XML file
leading to a huge memory consumption problem. Now, if the
parser could use symbols, the memory for a tag name would
be allocated only once (so it's also faster because it
doesn't call malloc() that often).
Walking the tree in your perl program you could use integer 
comparison instead of string comparison.

use Benchmark;

$num1 = 10;
$num2 = 20;
$string1 = 'htmltag';
$string2 = 'htmltag';
$string3 = 'buffy';

timethese(10000000, {
	'number' => '$num1 == $num2',
	'stringe' => '$string1 eq $string2', # worst case
	'string' => '$string1 eq $string3',  # best case: length differs
});

Gives:
Benchmark: timing 10000000 iterations of number, string, stringe...
    number:  4 wallclock secs ( 3.87 usr +  0.00 sys =  3.87 CPU)
    string:  6 wallclock secs ( 4.28 usr +  0.01 sys =  4.29 CPU)
   stringe:  7 wallclock secs ( 5.97 usr +  0.00 sys =  5.97 CPU)

In the internals using C the performance gains are way more than
the 30% average here.

So, both for internal use and as a language feature there are
advantages, implementation is easy. If no one shows a significant
drawback, it's a deal:-)

The only real problem I see is choosing the single character for
using symbols in the language. I suggested ^ or :, but * may work
as well if typeglobs go away.

Thanks,
	lupus

-- 
Paolo Molaro, Open Source Developer, Linuxcare, Inc.
+39.049.8043411 tel, +39.049.8043412 fax
lupus@linuxcare.com, http://www.linuxcare.com/
Linuxcare. Support for the revolution.
0
lupus
10/2/2000 1:42:24 PM
At 03:42 PM 10/2/00 +0200, Paolo Molaro wrote:
>On 09/27/00 Ken Fox wrote:
> > Dave Storrs wrote:
> > > It isn't terribly clear to me either
> >
> > Well, he does give a couple references that would clear it up.
> > X11 Atoms are well documented.
> >
> > > saying is that you can qs() a method name, get a "thingie" out, store the
> > > thingine in a scalar, and then that scalar becomes a direct portal to the
> > > method...somewhat like a coderef, but without a required deref.
> >
> > Actually it's more trivial than that. When you "intern" a symbol, you're
> > adding a string-to-integer mapping to the compiler's symbol table. Whenever
> > the compiler sees the string, it replaces it with the corresponding
> > integer. (The type stays "symbol" though; I'm sort of mixing implementation
> > and semantics.) Think of it like a compile-time hash for barewords.
>
>Not only that: every time the compiler sees another symbol with the
>same string representation, it uses the already created symbol, so
>it doesn't use more memory.

Ah, I see what you're asking.

Whether this sort of thing is user-visible is a separate issue (and one for 
-language). Personally I don't think it should be--there's reasonably 
little value at the user level.

For the internals, though...

This would be very useful, and it's a feature I'd really like to implement. 
Basically you're asking for pre-computed, indirect, shared hash keys. This 
sounds like a Good Plan to me.

					Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk

0
dan
10/2/2000 8:13:33 PM
>>>>> "DS" == Dan Sugalski <dan@sidhe.org> writes:

DS> For the internals, though...

DS> This would be very useful, and it's a feature I'd really like to implement. 
DS> Basically you're asking for pre-computed, indirect, shared hash keys. This 
DS> sounds like a Good Plan to me.

Why precomputed? Any 'interned' string has a unique value (e.g. address).
Though wouldn't they have to be garbage collected? Short lived hashes
with constantly changing keys, the shared hash keys would keep growing.

Actually, this might be something useful at the user level. Many times
I do this

	@record{@keys} = new_values();

Using a set of 'intern'ed strings might make it more efficient. And unless
we are able to note that @keys is always the same,the hashes would have to
keep getting recomputed. With the symbols we might be able to recognize
the constant set.

<chaim>
-- 
Chaim Frenkel					     Nonlinear Knowledge, Inc.
chaimf@pobox.com				               +1-718-236-0183
0
chaimf
10/6/2000 3:54:05 AM
At 11:54 PM 10/5/00 -0400, Chaim Frenkel wrote:
> >>>>> "DS" == Dan Sugalski <dan@sidhe.org> writes:
>
>DS> For the internals, though...
>
>DS> This would be very useful, and it's a feature I'd really like to 
>implement.
>DS> Basically you're asking for pre-computed, indirect, shared hash keys. 
>This
>DS> sounds like a Good Plan to me.
>
>Why precomputed? Any 'interned' string has a unique value (e.g. address).
>Though wouldn't they have to be garbage collected? Short lived hashes
>with constantly changing keys, the shared hash keys would keep growing.

I'm thinking of a central store for text constants used as hash keys. (How 
many times do we use AUTOLOAD in perl 5, for example?) Something reasonably 
automagic, such that we don't have to go recompute hash values every 
fscking time we access something with a hash value known at program compile 
time. (Or perl compile time, for that matter)

I'm also wondering if attaching a hash value to SVs would be a performance 
win--probably not, but it's tempting to check and see.


					Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk

0
dan
10/9/2000 4:24:14 PM
>>>>> "DS" == Dan Sugalski <dan@sidhe.org> writes:

>> Why precomputed? Any 'interned' string has a unique value (e.g. address).
>> Though wouldn't they have to be garbage collected? Short lived hashes
>> with constantly changing keys, the shared hash keys would keep growing.

DS> I'm thinking of a central store for text constants used as hash keys. (How 
DS> many times do we use AUTOLOAD in perl 5, for example?) Something reasonably 
DS> automagic, such that we don't have to go recompute hash values every 
DS> fscking time we access something with a hash value known at program compile 
DS> time. (Or perl compile time, for that matter)

DS> I'm also wondering if attaching a hash value to SVs would be a performance 
DS> win--probably not, but it's tempting to check and see.

Well, if having a dual-natured string/number is a win, then the first
time a string is used as a hash, the hash value could be saved.

But the biggest win would probably be compile time hash values. 

Hmm,

my @values : forKeys = qw( .... );


<chaim>
-- 
Chaim Frenkel					     Nonlinear Knowledge, Inc.
chaimf@pobox.com				               +1-718-236-0183
0
chaimf
10/10/2000 1:31:42 AM
Reply:

Similar Artilces:

RE: RFC 326 (v1) Symbols, symbols everywhere
Chain Frenkel writes: >DS> For the internals, though... > >DS> This would be very useful, and it's a feature I'd really like to implement. >DS> Basically you're asking for pre-computed, indirect, shared hash keys. This >DS> sounds like a Good Plan to me. > >Why precomputed? Any 'interned' string has a unique value (e.g. address). >Though wouldn't they have to be garbage collected? Short lived hashes >with constantly changing keys, the shared hash keys would keep growing. I'm not sure where the idea originated, b...

PDB Symbols/symbol server
Does it exists server where are symbols from official Mozilla builds? I'm interested in Thunderbird 2.0.0.18 win32 symbols. PM ...

Read file symbol by symbol?
Friends, how read a file one by one symbols, not a whole string once at time? Thanks. -- -----------! My blessing! Ramis. ! -----------! http://www.samtan.fromru.com mailto: samtan@fromru.com Here is one script I used to inspect files use strict; my $fn; print"Enter the name of a file you want to examine "; while (<>) { $fn = $_; last if $fn; } print "Opening $fn\n"; open TF, "$fn" or die "Cannot open $fn:$!\n"; my @ov; my $ov; while (<TF>) { @ov = unpack('U*',$_); print; print"\t\t"...

Symbolic references, was Re: RFC 109 (v1) Less line noise
(thread intentionally broken) Nathan Torkington wrote: > > Steve Fink writes: > > True. Would anyone mourn @$scalar_containing_variable_name if it died? > > I've never used it, and I'm rather glad I haven't. Perl5's -w doesn't > > notice $x="var"; print @$x either -- it'll complain if you mention @var > > once. > > These are symbolic references. You can forbid them with the strict > pragma. Yes, I'd miss them. So would the Exporter. > > > Damn, learn something new every day... perl really ...

Symbol lookup error
I'm using Inline->bind to bind to a static lib which has a few dependencies so I am setting LIBS to a series of 3 -L/-l pairs. Inline is successfully binding to the library and I go to make the call to the library but I'm failing with a Symbol lookup error - undefined symbol: bash-3.2$ /usr/local/bin/perl raid2 -load DbgSh.h Calling bind with libs set to -L/view/yfang_fp1.0_yfang_May24/vob/9200_packetcore/packetcore/infra/sysmgr/ipmi -lwiipmi -L/view/yfang_fp1.0_yfang_May24/vob/9200_packetcore/packetcore/infra/lib/src -lwinfra -L/vob/9200/software/common/debugshell ...

Symbols
Hi, Was wondering what people thought about a kind of index page for symbols that are used (esp. JavaScript). A lot of programming books list such uses (and the Special:Allpages at the wiki would highlight any which had their own pages). For example, I just came across some code which uses left and right brackets on the left-hand side like this: var [Channel] = load('chrome://...', 'Channel'); and wasn't sure how to go about finding how the brackets were being used here. A page could list symbols like brackets and explain or link to some of their uses (...

Symbols
Good Morning. This may sound like a lame question. Is there anywhere that I can get a key of the various symbols that accompany emails? E.G. I know that a paper clip symbol represents an attachment. Thanks -- JebediahShapnacker ------------------------------------------------------------------------ Finally found it in help! Sorry for wasting your time! -- JebediahShapnacker ------------------------------------------------------------------------ JebediahShapnacker's Profile: http://forums.novell.com/member.php?userid=13749 View this thread: http://...

# symbol
I've inherited a whole project and am trying to decipher how it all fits together. Could someone please tell me what the pound symbol (#) means in the following 2 pieces of code? a) arr = split(aMaster(i),"#") b) iScore = request.form("iMEAN_#" & sSubScale_Short) Thanks it's being used as nothing more then a delimeter... same as a comma would, or | or whatever...just what they/he/she decided to use. The '#' character is obviously used as a delimiter in a string of values. This statement is actually converting the string into a string array. ...

No Symbol
How can I get data at relative point in Graph Object which no symbol at erery data point for display clear?because it having some series and a lot of points . ...

Symbols
How do I insert a symbol into an I7 report. By symbol I mean the symbol for micro (u). Select a text field, then write an "m" and finally change Font type to Symbol. <Barry_Knight> escribi� en el mensaje news:EBE92584A92F5F7D00135C2385256BAB.00135C3885256BAB@webforums... > How do I insert a symbol into an I7 report. > By symbol I mean the symbol for micro (u). ...

Symbols
Name: Mark Hall Email: altocarodrive1psatontdotcom Product: Thunderbird Summary: Symbols Comments: I just downloaded your new release. What are the new symbols, Left arrow, right arrow, and left&right arrow on the same e-mail. Something looked like a candle. Just wondering. Thanks Browser Details: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FunWebProducts-MyWay; .NET CLR 1.0.3705; .NET CLR 1.1.4322; SpamBlockerUtility 4.6.1; InfoPath.1) ...

No Symbol
I want to get data at every point in Graph Control ,but these points have no symbol(NoSymbol!).How can I get it to display with StaticText? ...

Symbols
Hello, if I for example the symbol "printer" of does it work, if I close then Mozilla Firefox 2.0.0.2 duly and restart is it after some time, those symbol "printer" again disappeared. How can one prevent that? Greeting Pascal hebbet wrote: > Hello, if I for example the symbol "printer" of does it work, if I > close then Mozilla Firefox 2.0.0.2 duly and restart is it after some > time, those symbol "printer" again disappeared. > > How can one prevent that? > > Greeting Pascal > localstore.rdf file is co...

symbol server (symbols.m.o) moved
Hello, We're now hosting symbols in S3, and have just completed the transition of the Mozilla symbol server (http://symbols.mozilla.org) in bug 1097209. If you use this service and notice any problems, please file a bug in Socorro::Infra (or just let me know) Thanks! Rob Helmer ...

Web resources about - RFC 326 (v1) Symbols, symbols everywhere - perl.perl6.internals

Live: Terror attack at Pathankot Air Force base; two militants killed - The Hindu
A group of four to five terrorists on Saturday struck at the Air Force base Pathankot in Punjab, triggering an encounter in which two attackers ...

Natalie Cole: Soulful songstress
Boston Herald Natalie Cole: Soulful songstress Boston Herald Iconic artists cast long shadows over their children. No matter how much talent ...

Somali al Shabaab militants use Donald Trump in recruiting film
Reuters Somali al Shabaab militants use Donald Trump in recruiting film Reuters RIYADH Somalia's Islamist militant group al Shabaab has released ...

‘Force Awakens’ Overtakes ‘Jurassic World’ & ‘Titanic’ Tonight; Set To Topple ‘Avatar’ All-Time Domestic ...
Refresh for updates Early Friday PM industry figures show Disney’s Star Wars: The Force Awakens making $35M today, raising its cume through 15 ...

Judge rejects Camille Cosby request to halt deposition
A federal judge has denied a request by lawyers for Camille Cosby, Bill Cosby's wife of more than 50 years, to be excused from an upcoming deposition ...

Stanford's McCaffrey scores on first play against Iowa in the Rose Bowl
CBSSports.com Stanford's McCaffrey scores on first play against Iowa in the Rose Bowl CBSSports.com Parrish: Iowa State is 11-1, but has plenty ...

Top 10 Parenting New Year's Resolutions
Since I have made the decision to embrace my non-perfect parenting, there were no extensive ways I wanted to change my mothering skills. So, ...

Bernie Sanders Will Win The Presidential Nomination — Hillary Clinton’s Only Chance Was 2008
The Huffington Post is reporting that Bernie Sanders will win the Democratic nomination over Hillary Clinton, after it emerged that the investigation ...

Erdoğan cites Hitler's Germany as example of effective government - World news - The Guardian
Turkey’s president Recep Tayyip Erdoğan is pushing to change his ceremonial role to chief executive as in US and Russia

New Year Brings Minimum Wage Hikes for Americans in 14 States
Boise Weekly New Year Brings Minimum Wage Hikes for Americans in 14 States Boise Weekly The increases come in the wake of a series of "living ...

Resources last updated: 1/2/2016 8:01:52 AM