destabilizing mergesort

--001a113cdf3c975ccb055448d5a5
Content-Type: text/plain; charset="UTF-8"

In what must establish a new standard for "leisurely", in a posting from
September 11, 2013, I said

If this gets enough +1s, then if someone points me at the documentation for
creating a git branch, I'll take a (leisurely) whack at making stability
optional.


I have finally gotten around to looking into this. It got stalled because
the simplicity of destabilizing the merge phase fooled me into thinking it
would also be simple to destabilize the code that establishes the initial
runs. This appears to *not* be the case. Stability is baked into that code
in far more subtle ways than it is in the merge phase. A few days of hard
thinking didn't lead me anywhere, so I essentially gave up. But it's been
nagging at me, and I now think I understand the run setup logic well enough
that I can see a path to destabilizing it.

As a proof of concept that I know how to file a patch using git, I just
filed a bug report on an unrelated issue in pp_sort.c. The S_mergesortsv
code starts with

    SVCOMPARE_t savecmp = NULL;

    if (nmemb <= 1) return;                     /* sorted trivially */

    if ((flags & SORTf_DESC) != 0) {
        savecmp = PL_sort_RealCmp;      /* Save current comparison routine,
if any */
        PL_sort_RealCmp = cmp;  /* Put comparison routine where cmp_desc
can find it */
        cmp = cmp_desc;
    }

and ends with

    if (flags) {
         PL_sort_RealCmp = savecmp;     /* Restore current comparison
routine, if any */
    }
    return;

But that just feels *wrong*. I think SORTf_STABLE is on in flags by
default, so we will be restoring something that was never saved. My bug
report replaces that final code with

    if (savecmp != NULL) {
         PL_sort_RealCmp = savecmp;     /* Restore current comparison
routine, if any */
    }
    return;

I'm puzzled that this has never caused problems, which makes me wonder if I
really understand what is going on.

Anyway, I have already done some testing with a destabilized merge phase,
and that's looking good in terms of reducing the number of comparison calls
when

    no sort "stable";

is invoked. Assuming my bug report goes through, I'll do some modifications
to the sort testing and documentation, and incorporate the (simple) changes
to the merge phase in another bug report, then take on the more complicated
issue of destabilizing the run setup code. And I'll try to be less
leisurely about it :-).

--001a113cdf3c975ccb055448d5a5
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">In what must establish a new standard for &quot;leisurely&=
quot;, in a posting from September 11, 2013, I said<div><br></div><blockquo=
te style=3D"margin:0px 0px 0px 40px;border:none;padding:0px"><div><div>If t=
his gets enough +1s, then if someone points me at the documentation for</di=
v></div><div><div>creating a git branch, I&#39;ll take a (leisurely) whack =
at making stability optional.</div></div></blockquote><div><br></div><div>I=
 have finally gotten around to looking into this. It got stalled because th=
e simplicity of destabilizing the merge phase fooled me into thinking it wo=
uld also be simple to destabilize the code that establishes the initial run=
s. This appears to <b>not</b> be the case. Stability is baked into that cod=
e in far more subtle ways than it is in the merge phase. A few days of hard=
 thinking didn&#39;t lead me anywhere, so I essentially gave up. But it&#39=
;s been nagging at me, and I now think I understand the run setup logic wel=
l enough that I can see a path to destabilizing it.</div><div><br></div><di=
v>As a proof of concept that I know how to file a patch using git, I just f=
iled a bug report on an unrelated issue in pp_sort.c. The=C2=A0S_mergesorts=
v code starts with</div><div><br></div><div><div>=C2=A0 =C2=A0 SVCOMPARE_t =
savecmp =3D NULL;</div><div><br></div><div>=C2=A0 =C2=A0 if (nmemb &lt;=3D =
1) return; =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 /* sorted trivially */</div><div><br></div><div>=C2=A0 =C2=A0 if ((f=
lags &amp; SORTf_DESC) !=3D 0) {</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 save=
cmp =3D PL_sort_RealCmp; =C2=A0 =C2=A0 =C2=A0/* Save current comparison rou=
tine, if any */</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 PL_sort_RealCmp =3D c=
mp; =C2=A0/* Put comparison routine where cmp_desc can find it */</div><div=
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 cmp =3D cmp_desc;</div><div>=C2=A0 =C2=A0 }</d=
iv></div><div><br></div><div>and ends with</div><div><br></div><div><div>=
=C2=A0 =C2=A0 if (flags) {</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0PL_s=
ort_RealCmp =3D savecmp; =C2=A0 =C2=A0 /* Restore current comparison routin=
e, if any */</div><div>=C2=A0 =C2=A0 }</div><div>=C2=A0 =C2=A0 return;</div=
></div><div><br></div><div>But that just feels <b>wrong</b>. I think=C2=A0S=
ORTf_STABLE is on in flags by default, so we will be restoring something th=
at was never saved. My bug report replaces that final code with</div><div><=
br></div><div><div>=C2=A0 =C2=A0 if (savecmp !=3D NULL) {</div><div>=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0PL_sort_RealCmp =3D savecmp; =C2=A0 =C2=A0 /* Re=
store current comparison routine, if any */</div><div>=C2=A0 =C2=A0 }</div>=
<div>=C2=A0 =C2=A0 return;</div></div><div><br></div><div>I&#39;m puzzled t=
hat this has never caused problems, which makes me wonder if I really under=
stand what is going on.</div><div><br></div><div>Anyway, I have already don=
e some testing with a destabilized merge phase, and that&#39;s looking good=
 in terms of reducing the number of comparison calls when</div><div><br></d=
iv><div>=C2=A0 =C2=A0 no sort &quot;stable&quot;;</div><div><br></div><div>=
is invoked. Assuming my bug report goes through, I&#39;ll do some modificat=
ions to the sort testing and documentation, and incorporate the (simple) ch=
anges to the merge phase in another bug report, then take on the more compl=
icated issue of destabilizing the run setup code. And I&#39;ll try to be le=
ss leisurely about it :-).</div></div>

--001a113cdf3c975ccb055448d5a5--
0
jpl
7/14/2017 3:38:00 PM
perl.perl5.porters 46587 articles. 0 followers. Follow

7 Replies
16 Views

Similar Articles

[PageSpeed] 19

John P. Linderman wrote:
> As a proof of concept that I know how to file a patch using git, I =
just
> filed a bug report on an unrelated issue in pp_sort.c.

I have not seen any bug reports from you show up.  I find the easiest =
way is just to send e-mail to perlbug@perl.org with a patch as an =
attachment.  That Just Works.=
0
sprout
7/23/2017 8:55:44 PM
--Apple-Mail=_D69EE97F-3F2F-44AE-ABFB-B99664EDEE96
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=windows-1252

John P. Linderman wrote:
> I agree with the comments (back in 2013!) that said that a stable sort
> *must* be the default, as it has been since mergesort was integrated =
into
> pp_sort.c. This is complicated by the quirky communication between the =
sort
> pragma, via op.c, to the code in pp_sort.c. Simply put, the =
SORTf_STABLE
> bit that qsort and mergesort see has *not* been interrogated by the =
merge
> sort. And it is, by default (in sort.pm and op.c) *not* set: stability =
has
> been *implicit* when mergesort is invoked. So if I allow mergesort to =
be
> unstable, how do we communicate that? I see three possibilities, none =
of
> them very good.
>=20
>    - Change sort.pm and op.c so that SORTf_STABLE *is* set by default. =
The
>    mergesort code would start interrogating that bit to determine if =
stability
>    was demanded. The downside is that it would make qsort stable by =
default.
>    If qsort finds SORTf_STABLE *on*, it will be much less efficient on =
the
>    kind of input where qsort might have a performance advantage over
>    mergesort. And turning stability *off* for qsort would be tricky. =
The
>    _qsort subpragma cannot turn the $sort::stable_bit hint off, =
because that
>    would make "use sort qw(stable _qsort);" or "use sort qw(stable); =
use sort
>    qw(_qsort);" ineffective. Mind you, a stabilized quicksort will =
still
>    return legitimate results, so it cannot break existing code. But it =
could
>    slow things down.
>    - Add a new subpragma and sort bit for "unstable". mergesort would
>    continue to ignore SORTf_STABLE, and instead test SORTf_UNSTABLE. =
qsort
>    would continue to test (only) SORTf_STABLE, which would continue to =
be off
>    by default. The downside is primarily the confusion of having two
>    subpragmas for what seems like the same thing.
>    - Maybe it's time to pull the plug on quicksort. It's a lot of =
code, and
>    I suspect it is seldom invoked. "perldoc sort" has advised that =
subpragmas
>    beginning with "_" may not persist beyond Perl 5.8, and we're long =
past
>    that. We could "phase it out gently" using the first suggestion, =
with some
>    suitable warning.

It has already been suggested in ticket #119635, but nobody could give a =
definitive answer.  (No pumpking spoke up at the time.)

>=20
> This shouldn't be my choice alone. Maybe other porters have better
> suggestions. Meanwhile, I'll plod on with trying to make stability =
optional
> in mergesort, and worry about the details of expressing the option =
later.

I think it is as simply as leaving each algorithm to use its own default =
(stable for mergesort; unstable for qsort), but let =91use/no sort =
'stable'=92 set it specifically for a given scope.  So:

    use sort 'stable'; # makes no difference, unless we are using qsort

    no sort 'stable'; # make mergesort unstable

Internally we can have two flags, stable and unstable, but that =
implementation detail does not need to be exposed to the user.

The attached patch (untested) is one way this might be accomplished.  Of =
course, it is up to the mergesort code to make use of the SORTf_WOBBLY =
flag.


--Apple-Mail=_D69EE97F-3F2F-44AE-ABFB-B99664EDEE96
Content-Disposition: attachment;
	filename=open_XsmdTqc9.txt
Content-Type: text/plain;
	name="open_XsmdTqc9.txt"
Content-Transfer-Encoding: quoted-printable

diff --git a/lib/sort.pm b/lib/sort.pm
index 7c8e50d..839b1ec 100644
--- a/lib/sort.pm
+++ b/lib/sort.pm
@@ -9,6 +9,7 @@ $sort::quicksort_bit   =3D 0x00000001;
 $sort::mergesort_bit   =3D 0x00000002;
 $sort::sort_bits       =3D 0x000000FF; # allow 256 different ones
 $sort::stable_bit      =3D 0x00000100;
+$sort::unstable_bit    =3D 0x00000200;
=20
 use strict;
=20
@@ -29,6 +30,7 @@ sub import {
 	    $^H{sort} |=3D  $sort::mergesort_bit;
 	} elsif ($_ eq 'stable') {
 	    $^H{sort} |=3D  $sort::stable_bit;
+	    $^H{sort} &=3D ~$sort::unstable_bit;
 	} elsif ($_ eq 'defaults') {
 	    $^H{sort} =3D   0;
 	} else {
@@ -53,6 +55,7 @@ sub unimport {
 	    $^H{sort} &=3D ~$sort::sort_bits;
 	} elsif ($_ eq 'stable') {
 	    $^H{sort} &=3D ~$sort::stable_bit;
+	    $^H{sort} |=3D  $sort::unstable_bit;
 	} else {
 	    require Carp;
 	    Carp::croak("sort: unknown subpragma '$_'");
diff --git a/op.c b/op.c
index fedaf67..41a9043 100644
--- a/op.c
+++ b/op.c
@@ -11062,7 +11062,7 @@ Perl_ck_sort(pTHX_ OP *o)
 		if ((sorthints & HINT_SORT_QUICKSORT) !=3D 0)
 		    o->op_private |=3D OPpSORT_QSORT;
 		if ((sorthints & HINT_SORT_STABLE) !=3D 0)
-		    o->op_private |=3D OPpSORT_STABLE;
+		    o->op_private |=3D OPpSORT_UNSTABLE;
 	    }
     }
=20
diff --git a/perl.h b/perl.h
index 01f8960..2f4a0e5 100644
--- a/perl.h
+++ b/perl.h
@@ -5406,7 +5406,8 @@ typedef enum {
 #define HINT_SORT_SORT_BITS	0x000000FF /* allow 256 different ones =
*/
 #define HINT_SORT_QUICKSORT	0x00000001
 #define HINT_SORT_MERGESORT	0x00000002
-#define HINT_SORT_STABLE	0x00000100 /* sort styles (currently =
one) */
+#define HINT_SORT_STABLE	0x00000100 /* sort styles */
+#define HINT_SORT_UNSTABLE	0x00000200
=20
 /* flags for PL_sawampersand */
=20
diff --git a/pp_sort.c b/pp_sort.c
index a54768a..58b9085 100644
--- a/pp_sort.c
+++ b/pp_sort.c
@@ -46,6 +46,7 @@
 #define SORTf_DESC   1
 #define SORTf_STABLE 2
 #define SORTf_QSORT  4
+#define SORTf_WOBBLY 8
=20
 /*
  * The mergesort implementation is by Peter M. Mcilroy =
<pmcilroy@lucent.com>.
@@ -1494,6 +1495,8 @@ PP(pp_sort)
 	sort_flags |=3D SORTf_QSORT;
     if ((priv & OPpSORT_STABLE) !=3D 0)
 	sort_flags |=3D SORTf_STABLE;
+    if ((priv & OPpSORT_UNSTABLE) !=3D 0)
+	sort_flags |=3D SORTf_WOBBLY;
=20
     if (gimme !=3D G_ARRAY) {
 	SP =3D MARK;
diff --git a/regen/op_private b/regen/op_private
index 3a2a5d8..1d6e4c7 100644
--- a/regen/op_private
+++ b/regen/op_private
@@ -665,6 +665,7 @@ addbits('sort',
     4 =3D> qw(OPpSORT_DESCEND  DESC   ), # Descending sort
     5 =3D> qw(OPpSORT_QSORT    QSORT  ), # Use quicksort (not =
mergesort)
     6 =3D> qw(OPpSORT_STABLE   STABLE ), # Use a stable algorithm
+    7 =3D> qw(OPpSORT_UNSTABLE UNSTABLE),# Use an unstable algorithm
 );
=20
=20

--Apple-Mail=_D69EE97F-3F2F-44AE-ABFB-B99664EDEE96--
0
sprout
7/23/2017 9:20:09 PM
On Jul 23, 2017, at 2:20 PM, Father Chrysostomos wrote:

> The attached patch (untested) is one way this might be accomplished.  =
Of course, it is up to the mergesort code to make use of the =
SORTf_WOBBLY flag.
>=20
> <open_XsmdTqc9.txt>

I forgot to mention, the patch needs =91make regen=92 run, which I =
omitted in order to keep the patch readable.=
0
sprout
7/23/2017 9:22:25 PM
--001a113cc00cf4c93b0555108b40
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Sun, Jul 23, 2017 at 5:20 PM, Father Chrysostomos <sprout@cpan.org>
wrote:

> John P. Linderman wrote:
> > I agree with the comments (back in 2013!) that said that a stable sort
> > *must* be the default, as it has been since mergesort was integrated in=
to
> > pp_sort.c. This is complicated by the quirky communication between the
> sort
> > pragma, via op.c, to the code in pp_sort.c. Simply put, the SORTf_STABL=
E
> > bit that qsort and mergesort see has *not* been interrogated by the mer=
ge
> > sort. And it is, by default (in sort.pm and op.c) *not* set: stability
> has
> > been *implicit* when mergesort is invoked. So if I allow mergesort to b=
e
> > unstable, how do we communicate that? I see three possibilities, none o=
f
> > them very good.
> >
> >    - Change sort.pm and op.c so that SORTf_STABLE *is* set by default.
> The
> >    mergesort code would start interrogating that bit to determine if
> stability
> >    was demanded. The downside is that it would make qsort stable by
> default.
> >    If qsort finds SORTf_STABLE *on*, it will be much less efficient on
> the
> >    kind of input where qsort might have a performance advantage over
> >    mergesort. And turning stability *off* for qsort would be tricky. Th=
e
> >    _qsort subpragma cannot turn the $sort::stable_bit hint off, because
> that
> >    would make "use sort qw(stable _qsort);" or "use sort qw(stable); us=
e
> sort
> >    qw(_qsort);" ineffective. Mind you, a stabilized quicksort will stil=
l
> >    return legitimate results, so it cannot break existing code. But it
> could
> >    slow things down.
> >    - Add a new subpragma and sort bit for "unstable". mergesort would
> >    continue to ignore SORTf_STABLE, and instead test SORTf_UNSTABLE.
> qsort
> >    would continue to test (only) SORTf_STABLE, which would continue to
> be off
> >    by default. The downside is primarily the confusion of having two
> >    subpragmas for what seems like the same thing.
> >    - Maybe it's time to pull the plug on quicksort. It's a lot of code,
> and
> >    I suspect it is seldom invoked. "perldoc sort" has advised that
> subpragmas
> >    beginning with "_" may not persist beyond Perl 5.8, and we're long
> past
> >    that. We could "phase it out gently" using the first suggestion, wit=
h
> some
> >    suitable warning.
>
> It has already been suggested in ticket #119635, but nobody could give a
> definitive answer.  (No pumpking spoke up at the time.)
>
> >
> > This shouldn't be my choice alone. Maybe other porters have better
> > suggestions. Meanwhile, I'll plod on with trying to make stability
> optional
> > in mergesort, and worry about the details of expressing the option late=
r.
>
> I think it is as simply as leaving each algorithm to use its own default
> (stable for mergesort; unstable for qsort), but let =E2=80=98use/no sort =
'stable'=E2=80=99
> set it specifically for a given scope.  So:
>
>     use sort 'stable'; # makes no difference, unless we are using qsort
>
>     no sort 'stable'; # make mergesort unstable
>
> Internally we can have two flags, stable and unstable, but that
> implementation detail does not need to be exposed to the user.
>
> The attached patch (untested) is one way this might be accomplished.  Of
> course, it is up to the mergesort code to make use of the SORTf_WOBBLY fl=
ag.
>
> Very nice!!! *Please* apply it, even though it will have no effect until
I modify pp_sort.c, but I don't see how it could break any existing code.
(I'd probably call it SORTf_UNSTABLE rather than SORTf_WOBBLY, just to
reduce cognitive stress, but that'll be *my* problem).
If you approve of a "reverse" subpragma, this is an obvious time to add it,
but it's entirely up to you. All I'll care about is *if* SORTf_DESC gets
set in the flags, not *how*.

--001a113cc00cf4c93b0555108b40
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><div class=3D"gmail_quo=
te">On Sun, Jul 23, 2017 at 5:20 PM, Father Chrysostomos <span dir=3D"ltr">=
&lt;<a href=3D"mailto:sprout@cpan.org" target=3D"_blank">sprout@cpan.org</a=
>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0p=
x 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><s=
pan class=3D"gmail-m_-8946112828945557777gmail-">John P. Linderman wrote:<b=
r>
&gt; I agree with the comments (back in 2013!) that said that a stable sort=
<br>
</span>&gt; *must* be the default, as it has been since mergesort was integ=
rated into<br>
<span class=3D"gmail-m_-8946112828945557777gmail-">&gt; pp_sort.c. This is =
complicated by the quirky communication between the sort<br>
&gt; pragma, via op.c, to the code in pp_sort.c. Simply put, the SORTf_STAB=
LE<br>
</span>&gt; bit that qsort and mergesort see has *not* been interrogated by=
 the merge<br>
&gt; sort. And it is, by default (in <a href=3D"http://sort.pm" rel=3D"nore=
ferrer" target=3D"_blank">sort.pm</a> and op.c) *not* set: stability has<br=
>
&gt; been *implicit* when mergesort is invoked. So if I allow mergesort to =
be<br>
<span class=3D"gmail-m_-8946112828945557777gmail-">&gt; unstable, how do we=
 communicate that? I see three possibilities, none of<br>
&gt; them very good.<br>
&gt;<br>
</span>&gt;=C2=A0 =C2=A0 - Change <a href=3D"http://sort.pm" rel=3D"norefer=
rer" target=3D"_blank">sort.pm</a> and op.c so that SORTf_STABLE *is* set b=
y default. The<br>
<span class=3D"gmail-m_-8946112828945557777gmail-">&gt;=C2=A0 =C2=A0 merges=
ort code would start interrogating that bit to determine if stability<br>
&gt;=C2=A0 =C2=A0 was demanded. The downside is that it would make qsort st=
able by default.<br>
</span>&gt;=C2=A0 =C2=A0 If qsort finds SORTf_STABLE *on*, it will be much =
less efficient on the<br>
<span class=3D"gmail-m_-8946112828945557777gmail-">&gt;=C2=A0 =C2=A0 kind o=
f input where qsort might have a performance advantage over<br>
</span>&gt;=C2=A0 =C2=A0 mergesort. And turning stability *off* for qsort w=
ould be tricky. The<br>
<span class=3D"gmail-m_-8946112828945557777gmail-">&gt;=C2=A0 =C2=A0 _qsort=
 subpragma cannot turn the $sort::stable_bit hint off, because that<br>
&gt;=C2=A0 =C2=A0 would make &quot;use sort qw(stable _qsort);&quot; or &qu=
ot;use sort qw(stable); use sort<br>
&gt;=C2=A0 =C2=A0 qw(_qsort);&quot; ineffective. Mind you, a stabilized qui=
cksort will still<br>
&gt;=C2=A0 =C2=A0 return legitimate results, so it cannot break existing co=
de. But it could<br>
&gt;=C2=A0 =C2=A0 slow things down.<br>
</span>&gt;=C2=A0 =C2=A0 - Add a new subpragma and sort bit for &quot;unsta=
ble&quot;. mergesort would<br>
<span class=3D"gmail-m_-8946112828945557777gmail-">&gt;=C2=A0 =C2=A0 contin=
ue to ignore SORTf_STABLE, and instead test SORTf_UNSTABLE. qsort<br>
&gt;=C2=A0 =C2=A0 would continue to test (only) SORTf_STABLE, which would c=
ontinue to be off<br>
&gt;=C2=A0 =C2=A0 by default. The downside is primarily the confusion of ha=
ving two<br>
&gt;=C2=A0 =C2=A0 subpragmas for what seems like the same thing.<br>
</span>&gt;=C2=A0 =C2=A0 - Maybe it&#39;s time to pull the plug on quicksor=
t. It&#39;s a lot of code, and<br>
<span class=3D"gmail-m_-8946112828945557777gmail-">&gt;=C2=A0 =C2=A0 I susp=
ect it is seldom invoked. &quot;perldoc sort&quot; has advised that subprag=
mas<br>
&gt;=C2=A0 =C2=A0 beginning with &quot;_&quot; may not persist beyond Perl =
5.8, and we&#39;re long past<br>
&gt;=C2=A0 =C2=A0 that. We could &quot;phase it out gently&quot; using the =
first suggestion, with some<br>
&gt;=C2=A0 =C2=A0 suitable warning.<br>
<br>
</span>It has already been suggested in ticket #119635, but nobody could gi=
ve a definitive answer.=C2=A0 (No pumpking spoke up at the time.)<br>
<span class=3D"gmail-m_-8946112828945557777gmail-"><br>
&gt;<br>
&gt; This shouldn&#39;t be my choice alone. Maybe other porters have better=
<br>
&gt; suggestions. Meanwhile, I&#39;ll plod on with trying to make stability=
 optional<br>
&gt; in mergesort, and worry about the details of expressing the option lat=
er.<br>
<br>
</span>I think it is as simply as leaving each algorithm to use its own def=
ault (stable for mergesort; unstable for qsort), but let =E2=80=98use/no so=
rt &#39;stable&#39;=E2=80=99 set it specifically for a given scope.=C2=A0 S=
o:<br>
<br>
=C2=A0 =C2=A0 use sort &#39;stable&#39;; # makes no difference, unless we a=
re using qsort<br>
<br>
=C2=A0 =C2=A0 no sort &#39;stable&#39;; # make mergesort unstable<br>
<br>
Internally we can have two flags, stable and unstable, but that implementat=
ion detail does not need to be exposed to the user.<br>
<br>
The attached patch (untested) is one way this might be accomplished.=C2=A0 =
Of course, it is up to the mergesort code to make use of the SORTf_WOBBLY f=
lag.<br>
<br>
</blockquote></div>Very nice!!! <b>Please</b> apply it, even though it will=
 have no effect until I modify pp_sort.c, but I don&#39;t see how it could =
break any existing code.</div><div class=3D"gmail_extra">(I&#39;d probably =
call it SORTf_UNSTABLE rather than SORTf_WOBBLY, just to reduce cognitive s=
tress, but that&#39;ll be <b>my</b> problem).</div><div class=3D"gmail_extr=
a">If you approve of a &quot;reverse&quot; subpragma, this is an obvious ti=
me to add it, but it&#39;s entirely up to you. All I&#39;ll care about is <=
b>if</b>=C2=A0SORTf_DESC gets set in the flags, not <b>how</b>.</div><div c=
lass=3D"gmail_extra"><br></div></div>

--001a113cc00cf4c93b0555108b40--
0
jpl
7/24/2017 1:53:59 PM
--001a113cc00c0f9032055510cedb
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Sun, Jul 23, 2017 at 5:22 PM, Father Chrysostomos <sprout@cpan.org>
wrote:

>
> On Jul 23, 2017, at 2:20 PM, Father Chrysostomos wrote:
>
> > The attached patch (untested) is one way this might be accomplished.  O=
f
> course, it is up to the mergesort code to make use of the SORTf_WOBBLY fl=
ag.
> >
> > <open_XsmdTqc9.txt>
>
> I forgot to mention, the patch needs =E2=80=98make regen=E2=80=99 run, wh=
ich I omitted in
> order to keep the patch readable.


I guess I should emphasize that I am not the least bit qualified or
comfortable with making any changes to op.c, which is why I asked you to
apply the patch
(and do a 'make regen', whatever that means). There's no danger of my
completing the modifications to pp_sort.c any time real soon. Having those
changes in place will make it much easier for me to do thorough tests on
the modified code.

--001a113cc00c0f9032055510cedb
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">=
On Sun, Jul 23, 2017 at 5:22 PM, Father Chrysostomos <span dir=3D"ltr">&lt;=
<a href=3D"mailto:sprout@cpan.org" target=3D"_blank">sprout@cpan.org</a>&gt=
;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
..8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=3D""><br>
On Jul 23, 2017, at 2:20 PM, Father Chrysostomos wrote:<br>
<br>
&gt; The attached patch (untested) is one way this might be accomplished.=
=C2=A0 Of course, it is up to the mergesort code to make use of the SORTf_W=
OBBLY flag.<br>
&gt;<br>
</span>&gt; &lt;open_XsmdTqc9.txt&gt;<br>
<br>
I forgot to mention, the patch needs =E2=80=98make regen=E2=80=99 run, whic=
h I omitted in order to keep the patch readable.</blockquote><div><br></div=
><div>I guess I should emphasize that I am not the least bit qualified or c=
omfortable with making any changes to op.c, which is why I asked you to app=
ly the patch</div><div>(and do a &#39;make regen&#39;, whatever that means)=
.. There&#39;s no danger of my completing the modifications to pp_sort.c any=
 time real soon. Having those changes in place will make it much easier for=
 me to do thorough tests on the modified code.</div></div><br></div></div>

--001a113cc00c0f9032055510cedb--
0
jpl
7/24/2017 2:12:28 PM
--Apple-Mail=_5754A222-E9F9-4FA8-9168-F0BBE0EF9909
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=windows-1252


On Jul 24, 2017, at 6:53 AM, John P. Linderman <jpl.jpl@gmail.com> =
wrote:

>=20
>=20
> On Sun, Jul 23, 2017 at 5:20 PM, Father Chrysostomos <sprout@cpan.org> =
wrote:
> The attached patch (untested) is one way this might be accomplished.  =
Of course, it is up to the mergesort code to make use of the =
SORTf_WOBBLY flag.
>=20
> Very nice!!! Please apply it, even though it will have no effect until =
I modify pp_sort.c, but I don't see how it could break any existing =
code.

It took me a while, but I have finally done it, as commit afe59f35c2b4.

> (I'd probably call it SORTf_UNSTABLE rather than SORTf_WOBBLY, just to =
reduce cognitive stress, but that'll be my problem).

Yes, of course.

> If you approve of a "reverse" subpragma, this is an obvious time to =
add it, but it's entirely up to you. All I'll care about is if =
SORTf_DESC gets set in the flags, not how.
>=20

I have it set up so that =91no sort "stable"=92 will enable the unstable =
flag.

BTW, concerning =91make regen=92, some of the source files are =
computer-generated (such as those dealing with private op flags), and =
=91make regen=92 will regenerate those files.  Patches are easier to =
read without =91make regen=92 having been run, but it needs to be run =
when such patches are applied.


--Apple-Mail=_5754A222-E9F9-4FA8-9168-F0BBE0EF9909
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=windows-1252

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dwindows-1252"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><br><div><div>On Jul 24, 2017, at 6:53 AM, John P. Linderman &lt;<a =
href=3D"mailto:jpl.jpl@gmail.com">jpl.jpl@gmail.com</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><div =
class=3D"gmail_quote">On Sun, Jul 23, 2017 at 5:20 PM, Father =
Chrysostomos <span dir=3D"ltr">&lt;<a href=3D"mailto:sprout@cpan.org" =
target=3D"_blank">sprout@cpan.org</a>&gt;</span> wrote:<br><blockquote =
class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; =
border-left-width: 1px; border-left-style: solid; border-left-color: =
rgb(204, 204, 204); padding-left: 1ex; position: static; z-index: auto; =
">The attached patch (untested) is one way this might be =
accomplished.&nbsp; Of course, it is up to the mergesort code to make =
use of the SORTf_WOBBLY flag.<br>
<br>
</blockquote></div>Very nice!!! <b>Please</b> apply it, even though it =
will have no effect until I modify pp_sort.c, but I don't see how it =
could break any existing =
code.</div></div></blockquote><div><br></div><div>It took me a while, =
but I have finally done it, as =
commit&nbsp;afe59f35c2b4.</div><br><blockquote type=3D"cite"><div =
dir=3D"ltr"><div class=3D"gmail_extra">(I'd probably call it =
SORTf_UNSTABLE rather than SORTf_WOBBLY, just to reduce cognitive =
stress, but that'll be <b>my</b> =
problem).</div></div></blockquote><div><br></div><div>Yes, of =
course.</div><br><blockquote type=3D"cite"><div dir=3D"ltr"><div =
class=3D"gmail_extra">If you approve of a "reverse" subpragma, this is =
an obvious time to add it, but it's entirely up to you. All I'll care =
about is <b>if</b>&nbsp;SORTf_DESC gets set in the flags, not =
<b>how</b>.</div><div class=3D"gmail_extra"><br></div></div>
</blockquote></div><br><div>I have it set up so that =91no sort =
"stable"=92 will enable the unstable flag.</div><div><br></div><div>BTW, =
concerning =91make regen=92, some of the source files are =
computer-generated (such as those dealing with private op flags), and =
=91make regen=92 will regenerate those files. &nbsp;Patches are easier =
to read without =91make regen=92 having been run, but it needs to be run =
when such patches are applied.</div><div><br></div></body></html>=

--Apple-Mail=_5754A222-E9F9-4FA8-9168-F0BBE0EF9909--
0
sprout
8/22/2017 5:28:26 AM
--001a113d3938aace3305575c8461
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Thanks very much! I'll clone the latest before stirring in my (ongoing)
changes.

It is my intent to make the sort as easy as possible to "steal". So I'll
#ifdef NO_PERL_PLEASE to remove all the terrifying perl internals. It'll be
easier for me (and other thieves) to stress-test the code without having to
go through the usual perl channels. I'm also planning to #ifdef INSTRUMENT
some code to count comparisons. It should add only microscopically to the
run times, but unless there's a perl-friendly way to retrieve the counts,
there's no reason to enable it in perl-land. I'm also planning to preserve
the existing dynprep run code, and the existing qsort code, so I can do
some comparison-count measurements. I'll clean out the old code before
committing the code. My guess (and it is mostly that at this point) is that
stabilizing qsort is waste of code. The two main reasons why qsort might be
preferred to mergesort are the in-place implementation, saving an extra
copy of the pointer array, and the "fat pivot" that can speed things up if
one is sorting a large number of records with a small number of keys. Both
of these vanish when stabilizing. If my intuition is correct, I'll just
turn a request for a stable qsort into a request for a stable mergesort.
The results, by definition, will be indistinguishable.

Thanks again for the commit.

On Tue, Aug 22, 2017 at 1:28 AM, Father Chrysostomos <sprout@cpan.org>
wrote:

>
> On Jul 24, 2017, at 6:53 AM, John P. Linderman <jpl.jpl@gmail.com> wrote:
>
>
>
> On Sun, Jul 23, 2017 at 5:20 PM, Father Chrysostomos <sprout@cpan.org>
> wrote:
>
>> The attached patch (untested) is one way this might be accomplished.  Of
>> course, it is up to the mergesort code to make use of the SORTf_WOBBLY f=
lag.
>>
>> Very nice!!! *Please* apply it, even though it will have no effect until
> I modify pp_sort.c, but I don't see how it could break any existing code.
>
>
> It took me a while, but I have finally done it, as commit afe59f35c2b4.
>
> (I'd probably call it SORTf_UNSTABLE rather than SORTf_WOBBLY, just to
> reduce cognitive stress, but that'll be *my* problem).
>
>
> Yes, of course.
>
> If you approve of a "reverse" subpragma, this is an obvious time to add
> it, but it's entirely up to you. All I'll care about is *if* SORTf_DESC
> gets set in the flags, not *how*.
>
>
> I have it set up so that =E2=80=98no sort "stable"=E2=80=99 will enable t=
he unstable flag.
>
> BTW, concerning =E2=80=98make regen=E2=80=99, some of the source files ar=
e
> computer-generated (such as those dealing with private op flags), and =E2=
=80=98make
> regen=E2=80=99 will regenerate those files.  Patches are easier to read w=
ithout
> =E2=80=98make regen=E2=80=99 having been run, but it needs to be run when=
 such patches are
> applied.
>
>

--001a113d3938aace3305575c8461
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Thanks very much! I&#39;ll clone the latest before stirrin=
g in my (ongoing) changes.<div><br></div><div>It is my intent to make the s=
ort as easy as possible to &quot;steal&quot;. So I&#39;ll #ifdef NO_PERL_PL=
EASE to remove all the terrifying perl internals. It&#39;ll be easier for m=
e (and other thieves) to stress-test the code without having to go through =
the usual perl channels. I&#39;m also planning to #ifdef INSTRUMENT some co=
de to count comparisons. It should add only microscopically to the run time=
s, but unless there&#39;s a perl-friendly way to retrieve the counts, there=
&#39;s no reason to enable it in perl-land. I&#39;m also planning to preser=
ve the existing dynprep run code, and the existing qsort code, so I can do =
some comparison-count measurements. I&#39;ll clean out the old code before =
committing the code. My guess (and it is mostly that at this point) is that=
 stabilizing qsort is waste of code. The two main reasons why qsort might b=
e preferred to mergesort are the in-place implementation, saving an extra c=
opy of the pointer array, and the &quot;fat pivot&quot; that can speed thin=
gs up if one is sorting a large number of records with a small number of ke=
ys. Both of these vanish when stabilizing. If my intuition is correct, I&#3=
9;ll just turn a request for a stable qsort into a request for a stable mer=
gesort. The results, by definition, will be indistinguishable.</div><div><b=
r></div><div>Thanks again for the commit.</div></div><div class=3D"gmail_ex=
tra"><br><div class=3D"gmail_quote">On Tue, Aug 22, 2017 at 1:28 AM, Father=
 Chrysostomos <span dir=3D"ltr">&lt;<a href=3D"mailto:sprout@cpan.org" targ=
et=3D"_blank">sprout@cpan.org</a>&gt;</span> wrote:<br><blockquote class=3D=
"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding=
-left:1ex"><div style=3D"word-wrap:break-word"><br><div><span class=3D""><d=
iv>On Jul 24, 2017, at 6:53 AM, John P. Linderman &lt;<a href=3D"mailto:jpl=
..jpl@gmail.com" target=3D"_blank">jpl.jpl@gmail.com</a>&gt; wrote:</div><br=
 class=3D"m_8869748582788466892Apple-interchange-newline"></span><blockquot=
e type=3D"cite"><div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><div cl=
ass=3D"gmail_quote"><span class=3D"">On Sun, Jul 23, 2017 at 5:20 PM, Fathe=
r Chrysostomos <span dir=3D"ltr">&lt;<a href=3D"mailto:sprout@cpan.org" tar=
get=3D"_blank">sprout@cpan.org</a>&gt;</span> wrote:<br></span><span class=
=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;b=
order-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,=
204);padding-left:1ex">The attached patch (untested) is one way this might =
be accomplished.=C2=A0 Of course, it is up to the mergesort code to make us=
e of the SORTf_WOBBLY flag.<br>
<br>
</blockquote></span></div><span class=3D"">Very nice!!! <b>Please</b> apply=
 it, even though it will have no effect until I modify pp_sort.c, but I don=
&#39;t see how it could break any existing code.</span></div></div></blockq=
uote><div><br></div><div>It took me a while, but I have finally done it, as=
 commit=C2=A0afe59f35c2b4.</div><span class=3D""><br><blockquote type=3D"ci=
te"><div dir=3D"ltr"><div class=3D"gmail_extra">(I&#39;d probably call it S=
ORTf_UNSTABLE rather than SORTf_WOBBLY, just to reduce cognitive stress, bu=
t that&#39;ll be <b>my</b> problem).</div></div></blockquote><div><br></div=
></span><div>Yes, of course.</div><span class=3D""><br><blockquote type=3D"=
cite"><div dir=3D"ltr"><div class=3D"gmail_extra">If you approve of a &quot=
;reverse&quot; subpragma, this is an obvious time to add it, but it&#39;s e=
ntirely up to you. All I&#39;ll care about is <b>if</b>=C2=A0SORTf_DESC get=
s set in the flags, not <b>how</b>.</div><div class=3D"gmail_extra"><br></d=
iv></div>
</blockquote></span></div><br><div>I have it set up so that =E2=80=98no sor=
t &quot;stable&quot;=E2=80=99 will enable the unstable flag.</div><div><br>=
</div><div>BTW, concerning =E2=80=98make regen=E2=80=99, some of the source=
 files are computer-generated (such as those dealing with private op flags)=
, and =E2=80=98make regen=E2=80=99 will regenerate those files.=C2=A0 Patch=
es are easier to read without =E2=80=98make regen=E2=80=99 having been run,=
 but it needs to be run when such patches are applied.</div><div><br></div>=
</div></blockquote></div><br></div>

--001a113d3938aace3305575c8461--
0
jpl
8/22/2017 7:22:37 PM
Reply: