helper scripts

--------------070104030408060607090301
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

I wrote some small helper scripts for parsing logs.  Would they be
useful enough to include in qpsmtpd?


-- 
JT Moree

--------------070104030408060607090301
Content-Type: text/plain;
 name="qpgrep"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="qpgrep"

#!/bin/sh

usage()
{
  cat << FOO
  $0 - utility to parse qpsmtpd log messages for a given string and get the whole transaction 

  Usage:  $0 <logfile> [grep options] text_to_find

We will find the last instance of the given string and show all lines pertaining to that message
FOO
}

LOG=$1
shift 1
if [ "$LOG" = "" ] || [ "$LOG" = "-h" ] ; then
  usage
  exit 1
fi

if [ ! -f "$LOG" ] ; then
  echo  "Invalid logfile given!" >&2
  exit 1
fi

if [ $# -eq 0 ] ; then
  echo "No data given" >&2
  exit 1
fi

set -x
N=`grep -h $@ $LOG | tail -n1 | awk '{print $1;}'`
if [ -n "$N" ] ; then
  grep -h ^$N $LOG
fi

--------------070104030408060607090301
Content-Type: text/plain;
 name="qplog"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="qplog"

#!/bin/sh

usage()
{
  cat << FOO
  $0 - utility to parse qpsmtpd log messages for a given id

  Usage:  $0 <logfile> <number> [<number> . . . .]
FOO
}

LOG=$1
shift 1
if [ "$LOG" = "" ] || [ "$LOG" = "-h" ] ; then
  usage
  exit 1
fi

if [ ! -f "$LOG" ] ; then
  echo  "Invalid logfile given!" >&2
  exit 1
fi

for N in $@ ; do
  grep ^$N $LOG
done

--------------070104030408060607090301--
0
jtmoree
8/22/2007 5:28:04 PM
perl.qpsmtpd 1907 articles. 0 followers. Follow

121 Replies
676 Views

Similar Articles

[PageSpeed] 54
Get it on Google Play
Get it on Apple App Store

>=20
> I wrote some small helper scripts for parsing logs.  Would they be
> useful enough to include in qpsmtpd?
>=20
>=20
> --
> JT Moree

You assume here that the process numbers are different for each message.
While it will work for forkserver and tcpserver, that is not the case
with prefork or Apache (also preforking).

Sydney.
0
Sydney
8/22/2007 7:44:26 PM
Sydney Bogaert wrote:
>> I wrote some small helper scripts for parsing logs.  Would they be
>> useful enough to include in qpsmtpd?
>>
> You assume here that the process numbers are different for each message.
> While it will work for forkserver and tcpserver, that is not the case
> with prefork or Apache (also preforking).

Is there a message ID that is unique to each message?

-- 
JT Moree
0
jtmoree
8/22/2007 7:48:09 PM
On Wed, 22 Aug 2007, Sydney Bogaert wrote:

> You assume here that the process numbers are different for each message.
> While it will work for forkserver and tcpserver, that is not the case
> with prefork or Apache (also preforking).

related: is there any work being done on standardizing qpsmtpd log messages
& levels?

James


0
jwa
8/22/2007 8:06:37 PM
On Wed, 2007-08-22 at 13:06 -0700, James W. Abendschan wrote:
> On Wed, 22 Aug 2007, Sydney Bogaert wrote:
> 
> > You assume here that the process numbers are different for each message.
> > While it will work for forkserver and tcpserver, that is not the case
> > with prefork or Apache (also preforking).
> 
> related: is there any work being done on standardizing qpsmtpd log messages
> & levels?

As far as standardization, I think this is a non-starter.  There seem to
be arbitrary ways to do logging via plugins and I'm pretty sure there
are at least 3 different formats being used.

As to levels, there are constants defined in Qpsmtpd::Constants (iirc),
which match the usual syslog levels.

If I am wrong about standardization, I would love to hear it ;-).

> 
> James
> 
> 

-- 
--gh


0
gwhulbert
8/22/2007 9:45:31 PM
--Multipart=_Thu__23_Aug_2007_18_50_33_+0200_S6OJk7067JIcxLO2
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Wed, 22 Aug 2007 12:48:09 -0700
JT Moree <jtmoree@kahalacorp.com> wrote:
> Sydney Bogaert wrote:
> > You assume here that the process numbers are different for each message.
> > While it will work for forkserver and tcpserver, that is not the case
> > with prefork or Apache (also preforking).
> 
> Is there a message ID that is unique to each message?
Not until the sending client has submitted a 'Message-ID' header ;-)

But it would be easy to add / generate a transaction id after every
reset_transaction() call. This could be logged instead of (or as
addition to) the PID.
....something like the attached diff would do. If it's OK, I'll submit
to svn trunk.
Hmm, the spool filename should also include this transaction id, then
it's easier to figure which session was incomplete or is currently
running.

	Hanno

--Multipart=_Thu__23_Aug_2007_18_50_33_+0200_S6OJk7067JIcxLO2
Content-Type: text/x-diff;
 name="transaction_id.diff"
Content-Disposition: attachment;
 filename="transaction_id.diff"
Content-Transfer-Encoding: 7bit

Index: lib/Qpsmtpd/Transaction.pm
===================================================================
--- lib/Qpsmtpd/Transaction.pm	(revision 774)
+++ lib/Qpsmtpd/Transaction.pm	(working copy)
@@ -207,6 +207,14 @@
   return shift->{_body_file};
 }
 
+sub id {
+  my $self = shift;
+  unless ($self->{_transaction_id}) {
+    $self->{_transaction_id} = sprintf("%08X", rand(2**32 - 1));
+  }
+  return $self->{_transaction_id};
+}
+
 sub DESTROY {
   my $self = shift;
   # would we save some disk flushing if we unlinked the file before
Index: plugins/logging/warn
===================================================================
--- plugins/logging/warn	(revision 774)
+++ plugins/logging/warn	(working copy)
@@ -31,7 +31,8 @@
   return DECLINED if defined $plugin and $plugin eq $self->plugin_name; 
 
   warn 
-    join(" ", $$ .
+    join(" ", $$ . 
+         ($transaction->sender ? $transaction->id : "00000000"),
          (defined $plugin ? " $plugin plugin:" :
           defined $hook   ? " running plugin ($hook):"  : ""),
          @log), "\n"
Index: plugins/logging/file
===================================================================
--- plugins/logging/file	(revision 774)
+++ plugins/logging/file	(working copy)
@@ -274,7 +274,9 @@
 
     my $f = $self->{_f};
     print $f strftime($self->{_tsformat}, localtime), ' ',
-             hostname(), '[', $$, ']: ', @log, "\n";
+             hostname(), '[', $$, ']: ',
+             ($txn->sender ? $txn->id : '00000000'),' ',
+             @log, "\n";
     return DECLINED;
 }
 
Index: lib/Qpsmtpd/SMTP.pm
===================================================================
--- lib/Qpsmtpd/SMTP.pm	(revision 774)
+++ lib/Qpsmtpd/SMTP.pm	(working copy)
@@ -388,8 +388,8 @@
     }
     else { # includes OK
       $self->log(LOGINFO, "getting mail from ".$from->format);
-      $self->respond(250, $from->format . ", sender OK - how exciting to get mail from you!");
       $self->transaction->sender($from);
+      $self->respond(250, $from->format . ", sender OK - your transaction id is ".$self->transaction->id);
     }
 }
 

--Multipart=_Thu__23_Aug_2007_18_50_33_+0200_S6OJk7067JIcxLO2--
0
hah
8/23/2007 4:50:33 PM
Hanno Hecker wrote:
>> Is there a message ID that is unique to each message?
> Not until the sending client has submitted a 'Message-ID' header ;-)

> But it would be easy to add / generate a transaction id after every
> reset_transaction() call. This could be logged instead of (or as
> addition to) the PID.

> +    $self->{_transaction_id} = sprintf("%08X", rand(2**32 - 1));

Is this uique enough?  what is the chance of getting the same random
number again?  should it be a combination of the PID + time + rand?

-- 
JT Moree
0
jtmoree
8/23/2007 5:24:58 PM
JT Moree wrote:
> Hanno Hecker wrote:
>>> Is there a message ID that is unique to each message?
>> Not until the sending client has submitted a 'Message-ID' header ;-)
> 
>> But it would be easy to add / generate a transaction id after every
>> reset_transaction() call. This could be logged instead of (or as
>> addition to) the PID.
> 
>> +    $self->{_transaction_id} = sprintf("%08X", rand(2**32 - 1));

I calculate transaction ids like this (hires time):

my $mid = sprintf("%.4f%d", time(), $self->{_id});

$self->{_id} is the id of the server and currently calculated by using
the last digit of the IP address.  This allows transaction ids to be
unique across 10 servers.  When servers get fast enough this may end up
causing dup ids.  A scheme like this allows easy auditing of problems
because the id contains the "when and where".

Cheers,

ds



> 
> Is this uique enough?  what is the chance of getting the same random
> number again?  should it be a combination of the PID + time + rand?

0
dave
8/23/2007 6:34:13 PM
I currently use the following in our sql logging plugin to generate a unique
id in the database:

my @sname = split(/\./, $self->qp->config("me"));
my $sqlIdent = $sname[0].$$.'r'.int( (( time ^ $$ ) * rand($$)) /
rand(time/$$));

With each mx box running at ~30 cps we get dups often enough I'm considering
changing it.



On 8/23/07 1:34 PM, "David Sparks" <dave@ca.sophos.com> wrote:

> JT Moree wrote:
>> Hanno Hecker wrote:
>>>> Is there a message ID that is unique to each message?
>>> Not until the sending client has submitted a 'Message-ID' header ;-)
>> 
>>> But it would be easy to add / generate a transaction id after every
>>> reset_transaction() call. This could be logged instead of (or as
>>> addition to) the PID.
>> 
>>> +    $self->{_transaction_id} = sprintf("%08X", rand(2**32 - 1));
> 
> I calculate transaction ids like this (hires time):
> 
> my $mid = sprintf("%.4f%d", time(), $self->{_id});
> 
> $self->{_id} is the id of the server and currently calculated by using
> the last digit of the IP address.  This allows transaction ids to be
> unique across 10 servers.  When servers get fast enough this may end up
> causing dup ids.  A scheme like this allows easy auditing of problems
> because the id contains the "when and where".
> 
> Cheers,
> 
> ds
> 
> 
> 
>> 
>> Is this uique enough?  what is the chance of getting the same random
>> number again?  should it be a combination of the PID + time + rand?
> 

-- 
Ed McLain
Sr. Data Center Engineer
TekLinks, Inc.
205.314.6634
hosting@teklinks.com

0
EMcLain
8/23/2007 7:13:34 PM
JT Moree wrote:
> 
> Is this uique enough?  what is the chance of getting the same random
> number again?  should it be a combination of the PID + time + rand?
> 

my @sname = split(/\./, $self->qp->config("me"));
= $sname[0].$$.'r'.int( (( time ^ $$ ) * rand($$)) / rand(time/$$));

= sprintf("%08X", rand(2**32 - 1));

$self->qp->config("me") =~ m/\.(\d{1,3}$/;  #not tested
$self->{_id} = $1;
= sprintf("%.4f%d", time(), $self->{_id});

= sprintf("%.4f", time()) .".". $self->qp->config("me") . \
  sprintf("%08X", rand(2**32 - 1));  #how expensive is this?

These are the approaches suggested so far.  I added the last one as a
combination of the others.  Can we see a show of hands for the one
people like the best?

Can we get Hanno to modify his patch if people like one of these
approaches?  Can we get it tested by some people?  Can we get it checked
into svn?

-- 
JT Moree
0
jtmoree
8/24/2007 6:52:07 PM
On Fri, 2007-08-24 at 11:52 -0700, JT Moree wrote:
> JT Moree wrote:
> > 
> > Is this uique enough?  what is the chance of getting the same random
> > number again?  should it be a combination of the PID + time + rand?
> > 
> 
> my @sname = split(/\./, $self->qp->config("me"));
> = $sname[0].$$.'r'.int( (( time ^ $$ ) * rand($$)) / rand(time/$$));
> 
> = sprintf("%08X", rand(2**32 - 1));
> 
> $self->qp->config("me") =~ m/\.(\d{1,3}$/;  #not tested
> $self->{_id} = $1;
> = sprintf("%.4f%d", time(), $self->{_id});
> 
> = sprintf("%.4f", time()) .".". $self->qp->config("me") . \
>   sprintf("%08X", rand(2**32 - 1));  #how expensive is this?
> 
> These are the approaches suggested so far.  I added the last one as a
> combination of the others.  Can we see a show of hands for the one

Using rand is bogus.  A random number generator will repeat values.

Time (with sufficient resolution) is equivalent to a sequence ... but
with threads, you would need a lock on the sequence generator.

> people like the best?
> 
> Can we get Hanno to modify his patch if people like one of these
> approaches?  Can we get it tested by some people?  Can we get it checked
> into svn?

-- 
--gh


0
gwhulbert
8/24/2007 7:44:52 PM
On Fri, 24 Aug 2007, Guy Hulbert wrote:

> > These are the approaches suggested so far.  I added the last one as a
> > combination of the others.  Can we see a show of hands for the one
>
> Using rand is bogus.  A random number generator will repeat values.
>
> Time (with sufficient resolution) is equivalent to a sequence ... but
> with threads, you would need a lock on the sequence generator.

fqdn + time + peer TCP port will be pretty unique, regardless of
whether you're forking, selecting, or threading.  (fortunately,
multiplexed SMTP does not yet exist.)

Looks like remote_port is  set in qpsmtpd-forkserver, at least..

James




0
jwa
8/24/2007 8:18:00 PM
Guy Hulbert wrote:
> Using rand is bogus.  A random number generator will repeat values.

So you would definitely not like #2 and probably not #1.  How about #3
and $4?

> Time (with sufficient resolution) is equivalent to a sequence ... but
> with threads, you would need a lock on the sequence generator.

In our case a repetition is not a highly critical problem.  (Not enough
to justify using a centralized sequence generator.)  Repetition just
reduces the readability of the logs.  Given that the logs are even less
readable without these id's I'd say we are in a better position to
implement something rather than nothing.

-- 
JT Moree
0
jtmoree
8/24/2007 8:22:59 PM
On Fri, 24 Aug 2007, James W. Abendschan wrote:

> On Fri, 24 Aug 2007, Guy Hulbert wrote:
>
> > > These are the approaches suggested so far.  I added the last one as a
> > > combination of the others.  Can we see a show of hands for the one
> >
> > Using rand is bogus.  A random number generator will repeat values.
> >
> > Time (with sufficient resolution) is equivalent to a sequence ... but
> > with threads, you would need a lock on the sequence generator.
>
> fqdn + time + peer TCP port will be pretty unique, regardless of
> whether you're forking, selecting, or threading.  (fortunately,
> multiplexed SMTP does not yet exist.)

whoops; s/fqdn/peer IP/

James


0
jwa
8/24/2007 8:32:29 PM
James W. Abendschan wrote:
> On Fri, 24 Aug 2007, Guy Hulbert wrote:
>
>   
>>> These are the approaches suggested so far.  I added the last one as a
>>> combination of the others.  Can we see a show of hands for the one
>>>       
>> Using rand is bogus.  A random number generator will repeat values.
>>
>> Time (with sufficient resolution) is equivalent to a sequence ... but
>> with threads, you would need a lock on the sequence generator.
>>     
>
> fqdn + time + peer TCP port will be pretty unique, regardless of
> whether you're forking, selecting, or threading.  (fortunately,
> multiplexed SMTP does not yet exist.)
>   
mmh, multiplexed?
A mailserver can send multiple mails within one tcp-connection:
"There may be zero or more, transactions in a session." - RFC2821

-- 
Jens

0
jens
8/24/2007 8:33:23 PM
On Fri, 24 Aug 2007, Jens Weibler wrote:

> mmh, multiplexed?
> A mailserver can send multiple mails within one tcp-connection:
> "There may be zero or more, transactions in a session." - RFC2821

Ah, good point.  Okay then, obviously qpsmtpd now needs to be rewritten
to make me right -- after leaving the DATA state, reject anything other
than QUIT :-)

I suppose a counter could be tacked on to the ID and incremented every
time a message is queued..

James


0
jwa
8/24/2007 8:42:53 PM
>> = sprintf("%.4f", time()) .".". $self->qp->config("me") . \
>>   sprintf("%08X", rand(2**32 - 1));  #how expensive is this?
>>
>> These are the approaches suggested so far.  I added the last one as a
>> combination of the others.  Can we see a show of hands for the one
> 
> Using rand is bogus.  A random number generator will repeat values.
> 
> Time (with sufficient resolution) is equivalent to a sequence ... but
> with threads, you would need a lock on the sequence generator.

I'm using the poll server which means that there aren't threads to worry
about.  However the future probably means running multiple daemons to
take advantage of multi-core systems so there would need to be a daemon
id encoded in there.

The big advantage to using time() + id as the least significant digit is
that you can put the id in a db server as a double or unixtime which
comes in quite handy when you've got a lot of volume.

Cheers,

ds
0
dave
8/24/2007 10:40:50 PM
On Fri, 2007-08-24 at 13:18 -0700, James W. Abendschan wrote:
> On Fri, 24 Aug 2007, Guy Hulbert wrote:
> 
> > > These are the approaches suggested so far.  I added the last one as a
> > > combination of the others.  Can we see a show of hands for the one
> >
> > Using rand is bogus.  A random number generator will repeat values.
> >
> > Time (with sufficient resolution) is equivalent to a sequence ... but
> > with threads, you would need a lock on the sequence generator.
> 
> fqdn + time + peer TCP port will be pretty unique, regardless of

fqdn is the trivial part

rand will be "pretty unique" ...

time by itself is sufficient if the resolution is fine enough ... the
problem is that when systems are "fast enough", whatever fixed
resolution you picked will not be enough.

However, at present the linux kernel gives microseconds (%.6f rather
than %.4f) and it seems to take about .0001 seconds to fork a process so
if forking, microseconds seem to be sufficient for a few years.

.... but threads may be able to get the same value ...

The problem with a sequence is to continue through a crash without
repeating values.

> whether you're forking, selecting, or threading.  (fortunately,
> multiplexed SMTP does not yet exist.)
> 
> Looks like remote_port is  set in qpsmtpd-forkserver, at least..
> 
> James
> 
> 
> 
> 

-- 
--gh


0
gwhulbert
8/24/2007 11:13:56 PM
On Fri, 2007-08-24 at 13:22 -0700, JT Moree wrote:
> Guy Hulbert wrote:
> > Using rand is bogus.  A random number generator will repeat values.
> 
> So you would definitely not like #2 and probably not #1.  How about #3
> and $4?

I can't think of anything that guarantees a unique number ... except
pulling a sequence from an ACID database (where the problem of system
crashes is already solved).

> 
> > Time (with sufficient resolution) is equivalent to a sequence ... but
> > with threads, you would need a lock on the sequence generator.
> 
> In our case a repetition is not a highly critical problem.  (Not enough

Repetition will break anything using a hash to sort messages by ID.

> to justify using a centralized sequence generator.)  Repetition just
> reduces the readability of the logs.  Given that the logs are even less

Ah.  You never know.  DJB had a clever method to pick message IDs for
the queue by using the inode ... but it is useless for log analysis
where the mail queue has a dedicated reiser partition and the load is
very LOW.  I found that every message used the same inode when there was
only one message at a time on the system ... :-(  It solves his problem
of picking a message ID which will not conflict with any other in the
queue _at the same time_.

> readable without these id's I'd say we are in a better position to
> implement something rather than nothing.

-- 
--gh


0
gwhulbert
8/24/2007 11:18:11 PM
--vGgW1X5XWziG23Ko
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-22 21:44:26 +0200, Sydney Bogaert wrote:
> >=20
> > I wrote some small helper scripts for parsing logs.  Would they be
> > useful enough to include in qpsmtpd?
> >=20
> >=20
> > --
> > JT Moree
>=20
> You assume here that the process numbers are different for each message.
> While it will work for forkserver and tcpserver, that is not the case
> with prefork or Apache (also preforking).

It doesn't even work for forkserver and tcpserver. On a busy system PIDs
have a rather high probability of repeating within a single log file. Of
course you can try to detect the end of one connection or the start of
another one.=20

In the logging/file_connection I chose this approach:

In register, choose a prefix which should be unique to every running
"instance" of qpsmtpd (in the case of forkserver, that's the parent
process):

    $self->{_log_session_id_prefix} =3D sprintf("%08x%04x", time(), $$);

(that may not be safe for tcpserver: It is possible that two processes
get the same pid within one second - use check_earlytalker to prevent
this :-))

then, in the pre-connection hook, just increment a counter:

    $self->{_log_session_id} =3D
	$self->{_log_session_id_prefix} . "." .
	++$self->{_log_session_id_counter};

and use the concatenation as the session id which is written to log
files.=20

So entries look like this:

2007-08-25T10:26:07+0200 46cfde417471.124  Accepted connection 0/15 from 70=
=2E84.4.138 / virtual.virtualtoolsets.com
2007-08-25T10:26:07+0200 46cfde417471.124  Connection from virtual.virtualt=
oolsets.com [70.84.4.138]
2007-08-25T10:26:08+0200 46cfde417471.124  check_earlytalker plugin: remote=
 host said nothing spontaneous, proceeding
2007-08-25T10:26:08+0200 46cfde417471.124  220 mx.luga.at ESMTP qpsmtpd 0.4=
0 ready; send us your mail, but not your spam.

In this example the first line is written by the parent process of
forkserver before the fork, the others by the child process after the
fork - the session id stays the same.

But note that this is a session/connection id, not a message id: If the
client sends several messages within a single connection, they will be
recorded with the same id. That's what I want, but if you want a unique
transaction id, it should be easy to add another counter which is
incremented for each transaction.

	hp


--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--vGgW1X5XWziG23Ko
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGz+hdfZ+RkG8quy0RAtWOAJ0aMT+U0n08W5rOT0qM5LdWKyQEbQCgmPYd
NuzvLeJnofUaZ71d7A2Bwfg=
=Yuc3
-----END PGP SIGNATURE-----

--vGgW1X5XWziG23Ko--
0
hjp
8/25/2007 8:29:17 AM
On 24-Aug-07, at 6:40 PM, David Sparks wrote:

> I'm using the poll server which means that there aren't threads to  
> worry
> about.  However the future probably means running multiple daemons to
> take advantage of multi-core systems so there would need to be a  
> daemon
> id encoded in there.

Yeah, we use HiRes::time() . ".$$" and we don't get any file stomping  
(and we're doing millions of emails/day).

0
matt
8/25/2007 2:15:32 PM
On Sat, 2007-08-25 at 10:15 -0400, Matt Sergeant wrote:
> On 24-Aug-07, at 6:40 PM, David Sparks wrote:
> 
> > I'm using the poll server which means that there aren't threads to  
> > worry
> > about.  However the future probably means running multiple daemons to
> > take advantage of multi-core systems so there would need to be a  
> > daemon
> > id encoded in there.
> 
> Yeah, we use HiRes::time() . ".$$" and we don't get any file stomping

This works until you can run 65K processes per microsecond or 2G process
per microsecond if the PID is 32 bits ... we'll probably need quantum
computers before that would break ;-)

>   
> (and we're doing millions of emails/day).

Could you clarify?  There is no HiRes module in CPAN.  Here's what I
see:

        DateTime::HiRes
        Rose::DB::Object::Metadata::Column::Epoch::HiRes
        Time::HiRes
        Time::HiRes::Value

I've been using Time::HiRes, which does not have a time() call but ships
in the debian perl-modules package.


> 

-- 
--gh


0
gwhulbert
8/25/2007 2:41:44 PM
On 25-Aug-07, at 10:41 AM, Guy Hulbert wrote:

>> Yeah, we use HiRes::time() . ".$$" and we don't get any file stomping
>
> This works until you can run 65K processes per microsecond or 2G  
> process
> per microsecond if the PID is 32 bits ... we'll probably need quantum
> computers before that would break ;-)

Yeah, and honestly IPv4 would break before that. And I'm hoping the  
spammers won't be switching to IPv6 before I retire.

>> (and we're doing millions of emails/day).
>
> Could you clarify?  There is no HiRes module in CPAN.  Here's what I
> see:

Time::HiRes::time().

Matt.
0
matt
8/25/2007 2:51:14 PM
--JWEK1jqKZ6MHAcjA
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-25 10:41:44 -0400, Guy Hulbert wrote:
> On Sat, 2007-08-25 at 10:15 -0400, Matt Sergeant wrote:
> > On 24-Aug-07, at 6:40 PM, David Sparks wrote:
> >=20
> > Yeah, we use HiRes::time() . ".$$" and we don't get any file stomping
> > (and we're doing millions of emails/day).
>=20
> Could you clarify?  There is no HiRes module in CPAN.  Here's what I
> see:
>=20
>         DateTime::HiRes
>         Rose::DB::Object::Metadata::Column::Epoch::HiRes
>         Time::HiRes
>         Time::HiRes::Value
>=20
> I've been using Time::HiRes, which does not have a time() call but ships
> in the debian perl-modules package.

It has. If you use it like this:

use Time::HiRes qw(time);

it will replace time with a function which returns the timestamp as a
floating point number. I use that all the time for benchmarking.

(Hmm, I see that it is missing from the synopsis but documented in the
text of the POD)

	hp

--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--JWEK1jqKZ6MHAcjA
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG0EQnfZ+RkG8quy0RAvQoAKCY80PELMZmXEsY0HNFYxM1nTIi1wCgpC+K
2WgTq3bvbvAWAE6sgGewM5E=
=63Bo
-----END PGP SIGNATURE-----

--JWEK1jqKZ6MHAcjA--
0
hjp
8/25/2007 3:00:55 PM
On Sat, 2007-08-25 at 17:00 +0200, Peter J. Holzer wrote:
> > I've been using Time::HiRes, which does not have a time() call but
> ships
> > in the debian perl-modules package.
> 
> It has. If you use it like this:
> 
> use Time::HiRes qw(time);
> 
> it will replace time with a function which returns the timestamp as a

Ah ... that's a useful trick ...

> floating point number. I use that all the time for benchmarking.

I just cut-and-paste the code from the synopsis ;-)

> 
> (Hmm, I see that it is missing from the synopsis but documented in the
> text of the POD)

Yup.  After Matt replied, I read through the POD ... 

thanks Peter.

-- 
--gh


0
gwhulbert
8/25/2007 4:14:44 PM
David Sparks wrote:
>>> = sprintf("%.6f", time()) .".". $self->qp->config("me") . \
>>>   sprintf("%08X", rand(2**32 - 1));  #how expensive is this?

> The big advantage to using time() + id as the least significant digit is
> that you can put the id in a db server as a double or unixtime which
> comes in quite handy when you've got a lot of volume.

Would each thread have a unique PID or are all the threads under the
parent PID?  Is there a thread ID we could use.  The system knows how to
differentiate each thread.  Can we use that in combination with time and IP?
-- 
JT Moree
0
jtmoree
8/25/2007 8:21:05 PM
On Fri, 24 Aug 2007, Guy Hulbert wrote:

> > fqdn + time + peer TCP port will be pretty unique, regardless of
>
> fqdn is the trivial part
>
> rand will be "pretty unique" ...

Initial connection time, peer IP, and peer port will only
repeat if the connection is torn down and restablished with
the same peer reusing the same local port within the resolution
of the timer.

The check_earlytalker plugin ensures at least a one
second pause in every SMTP session, so time() + peer IP
+ peer port will be far more unique than a random number :-)

This combo would be unique among all hosts attached to the same
routable networks -- two hosts on two different, unconnected
networks could possibly get a connection from the same
private IP + local port at the same time, but this "should
be impossible" if the networks are connected.

Adding this to plugins/logging/syslog works pretty well for
forkserver:

    use Time::HiRes;

....

    if (!$self->{_logid})
    {
        if ($self->connection->remote_ip)
        {
            $self->{_timestamp} = Time::HiRes::time();
            $self->{_logid} = "t=" . $self->{_timestamp} . "/peer=" . $self->connection->remote_ip  . ":" . $self->connection->remote_port;
        }
    }

    if ($self->connection->remote_ip)
    {
        $header = $self->{_logid} . " ";
    }

    syslog $priority, '%s%s', $header, join(' ', @log);


syslog messages look like this:

  Aug 25 14:31:27 mailfoo qpsmtpd[4892]: t=1188077487.69488/peer=10.1.253.1:40911 check_earlytalker


If there's an existing way to count the number of messages sent
during the connection, then append the count to _logid and it
becomes a message ID generator.  If this isn't already somewhere
in SMTP.pm, the queueing plugin could increment a counter..
or the logging plugin could watch for the string 'to email address :' &
increment a (thread-safe) counter.  That's a smidge brittle, tho..
a proper message counter would be less hacky.

James


0
jwa
8/25/2007 9:55:40 PM
On Sat, 2007-08-25 at 14:55 -0700, James W. Abendschan wrote:
> The check_earlytalker plugin ensures at least a one
> second pause in every SMTP session, so time() + peer IP
> + peer port will be far more unique than a random number :-)

IF you use check_earlytalker configured as you describe then you can
just use time() ... as long as you are single threaded ... I'm going to
look at what Ask is proposing ...

-- 
--gh


0
gwhulbert
8/26/2007 12:23:49 AM
James W. Abendschan wrote:
> The check_earlytalker plugin ensures at least a one
> second pause in every SMTP session, so time() + peer IP
> + peer port will be far more unique than a random number :-)

This has been suggested a few times but I'd rather not have to have ids
for the system depend on using a plugin.  I'm pushing for adding this id
to core qpsmtpd.

> This combo would be unique among all hosts attached to the same
> routable networks -- two hosts on two different, unconnected
> networks could possibly get a connection from the same
> private IP + local port at the same time, but this "should
> be impossible" if the networks are connected.

As in two clients behind a NAT sending to our server at the exact same
time?  Might be possible from server farms or distributed mailing list
systems?

What do you guys think?

-- 
JT Moree
0
jtmoree
8/28/2007 4:49:16 PM
On 8/28/07, JT Moree <jtmoree@kahalacorp.com> wrote:
> James W. Abendschan wrote:
> > The check_earlytalker plugin ensures at least a one
> > second pause in every SMTP session, so time() + peer IP
> > + peer port will be far more unique than a random number :-)
>
> This has been suggested a few times but I'd rather not have to have ids
> for the system depend on using a plugin.  I'm pushing for adding this id
> to core qpsmtpd.
>
> > This combo would be unique among all hosts attached to the same
> > routable networks -- two hosts on two different, unconnected
> > networks could possibly get a connection from the same
> > private IP + local port at the same time, but this "should
> > be impossible" if the networks are connected.
>
> As in two clients behind a NAT sending to our server at the exact same
> time?  Might be possible from server farms or distributed mailing list
> systems?
>
> What do you guys think?

that wont be an issue. the nat box will rewrite the outgoing packets
to say they are coming from a unique port on it's external interface,
and that is all you can see on your end.

remoteIP + remotePort + fineGrainedTime is what we use in-house for
some high-speed http logging that needs a unique handle. it works just
fine with a fair number of concurrent clients behind a nat or proxy.
but, my installation is not massive :)

allan

-- 
"The truth is an offense, but not a sin"
0
kitno455
8/28/2007 5:23:58 PM
Why not use something like Data::UUID?

http://search.cpan.org/~rjbs/Data-UUID-1.148/UUID.pm

There is reads:

"It provides reasonably efficient and reliable framework for generating
UUIDs and supports fairly high allocation rates -- 10 million per second
per machine -- and therefore is suitable for identifying both extremely
short-lived and very persistent objects on a given system as well as
across the network."

I used this in a former project for unique persistent object ids.

--
Ernesto




0
ernest
8/28/2007 5:46:25 PM
> remoteIP + remotePort + fineGrainedTime is what we use in-house for
> some high-speed http logging that needs a unique handle. it works just
> fine with a fair number of concurrent clients behind a nat or proxy.
> but, my installation is not massive :)

Add PID and a per-process message-counter and you should always be
unique.


Regards
Michael

-- 
It's an insane world, but i'm proud to be a part of it. -- Bill Hicks
0
kju
8/28/2007 6:10:38 PM
I've checked in $transaction->id support now. Please let me know if  
you think it's OK.

Matt.
0
matt
8/28/2007 6:42:17 PM
Matt Sergeant wrote:
> I've checked in $transaction->id support now. Please let me know if you
> think it's OK.

which method did you use?
-- 
JT Moree
0
jtmoree
8/28/2007 7:12:57 PM
On 28-Aug-07, at 3:12 PM, JT Moree wrote:

> Matt Sergeant wrote:
>> I've checked in $transaction->id support now. Please let me know  
>> if you
>> think it's OK.
>
> which method did you use?

hires_time.pid.local_port

Matt.


0
matt
8/28/2007 7:33:16 PM
Matt Sergeant wrote:
> On 28-Aug-07, at 3:12 PM, JT Moree wrote:
> 
>> Matt Sergeant wrote:
>>> I've checked in $transaction->id support now. Please let me know if you
>>> think it's OK.
>>
>> which method did you use?
> 
> hires_time.pid.local_port

I found the svn web interface:

  # generate id
  my $conn = $args{connection};
  my $ip = $conn->local_port || "0";
  my $start = time;
  my $id = "$start.$$.$ip";

Some people have suggested adding the remote IP address.  I'm curious
why use local port instead of remote port?  would both be better?

  my $ip = $conn->remote_ip($ip);
  my $rport = $conn->remote_port || "0";
  my $lport = $conn->local_port || "0";
  my $start = time;
  my $id = "$start_$$.$lport_$ip:$rport";


Thanks for checking something in.  Progress is being made. ;)
-- 
JT Moree
0
jtmoree
8/28/2007 7:51:52 PM
On 28-Aug-07, at 3:51 PM, JT Moree wrote:

> I found the svn web interface:
>
>   # generate id
>   my $conn = $args{connection};
>   my $ip = $conn->local_port || "0";
>   my $start = time;
>   my $id = "$start.$$.$ip";
>
> Some people have suggested adding the remote IP address.  I'm curious
> why use local port instead of remote port?  would both be better?

Err, actually I had a brain fart. It should be remote_port.

Matt.


0
matt
8/28/2007 8:06:15 PM
On Tue, 28 Aug 2007, Matt Sergeant wrote:

> On 28-Aug-07, at 3:51 PM, JT Moree wrote:
>>> hires_time.pid.local_port
....
>>    my $conn = $args{connection};
>>    my $ip = $conn->local_port || "0";
>>    my $start = time;
>>    my $id = "$start.$$.$ip";
>>
>>  Some people have suggested adding the remote IP address.  I'm curious
>>  why use local port instead of remote port?  would both be better?
>
> Err, actually I had a brain fart. It should be remote_port.

No, it should be remote_IP.remote_port.local_port and should include a 
transaction_within_connection count. I don't think that pid adds anything.

0
charlieb
8/29/2007 3:04:56 AM
On Tue, 2007-08-28 at 23:04 -0400, Charlie Brady wrote:
> > On 28-Aug-07, at 3:51 PM, JT Moree wrote:
> >>> hires_time.pid.local_port
> ...
> >>    my $conn = $args{connection};
> >>    my $ip = $conn->local_port || "0";
> >>    my $start = time;
> >>    my $id = "$start.$$.$ip";
> >>
> >>  Some people have suggested adding the remote IP address.  I'm
> curious
> >>  why use local port instead of remote port?  would both be better?
> >
> > Err, actually I had a brain fart. It should be remote_port.
> 
> No, it should be remote_IP.remote_port.local_port and should include
> a 
> transaction_within_connection count. I don't think that pid adds
> anything.
> 
> 
> 
> 

This does not guarantee a unique message ID.  That's why we are using
hi_res time.


-- 
--gh



0
gwhulbert
8/29/2007 10:36:41 AM
On 28-Aug-07, at 11:04 PM, Charlie Brady wrote:

>> Err, actually I had a brain fart. It should be remote_port.
>
> No, it should be remote_IP.remote_port.local_port and should  
> include a transaction_within_connection count. I don't think that  
> pid adds anything.

Please try any way you can to get the algorithm I've used to generate  
a duplicate transaction id. Feel free to use your fastest hardware.

I've tried, and cannot conceive of any way to get a repeat with this  
algorithm. Perhaps in 30 years maybe (when computers are that fast),  
but for now it works well.

Matt.
0
matt
8/29/2007 1:12:08 PM
--==_Exmh_1188395934_12619P
Content-Type: text/plain; charset=us-ascii

> From:  Charlie Brady <charlieb-qpsmtpd@budge.apana.org.au>
> Date:  Tue, 28 Aug 2007 23:04:56 -0400 (EDT)
>
> No, it should be remote_IP.remote_port.local_port and should include a 
> transaction_within_connection count. I don't think that pid adds anything.

Isn't localport always 25?

Chris

-- 
Chris Garrigues                         Trinsic Solutions
President                               710-B West 14th Street
                                        Austin, TX  78701-1798
http://www.trinsics.com/blog
http://www.trinsics.com			512-322-0180

                 Would you rather proactively pay for
                uptime or reactively pay for downtime?

			  Trinsic Solutions
		Your Trusted Friends in Proactive IT.



--==_Exmh_1188395934_12619P
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Exmh version 2.2_20000822 06/23/2000

iD8DBQFG1XueK9b4h5R0IUIRAq5PAJ4zE39oyWzwJDcOg1t1Gn7txVNn8ACgipUl
+5qlT1nl2yYbLupkTfABlUs=
=N64U
-----END PGP SIGNATURE-----

--==_Exmh_1188395934_12619P--
0
cwg
8/29/2007 1:58:54 PM
--------------enigD979FFA8AF8577B472C940D3
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Chris Garrigues wrote:
>> From:  Charlie Brady <charlieb-qpsmtpd@budge.apana.org.au>
>> Date:  Tue, 28 Aug 2007 23:04:56 -0400 (EDT)
>>
>> No, it should be remote_IP.remote_port.local_port and should include a=
=20
>> transaction_within_connection count. I don't think that pid adds anyth=
ing.
>>    =20
>
> Isn't localport always 25?
>  =20
the most time: yes.
But it can also be 465

--=20
Jens



--------------enigD979FFA8AF8577B472C940D3
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1X4WU1YrmEGthMIRCIPTAJ91x6ZCLByoYcs6SozI2k/AxOu/TACfRGBS
zevH35zVGv1CHLT78tRHNCo=
=EAbn
-----END PGP SIGNATURE-----

--------------enigD979FFA8AF8577B472C940D3--
0
jens
8/29/2007 2:09:22 PM
Charlie Brady wrote:
> 
> On Tue, 28 Aug 2007, Matt Sergeant wrote:
> 
>> On 28-Aug-07, at 3:51 PM, JT Moree wrote:
>>>> hires_time.pid.local_port
> ...
>>>    my $conn = $args{connection};
>>>    my $ip = $conn->local_port || "0";
>>>    my $start = time;
>>>    my $id = "$start.$$.$ip";
>>>
>>>  Some people have suggested adding the remote IP address.  I'm curious
>>>  why use local port instead of remote port?  would both be better?
>>
>> Err, actually I had a brain fart. It should be remote_port.
> 
> No, it should be remote_IP.remote_port.local_port and should include a 
> transaction_within_connection count. I don't think that pid adds anything.

You could still have a machine with several IP's / interfaces, so 
emote_IP.remote_port.local_port.transaction_within_connection is not 
enough either.

-Johan
0
johan
8/29/2007 2:21:31 PM
On Wed, 29 Aug 2007 the voices made Guy Hulbert write:

GH> On Tue, 2007-08-28 at 23:04 -0400, Charlie Brady wrote:
GH> > > On 28-Aug-07, at 3:51 PM, JT Moree wrote:
GH> > >>> hires_time.pid.local_port
GH> > ...
GH> > >>    my $conn = $args{connection};
GH> > >>    my $ip = $conn->local_port || "0";
GH> > >>    my $start = time;
GH> > >>    my $id = "$start.$$.$ip";
GH> > >>
GH> > >>  Some people have suggested adding the remote IP address.  I'm
GH> > curious
GH> > >>  why use local port instead of remote port?  would both be better?
GH> > >
GH> > > Err, actually I had a brain fart. It should be remote_port.
GH> > 
GH> > No, it should be remote_IP.remote_port.local_port and should include
GH> > a 
GH> > transaction_within_connection count. I don't think that pid adds
GH> > anything.

GH> This does not guarantee a unique message ID.  That's why we are using
GH> hi_res time.

 If this' meant to go into the core it might not be a bad idea to make the 
first part of a "guaranteed" unique message ID identify the server itself; just 
to avoid that someone manages to write a plugin which relies on the ID being 
unique, and then in some (maybe very rare) cases isn't so in a busy multi-mx 
setup.

 I'm thinking aggregated/centralized logging, simple/custom supportsystems 
relying on an already provided "unique" ID, mailqueues/filters/whatever using 
this ID to know where to pick up its work after an interruption, "report this 
e-mail as spam"-links added in e-mails... but most of all I'm thinking that if 
it might be possible to time an attack such that something might show an, at 
least to some, unexpected behaviour, like the creation of a unique ID that 
isn't "unique enough" for its usage, then someone out there will sooner or 
later figure out a way to use it to his advantage.


 Including the PID might be enough to differentiate the IDs, but if it is it's 
more, from a security POV, works-by-accident than by design; PID+random-string- 
created-when-qpsmtpd-is-started might be a tempting way to keep resource usage 
at a minimum, but would once again not feel as safe as might be prefered... but 
using a public IPaddress used by qpsmtpd might do the trick. Then agan, if we 
start to think about security, then there might be several (semi-paranoid) 
reasons why one wouldn't want to just use the/a IPadress (esp. not since it 
might freak out some qpsmtpd-users if they use the ID externally without 
knowing about this); so how about, at the start of qpsmtpd, creating a unique 
ID for that startup by hashing something along the lines of (IPs-listened-to, 
PID, hi-res-time), possibly with randomized positions or a random string 
somewhere.
 That unique server ID would then be used as the first part of the unique IDs 
created for each e-mail processed on that server.

 Sure, this is more convoluted than truly needed for the requirements of a 
unique ID in this case, and one could rightfully argue that the usage of a 
hashing algorithm will increase the chance that two servers will end up with 
the same ID; further weaknesses would be in situations where qpsmtpd isn't 
using any IPaddresses uniquely assigned (such as on a local network) or where 
two servers are intentionally designed to be identical and maybe (re)activated 
from the same saved image of a previously started system (perhaps a teeny weeny 
bit more esoteric usage of qpsmtpd to ever be setup by anyone that would rely 
on a "guaranteed" unique ID without checking the code thoroughly).

 HOWEVER, this never was about cryptographically perfect security; but about 
greatly reducing the potential number of situations in which 1) a vulnerability 
might be introduced, and 2) where vulnerable setups might be identified.

 A scheme such as that I've outlined would do just that, it would also reduce 
the window of opportunity for the already unlikely error/exploit to no longer 
than when qpsmtpd is restarted on one of the two affected servers; to the cost 
of a very slight extra resource usage at the startup.


 (Only reason I gave this some thought was because I few days ago wanted unique 
IDs traceable to a single server, without having to use statically assigned 
identifiers. The purely security implications of adding unique IDs to qpsmtpd 
without them possibly being less unique in a multi-mx setup seems quite 
neglectable; but still easily avoidable... )


	/Tony
-- 
"Generally speaking, taunting mentally unstable people is a bad idea."
0
tony
8/29/2007 2:53:14 PM
On Wed, 2007-08-29 at 16:53 +0200, Tony L. Svanstrom wrote:
>  (Only reason I gave this some thought was because I few days ago wanted unique 
> IDs traceable to a single server, without having to use statically assigned 

Look at UUID.  That is what it is for.

> identifiers. The purely security implications of adding unique IDs to qpsmtpd 
> without them possibly being less unique in a multi-mx setup seems quite 
> neglectable; but still easily avoidable... )

IIRC, it uses TAI plus MAC address (MAC address is not unique but it is
supposed to be) plus some other stuff.  It is used for subversion
repository IDs ... there are about 5 flavours (including microsoft ;-).

During this discussion, someone referenced an implementation able to
provide 10M unique IDs per second.

-- 
--gh


0
gwhulbert
8/29/2007 3:21:31 PM
On Wed, 29 Aug 2007 the voices made Guy Hulbert write:

GH> On Wed, 2007-08-29 at 16:53 +0200, Tony L. Svanstrom wrote:
GH> >  (Only reason I gave this some thought was because I few days ago wanted unique 
GH> > IDs traceable to a single server, without having to use statically assigned 
GH> 
GH> Look at UUID.  That is what it is for.

 Oh, it wasn't exactly a huge problem; just thought I'd throw some thoughts out 
there to make sure it was at least thought about when adding "unique" IDs in 
qpsmtpd, as people might end up using them outside of the intended scope. 
(Sorry for any strange wording etc; but I'm tired, in a hurry and english isn't 
my language of choice.)


	/Tony
-- 
"Generally speaking, taunting mentally unstable people is a bad idea."
0
tony
8/29/2007 3:34:52 PM
Given that we are still disagreeing on what is the best way to do it;
Can we use all information used so far to get the most unique possible
for now?  Even if it's not perfect, it's a start.  Even if some of the
information seems extraneous to some people (and may be) it's still
better than nothing.

Short of using UUID i'd say doing something like this.  I've tried to
put the order of information from most static to most dynamic.

Using HiRes::Time

my $ip = $conn->remote_ip($ip);
my $rport = $conn->remote_port || "0";
my $lport = $conn->local_port || "0";
my $start = time;
my $id = "$$_$start.$lport_$ip:$rport";

-- 
JT Moree
0
jtmoree
8/29/2007 3:41:59 PM
On 8/29/07, JT Moree <jtmoree@kahalacorp.com> wrote:
> Given that we are still disagreeing on what is the best way to do it;
> Can we use all information used so far to get the most unique possible
> for now?  Even if it's not perfect, it's a start.  Even if some of the
> information seems extraneous to some people (and may be) it's still
> better than nothing.
>
> Short of using UUID i'd say doing something like this.  I've tried to
> put the order of information from most static to most dynamic.
>
> Using HiRes::Time
>
> my $ip = $conn->remote_ip($ip);
> my $rport = $conn->remote_port || "0";
> my $lport = $conn->local_port || "0";
> my $start = time;
> my $id = "$$_$start.$lport_$ip:$rport";
>
> --
> JT Moree
>

if you want to be paranoid, you have to have all 4 data points from
the connection- local port/ip, and remote port/ip, plus local boxes'
time with high granularity. if you re-gen '$start' with each
transaction within the connection, you dont need a per-connection
counter, provided that your time is fine enough to prevent collisions.

If you leave out any of the local info, an installation with two
servers with un-synced times could still gen the same id. if you add
it, then the only way you could have a collision is if your time is
not granular enough or gets set back.

tcp sequence numbers can also be useful here as a replacement for
time, but might be hard to get within perl?

allan

-- 
"The truth is an offense, but not a sin"
0
kitno455
8/29/2007 3:53:19 PM
On Wed, 2007-08-29 at 11:53 -0400, m. allan noah wrote:
> On 8/29/07, JT Moree <jtmoree@kahalacorp.com> wrote:
> > Given that we are still disagreeing on what is the best way to do it;
> > Can we use all information used so far to get the most unique possible
> > for now?  Even if it's not perfect, it's a start.  Even if some of the
> > information seems extraneous to some people (and may be) it's still
> > better than nothing.
> >
> > Short of using UUID i'd say doing something like this.  I've tried to
> > put the order of information from most static to most dynamic.
> >
> > Using HiRes::Time

i.e.

use HiRes::Time qw (time);

> >
> > my $ip = $conn->remote_ip($ip);
> > my $rport = $conn->remote_port || "0";
> > my $lport = $conn->local_port || "0";
> > my $start = time;
> > my $id = "$$_$start.$lport_$ip:$rport";
> >
> > --
> > JT Moree
> >
> 
> if you want to be paranoid, you have to have all 4 data points from

Why is there all this confusion about "security" ?  The goal is to have
a unique MessageID for logs ... 

[snip]
> tcp sequence numbers can also be useful here as a replacement for

I doubt it very much.  TCP sequence numbers have a history of poor
implementation.

> time, but might be hard to get within perl?
> 
> allan

-- 
--gh


0
gwhulbert
8/29/2007 4:05:14 PM
> > Isn't localport always 25?
> the most time: yes.
> But it can also be 465

Also port 587 (message submission as per RFC2476).


Regards
Michael

-- 
It's an insane world, but i'm proud to be a part of it. -- Bill Hicks
0
kju
8/29/2007 4:06:35 PM
> If you leave out any of the local info, an installation with two
> servers with un-synced times could still gen the same id. if you add
> it, then the only way you could have a collision is if your time is
> not granular enough or gets set back.

I'm ok with that

 Using HiRes::Time

my $lip = $conn->local_ip();
my $rip = $conn->remote_ip();
my $rport = $conn->remote_port || "0";
my $lport = $conn->local_port || "0";
my $start = time;
my $id = "$$_$start_$lip:$lport_$rip:$rport";

-- 
JT Moree
0
jtmoree
8/29/2007 4:08:56 PM
On Wed, 29 Aug 2007 the voices made Guy Hulbert write:

GH> Why is there all this confusion about "security" ?  The goal is to have
GH> a unique MessageID for logs ... 

 Then forget about the word "security", and let's just say that people might 
want to have unique IDs that'll be unique even when they've got more than one 
server and centralized/aggregated logging... But we're not even there right 
now, "we" are still stuck on how to make the IDs 100% unique within a single 
server as it might be setup by "any" qpsmtpd-user.



	/Tony
-- 
"Generally speaking, taunting mentally unstable people is a bad idea."
0
tony
8/29/2007 4:15:44 PM
On 8/29/07, Guy Hulbert <gwhulbert@eol.ca> wrote:
> > if you want to be paranoid, you have to have all 4 data points from
>
> Why is there all this confusion about "security" ?  The goal is to have
> a unique MessageID for logs ...

i never said security. i said paranoid, specifically about collisions.

allan

-- 
"The truth is an offense, but not a sin"
0
kitno455
8/29/2007 4:23:43 PM
On Wed, 2007-08-29 at 12:23 -0400, m. allan noah wrote:
> On 8/29/07, Guy Hulbert <gwhulbert@eol.ca> wrote:
> > > if you want to be paranoid, you have to have all 4 data points from
> >
> > Why is there all this confusion about "security" ?  The goal is to have
> > a unique MessageID for logs ...
> 
> i never said security. i said paranoid, specifically about collisions.

If the message ID is "unique" there will be no collisions.  So I
interpreted your "paranoia" ... my bad.

> 
> allan

-- 
--gh


0
gwhulbert
8/29/2007 4:27:15 PM
On Wed, 2007-08-29 at 18:15 +0200, Tony L. Svanstrom wrote:
> On Wed, 29 Aug 2007 the voices made Guy Hulbert write:
> 
> GH> Why is there all this confusion about "security" ?  The goal is to have
> GH> a unique MessageID for logs ... 
> 
>  Then forget about the word "security", and let's just say that people might 
> want to have unique IDs that'll be unique even when they've got more than one 
> server and centralized/aggregated logging... But we're not even there right 
> now, "we" are still stuck on how to make the IDs 100% unique within a single 
> server as it might be setup by "any" qpsmtpd-user.

There have been several adequate suggestions.  This is only a problem if
it goes into the qpsmtpd core since some of the suggestions are reported
to be in use already.

Perhaps it would help to agree on a list of requirements.  From what I
can remember these are:

	1. A unique ID per message (on one server).
	2. Ability to distinguish per recipient.
	3. Ability to identify the server.

A sequence solves (1) except for simultaneous processing of
incoming messages via:

	a) async
	b) threads/multiple cpus
	c) local ports (possibly on multiple addresses)

Except with multiple CPUs, time with sufficient resolution is a
satisfactory replacement for a sequence.

It may be useful to log things like remote_port but it doesn't seem to
help directly to solve problem 1.

A counter solves 2.

Any tag which is unique per server solves 3.  It is probably simpler to
make this configurable by the end-user.

> 
> 
> 
> 	/Tony
-- 
--gh


0
gwhulbert
8/29/2007 5:07:06 PM
A UUID is preferable to the other solutions because you can condense it
down to 128 bits of binary data ... and put it in a database. :)

The other solutions are not as database friendly.  It seems to me if
we're trying to solve the problem of guaranteeing unique transaction ids
for extremely high volume sites, then we should make sure that the
transaction id itself is high volume friendly.

Cheers,

ds
0
dave
8/29/2007 5:14:46 PM
On Wed, 2007-08-29 at 10:14 -0700, David Sparks wrote:
> A UUID is preferable to the other solutions because you can condense it
> down to 128 bits of binary data ... and put it in a database. :)

HiRes::Timer is 64 bits ... leaving 64 bits for the server tag.

> 
> The other solutions are not as database friendly.  It seems to me if
> we're trying to solve the problem of guaranteeing unique transaction ids
> for extremely high volume sites, then we should make sure that the
> transaction id itself is high volume friendly.
> 
> Cheers
-- 
--gh


0
gwhulbert
8/29/2007 5:43:38 PM
Guy Hulbert wrote:
> There have been several adequate suggestions.  This is only a problem if
> it goes into the qpsmtpd core since some of the suggestions are reported
> to be in use already.
how is this a problem.  those uses should still work even if we start
with the same variable because they would overwrite what is in core.
The plugin maintainers can update as they have time.

good idea about the requirements.

> 	1. A unique ID per message (on one server).
> 	2. Ability to distinguish per recipient.
> 	3. Ability to identify the server.

2) per recipient or per message?  I don't see a way to make an id per
recipient since any message can have multiple recipients.

3) which server are we talking about?

> A sequence solves (1) except for simultaneous processing of
<snip>
> A counter solves 2.
> 
> Any tag which is unique per server solves 3.  It is probably simpler to
> make this configurable by the end-user.

if A solves 1, B solves 2, and C solves 3 then A+B+C should solve all
three and it's pretty simple to do so let's just do it.

While letting the end user make changes is nice it defeats the purpose
of putting a transaction ID into core where everyone can know and rely
on it working the same way.

-- 
JT Moree
0
jtmoree
8/29/2007 6:16:28 PM
Tony L. Svanstrom wrote:
>  Then forget about the word "security", and let's just say that people might 
> want to have unique IDs that'll be unique even when they've got more than one 
> server and centralized/aggregated logging... But we're not even there right 
> now, "we" are still stuck on how to make the IDs 100% unique within a single 
> server as it might be setup by "any" qpsmtpd-user.

No, that much works, as far as I've been able to prove. It's just a 
bunch of bikeshed painting going on now :-)

I'd be happy to add a quick hash of the server in.

Matt.
0
matt
8/29/2007 7:06:18 PM
On Wed, 2007-08-29 at 11:16 -0700, JT Moree wrote:
> Guy Hulbert wrote:
> > There have been several adequate suggestions.  This is only a problem if
> > it goes into the qpsmtpd core since some of the suggestions are reported
> > to be in use already.
> how is this a problem.  those uses should still work even if we start

I think you answered this at the end.

> with the same variable because they would overwrite what is in core.
> The plugin maintainers can update as they have time.
> 
> good idea about the requirements.

Well if people restrict their input to the requirements it simplifies
things.

> 
> > 	1. A unique ID per message (on one server).
> > 	2. Ability to distinguish per recipient.
> > 	3. Ability to identify the server.
	4. Well-defined format (e.g. UUID).
> 
> 2) per recipient or per message?  I don't see a way to make an id per
> recipient since any message can have multiple recipients.

There was a suggestion way back in the thread that this was required.  I
don't really know if it is required but it has been mentioned more than
once (by people besides me).

> 
> 3) which server are we talking about?

If you use syslog you can have all your logs in one place but if you are
running multiple mail servers then you might want to know which server
is responsible for a particular message ID.

> 
> > A sequence solves (1) except for simultaneous processing of
> <snip>
> > A counter solves 2.
> > 
> > Any tag which is unique per server solves 3.  It is probably simpler to
> > make this configurable by the end-user.
> 
> if A solves 1, B solves 2, and C solves 3 then A+B+C should solve all
> three and it's pretty simple to do so let's just do it.

I would just use either what Matt Seargent is using 

http://www.nntp.perl.org/group/perl.qpsmtpd/2007/08/msg7116.html

        Yeah, we use HiRes::time() . ".$$" and we don't get any file stomping  
        (and we're doing millions of emails/day).

or somthing like a UUID ... 

Here is an old UUID I have lying around:
	f9c31c2d-b3fb-0310-82b0-c4cdd2013627
so we can make it look something like that.

        use Time::HiRes qw( gettimeofday );
        print sprintf("%08x-%08x-%04x\n",gettimeofday,$$);

	46d5cf96-000045e3-3348

This, at least looks a bit like a UUID and can be extended with -%04x
formatted pieces.  As long as (2) and (3) are not needed, we are done.

I have to run now ...

> While letting the end user make changes is nice it defeats the purpose
> of putting a transaction ID into core where everyone can know and rely
> on it working the same way.

> 
-- 
--gh


0
gwhulbert
8/29/2007 8:06:56 PM
--61jdw2sOBCFtR2d/
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-29 13:07:06 -0400, Guy Hulbert wrote:
> On Wed, 2007-08-29 at 18:15 +0200, Tony L. Svanstrom wrote:
> > On Wed, 29 Aug 2007 the voices made Guy Hulbert write:
> >=20
> > GH> Why is there all this confusion about "security" ?  The goal is to =
have
> > GH> a unique MessageID for logs ...=20
> >=20
> >  Then forget about the word "security", and let's just say that people =
might=20
> > want to have unique IDs that'll be unique even when they've got more th=
an one=20
> > server and centralized/aggregated logging... But we're not even there r=
ight=20
> > now, "we" are still stuck on how to make the IDs 100% unique within a s=
ingle=20
> > server as it might be setup by "any" qpsmtpd-user.
>=20
> There have been several adequate suggestions.  This is only a problem if
> it goes into the qpsmtpd core since some of the suggestions are reported
> to be in use already.
>=20
> Perhaps it would help to agree on a list of requirements.  From what I
> can remember these are:
>=20
> 	1. A unique ID per message (on one server).

I'd rephrase that as "unique ID per transaction". Not every transaction
results in a message (indeed, on my systems 90+% of transactions don't
result in a message).


> 	2. Ability to distinguish per recipient.

I'm not even sure what "per recipient" should mean here. Does it mean
"per RCPT command", so that a log file looks something like this:

abcdef.0 Accepted connection 1/15 from 192.0.2.1 /foo.example.com
abcdef.0 check_earlytalker plugin: remote host said nothing spontaneous, pr=
oceeding
abcdef.0 220 ns1.hjp.at ESMTP qpsmtpd 0.40 ready; send us your mail, but no=
t your spam.
abcdef.0 dispatching EHLO foo.example.com
abcdef.0 250-ns1.hjp.at Hi foo.example.com [192.0.2.1]
abcdef.0 250-PIPELINING
abcdef.0 250-8BITMIME
abcdef.0 250 STARTTLS
abcdef.0 dispatching MAIL FROM:<somebody@example.com>
abcdef.0 from email address : [<somebody@example.com>]
abcdef.0 Plugin check_badmailfrom, hook mail returned DECLINED
abcdef.0 250 <somebody@example.com>, sender OK - how exciting to get mail f=
rom you!
abcdef.1 dispatching RCPT TO:<hjp@hjp.at>
abcdef.1 to email address : [<hjp@hjp.at>]
abcdef.1 Plugin aliases_check, hook rcpt returned DECLINED,
abcdef.1 Plugin spamhaus, hook rcpt returned DECLINED,
abcdef.1 250 <hjp@hjp.at>, recipient ok
abcdef.2 dispatching RCPT TO:<postmaster@hjp.at>
abcdef.2 to email address : [<postmaster@hjp.at>]
abcdef.2 Plugin aliases_check, hook rcpt returned DECLINED,
abcdef.2 Plugin spamhaus, hook rcpt returned DECLINED,
abcdef.2 250 <postmaster@hjp.at>, recipient ok
abcdef.0 dispatching DATA
=2E..

or really distinguish recipients? The latter doesn't make much sense to
me (before the first RCPT there are 0 recpients, and after the second
(successful) RCPT there is more than one, so there are a lot of cases
where this is ambiguous. As for the former, I don't see that much use in
it, either. Grouping lines from "dispatching ..." to the response
together seems easy enough, and if you find that hard for some reason,
it doesn't apply only to recipients - you might want a command counter.


> 	3. Ability to identify the server.

	4. Ability to identify the connection.

	   A connection can contain several transactions, and would not
	   like to lose the information that two log entries are from
	   the same connection.

If we want transaction (and possibly command) ids, I would derive them
=66rom the connection id via simple counters:

$transaction_id =3D "$connection_id.$transaction_counter"
$command_id =3D "$transaction_id.$command_counter"

where the counters are local to their parent and start at 0.

> A sequence solves (1) except for simultaneous processing of
> incoming messages via:
>=20
> 	a) async
> 	b) threads/multiple cpus
> 	c) local ports (possibly on multiple addresses)

I think you'll have to define "sequence". If you have one global
sequence, that will work in all of these cases. Or you can have multiple
sequences, but then you need a prefix to distinguish them.

	hp

--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--61jdw2sOBCFtR2d/
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1eGTfZ+RkG8quy0RAjyfAJ9QPnyKu+7WIrQPEds5ayw3mxsFtQCfX+6q
e1DqcNtZ9GFU3e9iduGaAig=
=br8d
-----END PGP SIGNATURE-----

--61jdw2sOBCFtR2d/--
0
hjp
8/29/2007 9:13:55 PM
--S1BNGpv0yoYahz37
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-29 09:08:56 -0700, JT Moree wrote:
> > If you leave out any of the local info, an installation with two
> > servers with un-synced times could still gen the same id. if you add
> > it, then the only way you could have a collision is if your time is
> > not granular enough or gets set back.
>=20
> I'm ok with that
>=20
>  Using HiRes::Time
>=20
> my $lip =3D $conn->local_ip();

    up to 15 characters (39 with IPv6)

> my $rip =3D $conn->remote_ip();

    up to 15 characters (39 with IPv6)

> my $rport =3D $conn->remote_port || "0";

    up to 5 characters

> my $lport =3D $conn->local_port || "0";

    up to 5 characters

> my $start =3D time;

    up to 16 characters

$$

    up to 5 characters (10 for 32bit PIDs)

> my $id =3D "$$_$start_$lip:$lport_$rip:$rport";

    5 + 1 + 16 + 1 + 15 + 1 + 5 + 1 + 15 + 1 + 5 =3D 66 characters

or even

    10 + 1 + 16 + 1 + 39 + 1 + 5 + 1 + 39 + 1 + 5 =3D 119 characters

on some systems. Much too long for an ID which is included in each log
line. You could condense it by using base 36 instead of base 10, but
it's still quite bulky.

	hp


--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--S1BNGpv0yoYahz37
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1eN7fZ+RkG8quy0RAkffAJoDZ1n6hcyTqrdF0+jVHycBwTI8nwCgozhN
8QOWtnRvFO9oBj+k/kw0ZJ8=
=zP29
-----END PGP SIGNATURE-----

--S1BNGpv0yoYahz37--
0
hjp
8/29/2007 9:22:03 PM
On 29-Aug-07, at 1:07 PM, Guy Hulbert wrote:

> On Wed, 2007-08-29 at 18:15 +0200, Tony L. Svanstrom wrote:
>>
>>  Then forget about the word "security", and let's just say that  
>> people might
>> want to have unique IDs that'll be unique even when they've got  
>> more than one
>> server and centralized/aggregated logging... But we're not even  
>> there right
>> now, "we" are still stuck on how to make the IDs 100% unique  
>> within a single
>> server as it might be setup by "any" qpsmtpd-user.
>
> There have been several adequate suggestions.  This is only a  
> problem if
> it goes into the qpsmtpd core since some of the suggestions are  
> reported
> to be in use already.

That doesn't matter as they haven't created $tran->id - they've just  
put something in ->notes() which will continue to work.

> Perhaps it would help to agree on a list of requirements.  From what I
> can remember these are:
>
> 	1. A unique ID per message (on one server).
> 	2. Ability to distinguish per recipient.
> 	3. Ability to identify the server.

I think you've made #2 confusing... I think what you mean is we want  
a new id when the transaction is reset (i.e. same connection, new  
email). That's fine.

> A sequence solves (1) except for simultaneous processing of
> incoming messages via:
>
> 	a) async
> 	b) threads/multiple cpus
> 	c) local ports (possibly on multiple addresses)

I don't think any of these break when using a the timer. But to  
settle that concern I've updated the implementation again to use even  
finer grained time (microseconds) and add in rand() in case the  
gettimeofday timer is on a slow clock.

So now it's:

    "secs.microsecs.rand.pid"

There's a requirement that I'd like to add in: the ability to use the  
id as a filename for storage, and have it sort by time.

> Except with multiple CPUs, time with sufficient resolution is a
> satisfactory replacement for a sequence.

I don't see what difference multiple CPUs makes. Adding in pid takes  
care of that.

> It may be useful to log things like remote_port but it doesn't seem to
> help directly to solve problem 1.

Yup. I removed it now - it was stupid to add it in - I just wasn't  
thinking.

> A counter solves 2.

Consider counter = rand().

> Any tag which is unique per server solves 3.  It is probably  
> simpler to
> make this configurable by the end-user.

I've added in a basic hashed version of hostname now.

Matt.
0
matt
8/29/2007 9:50:15 PM
On Wed, 29 Aug 2007, Guy Hulbert wrote:

> 	1. A unique ID per message (on one server).
> 	2. Ability to distinguish per recipient.
> 	3. Ability to identify the server.
>
> A sequence solves (1) except for simultaneous processing of
> incoming messages via:
>
> 	a) async
> 	b) threads/multiple cpus
> 	c) local ports (possibly on multiple addresses)
>
> Except with multiple CPUs, time with sufficient resolution is a
> satisfactory replacement for a sequence.

"Except with multiple CPUs" is a big problem. OTOH, as has been mentioned 
multiple times, a four-tuple identifying the TCP connection plus a 
timestamp will be satisfactory with any number of CPUs, and with very fast 
networks.

> It may be useful to log things like remote_port but it doesn't seem to
> help directly to solve problem 1.
>
> A counter solves 2.
>
> Any tag which is unique per server solves 3.  It is probably simpler to
> make this configurable by the end-user.

A four-tuple identifying the TCP connection also identifies the server.

---
Charlie
0
charlieb
8/29/2007 9:50:28 PM
On 29-Aug-07, at 5:50 PM, Charlie Brady wrote:

> "Except with multiple CPUs" is a big problem. OTOH, as has been  
> mentioned multiple times, a four-tuple identifying the TCP  
> connection plus a timestamp will be satisfactory with any number of  
> CPUs, and with very fast networks.

pid entirely satisfies this problem.

Matt.
0
matt
8/29/2007 9:52:25 PM
On Wed, 29 Aug 2007, David Sparks wrote:

> A UUID is preferable to the other solutions because you can condense it
> down to 128 bits of binary data ... and put it in a database. :)

A UUID, OTOH, only makes collisions unlikely, not impossible, and also 
uses up entropy, which may be a relatively scarce resource.

> The other solutions are not as database friendly.

Really?

---
Charlie
0
charlieb
8/29/2007 9:52:38 PM
On Wed, 29 Aug 2007, Matt Sergeant wrote:

> On 28-Aug-07, at 11:04 PM, Charlie Brady wrote:
>
>> > Err, actually I had a brain fart. It should be remote_port.
>> 
>> No, it should be remote_IP.remote_port.local_port and should include a 
>> transaction_within_connection count. I don't think that pid adds anything.
>
> Please try any way you can to get the algorithm I've used to generate a 
> duplicate transaction id. Feel free to use your fastest hardware.

My fastest hardware isn't relevant. And I don't have any fast hardware :-)

> I've tried, and cannot conceive of any way to get a repeat with this 
> algorithm.

"This algorith" I take to mean your proposal of time() . $$ . 
$remote_port.

That is just asserting that no single process could receive two 
connections in the same tick of time() (because if it could, it's trivial 
to arrange for them to have the same remote port). I can conceive of that 
happening, so we should do better. Use the four-tuple.

> Perhaps in 30 years maybe (when computers are that fast), but for 
> now it works well.

But not perfectly :-) Nor as well as it could with a tiny bit more effort.

---
Charlie
0
charlieb
8/29/2007 10:03:19 PM
On Wed, 2007-08-29 at 18:03 -0400, Charlie Brady wrote:
> That is just asserting that no single process could receive two 
> connections in the same tick of time() (because if it could, it's
> trivial 

Just assume that time() can have the granularity of the CPU instruction
counter[1].  However, with a 16 bit PID and 65K processors you might run
into collisions with the PID ... but I doubt anyone has a connection
machine to run qpsmtpd on.

I think time() + PID is sufficient "for now" ... unless threads share
the PID ... ( otoh, qpsmtpd is not even threaded is it ? ).


[1] In principal it could and in practice microseconds are probably
sufficient.

-- 
--gh


0
gwhulbert
8/29/2007 10:36:12 PM
On Wed, 29 Aug 2007 the voices made Matt Sergeant write:

MS> I've added in a basic hashed version of hostname now.

 Would this be a bad time to mention that people might get the idea that they 
want to run two different setups of qpsmtpd on the same server? Like one for 
incoming e-mails and one for outgoing (logging, whitelisting, preventing 
spam/viruses from exiting).

 Yeah, I saw the crypt+rand, but if something is worth doing... =)


	/Tony
-- 
"Generally speaking, taunting mentally unstable people is a bad idea."
0
tony
8/29/2007 10:38:51 PM
Peter.

I think it might help if you were to just rewrite "the requirements"
properly.  I don't have strong opinions on what the solution should be
nor what the requirements should be.  As long as the total number is
small and they are written concisely they will either converge or, if
necessary, we can vote.

On Wed, 2007-08-29 at 23:13 +0200, Peter J. Holzer wrote:
> >       1. A unique ID per message (on one server).
> 
> I'd rephrase that as "unique ID per transaction". Not every
> transaction
> results in a message (indeed, on my systems 90+% of transactions don't
> result in a message).

fine ... I was not clear on the distinction and I think the person who
started the thread has already started using "transaction ID"

> 
> 
> >       2. Ability to distinguish per recipient.
> 
> I'm not even sure what "per recipient" should mean here. Does it mean
> "per RCPT command", so that a log file looks something like this:

Yes. [ Again I'm not clear but "per RCPT command" was the previous
context I was referring to. ]

-- 
--gh


0
gwhulbert
8/29/2007 10:40:07 PM
> > my $lip = $conn->local_ip();
>     up to 15 characters (39 with IPv6)
> > my $rip = $conn->remote_ip();
>     up to 15 characters (39 with IPv6)
> > my $rport = $conn->remote_port || "0";
>     up to 5 characters
> > my $lport = $conn->local_port || "0";
>     up to 5 characters
> > my $start = time;
>     up to 16 characters
> $$
>     up to 5 characters (10 for 32bit PIDs)
> > my $id = "$$_$start_$lip:$lport_$rip:$rport";
>     5 + 1 + 16 + 1 + 15 + 1 + 5 + 1 + 15 + 1 + 5 = 66 characters
> or even
>     10 + 1 + 16 + 1 + 39 + 1 + 5 + 1 + 39 + 1 + 5 = 119 characters

Better encode it binary. E.g. for IPv4:

my $id = pack("NCNNNN",$$,$start,$lip,$lport,$rip,$rport)

Sum: 21 Bytes. Encoded in Base64: 28 Bytes.


Regards
Michael

-- 
It's an insane world, but i'm proud to be a part of it. -- Bill Hicks
0
kju
8/29/2007 10:49:35 PM
On 8/29/07, Matt Sergeant <matt@sergeant.org> wrote:
> On 29-Aug-07, at 5:50 PM, Charlie Brady wrote:
>
> > "Except with multiple CPUs" is a big problem. OTOH, as has been
> > mentioned multiple times, a four-tuple identifying the TCP
> > connection plus a timestamp will be satisfactory with any number of
> > CPUs, and with very fast networks.
>
> pid entirely satisfies this problem.

not on multiple machines with centralized logging, which is a fairly
common design.

allan

-- 
"The truth is an offense, but not a sin"
0
kitno455
8/29/2007 11:02:53 PM
On Thu, 2007-08-30 at 00:49 +0200, Michael Holzt wrote:
> > or even
> >     10 + 1 + 16 + 1 + 39 + 1 + 5 + 1 + 39 + 1 + 5 = 119 characters
> 
> Better encode it binary. E.g. for IPv4:

And better get the number of bits correct.  An IP address is a 32 bit
integer, not 15 characters.  Although perl converts scalars on-demand,
it correctly preserves integer values.

>
> my $id = pack("NCNNNN",$$,$start,$lip,$lport,$rip,$rport)

I thought port numbers had been pretty much refuted already.

If not, it would help to add a requirement to the list and explain why
they provide the best solution.

> 
> Sum: 21 Bytes. Encoded in Base64: 28 Bytes.

For discussion, I started using sprintf("0#x%- ... ") since I expect my
logs to be text and this looks like a UUID.  The number of bits is just
4* the number of characters.  It is simple enough to switch to pack or
base64 later or use pack as an option for those who want binary logs.

So far no-one has come up with much objection to
	sprintf("%08x-%08x-%04x",gettimeofday,$$)
This satisfies requirements (1) and (4) and Matt says it works for him.

If we need a 16-bit, RCPT counter, req (2), then:
	sprintf("%08x-%08x-%04x-%04x",gettimeofday,$$,$RCPT)

If we need a host ID, req (3), then using IP-Addr (IPv4 only):
	sprintf("%0x8-%0x8-%04x-%04x-%08x",gettimeofday,$$,$RCPT,$IP)
though I think a substring of config('me') might do the job and would be
IPv-agnostic.  At this point we are at 128 bits (8 bytes) - same as
UUID.


-- 
--gh


0
gwhulbert
8/29/2007 11:15:37 PM
On Wed, 2007-08-29 at 19:15 -0400, Guy Hulbert wrote:
[snip]
> And better get the number of bits correct.  An IP address is a 32 bit
> integer, not 15 characters.  Although perl converts scalars on-demand,

I should have said 'unsigned'.

[snip]
> IPv-agnostic.  At this point we are at 128 bits (8 bytes) - same as
> UUID.

guy@cal:~$ ./quid
46d604d6-000827f7-4adc-00000000-c0a80005

------- 'quid' = qpsmtpd uid -----
#!/usr/bin/perl -w

use strict;
use Time::HiRes qw( gettimeofday );
use Socket;
use vars qw ( $rcpt $iaddr $me );
$me='cal';
$iaddr = unpack('N',((gethostbyname($me))[4]));
$rcpt = 0;
print sprintf("%08x-%08x-%04x-%08x-%08x\n",gettimeofday,$$,$rcpt,$iaddr);
----------------------------------

Requirements
 1. Unique transaction ID (gettimeofday, $$)
 2. RCPT counter ($rcpt)
 3. Multiple hosts ($iaddr)
 4. Format (e.g. %0#x-)

[snip]
-- 
--gh


0
gwhulbert
8/29/2007 11:51:02 PM
On 29-Aug-07, at 6:03 PM, Charlie Brady wrote:

> That is just asserting that no single process could receive two  
> connections in the same tick of time() (because if it could, it's  
> trivial to arrange for them to have the same remote port). I can  
> conceive of that happening, so we should do better. Use the four- 
> tuple.

Just because you can conceive of it doesn't make it so. I can  
conceive of flying monkeys too.

And yes, remote_port was dumb. It's gone now.

Matt.

0
matt
8/30/2007 12:20:35 AM
On 29-Aug-07, at 6:38 PM, Tony L. Svanstrom wrote:

> On Wed, 29 Aug 2007 the voices made Matt Sergeant write:
>
> MS> I've added in a basic hashed version of hostname now.
>
>  Would this be a bad time to mention that people might get the idea  
> that they
> want to run two different setups of qpsmtpd on the same server?

No that's fine. PID is still in there taking care of that.

Matt.
0
matt
8/30/2007 12:21:22 AM
On 29-Aug-07, at 7:02 PM, m. allan noah wrote:

> On 8/29/07, Matt Sergeant <matt@sergeant.org> wrote:
>> On 29-Aug-07, at 5:50 PM, Charlie Brady wrote:
>>
>>> "Except with multiple CPUs" is a big problem. OTOH, as has been
>>> mentioned multiple times, a four-tuple identifying the TCP
>>> connection plus a timestamp will be satisfactory with any number of
>>> CPUs, and with very fast networks.
>>
>> pid entirely satisfies this problem.
>
> not on multiple machines with centralized logging, which is a fairly
> common design.

Hostname is also part of the id (hashed down to a few chars).

Matt.

0
matt
8/30/2007 12:22:38 AM
--4Ckj6UjgE2iN1+kY
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-29 18:36:12 -0400, Guy Hulbert wrote:
> On Wed, 2007-08-29 at 18:03 -0400, Charlie Brady wrote:
> > That is just asserting that no single process could receive two=20
> > connections in the same tick of time() (because if it could, it's
> > trivial=20
>=20
> Just assume that time() can have the granularity of the CPU instruction
> counter[1].

It could (if your perl implementation uses 128 bit long doubles), but it
isn't guaranteed to have that. You have to plan for the worst case and
that's probably a 60 Hz counter.

Here are some (measured) resolutions of gettimeofday on various systems:

Linux/i386:      1 ms
Linux/SPARC:     2 ms
HP-UX/PA-RISC:   2 ms
Linux/Alpha:   976 ms (1024 Hz)

Ok, so the Alpha is obsolete, and Sun and HP hardware seems to include a
timer with reasonably high resolution (both systems are a bit old I'd
expect newer gear get . I don't know anything about
PowerPC hardware though, and maybe we should worry about ARM for
embedded devices (although it could be argued that this is the least
problem for anybody building a mail-toaster on very small hardware).


> However, with a 16 bit PID and 65K processors you might run
> into collisions with the PID ...

I don't know see that follows. The PID still has to be unique at any
particular time. If a system can run more than 32k processes in parallel
it must use a 32 bit PID.=20

The combination of hires time and pid is more likely to be non-unique
for an async server. It might be possible to call accept() and
gettimeofday() twice within the same microsecond.

> but I doubt anyone has a connection machine to run qpsmtpd on.
>=20
> I think time() + PID is sufficient "for now" ... unless threads share
> the PID ...

They do on most systems - but you could use the TID instead of the PID.

> ( otoh, qpsmtpd is not even threaded is it ? ).

It might be possible to run Apache::Qpsmtpd on a multithreaded Apache.

	hp

--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--4Ckj6UjgE2iN1+kY
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1nsEfZ+RkG8quy0RAu/KAJ4plpfbjxzO5juifvfSl4GAOIiwWgCgxblC
pBEN+1ZbGLJOWtHXhXywomE=
=nbWZ
-----END PGP SIGNATURE-----

--4Ckj6UjgE2iN1+kY--
0
hjp
8/30/2007 8:08:36 AM
--NDin8bjvE/0mNLFQ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-30 10:08:36 +0200, Peter J. Holzer wrote:
> Here are some (measured) resolutions of gettimeofday on various systems:
>=20
> Linux/i386:      1 ms
> Linux/SPARC:     2 ms
> HP-UX/PA-RISC:   2 ms
> Linux/Alpha:   976 ms (1024 Hz)
>=20
> Ok, so the Alpha is obsolete, and Sun and HP hardware seems to include a
> timer with reasonably high resolution (both systems are a bit old I'd
> expect newer gear get .

The sentence in the parentheses was supposed to read: "both systems are
a bit old - I'd expect newer gear to get full microsecond resolution".
Don't know how I managed to garble it that badly.

	hp

--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--NDin8bjvE/0mNLFQ
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1oDhfZ+RkG8quy0RAo90AJwOxaFlJBLC5onKMGimIkIn6ZIEEwCfcJGw
Jy3cqyGT7eWtQ3TuEp40tAI=
=d6jk
-----END PGP SIGNATURE-----

--NDin8bjvE/0mNLFQ--
0
hjp
8/30/2007 8:33:37 AM
On Wed, 29 Aug 2007 the voices made Matt Sergeant write:

> On 29-Aug-07, at 6:38 PM, Tony L. Svanstrom wrote:
> 
> > On Wed, 29 Aug 2007 the voices made Matt Sergeant write:
> > 
> > MS> I've added in a basic hashed version of hostname now.
> > 
> > Would this be a bad time to mention that people might get the idea that 
> > they want to run two different setups of qpsmtpd on the same server?
> 
> No that's fine. PID is still in there taking care of that.

 True, but the code makes both the security guy and the programmer in me 
twitch...

 The part of the unique ID meant to identify the server is now focusing on the 
OS/computer instead of the instance of qpsmtpd; which one can only get away 
with as the PID is in the connection ID-part, and thus we shouldn't get more
collisions just because we run more than one instance on the same server.

 However, this is not only (currently) an undocumented and somewhat unobvious 
feature of the ID-generation, but it's also an unnecessary limitation.
 If people ever were to remove the PID, maybe as soon as at the end of this 
discussion, they might not think about fixing the $SALT_HOST.

 Using the IPs + port ought to be the way to go.


	/Tony
-- 
"Generally speaking, taunting mentally unstable people is a bad idea."
0
tony
8/30/2007 8:45:58 AM
On Thu, 2007-08-30 at 10:08 +0200, Peter J. Holzer wrote:
> On 2007-08-29 18:36:12 -0400, Guy Hulbert wrote:
[snip]
> > Just assume that time() can have the granularity of the CPU instruction
> > counter[1].
> 
> It could (if your perl implementation uses 128 bit long doubles), but it

Or you could have gettimeofday return 3 ints (sec, nano-sec, atto-sec)
instead of 2.

> isn't guaranteed to have that. You have to plan for the worst case and
> that's probably a 60 Hz counter.

Or you can just warn that the transaction ID may be broken on some
systems.  Does it provide some critical internal function or is it just
for logging ?

Or we can provide some alternate hack.

> 
> Here are some (measured) resolutions of gettimeofday on various systems:
> 
> Linux/i386:      1 ms
> Linux/SPARC:     2 ms
> HP-UX/PA-RISC:   2 ms
> Linux/Alpha:   976 ms (1024 Hz)

'ms' is usually milli-seconds but it appears you mean micro-seconds ( I
pretend that u=mu and write it 'us' ).

The alpha is a problem then.  However, Time::HiRes seems to be over 10
years old ... are the alpha boxes still being sold ?



[snip]
> > However, with a 16 bit PID and 65K processors you might run
> > into collisions with the PID ...
> 
> I don't know see that follows. The PID still has to be unique at any
> particular time. If a system can run more than 32k processes in parallel
> it must use a 32 bit PID. 

Doh.  Yeah, iirc it's been 32 bits on AIX since 1992.

[snip]
> > but I doubt anyone has a connection machine to run qpsmtpd on.
> > 
> > I think time() + PID is sufficient "for now" ... unless threads share
> > the PID ...
> 
> They do on most systems - but you could use the TID instead of the PID.

Yup.

> 
> > ( otoh, qpsmtpd is not even threaded is it ? ).
> 
> It might be possible to run Apache::Qpsmtpd on a multithreaded Apache.

Unlikely.  PHP people still won't bless mt apache.  Postgres people
discovered a problem with crypt() - from libc - you must not use crypt()
passwords with Pg on mt apache (the problem is only seen with very high
loads though).

> 
> 	hp

-- 
--gh


0
gwhulbert
8/30/2007 11:07:51 AM
On Thu, 2007-08-30 at 10:45 +0200, Tony L. Svanstrom wrote:
> > > Would this be a bad time to mention that people might get the idea that 
> > > they want to run two different setups of qpsmtpd on the same server?
> > 
> > No that's fine. PID is still in there taking care of that.
> 
>  True, but the code makes both the security guy and the programmer in me 
> twitch...

rotfl

> 
>  The part of the unique ID meant to identify the server is now focusing on the 
> OS/computer instead of the instance of qpsmtpd; which one can only get away 
> with as the PID is in the connection ID-part, and thus we shouldn't get more
> collisions just because we run more than one instance on the same server.
> 
>  However, this is not only (currently) an undocumented and somewhat unobvious 
> feature of the ID-generation, but it's also an unnecessary limitation.
>  If people ever were to remove the PID, maybe as soon as at the end of this 
> discussion, they might not think about fixing the $SALT_HOST.

wtf does this mean - the *purpose* of the discussion is to *fix* a
*unique* transaction ID when the discussion is over it is *fixed* and
the discussion *documents* the implementation.

What do you mean "people ever were to remove the PID" ?  If you make
random changes to any piece of code it's going to break.

> 
>  Using the IPs + port ought to be the way to go.

Please clarify.

Given sufficient resolution in the time(), you cannot have two processes
on the CPU for the same transaction ID.  I thought there might be a
problem if you have multiple CPUs but Peter H. has pointed out that the
PIDs must be different in that case.

You need to have context-switching faster than the clock resolution for
collisions in time() -- Peter has shown that the clock resolution is
(close to) 1 us for all likely systems other than the alpha.


> 

-- 
--gh


0
gwhulbert
8/30/2007 11:23:51 AM
On 30-Aug-07, at 4:45 AM, Tony L. Svanstrom wrote:

>  True, but the code makes both the security guy and the programmer  
> in me
> twitch...

Well, don't think of it for security then :-)

>  The part of the unique ID meant to identify the server is now  
> focusing on the
> OS/computer instead of the instance of qpsmtpd;

Not really. It uses a random salt. So every instance will be different.

Matt.
0
matt
8/30/2007 1:14:36 PM
On Thu, 2007-08-30 at 09:14 -0400, Matt Sergeant wrote:
> >  The part of the unique ID meant to identify the server is now  

Is this "unique ID" the "transaction ID" we've been discussing.

Has someone already implemented it in svn - I thought it was a new
proposal (I'm just a bit confused here) ?

> > focusing on the
> > OS/computer instead of the instance of qpsmtpd;
> 
> Not really. It uses a random salt. So every instance will be
> different.

That is not true.  Random numbers do not give unique results.  Also,
hash functions have collisions.  This is not a problem when using a hash
in perl because there is a collision-resolution mechanism.  For the
requirement of logging multiple independent qpsmtpd servers to a central
point there is no trivial mechanism to compare the results of the hash
function so you must use a predictable function on something unique to
the server.

The IP address (for IPv6 the 32 most-significant bits would probably
work) is one choice.  However, I think it might be better to use a value
derived from config('me') but it cannot be a hash.  A suitable
non-random choice might be substr(config('me')) padded with '_' to a
fixed length.  Since the sysadmin has to conifigure qpsmtpd to use it,
he can make sure that his configurations will work together (if he
cares).

-- 
--gh


0
gwhulbert
8/30/2007 1:34:38 PM
On 30-Aug-07, at 9:34 AM, Guy Hulbert wrote:

> On Thu, 2007-08-30 at 09:14 -0400, Matt Sergeant wrote:
>>>  The part of the unique ID meant to identify the server is now
>
> Is this "unique ID" the "transaction ID" we've been discussing.

Yes.

> Has someone already implemented it in svn - I thought it was a new
> proposal (I'm just a bit confused here) ?

Yes, it's in svn.

>>> focusing on the
>>> OS/computer instead of the instance of qpsmtpd;
>>
>> Not really. It uses a random salt. So every instance will be
>> different.
>
> That is not true.  Random numbers do not give unique results.

True enough. But I'm going out on a limb to assume that it's good  
enough for logging. It's not a security feature.

Matt.
0
matt
8/30/2007 2:01:27 PM
On Thu, 30 Aug 2007 the voices made Guy Hulbert write:

GH> wtf does this mean - the *purpose* of the discussion is to *fix* a
GH> *unique* transaction ID when the discussion is over it is *fixed* and
GH> the discussion *documents* the implementation.

 I meant undocumented as in it in Transaction.pm currently says "Generate 
unique id" without mentioning that the earlier defined $SALT_HOST relies on 
certain aspects of the ID-generation, without which the $id might not be unique 
in cases where there's more than one instance of qpsmtpd running on a single 
server. (Or on two different servers with the same hostname, which isn't 
exactly unheard of; it happens both by mistake and by design, for instance if 
setting up a testserver... which you still might want to use with whatever 
centralized logging you've got.)

GH> What do you mean "people ever were to remove the PID" ?  If you make
GH> random changes to any piece of code it's going to break.

 Random changes yes, but as this discussion has clearly shown it isn't 
unreasonable to consider creating unique IDs without using the PID 
(incrementing counter etc); and it isn't unreasonable to view the transaction 
ID and the server ID as two seperate things, which combined creates a 
(hopefully) universally unique ID. Even the (current) code structure reflects 
such thinking.

 To then use a server ID that I think everyone on this list can agree on has a 
lesser chance of being unique, esp. if minor changes are made to it, isn't as 
future/idiot-proof as it easily could be; and if it's easily done at least I 
prefer to write code that minimizes the chances that people will mess up when 
working with it.
 It's enough that someone removes the crypt+rand to easier search the logs for 
this solution (hostname-based) to theoretically start creating trouble/break 
(well, at least crack slightly in a corner or two).

GH> >  Using the IPs + port ought to be the way to go.
GH> 
GH> Please clarify.

 To qpsmtpd the hostname isn't as unique as the IPs + port used by it is.

 Actually, although IPs+port IMHO is better than hostname it was silly of me to 
say that it "ought to be the way to go", as it doesn't deal with special-use 
addresses well enough... but it'd be easy to catch those and do something 
create/output a warning.

 I think I'll exit the discussion here; you can battle it out among yourselves, 
and if I'm unhappy with the results I'll just show up with some code and 
restart the fire... ;-)

GH> Given sufficient resolution in the time(), you cannot have two processes
GH> on the CPU for the same transaction ID.  I thought there might be a
GH> problem if you have multiple CPUs but Peter H. has pointed out that the
GH> PIDs must be different in that case.
GH> 
GH> You need to have context-switching faster than the clock resolution for
GH> collisions in time() -- Peter has shown that the clock resolution is
GH> (close to) 1 us for all likely systems other than the alpha.

 Or the same hostname on a second server, which is something we shouldn't rule 
out...



	/Tony
-- 
"Generally speaking, taunting mentally unstable people is a bad idea."
0
tony
8/30/2007 2:07:52 PM
On Thu, 2007-08-30 at 10:01 -0400, Matt Sergeant wrote:
> > That is not true.  Random numbers do not give unique results.
> 
> True enough. But I'm going out on a limb to assume that it's good  
> enough for logging. It's not a security feature.

But this (by design[*]) doesn't meet the requirement.

The (ok, one) purpose of logging is to be able to trace the results of
running the service and if your hash collides ALL the messages from the
two servers where it collides will be ambiguous (by source).  Using a
non-random and predictable function on config('me') allows the user to
avoid this problem (without modifying the core code).

[*] As opposed to the implementation - where Peter has pointed out some
limitations of Time::HiRes on one old platform.

Thanks for the clarification on svn ... I'll have to check it out (but
not today) to see it.

> 
> Matt.
-- 
--gh


0
gwhulbert
8/30/2007 2:13:19 PM
On Thu, 2007-08-30 at 16:07 +0200, Tony L. Svanstrom wrote:
>  To qpsmtpd the hostname isn't as unique as the IPs + port used by it
> is.

But for qpsmptd the "hostname" is configurable ( config('me') ).  As
long as a hash is not used (see my follow-up to Matt) and the function
used is documented, e.g.: sprintf("%_8s",substr(config('me',0,8)) so:

me = linux1
	-> linux1__

me = linux2.example.com
 	-> linux2.e

If you run two instances you can call them 'thing1' and 'thing2'.

-- 
--gh


0
gwhulbert
8/30/2007 2:18:03 PM
On 30-Aug-07, at 10:07 AM, Tony L. Svanstrom wrote:

> On Thu, 30 Aug 2007 the voices made Guy Hulbert write:
>
> GH> wtf does this mean - the *purpose* of the discussion is to *fix* a
> GH> *unique* transaction ID when the discussion is over it is  
> *fixed* and
> GH> the discussion *documents* the implementation.
>
>  I meant undocumented as in it in Transaction.pm currently says  
> "Generate
> unique id" without mentioning that the earlier defined $SALT_HOST  
> relies on
> certain aspects of the ID-generation, without which the $id might  
> not be unique
> in cases where there's more than one instance of qpsmtpd running on  
> a single
> server.

Including PID takes care of that. And you're assuming a broken srand 
() too.

Admittedly, there's a very very remote freak possibility that given  
two identical hostnames, a rand() with a broken srand(), and those  
servers starting at the exact same microsecond time with the exact  
same PID, that you MIGHT, just MAYBE, get a duplicate transaction id.

The alternative seems to me the only way to satisfy your security  
paranoid mind is to use Data::UUID, which is an extra dependency I  
don't want to add in.

Matt.
0
matt
8/30/2007 2:30:56 PM
On Thu, 2007-08-30 at 10:30 -0400, Matt Sergeant wrote:
> On 30-Aug-07, at 10:07 AM, Tony L. Svanstrom wrote:
> 
> > On Thu, 30 Aug 2007 the voices made Guy Hulbert write:
[snip]
> > GH> the discussion *documents* the implementation.
> >
> >  I meant undocumented as in it in Transaction.pm currently says 

In principle, the documentation will be updated when the discussion is
complete.

>  
> > "Generate
> > unique id" without mentioning that the earlier defined $SALT_HOST  
> > relies on
> > certain aspects of the ID-generation, without which the $id might  
> > not be unique
> > in cases where there's more than one instance of qpsmtpd running on  
> > a single
> > server.
> 
> Including PID takes care of that. And you're assuming a broken srand 
> () too.
> 
> Admittedly, there's a very very remote freak possibility that given  
> two identical hostnames, a rand() with a broken srand(), and those  
> servers starting at the exact same microsecond time with the exact  
> same PID, that you MIGHT, just MAYBE, get a duplicate transaction id.

Nope.  I reject this.  The design ASSUMES that the clock has "sufficient
resolution".  It is the implementation which chooses Time::HiRes.  There
are two "perfect" solutions (bikesheds ;-):

1. Use a timer based directly on the values in the instruction count
register.  IIRC, the linux kernel clock (at least on intel) just
quantizes this in either micro- or nano- seconds. [bikeshed = kernel
patch]

2. Implement our own "clock" using a sequence generator, which reads the
last value out of the tail of the log on startup (and is
thread/async-safe).

I think that using PID is a bit of a hack but it seems to work in every
case that anyone has come up with.  It should be changed to TID, should
qpsmtpd ever be blessed as "thread-safe" but I'm not holding my breath
for that to happen ;-) ... besides, async is a much better choice
(compare lighttpd with apache).

> 
> The alternative seems to me the only way to satisfy your security  
> paranoid mind is to use Data::UUID, which is an extra dependency I 

I think the use of the adjective "security" in this context is rather
generous.

>  
> don't want to add in.
> 
> Matt.

-- 
--gh


0
gwhulbert
8/30/2007 2:57:55 PM
Woah - bikeshedding galore!

I just got my email downloaded to my mac (I'm traveling) and Mail.app  
says there are 61 mails in this thread (plus those I deleted  
earlier!?!).

Enough already.

If anyone has a serious realistic concern with what Matt did, please  
provide a perl implementation of mod_unique_id from Apache -  
otherwise then let's leave this alone for now.


  - ask
0
ask
8/30/2007 4:59:44 PM
Guy Hulbert wrote:

> me = linux1
> 	-> linux1__
> 
> me = linux2.example.com
>  	-> linux2.e
> 
> If you run two instances you can call them 'thing1' and 'thing2'.
> 
I'd rather not.

-- 
JT Moree
0
jtmoree
8/30/2007 5:37:53 PM
On Fri, 2007-08-31 at 00:59 +0800, Ask Bj�rn Hansen wrote:
> Woah - bikeshedding galore!
> 
> I just got my email downloaded to my mac (I'm traveling) and Mail.app  
> says there are 61 mails in this thread (plus those I deleted  
> earlier!?!).
> 
> Enough already.

There might have been a little less chat if he'd posted the code to the
list ... fwiw, here it is.

> 
> If anyone has a serious realistic concern with what Matt did, please  

http://svn.perl.org/qpsmtpd/trunk/lib/Qpsmtpd/Transaction.pm

  # Generate unique id
  # use gettimeofday for microsec precision
  # add in rand() in case gettimeofday clock is slow (e.g. bsd?)
  # add in $$ in case srand is set per process
  my ($start, $mstart) = gettimeofday();
  my $id = sprintf("%d.%06d.%s.%d.%d",
      $start,
      $mstart,
      $SALT_HOST, 
      rand(10000),
      $$,
  );



> provide a perl implementation of mod_unique_id from Apache -  
> otherwise then let's leave this alone for now.
> 
> 
>   - ask

-- 
--gh


0
gwhulbert
8/30/2007 5:53:47 PM
On Thu, 2007-08-30 at 13:53 -0400, Guy Hulbert wrote:
> On Fri, 2007-08-31 at 00:59 +0800, Ask Bj�rn Hansen wrote:
> > Woah - bikeshedding galore!
> > 
> > I just got my email downloaded to my mac (I'm traveling) and Mail.app  
> > says there are 61 mails in this thread (plus those I deleted  
> > earlier!?!).
> > 
> > Enough already.
> 
> There might have been a little less chat if he'd posted the code to the
> list ... fwiw, here it is.
> 
> > 
> > If anyone has a serious realistic concern with what Matt did, please  
> 
> http://svn.perl.org/qpsmtpd/trunk/lib/Qpsmtpd/Transaction.pm
> 

Sorry, I missed this bit:

        my $SALT_HOST => crypt(hostname, chr(65+rand(57)).chr(65+rand(57)));
        $SALT_HOST =~ tr/A-Za-z0-9//cd;
        

>   # Generate unique id
>   # use gettimeofday for microsec precision
>   # add in rand() in case gettimeofday clock is slow (e.g. bsd?)
>   # add in $$ in case srand is set per process
>   my ($start, $mstart) = gettimeofday();
>   my $id = sprintf("%d.%06d.%s.%d.%d",
>       $start,
>       $mstart,
>       $SALT_HOST, 
>       rand(10000),
>       $$,
>   );
> 
> 
> 
> > provide a perl implementation of mod_unique_id from Apache -  
> > otherwise then let's leave this alone for now.
> > 
> > 
> >   - ask

-- 
--gh


0
gwhulbert
8/30/2007 5:55:08 PM
On 30-Aug-07, at 10:57 AM, Guy Hulbert wrote:

> Nope.  I reject this.  The design ASSUMES that the clock has  
> "sufficient
> resolution".  It is the implementation which chooses Time::HiRes.

Fine, so on Alpha, you have a qpsmtpd installation that is using  
async and doing more than 1000 mails/second? And given that it has  
rand(10000) in there, you also need a rand() collision in that  
millisecond. You're reaching for a problem.

On "normal" platforms the minimum granularity is on the order of 1  
billion mails/sec. Let me know when you're building the single CPU  
system that can do that, I'd like to buy one.

Note that mod_unique_id is only designed for 64k hits/sec.

Matt.
0
matt
8/30/2007 6:19:18 PM
Ask asked us to stop ... but what the heck ;-).

Perhaps we should drop the list after this one though.

On Thu, 2007-08-30 at 14:19 -0400, Matt Sergeant wrote:
> On 30-Aug-07, at 10:57 AM, Guy Hulbert wrote:
> 
> > Nope.  I reject this.  The design ASSUMES that the clock has  
> > "sufficient
> > resolution".  It is the implementation which chooses Time::HiRes.
> 
> Fine, so on Alpha, you have a qpsmtpd installation that is using  

First, what I'm saying, is that I don't think we should be particularly
worried about an almost obsolete platform.  Also, I am quite happy with
whatever you decide as long as it reflects the requirements that
everyone has requested (which it seems to do).

> async and doing more than 1000 mails/second? And given that it has  
> rand(10000) in there, you also need a rand() collision in that

However.

Nope.

The problem with random number generators is that their output is
*random*.  That means that you will occasionally get results very close
together and when you quantize it (e.g. rand(100000)) it means you will
get the same number consecutively.  This is exactly what you do not want
when your problem is insufficiently resolved times.  You'd be better off
using a block-cipher (e.g. DES) which scatters results *uniformly*.

But either case is a hack so rand() will do since it's available.

Actually, I think the right answer is just a sequence generator (mod
10000).  That guarantees different consecutive results.  In python you
could just use an iterator ... I'm not sure about perl.

Have you read Knuth on random number generators ?  It's quite amusing.

>   
> millisecond. You're reaching for a problem.
> On "normal" platforms the minimum granularity is on the order of 1  
> billion mails/sec. Let me know when you're building the single CPU  
> system that can do that, I'd like to buy one.
> 
> Note that mod_unique_id is only designed for 64k hits/sec.
-- 
--gh


0
gwhulbert
8/30/2007 6:52:10 PM
--fUYQa+Pmc3FrFX/N
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-30 13:55:08 -0400, Guy Hulbert wrote:
> On Thu, 2007-08-30 at 13:53 -0400, Guy Hulbert wrote:
> > On Fri, 2007-08-31 at 00:59 +0800, Ask Bj=F8rn Hansen wrote:
> > > Woah - bikeshedding galore!
> > >=20
> > > I just got my email downloaded to my mac (I'm traveling) and Mail.app=
 =20
> > > says there are 61 mails in this thread (plus those I deleted =20
> > > earlier!?!).
> > >=20
> > > Enough already.
> >=20
> > There might have been a little less chat if he'd posted the code to the
> > list ... fwiw, here it is.
> >=20
> > >=20
> > > If anyone has a serious realistic concern with what Matt did, please =
=20
> >=20
> > http://svn.perl.org/qpsmtpd/trunk/lib/Qpsmtpd/Transaction.pm
> >=20
>=20
> Sorry, I missed this bit:
>=20
>         my $SALT_HOST =3D> crypt(hostname, chr(65+rand(57)).chr(65+rand(5=
7)));
                        ^^
			And Matt apparently missed this ;-).

>         $SALT_HOST =3D~ tr/A-Za-z0-9//cd;
>        =20
>=20
> >   # Generate unique id
> >   # use gettimeofday for microsec precision
> >   # add in rand() in case gettimeofday clock is slow (e.g. bsd?)
> >   # add in $$ in case srand is set per process
> >   my ($start, $mstart) =3D gettimeofday();
> >   my $id =3D sprintf("%d.%06d.%s.%d.%d",
> >       $start,
> >       $mstart,
> >       $SALT_HOST,=20
> >       rand(10000),
> >       $$,
> >   );

Anyway, I do have serious realistic concerns with that. To demonstrate
them I modified loggin/warn slightly:

--- plugins/logging/warn        2007-08-30 21:16:13.000000000 +0200
+++ plugins/logging/transaction_id      2007-08-30 20:51:41.000000000 +0200
@@ -31,7 +31,7 @@
   return DECLINED if defined $plugin and $plugin eq $self->plugin_name;=20
=20
   warn=20
-    join(" ", $$ .
+    join(" ", ($transaction ? $transaction->id : "???") .
          (defined $plugin ? " $plugin plugin:" :
           defined $hook   ? " running plugin ($hook):"  : ""),
          @log), "\n"


And here is some sample output (at level LOGINFO):

     1  1188500891.616465.eUqA14oEas5n2.2440.24861 Loaded Qpsmtpd::Plugin::=
logging::transaction_id=3DHASH(0x87d2c00)
     2  1188500891.616465.eUqA14oEas5n2.2440.24861 Listening on 0.0.0.0:2525
     3  1188500891.616465.eUqA14oEas5n2.2440.24861 Running as user hjp, gro=
up hjp
     4  1188500891.616465.eUqA14oEas5n2.2440.24861 Initializing spool_dir
     5  1188500891.616465.eUqA14oEas5n2.2440.24861 Permissions on spool_dir=
 /home/hjp/tmp/ are not 0700
     6  1188500891.616465.eUqA14oEas5n2.2440.24861 size_threshold set to 0
     7  1188500891.616465.eUqA14oEas5n2.2440.24861 Accepted connection 0/15=
 from 127.0.0.1 / localhost
     8  1188500891.616465.eUqA14oEas5n2.2440.24861 Connection from localhos=
t [127.0.0.1]
     9  1188500891.616465.eUqA14oEas5n2.2440.24861 check_earlytalker plugin=
: remote host said nothing spontaneous, proceeding
    10  1188500891.616465.eUqA14oEas5n2.2440.24861 220 hrunkner ESMTP qpsmt=
pd 0.40 ready; send us your mail, but not your spam.
    11  1188500891.616465.eUqA14oEas5n2.2440.24861 dispatching ehlo foo
    12  1188500891.616465.eUqA14oEas5n2.2440.24861 250-hrunkner Hi localhos=
t [127.0.0.1]
    13  1188500891.616465.eUqA14oEas5n2.2440.24861 250-PIPELINING
    14  1188500891.616465.eUqA14oEas5n2.2440.24861 250-8BITMIME
    15  1188500891.616465.eUqA14oEas5n2.2440.24861 250 AUTH PLAIN LOGIN CRA=
M-MD5
    16  1188500891.616465.eUqA14oEas5n2.2440.24861 dispatching mail from:<>
    17  1188500925.148306.eUqA14oEas5n2.6284.24865 full from_parameter: fro=
m:<>

The transaction id changes here (as it should), but how are we to know
that line 17 belongs to the same connection as line 16? Here there is
only one parallel connection (I am awful as speaking SMTP in parallel
:-), but if there are several it could happen that the "dispatching
mail" and "full from_parameter" lines of multiple connections are
interleaved. Then you lose the information about the client.


    18  1188500925.148306.eUqA14oEas5n2.6284.24865 from email address : [<>]
    19  1188500925.148306.eUqA14oEas5n2.6284.24865 getting mail from <>
    20  1188500925.148306.eUqA14oEas5n2.6284.24865 250 <>, sender OK - how =
exciting to get mail from you!
    21  1188500925.148306.eUqA14oEas5n2.6284.24865 dispatching rcpt to:<bla>
    22  1188500925.148306.eUqA14oEas5n2.6284.24865 to email address : [<bla=
>]
    23  1188500925.148306.eUqA14oEas5n2.6284.24865 501 could not parse reci=
pient
    24  1188500925.148306.eUqA14oEas5n2.6284.24865 dispatching rset
    25  1188500939.224301.eUqA14oEas5n2.2814.24865 250 OK

Similar to the above. The transaction id changes. Here the process id
stays the same because I'm using forkserver, but with -async all
connections use the same pid, so that doesn't help.


    26  1188500939.224301.eUqA14oEas5n2.2814.24865 dispatching mail from:<s=
omebody@example.net>
    27  1188500988.137621.eUqA14oEas5n2.4740.24865 full from_parameter: fro=
m:<somebody@example.net>
    28  1188500988.137621.eUqA14oEas5n2.4740.24865 from email address : [<s=
omebody@example.net>]
    29  1188500988.137621.eUqA14oEas5n2.4740.24865 getting mail from <someb=
ody@example.net>
    30  1188500988.137621.eUqA14oEas5n2.4740.24865 250 <somebody@example.ne=
t>, sender OK - how exciting to get mail from you!
    31  1188500988.137621.eUqA14oEas5n2.4740.24865 dispatching rcpt to:<foo=
@localhost>
    32  1188500988.137621.eUqA14oEas5n2.4740.24865 to email address : [<foo=
@localhost>]
    33  1188500988.137621.eUqA14oEas5n2.4740.24865 550 Relaying denied (#5.=
7.1)
    34  1188500988.137621.eUqA14oEas5n2.4740.24865 dispatching quit
    35  1188500988.137621.eUqA14oEas5n2.4740.24865 221 hrunkner closing con=
nection. Have a wonderful day.
    36  1188500988.137621.eUqA14oEas5n2.4740.24865 click, disconnecting
    37  1188500891.616465.eUqA14oEas5n2.2440.24861 cleaning up after 24865

Ok, we're back in the parent now and using the same transaction id
again.=20

    38  1188500891.616465.eUqA14oEas5n2.2440.24861 Accepted connection 0/15=
 from 127.0.0.1 / localhost
    39  1188500891.616465.eUqA14oEas5n2.2440.24861 Connection from localhos=
t [127.0.0.1]
    40  1188500891.616465.eUqA14oEas5n2.2440.24861 check_earlytalker plugin=
: remote host said nothing spontaneous, proceeding

And here we've accepted the next connection and forked and the
transaction id is still the same - so the beginnings of each connection
(up to the first MAIL command) cannot be distinguished.

    41  1188500891.616465.eUqA14oEas5n2.2440.24861 220 hrunkner ESMTP qpsmt=
pd 0.40 ready; send us your mail, but not your spam.
    42  1188500891.616465.eUqA14oEas5n2.2440.24861 dispatching ehlo foo
    43  1188500891.616465.eUqA14oEas5n2.2440.24861 250-hrunkner Hi localhos=
t [127.0.0.1]
    44  1188500891.616465.eUqA14oEas5n2.2440.24861 250-PIPELINING
    45  1188500891.616465.eUqA14oEas5n2.2440.24861 250-8BITMIME
    46  1188500891.616465.eUqA14oEas5n2.2440.24861 250 AUTH PLAIN LOGIN CRA=
M-MD5
    47  1188500891.616465.eUqA14oEas5n2.2440.24861 dispatching mail from:<a=
nother@connection>
    48  1188501079.319659.eUqA14oEas5n2.83.24915 full from_parameter: from:=
<another@connection>
    49  1188501079.319659.eUqA14oEas5n2.83.24915 from email address : [<ano=
ther@connection>]
    50  1188501079.319659.eUqA14oEas5n2.83.24915 getting mail from <another=
@connection>
    51  1188501079.319659.eUqA14oEas5n2.83.24915 250 <another@connection>, =
sender OK - how exciting to get mail from you!
    52  1188501079.319659.eUqA14oEas5n2.83.24915 dispatching rset
    53  1188501082.198193.eUqA14oEas5n2.7497.24915 250 OK
    54  1188501082.198193.eUqA14oEas5n2.7497.24915 dispatching quit
    55  1188501082.198193.eUqA14oEas5n2.7497.24915 221 hrunkner closing con=
nection. Have a wonderful day.
    56  1188501082.198193.eUqA14oEas5n2.7497.24915 click, disconnecting
    57  1188500891.616465.eUqA14oEas5n2.2440.24861 cleaning up after 24915

As I wrote before I consider a connection id more important than a
transaction id: If I have a connection, I can always split it into
transactions with a bit of SMTP knowledge. But if I have only
transactions, I cannot reassemble them back into connections, so
information which is only recorded once per connection (like the client
address or EHLO parameter) is lost.

Unfortunately a similar patch to Qpsmtpd::Connection doesn't work with
forkserver, because it just reuses the same connection object over and
over (all changes are done after the fork, so the parent always has a
"pristine" connection. But we can just reinitialize the id before the
pre_connection hook.

Then a similar session to the one above looks like this:

     1  1188504537.392238.w16URlX9gyk.6628.26545 Loaded Qpsmtpd::Plugin::lo=
gging::connection_id=3DHASH(0x87d38dc)
     2  1188504537.392238.w16URlX9gyk.6628.26545 Listening on 0.0.0.0:2525
     3  1188504537.392238.w16URlX9gyk.6628.26545 Running as user hjp, group=
 hjp
     4  1188504537.392238.w16URlX9gyk.6628.26545 Initializing spool_dir
     5  1188504537.392238.w16URlX9gyk.6628.26545 Permissions on spool_dir /=
home/hjp/tmp/ are not 0700
     6  1188504537.392238.w16URlX9gyk.6628.26545 size_threshold set to 0
     7  1188504546.152131.w16URlX9gyk.7327.26545 Accepted connection 0/15 f=
rom 127.0.0.1 / localhost

We have accepted a new connection and it has a new id.

     8  1188504546.152131.w16URlX9gyk.7327.26545 Connection from localhost =
[127.0.0.1]
     9  1188504546.152131.w16URlX9gyk.7327.26545 check_earlytalker plugin: =
remote host said nothing spontaneous, proceeding
    10  1188504546.152131.w16URlX9gyk.7327.26545 220 hrunkner ESMTP qpsmtpd=
 0.40 ready; send us your mail, but not your spam.
    11  1188504546.152131.w16URlX9gyk.7327.26545 dispatching ehlo localhost=
=20
    12  1188504546.152131.w16URlX9gyk.7327.26545 250-hrunkner Hi localhost =
[127.0.0.1]
    13  1188504546.152131.w16URlX9gyk.7327.26545 250-PIPELINING
    14  1188504546.152131.w16URlX9gyk.7327.26545 250-8BITMIME
    15  1188504546.152131.w16URlX9gyk.7327.26545 250 AUTH PLAIN LOGIN CRAM-=
MD5
    16  1188504546.152131.w16URlX9gyk.7327.26545 dispatching mail from:<>
    17  1188504546.152131.w16URlX9gyk.7327.26545 full from_parameter: from:=
<>

We have received a mail from: command, it it was in the connecion from loca=
lhost [127.0.0.1].

    18  1188504546.152131.w16URlX9gyk.7327.26545 from email address : [<>]
    19  1188504546.152131.w16URlX9gyk.7327.26545 getting mail from <>
    20  1188504546.152131.w16URlX9gyk.7327.26545 250 <>, sender OK - how ex=
citing to get mail from you!
    21  1188504546.152131.w16URlX9gyk.7327.26545 dispatching rset

Still the same connection.

    22  1188504546.152131.w16URlX9gyk.7327.26545 250 OK
    23  1188504546.152131.w16URlX9gyk.7327.26545 dispatching mail from:<oth=
er@send.er>
    24  1188504546.152131.w16URlX9gyk.7327.26545 full from_parameter: from:=
<other@send.er>
    25  1188504546.152131.w16URlX9gyk.7327.26545 from email address : [<oth=
er@send.er>]
    26  1188504546.152131.w16URlX9gyk.7327.26545 getting mail from <other@s=
end.er>
    27  1188504546.152131.w16URlX9gyk.7327.26545 250 <other@send.er>, sende=
r OK - how exciting to get mail from you!
    28  1188504546.152131.w16URlX9gyk.7327.26545 dispatching quit
    29  1188504546.152131.w16URlX9gyk.7327.26545 221 hrunkner closing conne=
ction. Have a wonderful day.
    30  1188504546.152131.w16URlX9gyk.7327.26545 click, disconnecting

The end of the first connection.

    31  1188504546.152131.w16URlX9gyk.7327.26545 cleaning up after 26551

Back in the parent. Still using the old connection id, but that single
message can be easily ignored.

    32  1188504596.595799.w16URlX9gyk.4468.26545 Accepted connection 0/15 f=
rom 127.0.0.1 / localhost

A new connection with a new id.

    33  1188504596.595799.w16URlX9gyk.4468.26545 Connection from localhost =
[127.0.0.1]
    34  1188504596.595799.w16URlX9gyk.4468.26545 check_earlytalker plugin: =
remote host said nothing spontaneous, proceeding
    35  1188504596.595799.w16URlX9gyk.4468.26545 220 hrunkner ESMTP qpsmtpd=
 0.40 ready; send us your mail, but not your spam.
    36  1188504596.595799.w16URlX9gyk.4468.26545 dispatching ehlo localhost=
=20
    37  1188504596.595799.w16URlX9gyk.4468.26545 250-hrunkner Hi localhost =
[127.0.0.1]
    38  1188504596.595799.w16URlX9gyk.4468.26545 250-PIPELINING
    39  1188504596.595799.w16URlX9gyk.4468.26545 250-8BITMIME
    40  1188504596.595799.w16URlX9gyk.4468.26545 250 AUTH PLAIN LOGIN CRAM-=
MD5
    41  1188504596.595799.w16URlX9gyk.4468.26545 dispatching mail from:<oth=
er@conn.ecti.on>
    42  1188504596.595799.w16URlX9gyk.4468.26545 full from_parameter: from:=
<other@conn.ecti.on>
    43  1188504596.595799.w16URlX9gyk.4468.26545 from email address : [<oth=
er@conn.ecti.on>]
    44  1188504596.595799.w16URlX9gyk.4468.26545 getting mail from <other@c=
onn.ecti.on>
    45  1188504596.595799.w16URlX9gyk.4468.26545 250 <other@conn.ecti.on>, =
sender OK - how exciting to get mail from you!
    46  1188504596.595799.w16URlX9gyk.4468.26545 dispatching quit
    47  1188504596.595799.w16URlX9gyk.4468.26545 221 hrunkner closing conne=
ction. Have a wonderful day.
    48  1188504596.595799.w16URlX9gyk.4468.26545 click, disconnecting
    49  1188504596.595799.w16URlX9gyk.4468.26545 cleaning up after 26572

I haven't tested it with anything but forkserver, but I checked it in on
trunk so others can play with it.

I think it should be possible to derive the transaction id from the
connection id, but I see no way to get from a transaction object to the
underlying connection object. Am I blind?

	hp


--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--fUYQa+Pmc3FrFX/N
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1ycPfZ+RkG8quy0RAlMTAJ9+SfQjZCfUVDFIPeNcREycFCmvvQCfWs0n
uwhA/MFJbo6GAGLlCl88eCo=
=JQfh
-----END PGP SIGNATURE-----

--fUYQa+Pmc3FrFX/N--
0
hjp
8/30/2007 8:22:39 PM
On 30-Aug-07, at 2:52 PM, Guy Hulbert wrote:

> Actually, I think the right answer is just a sequence generator (mod
> 10000).  That guarantees different consecutive results.

I think so too. In my testing perl only switches to floating point at  
or around 2**50 on 32 bit platforms, which should allow enough email  
between restarts for even the fastest mail systems on the planet.

Consider rand() gone and a sequence used instead.

Matt.
0
matt
8/30/2007 8:37:36 PM
--V0207lvV8h4k8FAm
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-29 17:50:28 -0400, Charlie Brady wrote:
> A four-tuple identifying the TCP connection also identifies the server.

Right. And the tuple must not be reused for some time (2*MSL or 4 minutes
according to RFC 793), so you don't even need a high resolution timer.=20

However, what if there is no TCP connection yet? For example, in
forkserver, the plugins are loaded before the first connection is
accepted and you want to log a failure to load one of them (or the
plugins may want to log something in their register method). You could
just fill the remote part with zeros, but you can have multiple
processes listening on the same port and you can't distinguish them in
this case.

	hp

--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--V0207lvV8h4k8FAm
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1y1QfZ+RkG8quy0RAnLRAJ4sRL5Mq0hapEAhVzWsXk6+XA5TbgCePkRr
oNlewSBODHpJC+AWRV5Rk6I=
=tEMn
-----END PGP SIGNATURE-----

--V0207lvV8h4k8FAm--
0
hjp
8/30/2007 8:49:20 PM
On 30-Aug-07, at 4:22 PM, Peter J. Holzer wrote:

> I think it should be possible to derive the transaction id from the
> connection id, but I see no way to get from a transaction object to  
> the
> underlying connection object. Am I blind?

No, there isn't. In an earlier patch I passed in the connection to  
the constructor so that I could use the connection information.

So to satisfy your requirement, I'd go with:

Connection id = time.hosthash.$$.seq
Transaction id = Connection-id.tran-num

Where tran-num is 0, 1, 2, 3, etc every time the transaction is reset.

Matt.
0
matt
8/30/2007 8:54:41 PM
--t0UkRYy7tHLRMCai
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-30 07:07:51 -0400, Guy Hulbert wrote:
> On Thu, 2007-08-30 at 10:08 +0200, Peter J. Holzer wrote:
> > On 2007-08-29 18:36:12 -0400, Guy Hulbert wrote:
> > Here are some (measured) resolutions of gettimeofday on various systems:
> >=20
> > Linux/i386:      1 ms
> > Linux/SPARC:     2 ms
> > HP-UX/PA-RISC:   2 ms
> > Linux/Alpha:   976 ms (1024 Hz)
>=20
> 'ms' is usually milli-seconds but it appears you mean micro-seconds ( I
> pretend that u=3Dmu and write it 'us' ).

Fortunately I am using a German keyboard so I can claim an AltGr
key malfunction ;-) (AltGr+m =3D =B5)


> > > ( otoh, qpsmtpd is not even threaded is it ? ).
> >=20
> > It might be possible to run Apache::Qpsmtpd on a multithreaded Apache.
>=20
> Unlikely.  PHP people still won't bless mt apache.

But mod_perl people do, AFAIK.

> Postgres people discovered a problem with crypt() - from libc

Interesting. This bug has been known for a long time (Rasmus Lerdorf
wrote 2004 that he tracked it down "a couple of years ago"), yet crypt
in the glibc still isn't threadsafe even though that should be very easy
to fix.  Obviously few people invoke crypt in multithreaded programs.

	hp

--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--t0UkRYy7tHLRMCai
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1zPWfZ+RkG8quy0RAuN0AKCJfmNKkGmw/evhUl0QCtEVsP26/QCdERzP
8uADUdRhIa0cCaktyZpd0rI=
=uo+B
-----END PGP SIGNATURE-----

--t0UkRYy7tHLRMCai--
0
hjp
8/30/2007 9:17:10 PM
On Thu, 30 Aug 2007, Peter J. Holzer wrote:

> On 2007-08-29 17:50:28 -0400, Charlie Brady wrote:
>> A four-tuple identifying the TCP connection also identifies the server.
>
> Right. And the tuple must not be reused for some time (2*MSL or 4 minutes
> according to RFC 793), so you don't even need a high resolution timer.

Indeed.

> However, what if there is no TCP connection yet? For example, in
> forkserver, the plugins are loaded before the first connection is
> accepted and you want to log a failure to load one of them (or the
> plugins may want to log something in their register method).

I consider that to be a different issue. Log messages at that stage aren't 
related to and don't need to be correlated with an email message.

> You could just fill the remote part with zeros, but you can have 
> multiple processes listening on the same port and you can't distinguish 
> them in this case.

You can't have multiple processes bound to the same 
local_IP/local_port, so you could distinguish hosts and processes by 
filling in the local part of the four-tuple. There's still an edge case 
where multiple processes are started with the same local port 
configuration, all but one of which will fail. Do we really ever expect to 
be merging logs from such errant processes?
0
charlieb
8/31/2007 1:12:15 AM
--CE+1k2dSO48ffgeK
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-30 21:12:15 -0400, Charlie Brady wrote:
> On Thu, 30 Aug 2007, Peter J. Holzer wrote:
> >On 2007-08-29 17:50:28 -0400, Charlie Brady wrote:
> >>A four-tuple identifying the TCP connection also identifies the server.
> >
> >Right. And the tuple must not be reused for some time (2*MSL or 4 minutes
> >according to RFC 793), so you don't even need a high resolution timer.
>=20
> Indeed.
>=20
> >However, what if there is no TCP connection yet? For example, in
> >forkserver, the plugins are loaded before the first connection is
> >accepted and you want to log a failure to load one of them (or the
> >plugins may want to log something in their register method).
>=20
> I consider that to be a different issue. Log messages at that stage aren'=
t=20
> related to and don't need to be correlated with an email message.

Right, but we still want to log them and find out what logged them.

>=20
> >You could just fill the remote part with zeros, but you can have=20
> >multiple processes listening on the same port and you can't distinguish=
=20
> >them in this case.
>=20
> You can't have multiple processes bound to the same=20
> local_IP/local_port,

Sure I can:

habanero:~ 9:50 101# lsof -i :80 | grep LISTEN
httpd    9875    root   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd    9946 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd    9967    root   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd    9970 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd    9974 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd    9977 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd    9980 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd    9981 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd    9991 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   10397 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   10400 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   10403 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   10790 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   11176 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   11728 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   14183 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   14186 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   14187 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   14194 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   14195 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   14198 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   14201 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)
httpd   14207 oraport   27u  IPv4  81804443       TCP *:http (LISTEN)

The httpd in question is an Apache btw, so I'd expect an Apache::Qpsmtpd
installation to look similar.=20

> so you could distinguish hosts and processes by filling in the local
> part of the four-tuple.

That's what I meant with "fill the remote part with zeros".

	hp


--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--CE+1k2dSO48ffgeK
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG18nvfZ+RkG8quy0RAm4dAKC82vHGTiYkjmmNIuiVJvNjv7yI2ACgqhm0
C+Vdn8szHJ4UYZcNQ5vqbJs=
=h/16
-----END PGP SIGNATURE-----

--CE+1k2dSO48ffgeK--
0
hjp
8/31/2007 7:57:35 AM
> You can't have multiple processes bound to the same 
> local_IP/local_port,

Of course you can.

bind -> listen -> fork


Regards
Michael

-- 
It's an insane world, but i'm proud to be a part of it. -- Bill Hicks
0
kju
8/31/2007 9:12:39 AM
On Fri, 31 Aug 2007, Michael Holzt wrote:

>> You can't have multiple processes bound to the same
>> local_IP/local_port,
>
> Of course you can.
>
> bind -> listen -> fork

Yes, brain fart at my end. s/$/ except by inheritance post-fork/.

If we stop listening post-fork (as qpsmtpd-forkserver does) then this 
state only occurs briefly. And since the fork occurs after accept(), then 
we already have a TCP four-tuple during that time interval.

However, there is still an issue with Peter's proposed "zero out remote 
address components" proposal - prior to accept(), qpstmpd-forkserver may 
have multiple listening sockets. Some of those sockets (e.g. 127.0.0.1:25) 
won't be unique across multiple hosts.
0
charlieb
8/31/2007 2:42:37 PM
--UlVJffcvxoiEqYs2
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-31 10:42:37 -0400, Charlie Brady wrote:
>=20
> On Fri, 31 Aug 2007, Michael Holzt wrote:
>=20
> >>You can't have multiple processes bound to the same
> >>local_IP/local_port,
> >
> >Of course you can.
> >
> >bind -> listen -> fork
>=20
> Yes, brain fart at my end. s/$/ except by inheritance post-fork/.
>=20
> If we stop listening post-fork (as qpsmtpd-forkserver does) then this=20
> state only occurs briefly. And since the fork occurs after accept(), then=
=20
> we already have a TCP four-tuple during that time interval.
>=20
> However, there is still an issue with Peter's proposed "zero out remote=
=20
> address components" proposal - prior to accept(), qpstmpd-forkserver may=
=20
> have multiple listening sockets. Some of those sockets (e.g. 127.0.0.1:25=
)=20
> won't be unique across multiple hosts.

127.0.0.1 is a problem even after establishing the connection: With
"normal" routing arrangements the remote IP address will be 127.0.0.1,
too, so the only variable is the remote port. If you aggregate log
messages from several hosts which receive locally generated messages,
that can be a problem.

--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--UlVJffcvxoiEqYs2
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG2DEqfZ+RkG8quy0RAu11AJ9pr2Iw0CslWmQu0lMbqse4qnmhUwCeIT7U
OMiBguloCTgd+L+NJNkCmiE=
=wkCa
-----END PGP SIGNATURE-----

--UlVJffcvxoiEqYs2--
0
hjp
8/31/2007 3:18:02 PM
On 8/31/07, Peter J. Holzer <hjp@hjp.at> wrote:
> On 2007-08-31 10:42:37 -0400, Charlie Brady wrote:
> >
> > On Fri, 31 Aug 2007, Michael Holzt wrote:
> >
> > >>You can't have multiple processes bound to the same
> > >>local_IP/local_port,
> > >
> > >Of course you can.
> > >
> > >bind -> listen -> fork
> >
> > Yes, brain fart at my end. s/$/ except by inheritance post-fork/.
> >
> > If we stop listening post-fork (as qpsmtpd-forkserver does) then this
> > state only occurs briefly. And since the fork occurs after accept(), then
> > we already have a TCP four-tuple during that time interval.
> >
> > However, there is still an issue with Peter's proposed "zero out remote
> > address components" proposal - prior to accept(), qpstmpd-forkserver may
> > have multiple listening sockets. Some of those sockets (e.g. 127.0.0.1:25)
> > won't be unique across multiple hosts.
>
> 127.0.0.1 is a problem even after establishing the connection: With
> "normal" routing arrangements the remote IP address will be 127.0.0.1,
> too, so the only variable is the remote port. If you aggregate log
> messages from several hosts which receive locally generated messages,
> that can be a problem.
>

questions:

1. why would the remote ip be localhost once a tcp connection is established?

2. why do we need a 'transaction ID' prior to a connection?

3. can we separate 'startup' type messages from transaction-based ones?

-- 
"The truth is an offense, but not a sin"
0
kitno455
8/31/2007 3:28:55 PM
On Fri, 31 Aug 2007, Peter J. Holzer wrote:

> On 2007-08-31 10:42:37 -0400, Charlie Brady wrote:
>>
>> However, there is still an issue with Peter's proposed "zero out remote
>> address components" proposal - prior to accept(), qpstmpd-forkserver may
>> have multiple listening sockets. Some of those sockets (e.g. 127.0.0.1:25)
>> won't be unique across multiple hosts.
>
> 127.0.0.1 is a problem even after establishing the connection: With
> "normal" routing arrangements the remote IP address will be 127.0.0.1,
> too, so the only variable is the remote port.

Just to clarify, you are referring to SMTP connections from other 
processes local to the qpsmtpd server, i.e. connecting over loopback. 
Correct?

Yes, I can see that in that case 127.0.0.1:nnn:127.0.0.1:25 would not 
identify the host and would not be unique across multiple servers.

---
Charlie
0
charlieb
8/31/2007 5:44:44 PM
--mYCpIKhGyMATD0i+
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-31 13:44:44 -0400, Charlie Brady wrote:
>=20
> On Fri, 31 Aug 2007, Peter J. Holzer wrote:
> >On 2007-08-31 10:42:37 -0400, Charlie Brady wrote:
> >127.0.0.1 is a problem even after establishing the connection: With
> >"normal" routing arrangements the remote IP address will be 127.0.0.1,
> >too, so the only variable is the remote port.
>=20
> Just to clarify, you are referring to SMTP connections from other=20
> processes local to the qpsmtpd server, i.e. connecting over loopback.=20
> Correct?

Yes.

> Yes, I can see that in that case 127.0.0.1:nnn:127.0.0.1:25 would not=20
> identify the host and would not be unique across multiple servers.

	hp

--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--mYCpIKhGyMATD0i+
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD4DBQFG2QyyfZ+RkG8quy0RAjyfAJUfJbVhjvgL4KCR4mcGWyqStzAkAJ9o2/aD
FEA5rHDs/nZ8/Z3VHEV9Sw==
=k29C
-----END PGP SIGNATURE-----

--mYCpIKhGyMATD0i+--
0
hjp
9/1/2007 6:54:42 AM
--7ZAtKRhVyVSsbBD2
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-31 11:28:55 -0400, m. allan noah wrote:
> On 8/31/07, Peter J. Holzer <hjp@hjp.at> wrote:
> > On 2007-08-31 10:42:37 -0400, Charlie Brady wrote:
> > > However, there is still an issue with Peter's proposed "zero out remo=
te
> > > address components" proposal - prior to accept(), qpstmpd-forkserver =
may
> > > have multiple listening sockets. Some of those sockets (e.g. 127.0.0.=
1:25)
> > > won't be unique across multiple hosts.
> >
> > 127.0.0.1 is a problem even after establishing the connection: With
> > "normal" routing arrangements the remote IP address will be 127.0.0.1,
> > too, so the only variable is the remote port. If you aggregate log
> > messages from several hosts which receive locally generated messages,
> > that can be a problem.
> >
>=20
> questions:
>=20
> 1. why would the remote ip be localhost once a tcp connection is establis=
hed?

When a client doesn't explicitely bind() to a socket before calling
connect(), the OS will choose a port number and IP address. The IP
address will generally be that of the interface that the connection goes
out of. If the server IP address is local, then the same IP address will
be chosen for the client. So, as a special case, if the server listens
on 127.0.0.1:25, any connection coming in on that port will be from
127.0.0.1:nnnnn.


> 2. why do we need a 'transaction ID' prior to a connection?

Don't think of it as a 'transaction ID'. Think of it as a 'logging ID',
which identifies the entity to which the log message belongs. There are
things which have to be logged before the first connection (e.g.,
problems with loading a plugin) and you want to identify where they come
=66rom.


> 3. can we separate 'startup' type messages from transaction-based ones?

Probably. In logging/file_connection I used a "server instance id"
(startup timestamp + pid of the forkserver parent process) plus a simple
counter for the connections. Due to a quirk which I never investigated,
all the "startup" messages have a "connection count" of 2, the connections
start at 3. In an earlier message I suggested extra counters for the
transactions and possibly commands, so the full scheme could be
something like:

$instance_id	# could be opaque or structured to include server name
                # or IP, PID, etc.
$instance_id.$connection_id 	# identifies a connection handled
				# by this instance
$instance_id.$connection_id.$transaction_id	# identifies a
						# transaction within=20
						# this connection.
=2E..

	hp

--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--7ZAtKRhVyVSsbBD2
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG2Rc/fZ+RkG8quy0RAsrPAJ95nbGZYIMVhhc/wup5jPXLyM+U2QCfXQpR
tCbWJwnDXF/B1kcTR38Cirk=
=01BD
-----END PGP SIGNATURE-----

--7ZAtKRhVyVSsbBD2--
0
hjp
9/1/2007 7:39:43 AM
--T7mxYSe680VjQnyC
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-29 19:15:37 -0400, Guy Hulbert wrote:
> On Thu, 2007-08-30 at 00:49 +0200, Michael Holzt wrote:
> > > or even
> > >     10 + 1 + 16 + 1 + 39 + 1 + 5 + 1 + 39 + 1 + 5 =3D 119 characters
> >=20
> > Better encode it binary. E.g. for IPv4:
>=20
> And better get the number of bits correct.  An IP address is a 32 bit
> integer, not 15 characters.

You've snipped the context. JT was calling the
Qpsmtpd::Connection::local_ip method which does indeed return a string
of up to 15 characters, not an integer of 32 bits.

> Although perl converts scalars on-demand, it correctly preserves
> integer values.

JT was using string concatenation, so that doesn't help.=20

Yes, it would be possible to call inet_aton on the return value of
local_ip, do the equivalent on local_port, then concatenate them, and=20
send them through base64, thus encoding 48 bits of information in 8
characters. But JT didn't do this, so his scheme needs 21 characters to
encode the same information.

	hp

--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--T7mxYSe680VjQnyC
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG2R3xfZ+RkG8quy0RAjQHAJ9EsWleg4KByZrxcz2sRKsB2tyZiwCeJG6+
hF9UhPXfD+/ouwTNapB3mbM=
=P3/V
-----END PGP SIGNATURE-----

--T7mxYSe680VjQnyC--
0
hjp
9/1/2007 8:08:17 AM
On Sat, 2007-09-01 at 10:08 +0200, Peter J. Holzer wrote:
> > > Better encode it binary. E.g. for IPv4:
> > 
> > And better get the number of bits correct.  An IP address is a 32
> bit
> > integer, not 15 characters.
> 
> You've snipped the context. JT was calling the
> Qpsmtpd::Connection::local_ip method which does indeed return a string
> of up to 15 characters, not an integer of 32 bits.

An IPv4 address is a 32 bit unsigned integer.  The "string of 15
characters" is a human-readable representation of it.  AFAICT the
context was obtaining an efficient packing of the data in question (see
the post on binary logging to a database). I chose the IP address as an
example -- the ID being created was not even close to what we had been
discussing and what Matt implemented and I did not want to go in to
length on an example which appeared to be marginally on-topic.

-- 
--gh


0
gwhulbert
9/1/2007 12:31:04 PM
--YD3LsXFS42OYHhNZ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-08-23 18:50:33 +0200, Hanno Hecker wrote:
> But it would be easy to add / generate a transaction id after every
> reset_transaction() call. This could be logged instead of (or as
> addition to) the PID.
[...]

> Index: lib/Qpsmtpd/SMTP.pm
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- lib/Qpsmtpd/SMTP.pm	(revision 774)
> +++ lib/Qpsmtpd/SMTP.pm	(working copy)
> @@ -388,8 +388,8 @@
>      }
>      else { # includes OK
>        $self->log(LOGINFO, "getting mail from ".$from->format);
> -      $self->respond(250, $from->format . ", sender OK - how exciting to=
 get mail from you!");
>        $self->transaction->sender($from);
> +      $self->respond(250, $from->format . ", sender OK - your transactio=
n id is ".$self->transaction->id);
>      }
>  }
> =20

I like that, although I suspect that most clients will just discard it.

	hp


--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--YD3LsXFS42OYHhNZ
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG2ulAfZ+RkG8quy0RAmA+AKCEt1ROgGQpyM8GYKDGZPRn2Y5hXwCeOPuR
guCKvJKsNXLXiNNh3qGENKM=
=IW9z
-----END PGP SIGNATURE-----

--YD3LsXFS42OYHhNZ--
0
hjp
9/2/2007 4:48:00 PM
On Sun, 2 Sep 2007 18:48:00 +0200
"Peter J. Holzer" <hjp@hjp.at> wrote:


> > +      $self->respond(250, $from->format . ", sender OK - your transaction id is ".$self->transaction->id);
> I like that, although I suspect that most clients will just discard it.
But some clients show the transaction log in case of errors. This would
help to identify the transaction easier.

	Hanno
0
vetinari
9/2/2007 7:41:25 PM
How does qmail do it?
0
davidnicol
9/2/2007 10:52:23 PM
On Sun, 2007-09-02 at 17:52 -0500, David Nicol wrote:
> How does qmail do it?

Uses the inode number ... doesn't work for qpsmtpd ... and it's crap for
logging (see my comment earlier in the thread) since the inodes get
recycled.

-- 
--gh


0
gwhulbert
9/2/2007 11:52:28 PM
> $instance_id	# could be opaque or structured to include server name
>                 # or IP, PID, etc.
> $instance_id.$connection_id 	# identifies a connection handled
> 				# by this instance
> $instance_id.$connection_id.$transaction_id	# identifies a
> 						# transaction within 
> 						# this connection.

I notice that svn code has moved to this model but it still has these
lines in it

  my $SALT_HOST = crypt(hostname, chr(65+rand(57)).chr(65+rand(57)));
  $SALT_HOST =~ tr/A-Za-z0-9//cd;

Is this being used anymore?  I don't find a reference to $SALT_HOST in
the same file.

-- 
JT Moree
0
jtmoree
9/4/2007 2:59:15 PM
--2Z2K0IlrPCVsbNpk
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-09-04 07:59:15 -0700, JT Moree wrote:
> > $instance_id	# could be opaque or structured to include server name
> >                 # or IP, PID, etc.
> > $instance_id.$connection_id 	# identifies a connection handled
> > 				# by this instance
> > $instance_id.$connection_id.$transaction_id	# identifies a
> > 						# transaction within=20
> > 						# this connection.
>=20
> I notice that svn code has moved to this model but it still has these
> lines in it
>=20
>   my $SALT_HOST =3D crypt(hostname, chr(65+rand(57)).chr(65+rand(57)));
>   $SALT_HOST =3D~ tr/A-Za-z0-9//cd;
>=20
> Is this being used anymore?  I don't find a reference to $SALT_HOST in
> the same file.

I was playing around a bit on the weekend, yes. Since neither Matt nor
Ask have cried out in horror on what I did, I guess it's time to present
that to a wider audience:

The instance id basically identifies Qpsmtpd::SMTP object. Looking
through the sources of various servers I found that there is always
exactly one per process (although with forkserver it is inherited by the
child processes), so I thought that=20

    time when object was created (seconds.microseconds since the epoch)
    "host_id"=20
    process id=20

should always be unique. I replaced $SALT_HOST as the "host_id" with the
primary IP address (in hex), because I think a predictable host id is
useful (so that you can find the relevant host from the log entry -
otherwise the host id could be removed). It may be useful to replace the
IP address with something else, most likely the (abbreviated) hostname.
That could be a configuration option. (So this answers your question:
$SALT_HOST is obsolete and I just forgot to delete it)

The connection id and transaction id are simple counters.=20

So a complete log entry (without timestamp or whatever else the logging
mechanism may add) looks like this:

1188729346.156197.7f000101.3165.2.1 Accepted connection 0/15 from 127.0.1.1=
 / Unknown

So this is instance "1188729346.156197.7f000101.3165" (started at
1188729346.156197 on host 127.0.1.1 (oops - the joys of dhcp and strange
/etc/hosts files) in process 3165). This is connection number 2 (i.e.
the first "real" connection (connection number 1 is used up during
startup) on this instance, and the first transaction within this
connection (since a "mail from" command always starts a new transaction,
you can think of transaction 2 as the first "real" transaction).

Apart from the fact that the "host id" thingy should probably be
configurable, there are some other things I'm not completely happy with:

* The id is rather long. That is written into every log line and the
  first 33 characters are always the same until you restart the
  instance. If you have only a handful of instances (which is quite
  likely) that's 33 characters for a few bits of information (at least
  it will compress well with gzip). We could use base64 instead of base
  10/16. Then the timestamp reduces to 6+4 (or 6+3 if we are content
  with 4 =B5s resolution) characters, the IP address to 6 characters and
  the PID to 3 characters. Now that's 20 characters including the
  dot, but it's quite opaque.

* The same delimiter is used within the instance id and between the
  instance id and the connection and transaction ids. This may make life
  unnecessarily hard for log analysis tools.

	hp


--=20
   _  | Peter J. Holzer    | I know I'd be respectful of a pirate=20
|_|_) | Sysadmin WSR       | with an emu on his shoulder.
| |   | hjp@hjp.at         |
__/   | http://www.hjp.at/ |	-- Sam in "Freefall"

--2Z2K0IlrPCVsbNpk
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG3YtJfZ+RkG8quy0RAppyAKCEesJuRwdnLC5DgykkINytpeC4KACeOfZ8
j/PQJdXah1n0KwGPuBzGacY=
=f/Wu
-----END PGP SIGNATURE-----

--2Z2K0IlrPCVsbNpk--
0
hjp
9/4/2007 4:43:54 PM
On 4-Sep-07, at 12:43 PM, Peter J. Holzer wrote:

> I was playing around a bit on the weekend, yes. Since neither Matt nor
> Ask have cried out in horror on what I did,

FWIW I didn't object simply because it seems so pointless with  
everyone having such conflicting ideas about what this should all be  
about.

Honestly I'd be much happier with the timestamp being the time of the  
connection. I have no idea why we want an "id" for the times we're  
outside of a connection/transaction. The idea being that if you're  
writing the file to disk you can use the transaction id as the  
filename and it will be guaranteed unique, but also contain a  
timestamp-like component.

But frankly if we're going to keep going around in circles on the  
implementation I'd rather just concede.

Matt.

0
matt
9/4/2007 6:14:37 PM
On Sep 4, 2007, at 9:43, Peter J. Holzer wrote:

> I was playing around a bit on the weekend, yes. Since neither Matt nor
> Ask have cried out in horror on what I did, I guess it's time to  
> present
> that to a wider audience:

I just got back from vacation and is hopelessly behind on reading up  
on this thread.  I'm planning to catch up over the next few weeks.


  - ask

-- 
http://develooper.com/ - http://askask.com/


0
ask
9/4/2007 6:45:54 PM
On 9/4/07 1:14 PM, "Matt Sergeant" <matt@sergeant.org> wrote:

> On 4-Sep-07, at 12:43 PM, Peter J. Holzer wrote:
> 
>> I was playing around a bit on the weekend, yes. Since neither Matt nor
>> Ask have cried out in horror on what I did,
> 
> FWIW I didn't object simply because it seems so pointless with
> everyone having such conflicting ideas about what this should all be
> about.
> 

There seems to be consensus that a/n {connection|session|transaction} id
would be useful.  

Would it be possible to implement ->id as a hook?  The actual key could then
be left to the creativity of the user.  The plugin could then implement the
other hooks and tune the id as necessary (connect, mail, queue, etc.).

peter

0
peter
9/4/2007 11:05:35 PM
On Tue, 4 Sep 2007, Peter Eisch wrote:

> Would it be possible to implement ->id as a hook?  The actual key could then
> be left to the creativity of the user.  The plugin could then implement the
> other hooks and tune the id as necessary (connect, mail, queue, etc.).

Yes, it's possible to do it that way. See the current hook for received 
headers for example code of how it has to work.

Matt.
0
matt
9/5/2007 4:55:50 PM
Reply:

Similar Artilces:

Script in Script
Hello guys, I am trying to run an script in a page that has an script it like this I have an html page that I add the script like <script src=http://mysite.com/index.aspx> </script> when it goes to http://mysite.com/index.aspx there is another script that I use with Dim Script as string Response.write Script to write but actually it doesn't work ! any idea or suggestion ,I would appreciate. I've put together a little sample for you. Hopefully, you will be able to see where your code differs from mine. default.aspx <%@ Page Langua...

qpsmtpd or qpsmtpd-server or qpsmtpd-forkserver
As just installed, it appears that the run file invokes 'qpsmtpd'. The qpsmtpd-server wants to use Qpsmtpd::SelectServer instead of Qpsmtpd::TcpServer. Okay, what's the difference? And since I want to run the forkserver version anyway, should my run file use 'qpsmtpd-forkserver' instead of 'qpsmtpd'? Thanks. roger-qp-list@rope.net wrote: > As just installed, it appears that the run file invokes 'qpsmtpd'. > The qpsmtpd-server wants to use Qpsmtpd::SelectServer instead of > Qpsmtpd::TcpServer. Okay, what's the difference? I d...

CSS and Script Helpers
Hello,Does anyone knows any CSS and Script Helpers to include the CSS and JS files in the view?Thanks,Miguel There aren't any in the framework, but implementing them is trivial: public static class HtmlExtensions    {        public static string Script(this HtmlHelper helper, string scriptName)        {            string path = VirtualPathUtility.ToAbsolute(string.Format("~/Scripts/{0}.js", scriptName));     &nb...

Script calling script
Hi, Just wondering what the method is for a script to call another script? I am compiling a script and running it, in Java, and want that script to be able to call another compiled script (and ideally continue execution in the initial script when the second one completes). Many thanks, David ...

Helper within Helper?
Hi, I'm writing a helper function and would like to utilize another helper.  Line's 12 & 13 below,  of course don't compile, but how would I accomplish this? 1            public static string ClientList(this HtmlHelper helper,2                IEnumerable<ClientDto> dataSource, object htmlAttributes)3            {4              &...

qpsmtpd or qpsmtpd-forkserver
Lo all, Given I am using the vanilla run file that came with the download: #!/bin/sh QMAILDUID=`id -u qpsmtpd` NOFILESGID=`id -g qpsmtpd` exec /usr/local/bin/softlimit -m 25000000 \ /usr/local/bin/tcpserver -c 10 -v -R -p \ -u $QMAILDUID -g $NOFILESGID `head -1 config/IP` 2525 \ ./qpsmtpd 2>&1 Should I be using qpsmtpd as above or qpsmtpd-forkserver. The email flow is less than 800 a day, if that. It is also a low spec box, AMD K6, top gives memory as: Mem: 247260K total, 242828K used, 4432K free, 41472K buffers Swap: 787168K total, 9840K used, 7...

calling a script from a script
I have this problem: I want to call a program in another computer by POST method -it is a .exe written in C- passing it some parameters that I gather from my shopping cart. That program in turn calls one of my site which must cope with the reply code. Di I need LIBWWW ? Is the use of LIBWWW best suited to do this ? Which are the sublibraries or mopdules needed ? I would appreciate any help || examples regards <bigger>--------------------------- <bold><bigger>Juan Valentin-Pastrana </bigger></bold></bigger> jvalent...

Extend Ancestor script, Revert Script,Activate Auto script
Hi All New to powerbuilder, Please let me know what below things do in powerbuilder , when i right click on User Object then i get this option on right click .... Extend Ancestor script Revert Script Activate Auto script Extend Ancestor script: When checked, this means that the ancestor script will run first before the script that you're looking at. If unchecked, it means that the script you're looking at will run INSTEAD of the ancestor script and that the ancestor script won't be run at all. (You can force the ancestor script to run at any time by calling SUP...

Login Scripts -Using a Script to Change all Login Scripts in Tree?
I'm in the middle planning a SAN consolidation. Naturally, the login script attributes of several Profile objects and Containers point to various data sources. I would LOVE to not have to modify all the scripts after moving data to the new SAN. Anyone know of method of modifying all the login scripts at once? Being able to do a search and replace of text strings in login scripts would greatly simply the whole process. Does BASH offer anything that can facilitate this in OES-Netware? You can do this script driven with "Quick for NDS" - see http://www.novell.co...

Running Power Shell scripts suing VB Script by Script Driver
Hi, I am having IDM 3.5 installed. I want to provision exchange account with the help of Power Shell scripts executed by VB Scripts by Script Driver. (not provisioning exchg accounts thru AD driver-thats my requirement). In this scenario, I have the working power shell script like given below. My question is: 1. How do I execute this Power shell script from the Scripting Driver (thru vb script)? 2. How do I create scripting driver and how do I communicate driver to exchng server? 3. Where do I place this Power Shell script and call this? POWERSHELL SCRIPT: $PASS = READ-HO...

beta testers needed for dbdocumenter sql script helper
A db documetation tool I developed It logs into a database, and generates an html overview of the whole thing, internally linked, all tables triggers views functions column layout + sql script to create each table second use, create a quick sql script of any table you like ( complete with content allready in there ) I use it a lot if I needto look up how a stored proc was used or if i need to duplicate a table from one database to the next etc.Armand Datema5 Skins, 4 SkinObject, 38 Containers, 2 Modules and more Euro 50 a year.SchwingNukeOffshore DNN and ASP.net development Container Crea...

[Linux] Browser refuses to use my script as helper app
Hi, I like to use Evince to read PDF files. If I select it (/usr/bin/evince) as helper app, it works. Now, the Evince window doesn't always open at an optimal size (*). So I wrote a script (/home/lucas/bin/evince-patched) to automatically launch then resize it. The script works like a charm when called from a terminal. But SM refuses to use it: if I set it as helper app, SM launches Acrobat ! Here's the script: --- #!/bin/bash /usr/bin/evince "$@" & pid=$! until wmctrl -lp |grep -q $pid do sleep 1s done winid=$(wmctrl -lp | grep $pid | cut -d'...

Hanging Script / Debug Script
I saw the new button "Debug Script" while I had a hanging script ... and tested it ... So here what I recognized: It is for me now not clear if the script gets stopped, or it is keeping running ... I think it keeps running, right? Also I missing a log to it in the Browser Console ... I think this would be good. (Like if I stop it ...) But I saw the following logs in the Browser Console: Key event not available on some keyboard layouts: key="e" modifiers="accel alt" debugger.xul Key event not available on some keyboard layouts: key="v" modifiers...

RUN!
Recent discussions about the difficulty in starting Mozilla from ProNews/2 and PMMail have prompted me to write RUN! It is now available at: http://hobbes.nmsu.edu/pub/incoming/run_050.zip From the readme: ---- RUN! can eliminate the batch files and config.sys entries often needed to start 'helper-apps' (e.g. opening your web browser from your email client). It sets the current drive, directory, & BeginLibPath to the directory containing the .exe, starts the program, then terminates. USAGE - if you want to start xyz.exe from MyApp: - copy run!.exe to xyz'...

Web resources about - helper scripts - perl.qpsmtpd

Seal script - Wikipedia, the free encyclopedia
Seal script ( simplified Chinese : 篆书 ; traditional Chinese : 篆書 ; pinyin : zhuànshū ) is an ancient style of Chinese calligraphy . It evolved ...

Bahamas man accused of hacking celebs, stealing movie scripts & sex tapes
... who stands accused of hacking into the e-mail accounts of celebrities and entertainment studios, then trying to sell off unreleased scripts, ...

‘Star Wars: The Force Awakens’ script reveals answers to our burning questions
... will make us wait until May 2017 when Episode VIII is set to premiere to learn anymore details about them. In the meantime, the official script ...

Gilead Sciences, Inc. Faces Troubled HCV Scripts as AbbVie Inc Scripts Go Up
Gilead HCV prescriptions decline, while AbbVie sees an increase in HCV scripts.

Google Apps Script: Tracking add-on usage with Google Analytics
... blog Posted by Romain Vialard, a Google Developer Expert and developer of Yet Another Mail Merge , a Google Sheets add-on. Google Apps Script ...

Studio: Tupac Shakur film script was offered by hacker
"Celebrity hacker" allegedly accessed more than 100 email accounts of actors, athletes and media personnel and tried to sell unreleased scripts ...

Bulls flip the script in stunning Christmas Day victory over Thunder
Chicago Tribune Bulls flip the script in stunning Christmas Day victory over Thunder Chicago Tribune The Bulls swept away their negative vibes ...

The Hateful Eight Ending Differences from Script Revealed - Collider
Brian Formo attended a live read of 'The Hateful Eight' in 2014 and he reveals all the ending changes from Quentin Tarantino's script to the ...

The police protection playbook: how Ohio officials followed the script to the letter
The day ended with impassioned calls for calm; that Cleveland and Ohio residents be respectful and mindful of the process and the inevitable ...

Feds arrest hacker for stealing scripts, celeb identities and sex tapes
The Department of Homeland Security has arrested and charged (PDF) a man from the Bahamas for stealing unreleased movie/TV scripts along with ...

Resources last updated: 1/10/2016 10:52:24 AM