DBI 1.14 doc patches and performance tests

I've done a talk on Amsterdam.pm about what happens /behind/ DBI at the DBD
side, and to explain a few facts, I did some benchmarks, which I've added here.
They show the same trend for both Unify and Oracle, so it's probably best
mentioned (if at all) in the DBI docs under performance issues. The benches
were done to show the difference between the 4 fetch methods like this:

    $sel = $dbh->prepare ("select c_ll from ll_go");
    timethese (5, {
	array    => sub {
			$sel->execute;
			while (my @arr = $sel->fetchrow_array   ) {};
			$sel->finish;
			},
	arrayref => sub {
			$sel->execute;
			while (my $arr = $sel->fetchrow_arrayref) {};
			$sel->finish;
			},
	hashref  => sub {
			$sel->execute;
			while (my $hsh = $sel->fetchrow_hashref ) {};
			$sel->finish;
			},
	bindcol  => sub {
			$sel->execute;
			$sel->bind_columns (\$c_ll);
			while (          $sel->fetch            ) {};
			$sel->finish;
			},
	});
    $sel->finish;

Every percentage is the performance gain towards the previous (slower) fetch
method. What surprised my audience was the fact that using bind_columns with
fetch was faster along the line than fetch without bind_columns (everyone using
DBI in the audience seemed to use fetchrow_arrayref all the time, and they were
not aware of the fact that fetch was an alias. Noone was using bind_columns).

UNIFY
============== Benchmarks (1000 records)
Benchmark: running array, arrayref, bindcol, hashref, each for at least 30 CPU seconds...
  hashref: 34 wallclock secs (33.56 usr +  0.01 sys = 33.57 CPU) @  4.56/s (n=153)
    array: 32 wallclock secs (31.42 usr +  0.02 sys = 31.44 CPU) @  5.73/s (n=180)  25.66 %
 arrayref: 33 wallclock secs (31.90 usr +  0.01 sys = 31.91 CPU) @  6.27/s (n=200)   9.42 %
  bindcol: 32 wallclock secs (31.56 usr +  0.01 sys = 31.57 CPU) @  6.43/s (n=203)   2.55 %

============== Benchmarks (2500 records)
Benchmark: running array, arrayref, bindcol, hashref, each for at least 30 CPU seconds...
  hashref: 41 wallclock secs (39.26 usr +  0.01 sys = 39.27 CPU) @  1.81/s (n= 71)
    array: 35 wallclock secs (33.68 usr +  0.01 sys = 33.69 CPU) @  2.29/s (n= 77)  26.52 %
 arrayref: 34 wallclock secs (33.31 usr +  0.01 sys = 33.32 CPU) @  2.49/s (n= 83)   8.73 %
  bindcol: 34 wallclock secs (33.28 usr +  0.01 sys = 33.29 CPU) @  2.58/s (n= 86)   3.61 %

============== Benchmarks (~ 200 000 records) [8 fields]
Benchmark: timing 5 iterations of array, arrayref, bindcol, hashref...
  hashref: 188 wallclock secs (183.68 usr +  0.47 sys = 184.15 CPU)  5155.63 rec/s
    array: 140 wallclock secs (133.01 usr +  0.53 sys = 133.54 CPU)  7109.56 rec/s  37.90 %
 arrayref: 127 wallclock secs (124.30 usr +  0.44 sys = 124.74 CPU)  7611.11 rec/s   7.05 %
  bindcol: 125 wallclock secs (122.22 usr +  0.49 sys = 122.71 CPU)  7737.02 rec/s   1.65 %

============== Benchmarks (~ 200 000 records) [1 field]
Benchmark: timing 5 iterations of array, arrayref, bindcol, hashref...
  hashref: 113 wallclock secs (108.99 usr +  0.48 sys = 109.47 CPU)  8672.79 rec/s
    array:  76 wallclock secs ( 73.85 usr +  0.47 sys =  74.32 CPU) 12774.60 rec/s  47.30 %
 arrayref:  73 wallclock secs ( 70.89 usr +  0.42 sys =  71.31 CPU) 13313.80 rec/s   4.22 %
  bindcol:  69 wallclock secs ( 67.44 usr +  0.47 sys =  67.91 CPU) 13980.40 rec/s   5.01 %

Oracle
============== Benchmarks (914 107 records) [8 fields]
Benchmark: timing 1 iterations of array, arrayref, bindcol, hashref...
  hashref: 93 wallclock secs (67.55 usr +  0.43 sys = 67.98 CPU)   13446.70 rec/s
    array: 54 wallclock secs (28.20 usr +  0.37 sys = 28.57 CPU)   31995.34 rec/s  137.94 %
 arrayref: 42 wallclock secs (17.86 usr +  0.36 sys = 18.22 CPU)   50170.53 rec/s   56.81 %
  bindcol: 41 wallclock secs (16.50 usr +  0.35 sys = 16.85 CPU)   54249.67 rec/s    8.13 %

============== Benchmarks (914 107 records) [1 field]
Benchmark: timing 1 iterations of array, arrayref, bindcol, hashref...
  hashref: 38 wallclock secs (33.18 usr +  0.08 sys = 33.26 CPU)   27483.67 rec/s
    array: 15 wallclock secs ( 9.91 usr +  0.07 sys =  9.98 CPU)   91593.89 rec/s  233.27 %
 arrayref: 14 wallclock secs ( 8.62 usr +  0.07 sys =  8.69 CPU)  105190.68 rec/s   14.84 %
  bindcol: 12 wallclock secs ( 6.55 usr +  0.07 sys =  6.62 CPU)  138082.63 rec/s   31.27 %

These figures are free for use by anyone ...

These are some small changes to the documentation part of DBI (the pod stuff),
to be more up to data with the latest p5p doc changes (it seems chop is dropped
everywhere in favour of chomp in all docs, thanks to Michael Schwern) and using
or in favour of || for /all/ examples.

--8<---
*** DBI.pm.org	Wed Mar 21 16:49:37 2001
--- DBI.pm	Wed Mar 21 16:59:41 2001
***************
*** 1291,1297 ****
    $sth = $dbh->prepare("INSERT INTO table(foo,bar,baz) VALUES (?,?,?)");
  
    while(<CSV>) {
!     chop;
      my ($foo,$bar,$baz) = split /,/;
  	$sth->execute( $foo, $bar, $baz );
    }
--- 1291,1297 ----
    $sth = $dbh->prepare("INSERT INTO table(foo,bar,baz) VALUES (?,?,?)");
  
    while(<CSV>) {
!     chomp;
      my ($foo,$bar,$baz) = split /,/;
  	$sth->execute( $foo, $bar, $baz );
    }
***************
*** 1407,1412 ****
--- 1407,1413 ----
    solid_   DBD::Solid
    syb_     DBD::Sybase
    tuber_   DBD::Tuber
+   uni_     DBD::Unify
    xbase_   DBD::XBase
  
  
***************
*** 1489,1501 ****
  
    my $sth = $dbh->prepare(q{
      INSERT INTO sales (product_code, qty, price) VALUES (?, ?, ?)
!   }) || die $dbh->errstr;
    while (<>) {
!       chop;
        my ($product_code, $qty, $price) = split /,/;
!       $sth->execute($product_code, $qty, $price) || die $dbh->errstr;
    }
!   $dbh->commit || die $dbh->errstr;
  
  See L</execute> and L</bind_param> for more details.
  
--- 1490,1502 ----
  
    my $sth = $dbh->prepare(q{
      INSERT INTO sales (product_code, qty, price) VALUES (?, ?, ?)
!   }) or die $dbh->errstr;
    while (<>) {
!       chomp;
        my ($product_code, $qty, $price) = split /,/;
!       $sth->execute($product_code, $qty, $price) or die $dbh->errstr;
    }
!   $dbh->commit or die $dbh->errstr;
  
  See L</execute> and L</bind_param> for more details.
  
***************
*** 1521,1529 ****
  =item C<connect>
  
    $dbh = DBI->connect($data_source, $username, $password)
!             || die $DBI::errstr;
    $dbh = DBI->connect($data_source, $username, $password, \%attr)
!             || die $DBI::errstr;
  
  Establishes a database connection, or session, to the requested C<$data_source>.
  Returns a database handle object if the connection succeeds. Use
--- 1522,1530 ----
  =item C<connect>
  
    $dbh = DBI->connect($data_source, $username, $password)
!             or die $DBI::errstr;
    $dbh = DBI->connect($data_source, $username, $password, \%attr)
!             or die $DBI::errstr;
  
  Establishes a database connection, or session, to the requested C<$data_source>.
  Returns a database handle object if the connection succeeds. Use
***************
*** 1640,1648 ****
  =item C<connect_cached> I<NEW>
  
    $dbh = DBI->connect_cached($data_source, $username, $password)
!             || die $DBI::errstr;
    $dbh = DBI->connect_cached($data_source, $username, $password, \%attr)
!             || die $DBI::errstr;
  
  C<connect_cached> is like L</connect>, except that the database handle
  returned is also
--- 1641,1649 ----
  =item C<connect_cached> I<NEW>
  
    $dbh = DBI->connect_cached($data_source, $username, $password)
!             or die $DBI::errstr;
    $dbh = DBI->connect_cached($data_source, $username, $password, \%attr)
!             or die $DBI::errstr;
  
  C<connect_cached> is like L</connect>, except that the database handle
  returned is also
***************
*** 2152,2160 ****
  
  =item C<do>
  
!   $rc  = $dbh->do($statement)           || die $dbh->errstr;
!   $rc  = $dbh->do($statement, \%attr)   || die $dbh->errstr;
!   $rv  = $dbh->do($statement, \%attr, @bind_values) || ...
  
  Prepare and execute a single statement. Returns the number of rows
  affected or C<undef> on error. A return value of C<-1> means the 
--- 2153,2161 ----
  
  =item C<do>
  
!   $rc  = $dbh->do($statement)           or die $dbh->errstr;
!   $rc  = $dbh->do($statement, \%attr)   or die $dbh->errstr;
!   $rv  = $dbh->do($statement, \%attr, @bind_values) or ...
  
  Prepare and execute a single statement. Returns the number of rows
  affected or C<undef> on error. A return value of C<-1> means the 
***************
*** 2181,2187 ****
    my $rows_deleted = $dbh->do(q{
        DELETE FROM table
        WHERE status = ?
!   }, undef, 'DONE') || die $dbh->errstr;
  
  Using placeholders and C<@bind_values> with the C<do> method can be
  useful because it avoids the need to correctly quote any variables
--- 2182,2188 ----
    my $rows_deleted = $dbh->do(q{
        DELETE FROM table
        WHERE status = ?
!   }, undef, 'DONE') or die $dbh->errstr;
  
  Using placeholders and C<@bind_values> with the C<do> method can be
  useful because it avoids the need to correctly quote any variables
***************
*** 2259,2266 ****
  
  =item C<prepare>
  
!   $sth = $dbh->prepare($statement)          || die $dbh->errstr;
!   $sth = $dbh->prepare($statement, \%attr)  || die $dbh->errstr;
  
  Prepares a single statement for later execution by the database
  engine and returns a reference to a statement handle object.
--- 2260,2267 ----
  
  =item C<prepare>
  
!   $sth = $dbh->prepare($statement)          or die $dbh->errstr;
!   $sth = $dbh->prepare($statement, \%attr)  or die $dbh->errstr;
  
  Prepares a single statement for later execution by the database
  engine and returns a reference to a statement handle object.
***************
*** 2326,2332 ****
  
  =item C<commit>
  
!   $rc  = $dbh->commit     || die $dbh->errstr;
  
  Commit (make permanent) the most recent series of database changes
  if the database supports transactions and AutoCommit is off.
--- 2327,2333 ----
  
  =item C<commit>
  
!   $rc  = $dbh->commit     or die $dbh->errstr;
  
  Commit (make permanent) the most recent series of database changes
  if the database supports transactions and AutoCommit is off.
***************
*** 2338,2344 ****
  
  =item C<rollback>
  
!   $rc  = $dbh->rollback   || die $dbh->errstr;
  
  Rollback (undo) the most recent series of uncommitted database
  changes if the database supports transactions and AutoCommit is off.
--- 2339,2345 ----
  
  =item C<rollback>
  
!   $rc  = $dbh->rollback   or die $dbh->errstr;
  
  Rollback (undo) the most recent series of uncommitted database
  changes if the database supports transactions and AutoCommit is off.
***************
*** 2351,2357 ****
  
  =item C<disconnect>
  
!   $rc = $dbh->disconnect  || warn $dbh->errstr;
  
  Disconnects the database from the database handle. C<disconnect> is typically only used
  before exiting the program. The handle is of little use after disconnecting.
--- 2352,2358 ----
  
  =item C<disconnect>
  
!   $rc = $dbh->disconnect  or warn $dbh->errstr;
  
  Disconnects the database from the database handle. C<disconnect> is typically only used
  before exiting the program. The handle is of little use after disconnecting.
***************
*** 2872,2880 ****
  
  =item C<bind_param>
  
!   $rc = $sth->bind_param($p_num, $bind_value)  || die $sth->errstr;
!   $rv = $sth->bind_param($p_num, $bind_value, \%attr)     || ...
!   $rv = $sth->bind_param($p_num, $bind_value, $bind_type) || ...
  
  The C<bind_param> method can be used to bind a value
  with a placeholder embedded in the prepared statement. Placeholders
--- 2873,2881 ----
  
  =item C<bind_param>
  
!   $rc = $sth->bind_param($p_num, $bind_value)  or die $sth->errstr;
!   $rv = $sth->bind_param($p_num, $bind_value, \%attr)     or ...
!   $rv = $sth->bind_param($p_num, $bind_value, $bind_type) or ...
  
  The C<bind_param> method can be used to bind a value
  with a placeholder embedded in the prepared statement. Placeholders
***************
*** 2948,2956 ****
  
  =item C<bind_param_inout>
  
!   $rc = $sth->bind_param_inout($p_num, \$bind_value, $max_len)  || die $sth->errstr;
!   $rv = $sth->bind_param_inout($p_num, \$bind_value, $max_len, \%attr)     || ...
!   $rv = $sth->bind_param_inout($p_num, \$bind_value, $max_len, $bind_type) || ...
  
  This method acts like L</bind_param>, but also enables values to be
  updated by the statement. The statement is typically
--- 2949,2957 ----
  
  =item C<bind_param_inout>
  
!   $rc = $sth->bind_param_inout($p_num, \$bind_value, $max_len)  or die $sth->errstr;
!   $rv = $sth->bind_param_inout($p_num, \$bind_value, $max_len, \%attr)     or ...
!   $rv = $sth->bind_param_inout($p_num, \$bind_value, $max_len, $bind_type) or ...
  
  This method acts like L</bind_param>, but also enables values to be
  updated by the statement. The statement is typically
***************
*** 2979,2986 ****
  
  =item C<execute>
  
!   $rv = $sth->execute                || die $sth->errstr;
!   $rv = $sth->execute(@bind_values)  || die $sth->errstr;
  
  Perform whatever processing is necessary to execute the prepared
  statement.  An C<undef> is returned if an error occurs.  A successful
--- 2980,2987 ----
  
  =item C<execute>
  
!   $rv = $sth->execute                or die $sth->errstr;
!   $rv = $sth->execute(@bind_values)  or die $sth->errstr;
  
  Perform whatever processing is necessary to execute the prepared
  statement.  An C<undef> is returned if an error occurs.  A successful
***************
*** 3414,3420 ****
  If the C<RaiseError> attribute is not set, then DBI calls would need to be
  manually checked for errors, typically like this:
  
!   $h->method(@args) || die $h->errstr;
  
  With C<RaiseError> set, the DBI will automatically C<die> if any DBI method
  call on that handle (or a child handle) fails, so you don't have to
--- 3415,3421 ----
  If the C<RaiseError> attribute is not set, then DBI calls would need to be
  manually checked for errors, typically like this:
  
!   $h->method(@args) or die $h->errstr;
  
  With C<RaiseError> set, the DBI will automatically C<die> if any DBI method
  call on that handle (or a child handle) fails, so you don't have to
***************
*** 3456,3470 ****
  Here's a complete example program to select and fetch some data:
  
    my $dbh = DBI->connect("dbi:DriverName:db_name", $user, $password)
!       || die "Can't connect to $data_source: $DBI::errstr";
  
    my $sth = $dbh->prepare( q{
            SELECT name, phone
            FROM mytelbook
!   }) || die "Can't prepare statement: $DBI::errstr";
  
    my $rc = $sth->execute
!       || die "Can't execute statement: $DBI::errstr";
  
    print "Query will return $sth->{NUM_OF_FIELDS} fields.\n\n";
    print "Field names: @{ $sth->{NAME} }\n";
--- 3457,3471 ----
  Here's a complete example program to select and fetch some data:
  
    my $dbh = DBI->connect("dbi:DriverName:db_name", $user, $password)
!       or die "Can't connect to $data_source: $DBI::errstr";
  
    my $sth = $dbh->prepare( q{
            SELECT name, phone
            FROM mytelbook
!   }) or die "Can't prepare statement: $DBI::errstr";
  
    my $rc = $sth->execute
!       or die "Can't execute statement: $DBI::errstr";
  
    print "Query will return $sth->{NUM_OF_FIELDS} fields.\n\n";
    print "Field names: @{ $sth->{NAME} }\n";
***************
*** 3490,3496 ****
  
    open FH, "<phone.csv" or die "Unable to open phone.csv: $!";
    while (<FH>) {
!       chop;
        my ($name, $phone) = split /,/;
        $sth->execute($name, $phone);
    }
--- 3491,3497 ----
  
    open FH, "<phone.csv" or die "Unable to open phone.csv: $!";
    while (<FH>) {
!       chomp;
        my ($name, $phone) = split /,/;
        $sth->execute($name, $phone);
    }
-->8---

-- 
H.Merijn Brand           Amsterdam Perl Mongers (http://www.amsterdam.pm.org/)
using perl-5.005.03, 5.6.0, 5.6.1, 5.7.1 & 623 on HP-UX 10.20 & 11.00, AIX 4.2
   AIX 4.3, WinNT 4, Win2K pro & WinCE 2.11 often with Tk800.022 &/| DBD-Unify
ftp://ftp.funet.fi/pub/languages/perl/CPAN/authors/id/H/HM/HMBRAND/

0
h
3/21/2001 4:18:31 PM
perl.dbi.dev 1960 articles. 0 followers. Follow

1 Replies
388 Views

Similar Articles

[PageSpeed] 45

The only surprise for me is that there wasn't more gain using
bind_columns()/fetch().  The documentation does mention it as faster than
the other methods and I never use anything else.  The aliasing of fetch()
from fetchrow_arrayref() isn't important, but using bind_col() or
bind_columns() is.

Maybe there is some way to emphasize it more strongly in the documentation.
A lot of people seen enamored with fetchrow_array() and fetchrow_hashref()
and may not realize the performance penalty they are incurring.

From DBI.pm:

=item B<fetchrow_arrayref>

  $ary_ref = $sth->fetchrow_arrayref;
  $ary_ref = $sth->fetch;    # alias

Fetches the next row of data and returns a reference to an array
holding the field values.  Null field values are returned as undef.
This is the fastest way to fetch data, particularly if used with
$sth->bind_columns.

--
Mac :})
** I normally forward private database questions to the DBI mail lists. **
Give a hobbit a fish and he'll eat fish for a day.
Give a hobbit a ring and he'll eat fish for an age.
----- Original Message -----
From: "H.Merijn Brand" <h.m.brand@hccnet.nl>
To: "Tim Bunce" <Tim.Bunce@ig.co.uk>
Cc: "DBI developers" <dbi-dev@perl.org>
Sent: Wednesday, March 21, 2001 8:18 AM
Subject: DBI 1.14 doc patches and performance tests


> . . .
> Every percentage is the performance gain towards the previous (slower)
fetch
> method. What surprised my audience was the fact that using bind_columns
with
> fetch was faster along the line than fetch without bind_columns (everyone
using
> DBI in the audience seemed to use fetchrow_arrayref all the time, and they
were
> not aware of the fact that fetch was an alias. Noone was using
bind_columns).
> . . .

0
mchase
3/21/2001 9:21:59 PM
Reply: