Advanced DBI

Perl Conference 3.0 Tutorial

August 1999

Tim Bunce

Topical Topics

Wheels within Wheels

Error Checking and Handling

Transactions

Speed Speed Speed!

Portability

The Power of the Proxy

DBI for the Web

Handling LONG/BLOB data

Tainting

Flexing the Multiplex

Wheels within Wheels

The DBI architecture

and how to watch it at work

Under the hood

DBI defines and implements an interface

Driver modules do much of the real work

DBI provides default methods, functions, tools etc for drivers

Not limited to the lowest common denominator - mechanism provided for driver specific extensions

Designed and built for speed

Valuable detailed call tracing/debugging built-in

A picture is worth?

Setting the scene

Inner and outer worlds

Application and Drivers

Inner and outer handles

DBI handles are references to tied hashes

The DBI Method Dispatcher

gateway between the worlds and the heart of the DBI

… Now we'll go all deep and visual for a while...

Architecture of the DBI classes #1

Architecture of the DBI classes #2

Anatomy of a DBI handle

Method call walk-through

Consider a simple prepare call:
$dbh->prepare(...)

$dbh is reference to a DBI::db object (regardless of driver)

DBI::db::prepare is an alias for DBI dispatch method

Dispatch calls driver’s prepare method something like this:

   my $inner_hash_ref = …     # from tie magic

   my $implementor_class = … # from DBI magic data

   $inner_hash_ref->$implementor_class::prepare(...)

Driver code gets the inner hash so it has fast access to the hash contents

Watching the DBI in action

DBI has detailed call tracing built-in

The trace can be very helpful in understanding application behavior and for debugging

Shows parameters and results

Trace information can be written to a file

Not used often enough!

Our program for today...

1: #!/usr/bin/perl -w

2:

3: use DBI;

4:

5: $dbh = DBI->connect('', '', '', { RaiseError => 1 });

6:

7: $upd = $dbh->prepare("UPDATE prices SET price=? WHERE prod=?");

8: $ins = $dbh->prepare("INSERT INTO prices (prod,price) VALUES(?,?)");

9: $rows = $upd->execute(42, "Widgets");

10: $ins->execute("Widgets", 42) if $rows == 0;

11:

12: $dbh->disconnect;

Enabling Tracing

Per handle

$h->trace($level);

$h->trace($level, $filename);

Only effects that handle and any new child handles created from it

Child handles get trace level of parent in effect at time of creation

Global (internal to application)

DBI->trace(...);

Sets effective global default minimum trace level

Global (external to application)

Enabled using DBI_TRACE environment variable

DBI_TRACE=digits DBI->trace(digits);

DBI_TRACE=filename DBI->trace(2, filename);

DBI_TRACE=digits=filename DBI->trace(digits, filename);

Trace level 1

Trace level 1 shows method results and line numbers:

    <- connect= DBI::db=HASH(0xe0abc) at DBI.pm line 356.

    <- STORE('PrintError', 1)= 1 at DBI.pm line 382.

    <- STORE('AutoCommit', 1)= 1 at DBI.pm line 382.

    <- STORE('RaiseError', 1)= 1 at DBI.pm line 382.

    <- prepare('UPDATE …')= DBI::st=HASH(0xe1238) at test.pl line 7.

    <- prepare('INSERT …')= DBI::st=HASH(0xe1504) at test.pl line 8.

    <- execute= '0E0' at test.pl line 9.

    <- execute= 1 at test.pl line 10.

    <- disconnect= 1 at test.pl line 11.

    <- DESTROY= undef

    <- DESTROY= undef

    <- DESTROY= undef

    <- DESTROY= undef

Trace level 2 and above

Trace level 2 shows calls with parameters and more:

    -> connect for DBD::ODBC::dr (DBI::dr=HASH(0x13dfec)~0xe14a4

                                  '' '' **** HASH(0xe0a10))

    <- connect= DBI::db=HASH(0xe0ab0) at DBI.pm line 356.

    -> STORE for DBD::ODBC::db (DBI::db=HASH(0xe0abc)~INNER 'PrintError' 1)

    <- STORE= 1 at DBI.pm line 382.

    -> prepare for DBD::ODBC::db (DBI::db=HASH(0xe0ab0)~0xe0abc

                                  'UPDATE prices SET price=? WHERE prod=?')

    <- prepare= DBI::st=HASH(0xe1274) at test.pl line 7.

Trace level 3 and above shows more processing details

Error Checking and Handling

To err is human, to detect, divine.

The importance of error checking

Errors happen!

Failure happens when you don't expect errors!

database crash / network disconnection

lack of disk space for insert or select (sort space for order by)

server math error on select (divide by zero)

and maybe, just maybe, errors in your own code (Gasp!)

Beat failure by expecting errors!

Detect errors early to limit effects

Defensive Programming, e.g., check assumptions

Through Programming, e.g., check for errors after fetch loops

Error checking - ways and means

Error checking the hard way...

$h->method || die "DBI method failed: $DBI::errstr";

$h->method || die "DBI method failed: $DBI::errstr";

$h->method || die "DBI method failed: $DBI::errstr";

Error checking the easy way...

$h->{RaiseError} = 1;

$h->method;

$h->method;

$h->method;

Handling errors the smart way

For simple applications immediate death on error is fine

For more advanced applications greater control is needed

Life after death:

$h->{RaiseError} = 1;

eval { … $h->method; … };

if ($@) { … }

Bonus prize

Other, non-DBI, code within the eval block may also raise an exception that will be caught and handled cleanly

Transactions

To do or to undo,

that is the question

Transactions - What's it all about?

Far more than just locking

The A.C.I.D. test

True transactions give true safety

even from power failures and system crashes!

Incomplete transactions are automatically rolled-back by the database server when it's restarted.

Also removes burden of undoing incomplete changes

Hard to implement (for the vendor) and has significant performance cost

A very large topic worthy of an entire tutorial

Transactions - Life Preserving

Classic: system crash between one bank account being debited and another being credited.

Dramatic: power failure during update statement on 3 million rows after 2 minutes when only part way through.

Real-world: complex series of inter-related updates, deletes and inserts on many separate tables fails at the last step due to a duplicate primary key on an insert.

Transaction recovery would handle all these situations automatically.

Makes a system far more robust and trustworthy over the long term.

Use transactions if your database supports them.

If it doesn't and you need them, switch to a different database.

Transactions - How the DBI helps

Tools of the trade:

Set AutoCommit off, and RaiseError on

Wrap eval { … } around the code

Use $dbh->commit; and $dbh->rollback;

Disable AutoCommit via $dbh->{AutoCommit} = 0;

to enable transactions and thus rollback-on-error

Enable RaiseError via $dbh->{RaiseError} = 1;

to automatically 'throw an exception' on error

The surrounding eval { … }

catches the exception and puts the error message into $@

Test $@ and rollback() if set, else commit()

Transactions - Example code

$dbh->{RaiseError} = 1;

eval {

    $dbh->method(…);   # assorted DBI calls

    foo(...);          # application code

    $dbh->commit;      # commit the changes

};

if ($@) {

    warn "Transaction aborted because $@";

    $dbh->rollback;

    ...

}

Transactions - Further comments

The eval { … } catches all exceptions

not just from DBI calls. Also catches fatal runtime errors from Perl.

Put commit() inside the eval

ensures commit failure is caught cleanly.

remember that commit() itself may fail for many reasons.

Don't forget that rollback() may also fail

due to database crash or network failure etc.

Other points:

Always explicitly commit or rollback before disconnect().

Destroying a connected $dbh should always rollback.

END blocks can catch exit-without-disconnect to rollback and disconnect cleanly.

Speed Speed Speed!

What helps,what doesn't

Give me speed!

DBI was designed for speed from day one

DBI dispatch written in hand-crafted XS/C

Dispatch to XS driver method calls optimized

Cached attributes returned directly by DBI dispatch

DBI overhead is generally insignificant

So we'll talk about other speed issues instead ...

Partition for speed

Application partitioning

stop and think - work smarter not harder

Pick the right database for the job, if you have the choice

Work close to the data

Use stored procedures where appropriate

Moving data to/from the client is always expensive

think of latency as well as bandwidth

Use proprietary bulk-load where appropriate

Consider database maths vs. Perl maths

Multiple simple queries with 'joins' in Perl may be faster

Consider local caching, in memory or DBM file etc.

Mix 'n Match techniques as needed

Prepare for speed

prepare() - what happens in the server

Receive and parse the SQL statement into internal form

Get details for all the selected tables

Check access rights for each

Get details for all the selected fields

Check data types in expressions

Get details for all the indices of the tables

Develop an optimised query 'access plan' for best execution

Return a handle for all this cached information

This can be an expensive process - especially the 'access plan'

Some databases, like MySQL, don't cache the information but have simpler, and thus faster, plan creation

The Best Laid Plans

Query optimisation is hard

Intelligent high quality cost based query optimisation is really hard!

Know your optimiser

Oracle, Informix, Sybase, DB2, SQL Server etc. all slightly different.

Check what it's doing

Use tools to see the plans used for your queries - very helpful

Help it along

Most 'big name' databases have a mechanism to analyse and store the key distributions of indices to help the optimiser make good plans

beware: keep it fresh, old key distributions might be worse than none

Some also allow you to embed 'hints' into the SQL as comments

beware: take it easy, over hinting hinders dynamic optimisation

Respect your server's SQL cache

Optimised Access Plan etc. is cached within the server

keyed by the exact original SQL string used

Without placeholders, SQL string varies each time

so cached one is not reused

and time is wasted creating a new access plan

the new statement and access plan are added to cache

so the cache fills and other statements get pushed out

Compare do("insert … $id");

with do("insert … ?", $id);

Hot handles

Use prepare() to get a handle on the statement in the SQL cache and keep it there.

The handle for the pre-prepared statement avoids a round-trip to server for cache-check / parsing etc.

Avoid using $dbh->do(…) in a loop.

Use $sth = $dbh->prepare(…)

    and $sth->execute() instead.

For example, convert looped do("insert … ?", $id)

    to prepare("insert … ?")

    and looped execute($id).

Let the DBI cache your handles

Sometimes it's not easy to hold all your handles

e.g., library code to lookup values from the database given a $dbh

The prepare_cached() method gives you a client side statement handle cache:

while ( ($field, $value) = each %search_fields ) {

push @sql, "$field = ?";

push @values, $value;

}

$where = "";

$where = "where ".join(" and ", @sql) if @sql;

$sth = $dbh->prepare_cached("select * from table $where");

$sth->execute(@values);

Can avoid the need for global statement handle variables

Sling less for speed

while(@row = $sth->fetchrow_array) { }

one field: 3,100 fetches per cpu second

ten fields: 1,000 fetches per cpu second

while($row = $sth->fetchrow_arrayref) { }

one field: 5,300 fetches per cpu second

ten fields: 4,000 fetches per cpu second

Notes:

Timings made on an old SPARC 10 using DBD::Oracle

Timings assume instant record fetch within driver

Fields all one char. @row would be even slower for more/bigger fields

Use bind_columns() for direct access to fetched fields

Keep a handle on your databases

Connecting to a database can be slow

Try to connect once and stay connected where practical

We'll discuss web server issues later

New method:

The new connect_cached() method acts like prepare_cached() but for database handles

It also checks the connection and reconnects if it's broken

Similar, but not quite the same as Apache::DBI

New and subject to change with experience

Potentially most useful with DBD::Proxy & DBI::ProxyServer

Portability

A Holy Grail

To be taken with a pinch of salt

Portability in practice

Portability requires care and testing - it can be tricky

Platform Portability - the easier bit

Availability of database client software (and server if required)

Availability of DBD driver

DBD::Proxy can address both these issues - see later

Database Portability - more tricky

Differences in SQL dialects cause most problems

Differences in data types can also be a problem

Driver capabilities (placeholders etc)

DBIx::Compat module (in DBIx::RecordSet) may be useful.

A standard DBI test suite is needed (and planned).

The Power of the Proxy

Thin clients ...

and other buzz words

DBD::Proxy & DBI::ProxyServer

Networking for Non-networked databases

DBD::Proxy driver forwards calls over network to remote DBI::ProxyServer

No changes in application behavior

Only the DBI->connect statement needs to be changed

Proxy can be made completely transparent by setting the DBI_AUTOPROXY environment variable

So not even the DBI->connect statement needs to be changed!

DBI::ProxyServer works on Win32

Access to Access and other Win32 ODBC and ADO data sources

Developed by Jochen Wiedmann

A proxy picture

Thin clients and other buzz words

Proxying for remote access: "thin-client"

No need for database client code on the DBI client

Proxying for network security: "encryption"

Can use Crypt::IDEA, Crypt::DES etc.

Proxying for "access control" and "firewalls"

handy for web servers

Proxying for action control

e.g., only allow specific select or insert statements

Proxying for local access: "n-tier middleware"

connect_cached(), memory use, licensing limits

Proxying for performance: "compression"

Can compress data transfers using Compress::Zlib

DBI for the Web

Hand waving from 30,000 feet

Web DBI - Connect speed

Databases can be slow to connect

Traditional CGI forces a new connect per request

Move Perl and DBI into the web server

Apache with mod_perl and Apache::DBI module

Microsoft IIS with ActiveState's PerlEx

Connections can then persist between requests

Apache::DBI automatically used by DBI if loaded

No CGI script changes required to get persistence

Take care not to change the shared session behaviour

Leave the $dbh in the same state you found it!

Other alternatives include

FastCGI, CGI::SpeedyCGI and CGI::MiniSvr

Web DBI - Too many connections

Limits on Database Connections

Database platform resources or licensing

Memory consumption of web servers

Partition web servers into General and Database

Direct database work to the Database web servers

Use Reverse Proxy / Redirect / Rewrite to achieve this

Allows each subset of servers to be tuned to best fit workload

And be run on appropriate platforms

Web DBI - State-less-ness

No fixed client-server pair

Each request can be handled by a different process.

So can't simply stop fetching when one page is complete and continue fetching from the same $sth when the next page is requested.

And transactions can't span requests. Even if they could you'd have problems with database locks being held etc.

Need access to 'accumulated state' somehow:

via the client (e.g., hidden form fields)

via the database (records in a session_state table)

via the web server file system (DBM files etc)

Need to purge old state info if stored on server, so timestamp it

See Apache::Session module

DBI::ProxyServer + connect_cached with session ID may suit

Web DBI - Browsing pages of results

Re-execute query each time then count/discard (simple but expensive)

works well for small results sets or where users rarely view many pages

fast initial response, degrades gradually for later pages

count/discard in server is better but still inefficient for large result sets

affected by inserts and deletes from other processes

Re-execute query with where clause using min/max keys from last results

works well where original query can be qualified in that way, not common

Select and cache full result rows somewhere for fast access

can be expensive for large result sets with big fields

Select and cache only the row keys, fetch full rows as needed

optimisation of above, use ROWID if supported, "select … where … in (…)"

If data is static and queries predictable then custom pre-built indexes may be useful

The cache can be on web server, using DBM file for example, or on database server using a table keyed by session id

Web DBI - Concurrent editing

How to prevent updates overwriting each other?

You can use Optimistic Locking via 'qualified update':

update table set ...=...

where key = $old_key

and field1 = $old_field1

and field2 = $old_field2 and … for all fields

Potential problems with floating point data values not matching

Some databases support a high-resolution 'update timestamp' field that can be checked instead

Check the update row count

If it's zero then you know the record has been changed or deleted by another process

Web DBI - Tips for the novice

Test one step at a time

Test perl + DBI + DBD driver outside the web server first

Test web server + non-DBI CGI next

Remember that CGI scripts run as a different user with a different environment - expect to be tripped up by that

DBI trace is your friend - use it

Use the Perl "-w" and "-T" options. Always "use strict;"

Read and inwardly digest the WWW Security FAQ:

http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html

Read the Perl CGI FAQ:

http://www.perl.com/cgi-bin/pace/pub/doc/FAQs/cgi/perl-cgi-faq.html

And, if appropriate, read the mod_perl information available from:

http://perl.apache.org

Handling LONG/BLOB data

What makes LONG / BLOB / MEMO data special?

Not practical to pre-allocate fixed size buffers for worst case

Fetching LONGs - treat as normal fields after setting:

$dbh->{LongReadLen} - buffer size to allocate for expected data

$dbh->{LongTruncOk} - should truncating-to-fit be allowed

Inserting LONGs

The limitations of string literals

The benefits of placeholders

Chunking / Piecewise processing not supported

So you're limited to available memory

Some drivers support an unofficial blob_read() method

DBI security tainting

By default the DBI ignores Perl tainting

doesn't taint returned data

doesn't check that parameters are not tainted

The Taint attribute enables that behaviour

If Perl itself is in taint mode

Each handle has it's own inherited Taint attribute

So can be enabled for a connection and disabled for particular statements, for example:

$dbh = DBI->connect(…, { Taint => 1 });

$sth = $dbh->prepare("select * from safe_table");

$sth->{Taint} = 0; # no tainting on this handle

Flexing the Multiplex

A new kid on the block

DBD::Multiplex

DBD::Multiplex - new and experimental

Connects to multiple databases at once (via DBI)

Executes statements on all of them by default

Could be configured to:

insert into all databases but select from one

fallback to alternate database if primary is unavailable

select from all and check results (pick most common)

select round-robin / or at random to distribute load

Can be used with DBD::Proxy, either above or below

May also acquire fancy caching in later versions

Watch this space!

Reference Materials

http://www.symbolstone.org/technology/perl/DBI

The DBI Home Page

http://www.isc.org/dbi-lists.html

To subscribe to the DBI Mailing Lists

http://www.perl.com/CPAN/authors/id/TIMB/DBI_Talk1_1997.tar.gz

My DBI session at The Perl Conference 1.0 - general introduction

http://www.perl.com/CPAN/authors/id/TIMB/DBI_Talk2_1998.tar.gz

My DBI session at The Perl Conference 2.0 - more depth

http://www.perl.com/CPAN/authors/id/TIMB/DBI_Talk3_1999.tar.gz

This DBI tutorial at The Perl Conference 3.0!

http://www.oreilly.com/catalog/perldbi/

or http://www.amazon.com/exec/obidos/ASIN/1565926994/dbi

The forthcoming DBI book! Due late 1999 or early 2000.

(Order in advance at the conference to get special discounted rate :-)

The end.

Till next year...


	Wheels within Wheels
	Error Checking and Handling
	Transactions
	Speed Speed Speed!
	Portability
	The Power of the Proxy
	DBI for the Web
	Handling LONG/BLOB data
	Tainting
	Flexing the Multiplex


	DBI defines and implements an interface
	Driver modules do much of the real work
	DBI provides default methods, functions, tools etc for drivers
	Not limited to the lowest common denominator - mechanism provided for driver specific extensions
	Designed and built for speed
	Valuable detailed call tracing/debugging built-in


	Inner and outer worlds
		Application and Drivers

	Inner and outer handles
		DBI handles are references to tied hashes

	The DBI Method Dispatcher
		gateway between the worlds and the heart of the DBI


	… Now we'll go all deep and visual for a while...


	Consider a simple prepare call: $dbh->prepare(...)
	$dbh is reference to a DBI::db object (regardless of driver)
	DBI::db::prepare is an alias for DBI dispatch method
	Dispatch calls driver’s prepare method something like this:
	my $inner_hash_ref = … # from tie magic
	my $implementor_class = … # from DBI magic data
	$inner_hash_ref->$implementor_class::prepare(...)
	Driver code gets the inner hash so it has fast access to the hash contents


	DBI has detailed call tracing built-in
	The trace can be very helpful in understanding application behavior and for debugging
	Shows parameters and results
	Trace information can be written to a file
	Not used often enough!


	1: #!/usr/bin/perl -w
	2:
	3: use DBI;
	4:
	5: $dbh = DBI->connect('', '', '', { RaiseError => 1 });
	6:
	7: $upd = $dbh->prepare("UPDATE prices SET price=? WHERE prod=?");
	8: $ins = $dbh->prepare("INSERT INTO prices (prod,price) VALUES(?,?)");
	9: $rows = $upd->execute(42, "Widgets");
	10: $ins->execute("Widgets", 42) if $rows == 0;
	11:
	12: $dbh->disconnect;


Per handle
		$h->trace($level);
		$h->trace($level, $filename);
	Only effects that handle and any new child handles created from it
	Child handles get trace level of parent in effect at time of creation
Global (internal to application)
		DBI->trace(...);
	Sets effective global default minimum trace level
Global (external to application)
	Enabled using DBI_TRACE environment variable
		DBI_TRACE=digits DBI->trace(digits);
		DBI_TRACE=filename DBI->trace(2, filename);
		DBI_TRACE=digits=filename DBI->trace(digits, filename);


	Trace level 1 shows method results and line numbers:

	<- connect= DBI::db=HASH(0xe0abc) at DBI.pm line 356.
	<- STORE('PrintError', 1)= 1 at DBI.pm line 382.
	<- STORE('AutoCommit', 1)= 1 at DBI.pm line 382.
	<- STORE('RaiseError', 1)= 1 at DBI.pm line 382.

	<- prepare('UPDATE …')= DBI::st=HASH(0xe1238) at test.pl line 7.
	<- prepare('INSERT …')= DBI::st=HASH(0xe1504) at test.pl line 8.
	<- execute= '0E0' at test.pl line 9.
	<- execute= 1 at test.pl line 10.
	<- disconnect= 1 at test.pl line 11.

	<- DESTROY= undef
	<- DESTROY= undef
	<- DESTROY= undef
	<- DESTROY= undef


	Trace level 2 shows calls with parameters and more:

	-> connect for DBD::ODBC::dr (DBI::dr=HASH(0x13dfec)~0xe14a4
	'' '' **** HASH(0xe0a10))
	<- connect= DBI::db=HASH(0xe0ab0) at DBI.pm line 356.

	-> STORE for DBD::ODBC::db (DBI::db=HASH(0xe0abc)~INNER 'PrintError' 1)
	<- STORE= 1 at DBI.pm line 382.

	-> prepare for DBD::ODBC::db (DBI::db=HASH(0xe0ab0)~0xe0abc
	'UPDATE prices SET price=? WHERE prod=?')
	<- prepare= DBI::st=HASH(0xe1274) at test.pl line 7.

	Trace level 3 and above shows more processing details


Errors happen!
	Failure happens when you don't expect errors!
		database crash / network disconnection
		lack of disk space for insert or select (sort space for order by)
		server math error on select (divide by zero)
		and maybe, just maybe, errors in your own code (Gasp!)

	Beat failure by expecting errors!

	Detect errors early to limit effects
		Defensive Programming, e.g., check assumptions
		Through Programming, e.g., check for errors after fetch loops


	Error checking the hard way...

			$h->method \|\| die "DBI method failed: $DBI::errstr";
			$h->method \|\| die "DBI method failed: $DBI::errstr";
			$h->method \|\| die "DBI method failed: $DBI::errstr";

	Error checking the easy way...

			$h->{RaiseError} = 1;
			$h->method;
			$h->method;
			$h->method;


For simple applications immediate death on error is fine

For more advanced applications greater control is needed

Life after death:
		$h->{RaiseError} = 1;
		eval { … $h->method; … };
		if ($@) { … }

Bonus prize
	Other, non-DBI, code within the eval block may also raise an exception that will be caught and handled cleanly


	Far more than just locking
	The A.C.I.D. test
	True transactions give true safety
			even from power failures and system crashes!
			Incomplete transactions are automatically rolled-back by the database server when it's restarted.
	Also removes burden of undoing incomplete changes
	Hard to implement (for the vendor) and has significant performance cost
	A very large topic worthy of an entire tutorial


	Classic: system crash between one bank account being debited and another being credited.
	Dramatic: power failure during update statement on 3 million rows after 2 minutes when only part way through.
	Real-world: complex series of inter-related updates, deletes and inserts on many separate tables fails at the last step due to a duplicate primary key on an insert.

	Transaction recovery would handle all these situations automatically.
	Makes a system far more robust and trustworthy over the long term.
	Use transactions if your database supports them.
	If it doesn't and you need them, switch to a different database.


Tools of the trade:
	Set AutoCommit off, and RaiseError on
	Wrap eval { … } around the code
	Use $dbh->commit; and $dbh->rollback;
Disable AutoCommit via $dbh->{AutoCommit} = 0;
		to enable transactions and thus rollback-on-error
Enable RaiseError via $dbh->{RaiseError} = 1;
		to automatically 'throw an exception' on error
The surrounding eval { … }
		catches the exception and puts the error message into $@
Test $@ and rollback() if set, else commit()


				$dbh->{RaiseError} = 1;
				eval {
				$dbh->method(…); # assorted DBI calls
				foo(...); # application code
				$dbh->commit; # commit the changes
				};
				if ($@) {
				warn "Transaction aborted because $@";
				$dbh->rollback;
				...
				}


	The eval { … } catches all exceptions
			not just from DBI calls. Also catches fatal runtime errors from Perl.
	Put commit() inside the eval
			ensures commit failure is caught cleanly.
			remember that commit() itself may fail for many reasons.
	Don't forget that rollback() may also fail
			due to database crash or network failure etc.
	Other points:
			Always explicitly commit or rollback before disconnect().
			Destroying a connected $dbh should always rollback.
			END blocks can catch exit-without-disconnect to rollback and disconnect cleanly.


	DBI was designed for speed from day one
	DBI dispatch written in hand-crafted XS/C
	Dispatch to XS driver method calls optimized
	Cached attributes returned directly by DBI dispatch
	DBI overhead is generally insignificant




			So we'll talk about other speed issues instead ...


Application partitioning
		stop and think - work smarter not harder
	Pick the right database for the job, if you have the choice
	Work close to the data
	Use stored procedures where appropriate
	Moving data to/from the client is always expensive
		think of latency as well as bandwidth
	Use proprietary bulk-load where appropriate
	Consider database maths vs. Perl maths
	Multiple simple queries with 'joins' in Perl may be faster
	Consider local caching, in memory or DBM file etc.
	Mix 'n Match techniques as needed


prepare() - what happens in the server
		Receive and parse the SQL statement into internal form
		Get details for all the selected tables
		Check access rights for each
		Get details for all the selected fields
		Check data types in expressions
		Get details for all the indices of the tables
		Develop an optimised query 'access plan' for best execution
		Return a handle for all this cached information

	This can be an expensive process - especially the 'access plan'
	Some databases, like MySQL, don't cache the information but have simpler, and thus faster, plan creation


Query optimisation is hard
	Intelligent high quality cost based query optimisation is really hard!
Know your optimiser
	Oracle, Informix, Sybase, DB2, SQL Server etc. all slightly different.
Check what it's doing
	Use tools to see the plans used for your queries - very helpful
Help it along
	Most 'big name' databases have a mechanism to analyse and store the key distributions of indices to help the optimiser make good plans
		beware: keep it fresh, old key distributions might be worse than none
	Some also allow you to embed 'hints' into the SQL as comments
		beware: take it easy, over hinting hinders dynamic optimisation


	Optimised Access Plan etc. is cached within the server
		keyed by the exact original SQL string used

	Without placeholders, SQL string varies each time
		so cached one is not reused
		and time is wasted creating a new access plan
		the new statement and access plan are added to cache
		so the cache fills and other statements get pushed out

	Compare do("insert … $id");
	with do("insert … ?", $id);


	Use prepare() to get a handle on the statement in the SQL cache and keep it there.
	The handle for the pre-prepared statement avoids a round-trip to server for cache-check / parsing etc.
	Avoid using $dbh->do(…) in a loop.
	Use $sth = $dbh->prepare(…)
	and $sth->execute() instead.
	For example, convert looped do("insert … ?", $id)
	to prepare("insert … ?")
	and looped execute($id).


Sometimes it's not easy to hold all your handles
		e.g., library code to lookup values from the database given a $dbh
The prepare_cached() method gives you a client side statement handle cache:
	while ( ($field, $value) = each %search_fields ) {
	push @sql, "$field = ?";
	push @values, $value;
	}
	$where = "";
	$where = "where ".join(" and ", @sql) if @sql;
	$sth = $dbh->prepare_cached("select * from table $where");
	$sth->execute(@values);
Can avoid the need for global statement handle variables


	while(@row = $sth->fetchrow_array) { }
		one field: 3,100 fetches per cpu second
		ten fields: 1,000 fetches per cpu second

	while($row = $sth->fetchrow_arrayref) { }
		one field: 5,300 fetches per cpu second
		ten fields: 4,000 fetches per cpu second

	Notes:
		Timings made on an old SPARC 10 using DBD::Oracle
		Timings assume instant record fetch within driver
		Fields all one char. @row would be even slower for more/bigger fields
		Use bind_columns() for direct access to fetched fields


	Connecting to a database can be slow
	Try to connect once and stay connected where practical
		We'll discuss web server issues later

	New method:
		The new connect_cached() method acts like prepare_cached() but for database handles
		It also checks the connection and reconnects if it's broken
		Similar, but not quite the same as Apache::DBI
		New and subject to change with experience
		Potentially most useful with DBD::Proxy & DBI::ProxyServer


	Portability requires care and testing - it can be tricky
	Platform Portability - the easier bit
		Availability of database client software (and server if required)
		Availability of DBD driver
		DBD::Proxy can address both these issues - see later
	Database Portability - more tricky
		Differences in SQL dialects cause most problems
		Differences in data types can also be a problem
		Driver capabilities (placeholders etc)
	DBIx::Compat module (in DBIx::RecordSet) may be useful.
	A standard DBI test suite is needed (and planned).


	Networking for Non-networked databases
	DBD::Proxy driver forwards calls over network to remote DBI::ProxyServer
	No changes in application behavior
		Only the DBI->connect statement needs to be changed
	Proxy can be made completely transparent by setting the DBI_AUTOPROXY environment variable
		So not even the DBI->connect statement needs to be changed!
	DBI::ProxyServer works on Win32
		Access to Access and other Win32 ODBC and ADO data sources
	Developed by Jochen Wiedmann


	Proxying for remote access: "thin-client"
			No need for database client code on the DBI client
	Proxying for network security: "encryption"
			Can use Crypt::IDEA, Crypt::DES etc.
	Proxying for "access control" and "firewalls"
			handy for web servers
	Proxying for action control
			e.g., only allow specific select or insert statements
	Proxying for local access: "n-tier middleware"
			connect_cached(), memory use, licensing limits
	Proxying for performance: "compression"
			Can compress data transfers using Compress::Zlib


Databases can be slow to connect
		Traditional CGI forces a new connect per request
Move Perl and DBI into the web server
		Apache with mod_perl and Apache::DBI module
		Microsoft IIS with ActiveState's PerlEx
Connections can then persist between requests
		Apache::DBI automatically used by DBI if loaded
		No CGI script changes required to get persistence
Take care not to change the shared session behaviour
		Leave the $dbh in the same state you found it!
Other alternatives include
	FastCGI, CGI::SpeedyCGI and CGI::MiniSvr