DBI: The Neophyte’s Guide
Alligator Descartes
<descarte@symbolstone.org>
Copyright ©1999 Arcane Technologies Ltd.

Introduction
What’s in this talk?
What’s not in this talk?!
What is the DBI?
What are DBDs?
Why do we need the DBI?
What about ODBC? Doesn’t ODBC do this?
What is oraperl? Ingperl? MsqlPerl?
DBI Resources

What’s in this talk?
Overview of the DBI and positioning beside ODBC
Overview of the DBI Architecture
Connecting to your databases
Manipulating the contents of your database
Database proxying
Question and Answer Session

What’s not in this talk?!
Because of time constraints, a lot of alternative strategies haven’t been discussed
I will stick primarily to discussing the most portable aspects of DBI. Topics such as table creation and stored procedures are not covered
The internal workings of DBI, error handing, tracing, magic and performance are discussed by Tim Bunce in his tutorial later today

What is DBI?
DBI = DataBase Interface
DBI is database-independent
DBI allows you to write code that interacts with databases independent of the underlying database
A DBI program will work with little, or no, modification on Oracle, Informix, MySQL and so on...

What are DBDs?
DBD = DataBase Driver
Drivers are used by DBI to perform the actual database work
Drivers are database-dependent
Drivers exist for most popular databases including Oracle, Informix, Ingres, mSQL, MySQL, Solid and many more

Why do we need the DBI?
Abstracted interface hides database-specific functionality from the programmer
DBI scripts are generally completely portable between databases with little modification
Faster application development and debugging cycle

What about ODBC?
ODBC and DBI share similar goals and heritage: platform- and database-independence
DBI is far more compact and less complex than ODBC. ODBC has a richer feature set
Although ODBC drivers exist for many operating systems, they can be expensive and difficult to use. DBI is free and easy to configure and use
A driver  called DBD::ODBC exists and can be used to connect to ODBC-based databases such as Microsoft Access

What is oraperl? Ingperl? ...
Prior to the development of DBDs for certain databases, database-specific Perl modules were developed
Oraperl for Oracle, Ingperl for Ingres and so on…
These modules are not portable between databases
DBI has emulation layers for some of these legacy modules using DBI under the bonnet

DBI Resources
The DBI home page
http://www.symbolstone.org/technology/perl/DBI
``Programming the Perl DBI’’ -- Alligator Descartes and Tim Bunce. Can be ordered via amazon.com from the DBI home page
This presentation and demonstration source code can be downloaded from the DBI home page

Any Questions?

DBI Architecture
Architectural Overview
A Handy Reminder
Handles
Driver Handles
Database Handles
Statement Handles
Basic Database Operations
Connecting to the Database
Data Sources
Disconnecting from the Database

Architectural Overview

A Handy Reminder
A tip to help you remember which modules do what in DBI
DBI is DataBase Independent
DBD is DataBase Dependent

Handles
All operations in DBI are performed via
DBI module as a static method
Driver Handles
Database Handle instance method
Statement Handle instance method
Driver Handles are not used explicitly by the programmer in scripts

Architectural Overview of Handles

Driver Handles
Driver Handles encapsulate a DBD
There exists exactly one driver handle for each loaded DBD
For example, a script using both MySQL and Oracle will have two instantiated driver handles
Driver handles are never used directly in programs and are for internal use only by DBI

Driver Handle Implications
With the DBI architecture as it is, there are several implications for driver handles
One or more Driver Handles can co-exist within a Perl script simultaneously
This implies that you can connect to two different types of database simultaneously

Database Handles
Database Handles encapsulate a single connection to a database
Database handles are created via a driver handle for the desired database type. This is done via the DBI->connect() method
   $dbh = DBI->connect( … );

Database Handle Implications
Depending on the DBD, one or more Database Handles can be simultaneously created either through the same driver handle or different driver handles
 For example, you might wish to connect to an Oracle database and an mSQL database, read data from tables in Oracle and write that data into mSQL

Statement Handles
Statement Handles encapsulate a statement to be issued to the database
Statement handles are created via a database handle. That is, a statement is issued to the database represented by the database handle
$sth = $dbh->prepare( “SELECT yadda FROM blah” );
Statement handles also allow you to fetch data from the issued statement from the database

Statement Handle Implications
Depending on the DBD, one or more statement handles can be simultaneously created
For example, you might wish to select data from two different tables simultaneously in different statements then combine that data in Perl
Additionally, this allows you to pre-prepare all your statements in advance and execute them as needed

Handle Naming Conventions
Throughout the example code in this presentation and other DBI texts, you might see the following naming convention of variables
$drh = Driver Handle
$dbh = Database Handle
$sth = Statement Handle

Any Questions?

Starting Out with DBI
Initializing the DBI
Data Sources
Checking Available Drivers
Connecting to the Database
Disconnecting from the Database
Example

Initializing the DBI
DBI can be initialized very simply by adding
    use DBI;
   to your scripts
DBI internally handles all driver loading automatically and DBDs should not normally be explicitly use’d or require’d in your scripts

Data Sources
Data Sources define at least two things:
The underlying database type to which we wish to connect
The name or location of that database
Data Source strings are usually defined as
 dbi:database_type:database_location:key=value;...

Data Sources ( cont. )
For example, connecting to an Oracle database called DEV can be achieved with the DSN:
   dbi:Oracle:DEV
If not already loaded, DBI will automatically load the desired underlying DBD, in this case DBD::Oracle
The database name is defined by the underlying database-specific requirements

Data Sources ( cont. )
Oracle typically uses alias files that map complex connection descriptors into simple aliases
mSQL and MySQL typically require hostname and port numbers to connect to
File-based DBDs such as DBD::CSV use directory names in which the database files are located

Data Sources ( cont. )
In summary, Data Source specification is about the most non-portable aspect of DBI
In reality, a well-written DBI script that uses standard SQL can be ported by simply changing the DSN in the DBI->connect() call

Checking Available Drivers
Prior to attempting connection to database, you can check which drivers are available by calling
            DBI->available_drivers()
This returns a list of drivers you have installed on your system
Can be used in conjunction with
    DBI->data_sources( $driver_name )
   which lists all available data sources for that driver that the DBD can probe

listdsns: Lists available data sources
#!/usr/bin/perl -w
### Load DBI
use DBI;
### Probe DBI for the installed drivers
@drivers = DBI->available_drivers();
### Iterate through the drivers and list the data
### sources for each one
foreach $driver ( @drivers ) {
    print "Driver: $driver\n";
    @dataSources = DBI->data_sources( $driver );
    foreach $dataSource ( @dataSources ) {
        print "\tData Source is $dataSource\n";
    }
    print "\n";
}
exit;

listdsns: Sample Output
    Driver: ADO
    Driver: CSV
        Data source is DBI:CSV:f_dir=megaliths
        Data source is DBI:CSV:f_dir=pictish_stones
    Driver: ExampleP
        Data Source is dbi:ExampleP:dir=.
    Driver: ODBC
Driver: Oracle
Data Source is DBI:Oracle:DEV
    Driver: Proxy
    Driver: XBase
        Data Source is dbi:XBase:.

Connecting to the Database
The very first database operation you must do!
Use the DBI->connect()method
For example
        $dbh = DBI->connect( ‘dbi:Oracle:DEV’,
                          ‘user’, ‘pass’ );
That is, we minimally specify the Data Source Name of the database, a username and password
If successful, this will return a valid database handle

Disconnecting from the Database
Once all work has been completed, you must disconnect from the database
Frees any used database and system resources
Achieved by calling disconnect() against a valid database handle. For example:
        $dbh->disconnect();

Example 1
This example connects to a single Oracle database called DEV
#!/usr/bin/perl -w
### Load the DBI module
use DBI;
### Perform the connection using the Oracle driver
$dbh = DBI->connect( "dbi:Oracle:DEV", "username",
                     "password" )
    or die "Can't connect to Oracle database: $DBI::errstr\n";
### Disconnect from the database
$dbh->disconnect();
exit;

Example 2
This example connects to two Oracle databases simultaneously
#!/usr/bin/perl -w
### Load the DBI module
use DBI;
### Perform the connection using the Oracle driver
$dbh1 = DBI->connect( "dbi:Oracle:DEV", "username", "password" )
    or die "Can't connect to 1st Oracle database: $DBI::errstr\n";
$dbh2 = DBI->connect( "dbi:Oracle:OTHERDB", "username2", "password2" )
    or die "Can't connect to 2nd Oracle database: $DBI::errstr\n";
$dbh1->disconnect();
$dbh2->disconnect();
exit;

Example 3
This example demonstrates connecting to two different databases of different types
#!/usr/bin/perl -w
### Load the DBI module
use DBI;
### Perform the connection using the Oracle driver
$dbh1 = DBI->connect( "dbi:Oracle:DEV", "username", "password" )
    or die "Can't connect to Oracle database: $DBI::errstr\n";
$dbh2 = DBI->connect( "dbi:mSQL:host:dbname:1114", "username", "password" )
    or die "Can't connect to mSQL database: $DBI::errstr\n";
$dbh1->disconnect();
$dbh2->disconnect();
exit;

Any Questions?

Interacting with the Database
Simple Queries
Preparing Statements
Executing Statements
Fetching Data
Non-SELECT Statements
Bind Values
Optimizing $dbh->do()

Simple Queries
The process of retrieving data from the database is a 4-stage cycle
Preparing the statement
Executing the statement
Fetching the data
Finishing the statement

Preparing the Statement
The $dbh->prepare( $statement ) method is used
This can mean different things depending on the underlying DBD used. Oracle will use this stage to send the statement to the database for parsing. However, mSQL does nothing here at all
The SQL statement is simply a Perl string and therefore can be supplied as a literal string or as a variable holding a built on-the-fly statement

Preparing the Statement ( cont. )
#!/usr/bin/perl -w
### Load the DBI module
use DBI;
### The database handle
$dbh = DBI->connect( "dbi:Oracle:DEV", "username", "password" );
### Prepare the statement handle
$sth = $dbh->prepare( "SELECT id, name FROM megaliths" );
...
exit;

Preparing the Statement ( cont. )
For example, you might have a DBI script driven via CGI. The form has checkboxes indicating which selection criteria to use
### Collect the selected field names
@fields = ();
### Work out which checkboxes have been selected
push @fields, "name"     if $nameCheckbox     eq "CHECKED";
push @fields, "location" if $locationCheckbox eq "CHECKED";
### Sanity-check that *something* was selected
die "No fields were selected for querying!\n"
    unless @fields;
### Now build the SQL statement
$statement = sprintf "SELECT %s FROM megaliths WHERE name = %s",
    join(", ", @fields), $dbh->quote($siteNameToQuery);
### Perform the query
$sth = $dbh->prepare( $statement ) or die ...

Executing the Statement
Once the statement has been successfully prepared, it must be executed
Depending on the database, this stage typically executes the prepared statement within the database and generates a result set ready for fetching
Simply a case of calling $sth->execute() against your prepared statement handle

Fetching the Data
The fetch stage retrieves the data from the database row-by-row by use of a cursor
The most commonly used fetch style is to retrieve the data as a Perl list via
   @row = $sth->fetchrow_array()
You should continuously loop calling fetchrow_array() until all rows are returned

Fetching Data - Example 1
#/usr/bin/perl -w
### Load the DBI module
use DBI;
### Connect to the database
$dbh = DBI->connect( ‘dbi:Oracle:DEV’, ‘username’, ‘password’ );
### Prepare and execute the statement
$sth = $dbh->prepare( “SELECT name, type FROM megaliths” );
$sth->execute();
### Fetch the data
while ( @row = $sth->fetchrow_array() ) {
    print "Megalith site $row[0] is a $row[1]\n";
  }
### Disconnect from the database
$dbh->disconnect();
exit;

Fetching Data - Example 2
#/usr/bin/perl -w
### Load the DBI module
use DBI;
### Connect to the database
$dbh = DBI->connect( ‘dbi:Oracle:DEV’, ‘username’, ‘password’ );
### Prepare and execute the statement
$sth = $dbh->prepare( “SELECT name, type FROM megaliths” );
$sth->execute();
### Fetch the data
while ( ( $sitename, $sitetype ) = $sth->fetchrow_array() ) {
    print "Megalith site $sitename is a $sitetype\n";
  }
### Disconnect from the database
$dbh->disconnect();
exit;

Fetching the Data
There are additional methods for fetching data from the statement handle:
fetchrow_arrayref()
fetchrow_hashref()

Finishing the Statement
DBI automatically marks the statement handles as inactive once all the data has been fetched from it
You can manually mark statement handles as being inactive via the $sth->finish() method, but you should usually not need to invoke this method in your scripts

Non-SELECT Statements
Statements other than queries can be issued to the database via DBI
For example, UPDATE, DELETE and INSERT statements
These statements are issued by calling
     $dbh->do()
    which combines preparation and execution in a single step

Non-SELECT Statements ( cont. )
For example:
$dbh->do( “DELETE FROM megaliths” );
$dbh->do( “DROP TABLE megaliths” );
$dbh->do( “INSERT INTO megaliths VALUES ( 1, ‘Stonehenge’,
                                          ‘Wiltshire’ )” );

Bind Values
A bind value is a value that can be bound to a placeholder declared within a SQL statement
This is similar to creating an on-the-fly SQL statement such as:
        $sth = $dbh->prepare( "
                    SELECT name, location
                    FROM megaliths
                    WHERE name = " . $dbh->quote( $siteName ) . "
                " );

Bind Values ( cont. )
However, instead of interpolating the generated value into the SQL statement, you specify a placeholder and then bind the generated value to that
For example:
        $sth = $dbh->prepare( "
                    SELECT name, location
                    FROM megaliths
                    WHERE name = ?
                " );
        $sth->bind_param( 1, $dbh->quote( ‘Avebury’ ) );

Bind Values ( cont. )
The benefits of this system depend on your underlying database
For Oracle, bind values will work well with the Shared SQL Cache which result in a pre-prepared statement handle being reused instead of the statement being reparsed
Over periods of executing the same statement with different bind values, you will notice a considerable performance benefit

Bind Values ( cont. )
Multiple bind values can also be specified, e.g.:
        $sth = $dbh->prepare( "
                    SELECT name, location
                    FROM megaliths
                    WHERE name = ?
                    AND mapref = ?
                    AND type LIKE ?
                " );
        $sth->bind_param( 1, $dbh->quote( "Avebury” ) );
        $sth->bind_param( 2, $dbh->quote( $mapreference ) );
        $sth->bind_param( 3, $dbh->quote( "%Stone Circle%” ) );

Bind Values ( cont. )
Data typing is also important and may need to be explicitly stated:
    ### No need for a data type for this value. It's a string.
    $sth->bind_param( 1, $dbh->quote( "Avebury” ) );
    ### This one is obviously a number, so no type again
    $sth->bind_param( 2, 21 );
    ### However, this one is a string but looks like a number
    $sth->bind_param( 3, $dbh->quote( 123500 ), { TYPE => SQL_VARCHAR } );
    ### Alternative short-hand form of the previous statement
    $sth->bind_param( 3, $dbh->quote( 123500 ), SQL_VARCHAR );

Bind Values ( cont. )
Bind values can be more quickly assigned by passing arguments to the $sth->execute() method
       $sth->execute( $dbh->quote( “Avebury” ), 21,  $dbh->quote( 123500 ),
                      $dbh->quote( “%Stone Circle%” ) );

Bind Values ( cont. )
Values that cannot be automatically typed can be specified with bind_param()
These settings are sticky for the lifetime of that statement handle or until the bind value type is reset via bind_param()
        $sth->bind_param( 3, $dbh->quote( ‘’ ), SQL_VARCHAR );
        $sth->execute( $dbh->quote( 123500 ) );

Optimizing $dbh->do()
The default implementation of $dbh->do() calls prepare(), execute() and finish()for each invocation of $dbh->do()
This can be optimized for repeated iterations by hand-preparing the statement but repeatedly executing the same statement handle
This is very efficient when used with bind values

Optimizing $dbh->do()
        ### Prepare the statement handle
        $sth = $dbh->prepare( “DELETE FROM megaliths WHERE id = ?” );
        ### Remove the first 100 rows one-by-one...
        $loopCounter = 0;
        while ( $loopCounter < 100 ) {
            $sth->execute( $loopCounter );
            $loopCounter++;
          }
        $sth->finish();
        ...
This is far faster than repeatedly preparing the same statement over and over again

Any Questions?

Database Proxying
This feature of DBI allows you to connect to remote databases using a database-independent protocol
For example, if you have an Microsoft Access database on an NT machine and wish to query it from a Linux machine
Database Proxying allows you to configure secure remote database access with access lists and ciphered communications streams

Proxy Architecture
The architecture of the DBI proxy system is a classic client/server design
A DBI Proxy Server (not a WWW proxy server) runs on the machine with the target database. In the common case, this might be an NT machine
A DBI Proxy Client which is a Perl script that uses the DBD::Proxy driver to make a connection as opposed to the `real’ DBD

Proxy Server Setup
DBI comes with a small program called dbiproxy which is a small DBI proxy server and can be run almost immediately
You must also install the appropriate modules required to interact with the target database. For example, if you want to fetch data from an Access database, you must also install DBD::ODBC
dbiproxy can now be started with a port number as an argument:
   dbiproxy --localport 3333

Proxy Client Setup
Connecting to the remote database is very simple and is achieved by connecting via the proxy server
For example, if you wished a local Perl script to connect to the Access database, you would write:
$dbh = DBI->connect( ‘dbi:ODBC:megaliths’, ‘’, ‘’ );

Proxy Client Setup ( cont. )
However, when using the DBI Proxy Server, we need the DSN to connect through that instead
        $dbh = DBI->connect( "dbi:Proxy:hostname=fowliswester;port=3333; \
                              dsn=dbi:ODBC:megaliths”, ‘’, ‘’ );
( The DSN above is split over the line-break for readability. In reality, it’s one long string! )

Proxying Data Flow
The sequence of events is that the client script will connect typically via TCP/IP to the port of the DBI Proxy Server
The proxy server will examine the incoming DSN and work out the database to connect to
The proxy server will then call DBI->connect() with the subsidiary DSN and make the connection to the database

Proxying Data Flow ( cont. )
At this stage, all database calls are routed to the proxy server who will forward them to the database
All returned data and other information from the database is then transmitted back to the Perl script
Therefore, this system allows us to access remote databases via raw TCP/IP without requiring special client/server software to be installed

Benefits of Proxying
You can access databases on remote machines more easily than before, especially if those databases were not network-aware (CSV files!)
Reduced licensing costs
Secure, encrypted transport
Highly configurable for access list support for enhanced security

Demonstration Database Tables

Final Questions and Answers