![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
require LWP::UserAgent; $ua = new LWP::UserAgent;
$request = new HTTP::Request('GET', 'file://localhost/etc/motd');
$response = $ua->request($request); # or $response = $ua->request($request, '/tmp/sss'); # or $response = $ua->request($request, \&callback, 4096);
sub callback { my($data, $response, $protocol) = @_; .... }
LWP::UserAgent
is a class implementing a simple World-Wide Web user agent in Perl. It
brings together the HTTP::Request, HTTP::Response and the LWP::Protocol
classes that form the rest of the core of libwww-perl library. For simple
uses this class can be used directly to dispatch WWW requests,
alternatively it can be subclassed for application-specific behaviour.
In normal usage the application creates a UserAgent object, and then
configures it with values for timeouts proxies, name, etc. The next step is
to create an instance of HTTP::Request
for the request that needs to be performed. This request is then passed to
the UserAgent request()
method, which dispatches it using the
relevant protocol, and returns a HTTP::Response
object.
The basic approach of the library is to use HTTP style communication for
all protocol schemes, i.e. you will receive an HTTP::Response
object also for gopher or ftp requests. In order to achieve even more
similarities with HTTP style communications, gopher menus and file
directories will be converted to HTML documents.
The request()
method can process the content of the response
in one of three ways: in core, into a file, or into repeated calls of a
subroutine. You choose which one by the kind of value passed as the second
argument to request().
The in core variant simply returns the content in a scalar attribute called
content()
of the response object, and is suitable for small
HTML replies that might need further parsing. This variant is used if the
second argument is missing (or is undef).
The filename variant requires a scalar containing a filename as the second
argument to request(),
and is suitable for large WWW objects
which need to be written directly to the file, without requiring large
amounts of memory. In this case the response object returned from
request()
will have empty content().
If the
request fails, then the content()
might not be empty, and the
file will be untouched.
The subroutine variant requires a reference to callback routine as the
second argument to request()
and it can also take an optional
chuck size as third argument. This variant can be used to construct
``pipe-lined'' processing, where processing of received chuncks can begin
before the complete data has arrived. The callback function is called with
3 arguments: the data received this time, a reference to the response
object and a reference to the protocol object. The response object returned
from request()
will have empty content().
If the
request fails, then the the callback routine will not have been called, and
the response->content() might not be empty.
The request can be aborted by calling die()
within the
callback routine. The die message will be available as the ``X-Died''
special response header field.
The library also accepts that you put a subroutine reference as content in the request object. This subroutine should return the content (possibly in pieces) when called. It should return an empty string when there is no more content.
The user of this module can finetune timeouts and error handling by calling
the use_alarm()
and use_eval()
methods.
By default the library uses alarm()
to implement timeouts,
dying if the timeout occurs. If this is not the prefered behaviour or it
interferes with other parts of the application one can disable the use
alarms. When alarms are disabled timeouts can still occur for example when
reading data, but other cases like name lookups etc will not be timed out
by the library itself.
The library catches errors (such as internal errors and timeouts) and present them as HTTP error responses. Alternatively one can switch off this behaviour, and let the application handle dies.
$request
should be a reference to a HTTP::Request
object with values defined for at least the method()
and
url()
attributes.
If $arg
is a scalar it is taken as a filename where the content of the response is
stored.
If $arg
is a reference to a subroutine, then this routine is called as chunks of
the content is received. An optional $size
argument is taken as a hint for an appropriate chunk size.
If $arg
is omitted, then the content is stored in the response object itself.
The arguments are the same as for simple_request()
.
request()
before it tries to do any
redirects. It should return a true value if the redirect is allowed to be
performed. Subclasses might want to override this.
The default implementation will return FALSE for POST request and TRUE for all others.
get_basic_credentials()
method
instead.
request()
to retrieve credentials for a
Realm protected by Basic Authentication or Digest Authentication.
Should return username and password in a list. Return undef to abort the authentication resolution atempts.
This implementation simply checks a set of pre-stored member variables.
Subclasses can override this method to e.g. ask the user for a
username/password. An example of this can be found in
lwp-request
program distributed with this library.
The user agent string should be one or more simple product identifiers with an optional version number separated by the ``/'' character. Examples are:
$ua->agent('Checkbot/0.4 ' . $ua->agent); $ua->agent('Mozilla/5.0');
$ua->from('aas@sn.no');
timeout()
value is 180 seconds, i.e. 3 minutes.
alarm()
when
implementing timeouts. The default is TRUE, i.e. to use alarm. Disable this
on systems that does not implement alarm, or if this interfers with other
uses of alarm in your application.
scheme
. The scheme
might be a string (like 'http' or 'ftp') or it might be an URI::URL object
reference.
$ua->proxy(['http', 'ftp'], 'http://proxy.sn.no:8001/'); $ua->proxy('gopher', 'http://proxy.sn.no:8001/');
The first form specifies that the URL is to be used for proxying of access methods listed in the list in the first method argument, i.e. 'http' and 'ftp'.
The second form shows a shorthand form for specifying proxy URL for a single access scheme.
*_proxy
environment variables. You
might specify proxies like this (sh-syntax):
gopher_proxy=http://proxy.my.place/ wais_proxy=http://proxy.my.place/ no_proxy="my.place" export gopher_proxy wais_proxy no_proxy
Csh or tcsh users should use the setenv
command to define these envirionment variables.
$ua->no_proxy('localhost', 'no', ...);
$CommentsMailTo = "perl5@dcs.ed.ac.uk"; include("../syssies_footer.inc");?>