![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Read the FAQs and documentation specific to the port of perl to your operating system (eg, perlvms, perlplan9, ...). These should contain more detailed information on the vagaries of your perl.
system()
instead.
Term::Cap Standard perl distribution Term::ReadKey CPAN Term::ReadLine::Gnu CPAN Term::ReadLine::Perl CPAN Term::Screen CPAN
Term::Cap Standard perl distribution Curses CPAN Term::ANSIColor CPAN
Tk CPAN
There's an example of this in crypt). First, you put the terminal into ``no echo'' mode, then just read the
password normally. You may do this with an old-style ioctl()
function, POSIX terminal control (see the POSIX manpage, and Chapter 7 of the Camel), or a call to the stty program, with varying degrees of portability.
You can also do this for most systems using the Term::ReadKey module from CPAN, which is easier to use and in theory more portable.
sysopen()
and O_RDWR|O_NDELAY|O_NOCTTY
from the Fcntl module (part of the standard perl distribution). See
sysopen for more on this approach.
print DEV "atv1\012"; # wrong, for some devices print DEV "atv1\015"; # right, for some devices
Even though with normal text files, a ``\n'' will do the trick, there is still no unified scheme for terminating a line that is portable between Unix, DOS/Win, and Macintosh, except to terminate ALL line ends with ``\015\012'', and strip what you don't need from the output. This applies especially to socket I/O and autoflushing, discussed next.
print()
them, you'll want to autoflush that filehandle, as in
the older
use FileHandle; DEV->autoflush(1);
and the newer
use IO::Handle; DEV->autoflush(1);
You can use select()
and the $|
variable to control autoflushing (see $| and select):
$oldh = select(DEV); $| = 1; select($oldh);
You'll also see code that does this without a temporary variable, as in
select((select(DEV), $| = 1)[0]);
As mentioned in the previous item, this still doesn't work when using socket I/O between Unix and Macintosh. You'll need to hardcode your line terminators, in that case.
read()
or sysread(),
you'll have to arrange for an alarm handler to provide a timeout (see
alarm). If you have a non-blocking open, you'll likely have a non-blocking read,
which means you may have to use a 4-arg select()
to determine
whether I/O is ready on that device (see
select.
Seriously, you can't if they are Unix password files - the Unix password system employs one-way encryption. Programs like Crack can forcibly (and intelligently) try to guess passwords, but don't (can't) guarantee quick success.
If you're worried about users selecting bad passwords, you should
proactively check when they try to change their password (by modifying
passwd(1),
for example).
system("cmd &")
or you could use fork as documented in fork, with further examples in the perlipc manpage. Some things to be aware of, if you're on a Unix-like system:
system("cmd&")
.
$SIG{CHLD} = sub { wait };
See Signals for other examples of code to do this. Zombies are not an issue with system("prog &")
.
Be warned that very few C libraries are re-entrant. Therefore, if you
attempt to print()
in a handler that got invoked during
another stdio operation your internal structures will likely be in an
inconsistent state, and your program will dump core. You can sometimes
avoid this by using syswrite()
instead of
print().
Unless you're exceedingly careful, the only safe things to do inside a
signal handler are: set a variable and exit. And in the first case, you
should only set a variable in such a way that malloc()
is not
called (eg, by setting a variable that already has a value).
For example:
$Interrupted = 0; # to ensure it has a value $SIG{INT} = sub { $Interrupted++; syswrite(STDERR, "ouch\n", 5); }
However, because syscalls restart by default, you'll find that if you're in
a ``slow'' call, such as <FH>, read(),
connect(),
or wait(),
that
the only way to terminate them is by ``longjumping'' out; that is, by
raising an exception. See the time-out handler for a blocking
flock()
in Signals or chapter 6 of the Camel.
pwd_mkdb(8)
to install it (see pwd_mkdb(5) for more details).
date(1)
program. (There is no way to set the time and date on a per-process basis.)
This mechanism will work for Unix, MS-DOS, Windows, and NT; the VMS
equivalent is set time
.
However, if all you want to do is change your timezone, you can probably get away with setting an environment variable:
$ENV{TZ} = "MST7MDT"; # unixish $ENV{'SYS$TIMEZONE_DIFFERENTIAL'}="-5" # vms system "trn comp.lang.perl";
sleep()
function provides, the easiest way is to use the
select()
function as documented in select. If your system has itimers and syscall()
support, you can
check out the old example in http://www.perl.com/CPAN/doc/misc/ancient/tutorial/eg/itimers.pl
.
In general, you may not be able to. But if you system supports both the
syscall()
function in Perl as well as a system call like
gettimeofday(2),
then you may be able to do something like
this:
require 'sys/syscall.ph';
$TIMEVAL_T = "LL";
$done = $start = pack($TIMEVAL_T, ());
syscall( &SYS_gettimeofday, $start, 0)) != -1 or die "gettimeofday: $!";
########################## # DO YOUR OPERATION HERE # ##########################
syscall( &SYS_gettimeofday, $done, 0) != -1 or die "gettimeofday: $!";
@start = unpack($TIMEVAL_T, $start); @done = unpack($TIMEVAL_T, $done);
# fix microseconds for ($done[1], $start[1]) { $_ /= 1_000_000 }
$delta_time = sprintf "%.4f", ($done[0] + $done[1] ) - ($start[0] + $start[1] );
atexit().
Each package's END block is called when the program
or thread ends (see the perlmod manpage manpage for more details). It isn't called when untrapped signals kill the
program, though, so if you use END blocks you should also use
use sigtrap qw(die normal-signals);
Perl's exception-handling mechanism is its eval()
operator.
You can use eval()
as setjmp and die()
as
longjmp. For details of this, see the section on signals, especially the
time-out handler for a blocking flock()
in Signals and chapter 6 of the Camel.
If exception handling is all you're interested in, try the exceptions.pl library (part of the standard perl distribution).
If you want the atexit()
syntax (and an rmexit()
as well), try the AtExit module available from CPAN.
Note that even though SunOS and Solaris are binary compatible, these values are different. Go figure.
syscall(),
you can use the syscall function (documented in
the perlfunc manpage).
Remember to check the modules that came with your distribution, and CPAN as well - someone may already have written a module to do it.
cpp(1)
directives in C header files to files containing subroutine definitions,
like &SYS_getitimer, which you can use as arguments to your functions.
It doesn't work perfectly, but it usually gets most of the job done. Simple
files like errno.h, syscall.h, and socket.h were fine, but the hard ones like ioctl.h nearly always need to hand-edited. Here's how to install the *.ph files:
1. become super-user 2. cd /usr/include 3. h2ph *.h */*.h
If your system supports dynamic loading, for reasons of portability and sanity you probably ought to use h2xs (also part of the standard perl distribution). This tool converts C header files to Perl extensions. See the perlxstut manpage for how to get started with h2xs.
If your system doesn't support dynamic loading, you still probably ought to use h2xs. See the perlxstut manpage and MakeMaker for more information (in brief, just use make perl instead of a plain make to rebuild perl with a new static extension).
pipe(),
fork(),
and exec()
to do the job. Make sure you
read the deadlock warnings in its documentation, though (see Open2).
system()
and backticks (``).
system()
runs a command and returns exit status information
(as a 16 bit value: the low 8 bits are the signal the process died from, if
any, and the high 8 bits are the actual exit value). Backticks (``) run a
command and return what it sent to STDOUT.
$exit_status = system("mail-users"); $output_string = `ls`;
system $cmd; # using system() $output = `$cmd`; # using backticks (``) open (PIPE, "cmd |"); # using open()
With system(),
both STDOUT and STDERR will go the same place
as the script's versions of these, unless the command redirects them.
Backticks and open()
read only the STDOUT of your command.
With any of these, you can change file descriptors before the call:
open(STDOUT, ">logfile"); system("ls");
or you can use Bourne shell file-descriptor redirection:
$output = `$cmd 2>some_file`; open (PIPE, "cmd 2>some_file |");
You can also use file-descriptor redirection to make STDERR a duplicate of STDOUT:
$output = `$cmd 2>&1`; open (PIPE, "cmd 2>&1 |");
Note that you cannot simply open STDERR to be a dup of STDOUT in your Perl program and avoid calling the shell to do the redirection. This doesn't work:
open(STDERR, ">&STDOUT"); $alloutput = `cmd args`; # stderr still escapes
This fails because the open()
makes STDERR go to where STDOUT
was going at the time of the open().
The backticks then make
STDOUT go to a string, but don't change STDERR (which still goes to the old
STDOUT).
Note that you must use Bourne shell (sh(1)) redirection syntax in backticks, not
csh(1)!
Details on why Perl's system()
and
backtick and pipe opens all use the Bourne shell are in http://www.perl.com/CPAN/doc/FMTEYEWTK/versus/csh.whynot
.
You may also use the IPC::Open3 module (part of the standard perl distribution), but be warned that it has a different order of arguments from IPC::Open2 (see Open3).
fork()/exec()
paradigm (eg, Unix), it works like
this: open()
causes a fork().
In the parent,
open()
returns with the process ID of the child. The child
exec()s
the command to be piped to/from. The parent can't know
whether the exec()
was successful or not - all it can return
is whether the fork()
succeeded or not. To find out if the
command succeeded, you have to catch SIGCHLD and wait()
to get
the exit status. You should also catch SIGPIPE if you're writing to the
child -- you may not have found out the exec()
failed by the
time you write. This is documented in the perlipc manpage.
On systems that follow the spawn()
paradigm,
open()
might do what you expect - unless perl uses a shell to start your command. In
this case the fork()/exec()
description still applies.
`cp file file.bak`;
And now they think ``Hey, I'll just always use backticks to run programs.''
Bad idea: backticks are for capturing a program's output; the
system()
function is for running programs.
Consider this line:
`cat /etc/termcap`;
You haven't assigned the output anywhere, so it just wastes memory (for a
little while). Plus you forgot to check $?
to see whether the program even ran correctly. Even if you wrote
print `cat /etc/termcap`;
In most cases, this could and probably should be written as
system("cat /etc/termcap") == 0 or die "cat program failed!";
Which will get the output quickly (as its generated, instead of only at the end ) and also check the return value.
system()
also provides direct control over whether shell
wildcard processing may take place, whereas backticks do not.
@ok = `grep @opts '$search_string' @filenames`;
You have to do this:
my @ok = (); if (open(GREP, "-|")) { while (<GREP>) { chomp; push(@ok, $_); } close GREP; } else { exec 'grep', @opts, $search_string, @filenames; }
Just as with system(),
no shell escapes happen when you
exec()
a list.
clearerr()
that you can use. That is the
technically correct way to do it. Here are some less reliable workarounds:
$where = tell(LOG); seek(LOG, $where, 0);
If all you want to do is pretend to be telnet but don't need the initial telnet handshaking, then the standard dual-process approach will suffice:
use IO::Socket; # new in 5.004 $handle = IO::Socket::INET->new('www.perl.com:80') || die "can't connect to port 80 on www.perl.com: $!"; $handle->autoflush(1); if (fork()) { # XXX: undef means failure select($handle); print while <STDIN>; # everything from stdin to socket } else { print while <$handle>; # everything from socket to stdout } close $handle; exit;
To actually alter the visible command line, you can assign to the variable
$0
as documented in the perlvar manpage. This won't work on all operating systems, though. Daemon programs like
sendmail place their state there, as in:
$0 = "orcus [accepting connections]";
eval()ing
the script's output in your shell; check out the
comp.unix.questions FAQ for details.
%ENV
persist after Perl exits, but directory changes
do not.
fork && exit;
-t STDIN
and -t STDOUT
can give clues, sometimes not.
if (-t STDIN && -t STDOUT) { print "Now what? "; }
On POSIX systems, you can test whether your own process group matches the current process group of your controlling terminal as follows:
use POSIX qw/getpgrp tcgetpgrp/; open(TTY, "/dev/tty") or die $!; $tpgrp = tcgetpgrp(TTY); $pgrp = getpgrp(); if ($tpgrp == $pgrp) { print "foreground\n"; } else { print "background\n"; }
alarm()
function, probably in conjunction with a
signal handler, as documented Signals and chapter 6 of the Camel. You may instead use the more flexible
Sys::AlarmCall module available from CPAN.
wait()
when a SIGCHLD is received, or else use the
double-fork technique described in fork.
system()
call (see the perlipc manpage for sample code) and then have a signal handler for the INT signal that
passes the signal on to the subprocess.
sysopen():
use Fcntl; sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT, 0644) or die "can't open /tmp/somefile: $!":
If your version of perl is compiled without dynamic loading, then you just need to replace step 3 (make) with make perl and you will get a new perl binary with your extension linked in.
See MakeMaker for more details on building extensions, the question ``How do I keep my own module/library directory?''
perl Makefile.PL PREFIX=/u/mydir/perl
then either set the PERL5LIB environment variable before you run scripts that use the modules/libraries (see the perlrun manpage) or say
use lib '/u/mydir/perl';
See Perl's the lib manpage for more information.
use FindBin; use lib "$FindBin:Bin"; use your_own_modules;
the PERLLIB environment variable the PERL5LIB environment variable the perl -Idir commpand line flag the use lib pragma, as in use lib "$ENV{HOME}/myown_perllib";
The latter is particularly useful because it knows about machine dependent architectures. The lib.pm pragmatic module was first included with the 5.002 release of Perl.
#!/usr/bin/perl -w use strict; $| = 1; for (1..4) { my $got; print "gimme: "; $got = getone(); print "--> $got\n"; } exit;
BEGIN { use POSIX qw(:termios_h);
my ($term, $oterm, $echo, $noecho, $fd_stdin);
$fd_stdin = fileno(STDIN);
$term = POSIX::Termios->new(); $term->getattr($fd_stdin); $oterm = $term->getlflag();
$echo = ECHO | ECHOK | ICANON; $noecho = $oterm & ~$echo;
sub cbreak { $term->setlflag($noecho); $term->setcc(VTIME, 1); $term->setattr($fd_stdin, TCSANOW); }
sub cooked { $term->setlflag($oterm); $term->setcc(VTIME, 0); $term->setattr($fd_stdin, TCSANOW); }
sub getone { my $key = ''; cbreak(); sysread(STDIN, $key, 1); cooked(); return $key; }
} END { cooked() }
perlfaq9 - Networking
Seriously, if you can demonstrate that you've read the following FAQs and that your problem isn't something simple that can be easily answered, you'll probably receive a courteous and useful reply to your question if you post it on comp.infosystems.www.authoring.cgi (if it's something to do with HTTP, HTML, or the CGI protocols). Questions that appear to be Perl questions but are really CGI ones that are posted to comp.lang.perl.misc may not be so well received.
The useful FAQs are:
http://www.perl.com/perl/faq/idiots-guide.html http://www3.pair.com/webthing/docs/cgi/faqs/cgifaq.shtml http://www.perl.com/perl/faq/perl-cgi-faq.html http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html http://www.boutell.com/faq/
Many folks attempt a simple-minded regular expression approach, like
s/<.*?>//g
, but that fails in many cases because the tags may continue over line
breaks, they may contain quoted angle-brackets, or HTML comment may be
present. Plus folks forget to convert entities, like <
for example.
Here's one ``simple-minded'' approach, that works for most files:
#!/usr/bin/perl -p0777 s/<(?:[^>'"]*|(['"]).*?\1)*>//gs
If you want a more complete solution, see the 3-stage striphtml program in http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/striphtml.gz .
#!/usr/bin/perl -n00 # qxurl - tchrist@perl.com print "$2\n" while m{ < \s* A \s+ HREF \s* = \s* (["']) (.*?) \1 \s* > }gsix;
This version does not adjust relative URLs, understand alternate bases, deal with HTML comments, deal with HREF and NAME attributes in the same tag, or accept URLs themselves as arguments. It also runs about 100x faster than a more ``complete'' solution using the LWP suite of modules, such as the http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/xurl.gz program.
start_multipart_form()
method, which isn't the same as the
startform()
method.
$html_code = `lynx -source $url`; $text_data = `lynx -dump $url`;
The libwww-perl (LWP) modules from CPAN provide a more powerful way to do this. They work through proxies, and don't require lynx:
# print HTML from a URL use LWP::Simple; getprint "http://www.sn.no/libwww-perl/";
# print ASCII from HTML from a URL use LWP::Simple; use HTML::Parse; use HTML::FormatText; my ($html, $ascii); $html = get("http://www.perl.com/"); defined $html or die "Can't fetch HTML from http://www.perl.com/"; $ascii = HTML::FormatText->new->format(parse_html($html)); print $ascii;
$string = "http://altavista.digital.com/cgi-bin/query?pg=q&what=news&fmt=.&q=%2Bcgi-bin+%2Bperl.exe"; $string =~ s/%([a-fA-F0-9]{2})/chr(hex($1))/ge;
Encoding is a bit harder, because you can't just blindly change all the
non-alphanumunder character (\W
) into their hex escapes. It's important that characters with special
meaning like /
and ?
not be translated. Probably the easiest way to get this right is to avoid
reinventing the wheel and just use the URI::Escape module, which is part of
the libwww-perl package (LWP) available from CPAN.
Content-Type
as the headers of your reply, send back a Location:
header. Officially this should be a
URI:
header, so the CGI.pm module (available from CPAN) sends back both:
Location: http://www.domain.com/newpage URI: http://www.domain.com/newpage
Note that relative URLs in these headers can cause strange effects because of ``optimizations'' that servers do.
use HTTPD::UserAdmin (); HTTPD::UserAdmin ->new(DB => "/foo/.htpasswd") ->add($username => $password);
In brief: use tainting (see the perlsec manpage), which makes sure that data from outside your script (eg, CGI parameters)
are never used in
eval
or system
calls. In addition to tainting, never use the single-argument form of
system()
or exec().
Instead, supply the command
and arguments as a list, which prevents shell globbing.
$/ = ''; $header = <MSG>; $header =~ s/\n\s+/ /g; # merge continuation lines %head = ( UNIX_FROM_LINE, split /^([-\w]+):\s*/m, $header );
That solution doesn't do well if, for example, you're trying to maintain all the Received lines. A more complete approach is to use the Mail::Header module from CPAN (part of the MailTools package).
$ENV{CONTENT_LENGTH}
and
$ENV{QUERY_STRING}
. It's true that this can work, but there are also a lot of versions of
this floating around that are quite simply broken!
Please do not be tempted to reinvent the wheel. Instead, use the CGI.pm or CGI_Lite.pm (available from CPAN), or if you're trapped in the module-free land of perl1 .. perl4, you might look into cgi-lib.pl (available from http://www.bio.cam.ac.uk/web/form.html).
Without sending mail to the address and seeing whether it bounces (and even then you face the halting problem), you cannot determine whether an email address is valid. Even if you apply the email header standard, you can have problems, because there are deliverable addresses that aren't RFC-822 (the mail header standard) compliant, and addresses that aren't deliverable which are compliant.
Many are tempted to try to eliminate many frequently-invalid email
addresses with a simple regexp, such as
/^[\w.-]+\@([\w.-]\.)+\w+$/
. However, this also throws out many valid ones, and says nothing about
potential deliverability, so is not suggested. Instead, see http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/ckaddr.gz
, which actually checks against the full RFC spec (except for nested
comments), looks for addresses you may not wish to accept email to (say,
Bill Clinton or your postmaster), and then makes sure that the hostname
given can be looked up in DNS. It's not fast, but it works.
Here's an alternative strategy used by many CGI script authors: Check the email address with a simple regexp (such as the one above). If the regexp matched the address, accept the address. If the regexp didn't match the address, request confirmation from the user that the email address they entered was correct.
use MIME::base64; $decoded = decode_base64($encoded);
A more direct approach is to use the unpack()
function's ``u''
format after minor transliterations:
tr#A-Za-z0-9+/##cd; # remove non-base64 chars tr#A-Za-z0-9+/# -_#; # convert to uuencoded format $len = pack("c", 32 + 0.75*length); # compute length byte print unpack("u", $len . $_); # uudecode and print
use Sys::Hostname; $address = sprintf('%s@%s', getpwuid($<), hostname);
Company policies on email address can mean that this generates addresses that the company's email system will not accept, so you should ask for users' email addresses when this matters. Furthermore, not all systems on which Perl runs are so forthcoming with this information as is Unix.
The Mail::Util module from CPAN (part of the MailTools package) provides a
mailaddress()
function that tries to guess the mail address of
the user. It makes a more intelligent guess than the code above, using
information given when the module was installed, but it could still be
incorrect. Again, the best way is often just to ask the user.
# sending mail use Mail::Internet; use Mail::Header; # say which mail host to use $ENV{SMTPHOSTS} = 'mail.frii.com'; # create headers $header = new Mail::Header; $header->add('From', 'gnat@frii.com'); $header->add('Subject', 'Testing'); $header->add('To', 'gnat@frii.com'); # create body $body = 'This is a test, ignore'; # create mail object $mail = new Mail::Internet(undef, Header => $header, Body => \[$body]); # send it $mail->smtpsend or die;
`hostname`
program. While sometimes expedient, this isn't very portable. It's one of
those tradeoffs of convenience versus portability.
The Sys::Hostname module (part of the standard perl distribution) will give
you the hostname after which you can find out the IP address (assuming you
have working DNS) with a gethostbyname()
call.
use Socket; use Sys::Hostname; my $host = hostname(); my $addr = inet_ntoa(scalar(gethostbyname($name)) || 'localhost');
Probably the simplest way to learn your DNS domain name is to grok it out of /etc/resolv.conf, at least under Unix. Of course, this assumes several things about your resolv.conf configuration, including that it exists.
(We still need a good DNS domain name-learning method for non-Unix systems.)
perl -MNews::NNTPClient -e 'print News::NNTPClient->new->list("newsgroups")'
$CommentsMailTo = "perl5@dcs.ed.ac.uk"; include("syssies_footer.inc");?>