Saturday, March 20, 2010

Command line parameters vs use pragmas

Everyone knows we should enable warning, strict and taint checking. What most people don't think about is whether to use it in the command line or using the use pragmas.

strict: Can only be done by doing a use strict, preferably at the beginning of your script and every package.

warning: Can be done by either passing the -w parameter to the interpreter via command line or the #! line in the script or by doing a use warning. The decision depends on how much control you have over the underlying packages. If you have full control then it is recommended to pass the parameter. This way all packages or require'd files have warnings automatically enabled.

taint: Can only be enabled at the interpreter level using the -T parameter. Don't forget if you are passing the script to a perl command then the command is the interpreter and not the #! line in the script. Especially with taint check the script will be refused if you don't enable it with the interpreter while the script has -T in the #! line.

Remember scope of the warning and strict use pragma is limited to the enclosing block and will not cover files loaded via use, require or do. Since the pragma's are limited by the scope the corresponding no pragma turns the checking off in the scope no was called.

use vs require

Yes I know most places will tell you to do a use since that is the new way of doing things. But require still has it's uses and as long as you understand the differences you might choose to do a require over a use.

Lets start with the basics
require: Does the following at the point of execution
   * Takes as parameter a filename, a module or a version
   * If it is a filename - if a relative path or no path is provided loads relative to the current folder otherwise loads the filename provided into memory
   * If it is a module - traverses through @INC and tries to load the module into memory relative to the first folder where it can find the corresponding .pm file. Barewords are considered modules and :: are replaced with / for locating the file.
   * If it is a version - if it is a numeric throws an error if the perl version number stored in $] is less than this number or if it is a literal then $^V is compared and throws an exception if it is less than the required value.
   * If the file or module is found the content of the file is eval'd and throws an exception if it doesn't return true.

use:  Does the following at the point of being compiled
   * Takes as parameter a module, pragma or a version
   * If it is a module - traverses through @INC and tries to load the module into memory relative to the first folder where it can find the corresponding .pm file. In the module name :: are replaced with / for locating the file.
   * If it is a version - if it is a numeric throws an error if the perl version number stored in $] is less than this number or if it is a literal then $^V is compared and throws an exception if it is less than the required value.
   * If the module is found the content of the file is eval'd and throws an exception if it doesn't return true.
   * All exported items are imported into the current namespace


Now in most cases it makes sense to do a use rather than a require for the following reasons:
   * use would be applied at compile time and all syntax errors would be caught instantly
   * use only loads modules hence it forces you to create packages and ensures separate namespaces are created
   * use allows control over what symbol table entries are imported - in case of a require potentially everything could land in your namespace

There are few scenarios where require can provide benefit
   * Some modules import a lot of items into your namespace and do a lot of actions on being evaled - we would do a require only if this is an optional requirement
   * Some modules can be initialized only after some pre-processing has taken place. e.g. I have a common module for setting up log4perl but this can be initialized only after the command line parameters have been processed. This wont be possible if I do a use but require works perfectly for my needs.
   * To implement polymorphism where you require only the relevant modules and not the universe of all potential modules.


The bareword concept is elaborated below
   require Foo::Bar;    # a splendid bareword
   $class = 'Foo::Bar';
   require $class;      # $class is not a bareword
   require "Foo::Bar";  # not a bareword because of the ""
   eval "require $class"; # it is bareword again inside the eval

Howto declare variables

So today we will look at the various options we have to declare a variable and how to decide which one to use. Primarily we will discuss the difference between use vars, our, my and local. I have quite often been confused as to when to use my and when to use our so I will elaborate a bit more on their differences in this article.

For the sake of this document I am using the following terms in the below mentioned context
declare: Associates a name with a variable - your perl interprator knows there is a variable to be used.
define: Actually allocate storage for the variable - if it was defined earlier the variable will lose that old definition.

my: This will declare and define a variable local to the block file or eval. Almost always you want to do this for all your variables unless you have a specific reason to do otherwise. If you define it outside any block in a Perl script the variable is global in scope to the whole file. my variables aren't associated with any package. The outcome of this type of declaration is that you can't access this file from another package. If you re-declare a variable it will create another instance of the variable in the current scope with the new data. Once it goes out of scope the variable reverts to the original value.

our: This will atleast declare a variable and if you wish it can also define a variable. You would primarily encounter the use of our variables while using global variables defined outside the scope of the current files - usually things like @INC and @ISA under the use strict pragma or while accessing global variables defined in a different namespace. You need to define only once in the scope - all other declaration for our in the file would use this definition. You also don't need to declare it more than once in a file if defined at the top level since our is global to the file unless restricted by scoping. This behavior results in the unintended consequence of a our variable changing the value in one function affecting globally even though you did a our in another function.

use vars: This will declare a variable inside a package - you can't do a use vars outside a package. The only place use vars would behave differently from a our variable is when you have multiple packages in a .pm file. The our variable will be globally declared across all packages whereas the use vars variable would be local to the package it was defined.

local: This will re-declare a variable in the enclosing block, file, or eval with the new initialized data if any. In most cases you want to do a my instead of a local as that is faster and considered safer. The only reason you want to do a local is if you are localizing a special variable or a global variable to use a different value in the current scope.


To highlight the difference between my and our I would like to quote from http://stackoverflow.com/questions/845060/what-is-the-difference-between-my-and-our-in-perl which says

Available since Perl 5, my is a way to declare:
    * non-package variables, that are
    * private,
    * new,
    * non-global variables,
    * separate from any package. So that the variable cannot be accessed in the form of $package_name::variable.

On the other hand, our variables are:
    * package variables, and thus automatically
    * global variables,
    * definitely not private,
    * nor are they necessarily new; and they
    * can be accessed outside the package (or lexical scope) with the qualified namespace, as $package_name::variable.


A good example to demonstrate the difference between my and our is as follows

$ cat test.pl
require "rt.pl";
print $hello . "\n";
$ cat rt.pl
our $hello = 'world';
$ perl t.pl
world

Changing the our $hello to my $hello would result in nothing being printed. The reason being our made the variable part of the main package whereas my stayed localized to rt.pl and nothing outside it could access the variable.

Followers