Unix Scripting: some Traps, Pitfalls and Recommendations

Version 1.02, 7 February 2002

by

Marc Dobson

 

Table of Contents

 

Acknowledgements

Thanks to Doris Burkhart who suggested I wrote this document, and to all those who have contributed information: Reiner Hauser and Piotr Galonka.

 

1. Introduction

There is much confusion over scripts, the way to call them, write them and many bad practices which complicate debugging in a multi-developer environment. I will show in this document some of the pitfalls which arise from using the same scripts on different Unix platforms and some of the pitfalls which arise from sourcing or executing scripts. In each section I present some examples to illustrate the pitfalls and some recommendations for avoiding them.

 

2. Shells and Script Interpreter Locations on some Unix Platforms

The various shells or script interpreters which I have come accross and which will be discussed in more detail in this document are: SH, BASH, PERL and EXPECT. Many more exist and much of what is said below also applies to those also. Examples of other shells or script interpreters are: CSH, TCSH, ZSH and KSH.

The location of the shells or script interpreters is important as executed scripts (see section 3 for the different calling methods for a script, i.e. sourcing or executing) must include this information in a special line which is the first line of the script. This line for the SH shell will look like:
#!/bin/sh
(Note there is no space between the #! and the path /bin/sh)

In Unix the "standard" shell is (or used to be) the Bourne Shell, which is usually located in /bin/sh. This location is valid for Irix, Linux, Solaris, and LynxOS and is therefore more "standard" than most other shells or scripting interpreters (see below for other examples).

Note: on many platforms /bin/sh points these days to BASH anyway (BASH stands for Bourne Again Shell).

Under Linux the default place for BASH is /bin/bash, the default place for PERL and EXPECT is /usr/bin/perl and /usr/bin/expect.

Note: CERN have these also in /usr/local/bin/ and this is the officially recommended place. The reason for this is that this location points to the ASIS repository which is updated regularly. The standard Linux location is often not updated regularly unless your distribution is updated regularly.

Under Solaris 2.5.1, 2.6 or 2.7, BASH is located in /usr/bin/bash, PERL and EXPECT are located in /usr/local/bin/perl and /usr/local/bin/expect.

Under LynxOS 3.0.1 running on PowerPC (I do not have access to anything else) BASH seems to be located under /bin/bash, PERL and EXPECT seem to be located under /usr/bin/perl and /usr/bin/expect.

As one can see the locations vary even in the very limited set of platforms which I have had access to.

Another way to start a script is to use the ENV command. This just runs a program in a environment. Which means that on Linux env bash will run BASH. Therefore in a script it is possible to write:
#!/usr/bin/env bash
in order to run a BASH script. ENV finds out where BASH, or any other scripting interpreter, is located (from the default search path and from the users PATH variable), and executes the script with that interpreter.

Writing scripts in this way implies that the only path that needs to be known or that needs to be the same on the various platforms is the path for ENV.

The path for ENV is /usr/bin/env on Linux and Solaris, and /bin/env on LynxOS.

Recommendation 1: I suggest that wherever possible scripts should run in the Bourne Shell (for simple scripts at least).

Recommendation 2: for minimal change to the machine setups I suggest the use of the ENV method of starting a script, which implies only one link needed in the LynxOS setup, from /bin/env to /usr/bin/env.

Note: If people want to keep the most important BASH scripts free of ENV then the suggestion is that /bin/bash is used and that under Solaris a link is put from /usr/bin/bash to /bin/bash.

 

3. Methods of Calling Script Interpreters

Starting a shell interactively, allows one or more commands to be typed and executed. It is also possible to type the commands into a file and tell the shell to read each command in turn from the file and execute it. The shell "sources" the file and it is done by the following command:
source /path-to-file/file_of_commands
or
. /path-to-file/file_of_commands
which are both equivalent.

The file of commands would look something like:
echo "Starting to execute the list of commands"
cd /tmp
ls
cd -

Warning: BASH has some very unintuitive behaviour if you source a script and do not provide a path to it. See section 5.2 for the details.

Another way of executing commands is to put them into an executable script file which will be run as any other program will. An example script would be:
#!/bin/bash

echo "Starting to execute the list of commands"
cd /tmp
ls
cd -

The execute bit should be set on this file so that it can be executed straight from your shell in the following way:
./script_file
if the current directory is the same as that of the file, or:
/path-to-script/script_file
if the current directory is different.

The shell you are currently in knows, because of the executable bit, that this file can be executed, and the first line of the script, #!/bin/bash, tells it that the script should be dealt with by the scripting interpreter located at /bin/bash. The BASH interpreter is therefore started with the file as argument. Therefore the script runs inside a new BASH shell, not in the shell from which the script was executed. In the example above the result is the same whichever method is used.

Recommendation 3: wherever a script is sourced, I recommend that the command SOURCE is explicitly used, in place of the abbreviation ". " as this avoids any ambiguity with "./script_file" while reading through a script which sources or executes another script.

Note: In the command-file (as opposed to the script-file) there is not the first line which is found in the script-file, i.e. #!/bin/bash. This line would not stop the script file from being sourced, and indeed in the above example the script file would work very well if sourced but this is not always the case. See the next section for the subtleties of sourcing versus executing scripts.

WARNING: If the command-file (as opposed to the script-file) has the execute bit set, no matter whether it has a first line of the type #!/bin/bash, the script will be executed!! This is true as long as the first line does not contain a hash symbol (#) as the first character. In the TCSH, the SH script interpreter is used to interpret the commands, whereas in BASH, the BASH script interpreter is used. If a hash symbol is present as the first character of the first line, then various things can happen depending on the rest of the line and the shell from which the command-file is trying to be executed. The various behaviours are too complex to relate here, however should the programmer not have a choice as to whether a command-file has the execute bit set or not, and if he absolutely wants to avoid the user from executing it, it is possible to set the first line to something of the form:
#!/some_directory/
such as #!/. This will not interfere with the sourcing of the script but will stop the script being executed and report the same error as if the execute bit was not set. In BASH the error looks like:
bash: ./command_file: Permission denied
and in TCSH like:
./command_file: Permission denied.
The reason for this is that a directory cannot be executed. Note this is a general and empirical way of stopping a sourced script from being executed regardless of the shell being used. Note also that each shell might have an option to do this which the author does not know about (if you know of one then please send me a mail about it).

 

4. Sourcing versus Executing

As the reader probably knows, if an interactive shell is started, typing the EXIT command will exit the shell. The same applies to the command-file or the script-file. In the command-file, an EXIT command, will exit the shell where the SOURCE command was executed, as it is that shell which reads each command in turn and executes it. In contrast in a script-file (if executed, not sourced) the EXIT command will exit only out of the shell/interpreter which was started to execute the commands in the script. Therefore the executed script file will just stop and return to the shell which called it.

The danger is therefore if a script-file is sourced instead of being executed. If in that script file there is an EXIT command then the shell which called the script will be exited and not just the script.

As an example take the following two scripts. Script 1 is:
#!/bin/bash

echo "Executing script2"
./script2
if [ $? -eq 0 ]; then
   echo "Executing ls in /tmp/md"
   ls -l /tmp/md
else
   echo "Exiting"
   exit 1
fi

And script 2 is:
#!/bin/bash

echo "In script 2"
if [ -e "/tmp/md" ]; then
   echo "/tmp/md exists"
else
   echo "/tmp/md does not exist"
   exit 1
fi

Both scripts should have the execute bit set. Start a BASH shell by typing bash, and at the next prompt execute script 1. The following output is produced:
If directory /tmp/md exists:If directory /tmp/md does not exist:
Executing script2
In script 2
/tmp/md exists
Executing ls in /tmp/md
total 0
Executing script2
In script 2
/tmp/md does not exist
Exiting

Now change script 1 to source script 2 instead of executing it (source ./script2 instead of ./script2). When the script 1 is executed the following output will be produced:
If directory /tmp/md exists:If directory /tmp/md does not exist:
Executing script2
In script 2
/tmp/md exists
Executing ls in /tmp/md
total 0
Executing script2
In script 2
/tmp/md does not exist

If the directory /tmp/md exists then the output is the same and exactly the same commands were executed. If however the directory /tmp/md does not exist then the script 2 has an EXIT and as it was sourced from script 1, it is actually script 1 which exits without the desired effect, i.e. printing "Exiting". In this case it is not very important but it could have very profound consequences with complex scripts.

The ambiguity in this case is compounded by the difference in coding in the two branches of the IF statement of script 2. For the case when the directory exists the EXIT command is implicit (the script goes to the end and exits normally), whereas for the case when the directory does not exist the EXIT command is explicit (this is the one which causes the exit from script 1).

If the programmer wishes to exit from a sourced script file (as he would with the EXIT command in an executed script), he may do so with:
return [n]
where "[n]" is the return value that can be tested for in the script/shell which sourced the script file (as with the EXIT command). Beware though that the RETURN command is also used to exit a function, therefore make sure that the RETURN command is placed in the appropriate place for the desired effect.

Recommendation 4: If the script is meant to be sourced then do not put a line at the top of the file of the format #!/bin/bash and do not set the execute bit on this file (see the 'Warning' in the previous section if you do not have control over the execute bit).

Recommendation 5: The EXIT command should NEVER be used in a command-file (sourced script) unless the developer is ABSOLUTELY sure that it does what he intends it to do.

Recommendation 6: some people keep the top line #!/bin/bash to show "humans" which script interpreter the commands in the file are meant for. I would recommend that if a reminder is needed, a naming convention be used such as: command-file.bash, or command-file.csh.

Recommendation 7: If the script is meant to be executed, the first line should be of the form #!/bin/bash and the execute bit should be set for this file.

Recommendation 8: The EXIT command can be used in an executed script file without any risk with respect to other scripts which might call it.

Recommendation 9: The RETURN command can be used to exit a sourced script file, however be carefull as it is also used to exit functions (either in sourced or executed script files).

The above six recommendations will make it easier for people who do not know the scripts to understand how they should be used, i.e. sourced or executed, without any ambiguity and without having to read and understand the scripts.

 

5. Special Cases

5.1 Duplicate Functionality

If the same functionality is required (i.e. the same commands) to be sourced in one script and executed in another then there is a simple way to avoid conflicting with the above recommendations.

Put the commands in a command-file (as opposed to a script-file) and source this file from the script which needs to source the commands. Create another file which is this time a script-file and inside this source the command-file. Now from the script that needs to execute the commands call the second file that was created.

As an example, find bellow the two files to create. The first file (command-file) could be:
echo "The commands to run start now"
cd /tmp
ls
cd -
echo "The commands to run finish now"

The second file (the script-file) could be:
#!/bin/bash

echo "We have been executed"
echo "Running the commands ..."
source ./command-file
echo "Exiting"

If the commands need to be sourced then call:
source ./command-file
and if they need to be executed then use:
./script-file

5.2 Unintuitive Bash Sourcing Behaviour

By default, when the command "source filename" does not contain a slash (i.e. does not include a path for the file), BASH searches the $PATH environment variable for the filename, and only if it does not find one there it searches the current directory!!!! Furthermore this happens whether the file has the executable bit set or not (you shouldn't have it set if the file is sourced anyway, see Recommendation 4 in section 4)!

This is counter intuitive to say the least and as an example TCSH does NOT do this search.

One can disable this behaviour with the BASH "shopt" builtin command:
shopt -u sourcepath

Note: the shopt builtin command is available in BASH version 2.02 and onwards but is not available in BASH version 1.14 (which is the default version used on RedHat 6.1). In the latter there is no way to change this behaviour.

Recommendation 10: when sourcing a script always use a path name for the file or check that the correct script is the first with that name in the path. If the script is in the current directory then use the command "source ./filename".

Recommendation 11: as a precaution you can also put the option to change the default BASH behaviour in your ".bash_profile" setup file. Note however that this will not help if you source a file from within an executed script.

A example of code to include in your ".bash_profile" file is:
# Assuming you use bash from /bin/bash
bash_vers=`/bin/bash -version -c exit | sed 's/.*version //' | sed 's/\(.\...\)\..*/\1/'`
if [ $bash_vers > "1.14" ]; then
  shopt -u sourcepath
fi

Recommendation 12: always choose a unique script name.

Unique script names can easily be obtained by prefixing the name with the project name to the script name (e.g. onlinesw_setup for the setup script of the Online Software project). Bad names are for example, setup, configure, etc...

 

6. Conclusions

I hope that this short document will be of some use to Unix script developers and I would welcome any feedback, additions or constructive criticism.

If you have information about different Unix platforms, I will be glad to integrate that information into this document.

 

A. Appendix

A.1 Change Log

The Change Log is organised in reverse date order so that the most recent changes are first.

7 February 2002, Version 1.02
Added part about executing a command-file if the execute bit is set, and the way to stop it from working.
 
28 January 2002, Version 1.01
Added part about unique script names. Other minor changes (typos etc...)
 
25 January 2002, Version 1.00
Added part on the RETURN command.
Added the version number, date, author.
Added the Appendix section (change log).
 
17 January 2002
Added section on "Unintuitive Bash Sourcing Behaviour", from information received from Reiner Hauser.
 
16 January 2002
Added paragraph on the "Standard" shell being /bin/sh.
 
15 January 2002
First draft of the document.

 


Marc DOBSON
Last modified: Thu Feb 7 15:53:36 CET 2002
http://cern.ch/Marc.Dobson/onlinesw/unix-scripting.html