Python 7: Exception Handling and Operating System Interfaces

contents

7.1 Error Handling

Programs written in languages which don't have exception handling often need more code to handle exceptional conditions (i.e. exceptions) than to handle the job the program is written to do. Examples of exceptions will include input files not yet existing, out of memory conditions or input values of incorrect range or type etc.

There are 2 problems with this approach:

a. The inherent logic of the application is obscured by all the exception coding.

b. The programmer has to consider every possible exception in advance and code specifically for these.

Maintainable source code should be easy to read. When reviewing an unfamiliar application a maintenance programmer will want to know how the program does the job it was designed for. If the exceptions dominate the business logic the source code will not be easy to read. Even if you maintain your own programs the details will probably become unfamiliar to you a few months after you have written them.

Item b. may be considered an advantage in some environments, e.g. for safety critical programming of a system on which human lives depend, or where a robust application is needed for other reasons. However, it is a disadvantage in rapid prototyping environments where programming time is critical to the success of the project and early program users are co-developers who help test what will become a more robust and fully-understood program.

Another reason for having exception handling facilities built into a programming language is that this enables the exception to be raised at the point at which it occurs, and then caught and handled by the code concerned by it. For example a library function may be the best place to raise an exception if it intercepts a request to write to a file for which the user doesn't have write access. However, the designer of this function probably won't know what to do about this situation as he or she doesn't have knowledge of the applications which use the function. The programmer who creates an application using this function can either ignore the exception, in which case this application will stop if the error occurs, or allow the error to be raised automatically up to the calling function or module, or trap the error and decide what to do with it. E.G. if the error concerns lack of write access to a file in which data entered is to be saved, it might be useful for the user to save their work elsewhere in preference to the program crashing with lost input data.

A situation where many programmers first encounter exceptions is when handling input. This is because input comes in various forms, a number of assumptions have to be made for input to be available at the time the program expects it, and there is little guarantee that the kind of input received will be the kind intended. Simply pressing the enter key twice accidentally in response to a prompt for input can cause the request to fail. Other common input-related exceptions include unexpected end-of file conditions or problems opening an external file, e.g. the file not existing or the program user not having the access privileges needed.

One of our first programs demonstrated a potential exception:

import math
radius=int(raw_input("Please enter a radius"))
area=math.pi*radius**2
print "For a circle of radius %s the area is %s" % (radius,area)

When inputting a value of 1.5 the program crashed and we received the error dump:

Traceback (most recent call last):
  File "C:\Python21\Pythonwin\pywin\framework\scriptutils.py",
  line 301, in RunScript
    exec codeObject in __main__.__dict__
  File "H:\rich\python\circarea2.py", line 2, in ?
    radius=int(raw_input("Please enter a radius"))
ValueError: invalid literal for int(): 1.5

This message gave us details of the error and the program lines where the error occurred. We were then able to improve our program by changing the int() conversion to float() so it would handle real numbers as well as integers. However, this didn't prevent all possible errors causing the program to crash, e.g. when a value of "fred" was input or the return key was pressed without the user having typed anything.

7.2 Trapping an error

The next stage in arranging for more robust input handling involved trapping or "catching" the error and handling this in a more controlled manner:

import math
try:
  radius=float(raw_input("Please enter a radius"))
  area=math.pi*radius**2
  print "For a circle of radius %s the area is %s" % (radius,area)
except(ValueError):
  print "you entered invalid input"

The keywords try and except were used to do the error catching and operated in a like manner to the if and else keywords in the absence or presence of ValueError exceptions.

Here we were catching a specific and known kind of error, the ValueError . Python contains a number of built in exception classes and ValueError is one of these.

We can get a list of the built in exception classes (followed by other builtin names e.g. range and len) by entering dir(__builtins__) at the interpreter >>> prompt :

['ArithmeticError', 'AssertionError', 'AttributeError', 'DeprecationWarning',
'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'FloatingPointError',
'IOError', 'ImportError', 'IndentationError', 'IndexError', 'KeyError',
'KeyboardInterrupt', 'LookupError', 'MemoryError', 'NameError', 'None',
'NotImplemented', 'NotImplementedError', 'OSError', 'OverflowError',
'RuntimeError', 'RuntimeWarning', 'StandardError', 'SyntaxError',
'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'TypeError',
'UnboundLocalError', 'UnicodeError', 'UserWarning', 'ValueError', 'Warning',
'WindowsError', 'ZeroDivisionError', '__debug__', '__doc__', ... other builtin
names ...]

A slightly neater way to have written the above program would be like this:

import math
try:
  radius=float(raw_input("Please enter a radius"))
except(ValueError):
  print "you entered invalid input"
else:
  area=math.pi*radius**2
  print "For a circle of radius %s the area is %s" % (radius,area)

This makes it clear that the only statement being protected by the try: clause is one containing the call to the float() and raw_input() functions. The rest of the normal behaviour of the program occurs in the else: block, which is only executed if there is no exception.

In the above case the program was either given correct input and performed its desired function, or it received incorrect input and failed gracefully, with a suitable error message. In larger applications it may be undesirable for the program to fail, especially if there is unsaved data, or the user has spent some time getting it into its current state to allow other actions to be performed. In the situation where the programmer intends promting the user for data again until usable data is input the above exeption handling is likely to be looped. In Python this can be achieved with a while loop.

import math
while 1:
  try:
    radius=float(raw_input("Please enter a radius"))
  except(ValueError):
    print "you entered invalid input, try again"
  else:
    break

area=math.pi*radius**2
print "For a circle of radius %s the area is %s" % (radius,area)

In the above example, alphabetic and null inputs are suitably dealt with, and the program prompts for the required data until it gets valid input. Without a break statement, which only occurs if there is no ValueError exception, while 1: loops forever.

Running the above code in the PythonWin IDE environment allows another kind of input error which the above program doesn't trap. This environment uses a dialog box to collect input which has a cancel button. Pressing this button crashes the program. Looking at the interactive response window we get on the last line of the crash dump:

KeyboardInterrupt: operation cancelled

The more robust approach to this is that if we have already trapped one kind of error associated with our use of raw_input() we can trap another. Changing our

except(ValueError):

error trap to

except(ValueError,KeyboardInterrupt):

allows both possible input errors to be caught. This incremental approach based on extensive program testing and observation has the advantage that if other errors which we have not yet thought about occur in future the program will still crash allowing the crash dump to be investigated, so the programmer can be made aware of any exceptions not yet catered for.

7.3 The catchall approach

A quicker approach to handling this situation is to catch all possible errors by not specifying which ones, by changing the except line to:

except:

Which catches everything in the associated try: block. This approach is useful in some situations, e.g. for rapid prototyping, but it isn't helpful everywhere. For example, you might consider putting your entire program into a try-except block, or into a function called main and then surrounding your call to main() with a try-except block. E.G:

def main():
  import math
  radius=float(raw_input("Please enter a radius"))
  area=math.pi*radius**2
  print "For a circle of radius %s the area is %s" % (radius,area)

try:
  main()
except:
  print "program failed"

This will trap all run-time errors (including syntax errors !) but the problem with such a "high level" approach is that when something does goes wrong you don't know anything about what went wrong or why. The stack trace that would give you some of this information is suppressed, putting an effective barrier on further program development.

Program testers need information about what went wrong and the system gives better information without this kind of error trap.

7.4 Raising your own exceptions

You can do this using the raise keyword. You might want to do this if you are expecting one of a set of input values from a user, or a value within a range. There are many other errors your program might detect which are better handled using try-except in preference to the if-else idiom. The place where you raise the error should be within a try: block, but this could be from within a function or method called from the try block.

import math
while 1:
  try:
    radius=float(raw_input("Enter a radius between 1.0 and 10.0"))
    if not (1.0 <= radius <= 10.0):
      raise ValueError
  except(ValueError,KeyboardInterrupt):
    print "You entered invalid input. Please try again."
  else:
    break

area=math.pi*radius**2
print "For a circle of radius %s the area is %s" % (radius,area)

7.5 Forcing cleanup with try - finally

You can use the finally block instead of except, and this will execute whether or not an exception was raised within the try block. This is likely to be useful if something must be done by your program whether or not the protected statement succeeds. Code that will typically appear within finally blocks include cleanup actions e.g. closing a file or removing a lock file prior to a program exiting.

In the following example, if the user enters anything other than 0 or 0.0 the inverse is printed. If the user inputs zero the null object: None is printed and the ZeroDivisionError is raised.

inverse=None # Python's NULL object
try:
   value=float(raw_input("enter a number and I will give you the inverse"))
   inverse=1/value # might be a ZeroDivisionError here
finally:
   print inverse # always printed, but might be None

An exception raised in this code will still propagate up the function calling stack and crash the program if not caught somewhere else. You can still catch it using another try-except block e.g. within a calling function if you want e.g:

def print_inverse(value):
  inverse=None # Python's NULL object
  try:
    inverse=1/value # might be a ZeroDivisionError here
  finally:
    print inverse # object always printed, value might be None

try:
  val=float(raw_input("enter a number and I will invert it"))
  print_inverse(val)
except(ZeroDivisionError):
  print "tried to divide by zero"
except(ValueError):
  print "didn't enter a number"
else:
  print "everything OK"

The above code doesn't always call the print_inverse function, because if the user enters something that isn't a number and float() raises an exeption, execution continues with the except(ValueError) block. However, if print_inverse is called the object referred to by the name: inverse is printed, even if this is the default: None object.

7.6 Raising your own kind of error

You might not consider any of the built in exception objects as suitable to carry your error condition, and might want to associate additional information with your error, such that this additional information can be emailed to a maintenance programmer or written to a log. This may be the only way developers can access information about program bugs in an environment where programs provide 24 x 7 services accessed over networks, as opposed to standalone same-computer user interaction. Python also allows you to create your own exception classes including subclass hierarchies of these if you so wish. This might be useful in enabling code to trap various errors of a related type in one place, especially in a very large and complex system. The most common DIY exception object is a string:

# program demonstrates user defined exception objects

import sys # need this for sys.exc_info() to get full error information

def weight_check(weight):
    if weight < 20:
        raise tl # too light exception
    elif weight < 40:
        print "you need to eat more"
    elif weight < 70:
        print "you seem to be normal"
    elif weight < 150:
        print "you need to go on a diet"
    else:
        raise bs # broke scales exception

bs="broke the scales"
tl="too light for adult"
weight=float(raw_input("enter your weight in Kg: "))
try:
   weight_check(weight)
except tl: # too light for an adult
   print sys.exc_info()[0]
   range=1
except bs: # broke the scales
   print sys.exc_info()[0]
   range=3
else:
   range=2
print range

sys.exc_info() returns a tuple containing the error object, its type and the error trace. Here we are just accessing the first element if this tuple: [0] which is the error string.

7.7 Using command line parameters in Python

If you know how to use the system command line you will probably be able to automate your computer to make it take regular scheduled backups without needing to buy expensive third party software, where all the operator has to do is put correct media in the tape or CD-RW drive before scheduled backups. If you don't know how to do this you probably will not be taking very good backups (most people don't) and so are at a greater risk of losing more than a day's or week's worth of unsecured data.

The batch-file or shell-script programming environment makes extensive use of command lines complete with parameters, put into files which can automate routine and/or complex operations. Batch or shell scripts can also call Python programs passing suitable command line parameters, e.g. telling them which folders to copy to CD-R on which days of the week or month.

These parameters, or extra words in a command after the command which runs the program itself, can be accessed through the list accessible through the sys module called argv. The following program prints out this list:

# file: printargs.py
import sys
argc=0
for arg in sys.argv:
  print "parameter %d is: %s" % (argc,arg)
  argc+=1

Running printargs.py using the Unix comand:

python printargs.py this that another

gave the following output:

parameter 0 is: H:\rich\python\printargs.py
parameter 1 is: this
parameter 2 is: that
parameter 3 is: another

Parameter 0 is generally the name of the file containing the program.

7.8 Use of the glob module for file wildcarding

This module uses Unix shell wildcard expansion rules for generating a list of filenames matching a particular wildcard or pattern in a particular directory. Anyone who has used batch or shell commands of the form:

MS-DOS:	c:\> copy a:\*.txt c:\flop_bak
UNIX:	$ cp /mnt/floppy/*txt ~/flop_bak

will know how much time and wrist strain can be saved by using filename wildcards compared to repetitive pointing and clicking using a file explorer to manipulate a large set of files with related names. E.G. the above shell commands are the equivalent MS-DOS and Unix commands for copying all .txt (text) files from the floppy disk drive into another folder.

>>> import glob
>>> glob.glob("circ*py")
['circarea.py', 'circarea0.py', 'circarea2.py', 'circarea3.py', 'circarea4.py',
'circarea5.py', 'circarea6.py', 'circarea7.py']
>>> glob.glob("h:\\rich\\*")
['h:\\rich\\cs+p1', 'h:\\rich\\cs+p2', 'h:\\rich\\csp1ex', 'h:\\rich\\download',
'h:\\rich\\inetprog', 'h:\\rich\\java', 'h:\\rich\\markbak', 'h:\\rich\\mscwud',
'h:\\rich\\perl', 'h:\\rich\\perlcourse', 'h:\\rich\\pfw3', 'h:\\rich\\py',
'h:\\rich\\python', 'h:\\rich\\pythoncourse', 'h:\\rich\\sv125.tmp',
'h:\\rich\\sv1j9.tmp', 'h:\\rich\\sweng3', 'h:\\rich\\WPA']

The first example obtained a list of files starting with "circ" and ending with "py". The second example obtained a list of folders within a folder. Unix shell style wildcards include:

* 1 or more characters which can span decimal points but not starting with a decimal point.

? matches any single character

[] bracketed character classes 

Wildcards can be combined with literals, e.g [2-4][0-9][0-9].txt matches a filename consisting of a number between 200 and 499 followed by .txt

These wildcards have certain similarities with regular expressions, but the glob wildcard rules aren't exactly the same.

If you want more selective matching of a set of filenames than you can easily achieve with the above patterns, it might be easier to read a larger list of filenames into your program with a simple file glob, and then filter this list into a smaller set of matching filenames using regular expressions.

The glob module is useful within programs which are expected automatically to process a number of files matching a particular filename pattern, e.g. a program which inserts a <body> tag bgcolor attribute into all files ending with .htm or .html within a particular folder, or a tree of folders.

7.9 Use of the os, stat and time modules

Changing and listing directories

The os module has various functions for operating systems access. Probably the most commonly used are for directory changing and listing. os.chdir changes the working directory. os.listdir is used to obtain a list of the file and directories within the current directory or folder.

>>> import os
>>> os.chdir("\\")
>>> os.listdir("h:\\")
['crats', 'jbuild', 'mscjava', 'My docs', 'notes1.html', 'perlmarks', 'rich']

Spawing programs and changing environment variables

This module can also obtain the path of the current directory, create and remove directories, rename and delete files, and set environment variables, including PYTHONPATH. For example:

>>> import os
>>> os.startfile("c:\\winnt\\system32\\cmd.exe"")
>>> os.environ["OS"]="Linux rules"
>>> os.startfile("c:\\winnt\\system32\\cmd.exe")

Within each of the 2 MS-DOS shells spawned, the environment was inspected using the set command. os.environ changed the value of the OS (operating system) environment variable as seen by the current program and within the second MS-DOS shell started by os.startfile, but copies of the environment are unaffected elsewhere. (To do this more generally see the notes on PYTHONPATH).

Many programs read environment variables to configure internal actions. This approach also enables your Python programs to change environments when run, e.g. to allow for storage and import of Python modules elsewhere on the system using the PYTHONPATH environment variable if the systems administrator has prevented you from using the control panel.

file statistics and formatting of dates and times

The os.stat() function gets various information about a file including size in bytes and modification date/time and (on systems which have file-level security) file ownership and permission values. the stat module contains various constants and functions which help us to interpret the tuple (at least 10 numbers) returned by os.stat(). Some of this information (access and modification times) is measured in seconds since 1970 (Unix) or 1980 (Windows), and we can use functions within the time module to interpret this into more useful date/time fomats. The following Python session obtained and formatted some of this information into human-readable form:

>>> import os,stat
>>> fso=os.stat("balls.py")       # fobj is a file stats tuple object
>>> if stat.S_ISDIR(fso[stat.ST_MODE]):  # stat.CONSTANT is index 0-9
...       print "balls.py is a directory" # it isn't so this won't print
...
>>> if stat.S_ISREG(fso[stat.ST_MODE]):
...   print "balls.py is a regular file" # it is so this will print
...
balls.py is a regular file
>>> print "size of balls.py is: %d bytes" % fso[stat.ST_SIZE]
size of balls.py is: 1427 bytes
>>> import time
>>> timeobj=time.localtime(fso[stat.ST_MTIME])
          # converts file modification time to time object
>>> mod_dt=time.asctime(timeobj) # converts timeobj into string date/time
>>> print "file was last modified on: %s" % mod_dt
file was last modified on: Mon Mar 11 13:02:44 2002

For more information about these and related conversions use the Python library documentation: http://www.python.org/doc/current/lib/modindex.html index for information on the os, stat and time modules. For example while the output printed by asctime() will do in many cases, strftime() enables you to specify the output format for converting a time object such as is returned by time.gmtime() (Greenwich Mean Time) or time.localtime() into a printed representation:

>>> import time
>>> time.strftime(
              "%a, %d %b %Y %H:%M:%S +0000", time.gmtime())
'Mon, 11 Mar 2002 13:21:34 +0000'