|
|
Project | |
BASIC variable declaration checker | |
For | |
Own project | |
Datum | |
1993-1994 | |
Platforms | |
C/C++, Windows/Linux PC | |
|
In this page I describe a small tool that I created some time ago, in my own time while working in a job doing development and maintenance on a large BASIC program. As a C programmer, I had previously gotten used to the compiler checking for typing errors in variable names, via the mechanism that variables have to be declared before use. This mechanism is lacking in BASIC, so I filled this functionality gap in the BASIC compiler by quickly creating this tool, in order to allow me to develop more efficiently.
A problem with BASIC, and also with most scripting languages, is that variables do not have to
be declared before use.
The absence of the requirement for declaration may sound user-friendly, however in my view it makes
the language unsuitable for serious software projects, because of the following problem:
In a language like BASIC, when a typing error is made in the name of a variable, then the compiler
silently creates a new variable at that point. Thus, typing errors in variable names
are not caught immediately by the compiler (as they are in languages that do require variable
declarations like C/C++, Java, FORTRAN, Pascal), but lead to a successfully compiled executable
with a bug in it (which will then have to be detected the slow way via testing and debugging).
Consider the following BASIC code:
The name of the parameter firstParam is misspelled in the body of the sub mySwap(), so it will not do what it is intended to do (which is to swap the values of the two variables varOne and varTwo on the caller's side). Instead of being copied into firstParam, the original value of the parameter secondParam is is written to a newly created variable with the name fisrtParam, which is then not used any further. Thus, in the sub call, the original value of the caller's variable varTwo is destroyed and lost.
DIM varOne AS INTEGER DIM varTwo AS INTEGER varOne = 1 varTwo = 2 mySwap( varOne, varTwo ) END SUB mySwap( firstParam AS INTEGER, secondParam AS INTEGER ) DIM tmp as INTEGER tmp = firstParam fisrtParam = secondParam secondParam = tmp END SUB
Such errors are easily detected by a set of small scripts or programs as described below. These are intended for use in projects in which the coding rule is followed that all variables should be declared before use, and the tool finds all occurences of variables that are used without a prior declaration. In the description below, I arbitrarily assume that the BASIC dialect used is similar to QBASIC.
The basis of the tool is a routine that processes one source file as follows: after comment removal, it makes one pass through the file, to detect where the definitions of subs, functions, and TYPE definitions begin and end. This is easy to detect from the keywords SUB ... END SUB, FUNCTION ... END FUNCTION, TYPE ... END TYPE with only rudimentary lexical processing. This routine is the skeleton (template) for the two routines or scripts that the actual tool consists of.
The first routine actually used in the tool extracts from a given BASIC source file the names
of all global objects defined in it, i.e. global variables, subs/functions, and datatypes
defined in TYPE definitions.
For this, we only need to extend the skeleton routine so that, outside of
SUB/FUNCTION/TYPE definition blocks, it additionally detects all statements beginning
with the keyword DIM, which are the definitions of global variables.
This globals-extracting routine is then further extended with one more element of functionalits,
namely on encountering an $INCLUDE line, it should open the included file and processes that
file in the same way (for which it will recursively call itself).
The output of this final extended routine is a list of the names of all the global objects
defined in the BASIC source file, plus in all source files $INCLUDEd by it.
Undeclared variables detection routine
The second routine actually used in the tool then does the final work of processing one given
BASIC source file individually, and finding all uses of undeclared variables, as follows.
The first step is to run the first routine on the source file, to create a list of the names of
the globals objects defined in the source file and in the files $INCLUDEd by it.
We will call this the globals list.
After this, the source file is split up into chunks of text in such a way that each chunk contains
the definition of one of the subs and functions in the file, plus there is a chunk that contains
all the code outside of SUB/FUNCTION/TYPE definition blocks, which is then treated the
same as the sub/function definition blocks except that there are no sub/function parameters.
Each of the resulting chunks of text is then processed individually, as follows. Perform
a simple lexical analysis and parsing to do the following: