Reports and Services     Home     Content    

Analyze a MUMPS system

The steps to process a MUMPS system are:

Unpack the MUMPS code

As easy as this step sounds, there is more to it than meets the eye. In an ideal world, the system administrator will have simply executed the MUMPS RoutineSave application and saved the system to a tape (even a cdrom). Unfortunately, I have never had it that easy. I usually have to write special applications for each new system, just to read the media the system was sent on. For the last system I examined, the system administrator did multiple routine saves, then used VMS backup to back-up the routine save files to a VMS proprietary backup tape. Once I was able to decipher the directory structure of the tape, I was able to extract the routine-save files from the tape.

Once we have the routine-save files, they may be unpacked. By standards (or by convention), a routine-save file is formatted with a header that identifies the system, the date, and other relevant information followed by a blank line. Next, each MUMPS routine is written to the tape with a single blank line indicating the end of a routine. The format of the routine header is:
SomeName
RoutineName(optional-explicit-parameters)-tab-; comment or empty
code.....
more code....
blank line

Where SomeName ideally is the same as RoutineName, RoutineName is the name of the routine, code is a collection of MUMPS commands and expressions, and the routine is terminated by a blank line. The parenthesis and optional-explicit-parameters usually are not used.

Identify files to be excluded from further analysis

When unpacking the routine save file, SomeName and RoutineName are compared. If they match, a file named RoutineName.m is generated and the lines from the routine save file are copied until a blank line is encountered. The blank line will terminate RoutineName.m file.
If SomeName does not match RoutineName, then the mumps file is assigned a pseudo name RoutineName_pseudo_NNNN_RoutineName.m. The Pseudo name is saved along with RoutineName and SomeName to a file to be processed to determine if the pseudo file should be excluded from further analysis.
The routine name must be unique. If a file is found that has the same RoutineName as a pseudo file, the pseudo file is excluded. If two pseudo files have the same routine name, then the first pseudo file is kept and the second is excluded.
MUMPS files are edited in the MUMPS environment. Since the envrironment does not automaticly make backup copies of edited files, a MUMPS programmer that wants to keep a backup copy must save the routine under another name, with a name that doesn't match the routine name. In an ideal world, I could discard all the pseudo files, but in the real world, if I discarded all the pseudo files, then there would be unresolved references. Sometimes routines are saved under a different name. I do not know a good reason for this observation, but it occurs too often to be an oversight.

Check the MUMPS code for errors

Although errors are ignored by most scripts, the check script function is to examine the errors and report as much information as possible. It is highly probable that the code that has errors was never executed. In an interpreter, an error is reported at execution time.

Determine the system complexity

Metrics and various complexity data is generated by examining the MUMPS code. The metrics consist of the number of instances of each MUMPS command with an empty argument list, the number of arguments associated with each command, the number of post conditions associated with each command. To see description and samples click here.

Identify vendor proprietary functions

Part of the metrics is to report all the commands, and functions. All functions are reported. The function list is manually examined to determine which ones are proprietary to a specific vendor.

Fix errors found in code

If we are not intending to transform the MUMPS code to C++, this step can be skipped. However, the errors would have been caught if the interpreter had executed the code. Just because a piece of code has not been executed does not mean that the code won't be executed.

Report errors, size and complexity

At this point, the size of the transformation can be estimated.

Identify naked references and report them

Lazy programmers make heavy use of naked references. This is a case where laziness causes more work and headeaches for a maintenance programmer. It is a dangerous practice at best. The naked reference scanner will identify the line where the naked reference is used, the probable global referenced, and if unique, will substitute the global reference for the naked reference. If the naked reference could refer to more than one global, then it is reported as an error and no substitutions are made.
I received email on this statement indicating that the naked reference is another example of "the power of MUMPS". He indicated that a naked reference could be used to call a common routine to process the referenced global data (i.e. as a global implicit parameter). I feel that it is still a dangerous practice, just as implied parameters are.

Run mflow analysis with parameters

Control flow is very useful to a maintenance programmer. With implicit parameter information, it is even more valuable. My preference is to have as much information on a page as possible (as long as it is formatted to be readable). A fan-in and fan-out report is also generated by this script to help estimate the complexity of the flow.

Run WebFlow to generate documentation

WebFlow will generate the same information as mflow, but it uses JAVA and HTML to present the control in a graphical manner. Each picture shows only the calls, gotos, jobs, etc for a single label^routine called a module. If the cursor is placed over the main module on a picture, and the mouse clicked, then the information for the module will be presented. The information consists of explicit parameters, implicit parameters, global and local data references report, and calling modules. The local data references will include local data that is used before setting, (probably an implicit parameter), local data that is used, set, used in a for loop, used with indirection.