CHAPTER THIRTEEN: MS-DOS, PC-BIOS AND FILE I/O (Part 10)

The Art of ASSEMBLY LANGUAGE PROGRAMMING

Chapter Thirteen (Part 9)	Table of Content	Chapter Thirteen (Part 11)


CHAPTER THIRTEEN: MS-DOS, PC-BIOS AND FILE I/O (Part 10)

13.4 - UCR Standard Library File I/O Routines 13.4.1 - Fopen 13.4.2 - Fcreate 13.4.3 - Fclose 13.4.4 - Fflush	13.4.5 - Fgetc 13.4.6 - Fread 13.4.7 - Fputc 13.4.8 - Fwrite 13.4.9 - Redirecting I/O Through the StdLib File I/O Routines

13.4 UCR Standard Library File I/O Routines

Although MS-DOS' file I/O facilities are not too bad, the UCR Standard Library provides a file I/O package which makes blocked sequential I/O as easy as character at a time file I/O. Furthermore, with a tiny amount of effort, you can use all the StdLib routines like printf, print, puti, puth, putc, getc, gets, etc., when performing file I/O. This greatly simplifies text file operations in assembly language.

Note that record oriented, or binary I/O, is probably best left to pure DOS. any time you want to do random access within a file. The Standard Library routines really only support sequential text I/O. Nevertheless, this is the most common form of file I/O around, so the Standard Library routines are quite useful indeed.

The UCR Standard Library provides eight file I/O routines: fopen, fcreate, fclose, fgetc, fread, fputc, and fwrite. Fgetc and fputc perform character at a time I/O, fread and fwrite let you read and write blocks of data, the other four functions perform the obvious DOS operations.

The UCR Standard Library uses a special file variable to keep track of file operations. There is a special record type, FileVar, declared in stdlib.a[8]. When using the StdLib file I/O routines you must create a variable of type FileVar for every file you need open at the same time. This is very easy, just use a definition of the form:

	MyFileVar       FileVar {}

Please note that a Standard Library file variable is not the same thing as a DOS file handle. It is a structure which contains the DOS file handle, a buffer (for blocked I/O), and various index and status variables. The internal structure of this type is of no interest (remember data encapsulation!) except to the implementor of the file routines. You will pass the address of this file variable to the various Standard Library file I/O routines.

13.4.1 Fopen

Entry parameters:       
                ax-     File open mode
                 0- File opened for reading 
                 1- File opened for writing
                dx:si-  Points at a zero terminated string containing 
                        the filename.
                es:di-  Points at a StdLib file variable.

Exit parameters:        
                If the carry is set, 
                 ax contains the returned DOS error code 
                 (see DOS open function).

Fopen opens a sequential text file for reading or writing. Unlike DOS, you cannot open a file for reading and writing. Furthermore, this is a sequential text file which does not support random access. Note that the file must exist or fopen will return an error. This is even true when you open the file for writing.

Note that if you open a file for writing and that file already exists, any data written to the file will overwrite the existing data. When you close the file, any data appearing in the file after the data you wrote will still be there. If you want to erase the existing file before writing data to it, use the fcreatefunction.

13.4.2 Fcreate

Entry parameters:       
                dx:si-  Points at a zero terminated string containing 
                        the filename.
                es:di-  Points at a StdLib file variable.

Exit parameters:        
                If the carry is set, 
                 ax contains the returned DOS error code 
                 (see DOS open function).

Fcreate creates a new file and opens it for writing. If the file already exists, fcreate deletes the existing file and creates a new one. It initializes the file variable for output but is otherwise identical to the fopencall.

13.4.3 Fclose

Entry parameters:       
                es:di-  Points at a StdLib file variable.
Exit parameters:        
                If the carry is set, 
                 ax contains the returned DOS error code 
                 (see DOS open function).

Fclose closes a file and updates any internal housekeeping information. It is very important that you close all files opened with fopenor fcreate using this call. When making DOS file calls, if you forget to close a file DOS will automatically do that for you when your program terminates. However, the StdLib routines cache up data in internal buffers. the fclose call automatically flushes these buffers to disk. If you exit your program without calling fclose, you may lose some data written to the file but not yet transferred from the internal buffer to the disk.

If you are in an environment where it is possible for someone to abort the program without giving you a chance to close the file, you should call the fflush routines (see the next section) on a regular basis to avoid losing too much data.

13.4.4 Fflush

Entry parameters:       
                es:di-  Points at a StdLib file variable.
Exit parameters:        
                If the carry is set, 
                 ax contains the returned DOS error code 
                 (see DOS open function).

This routine immediately writes any data in the internal file buffer to disk. Note that you should only use this routine in conjunction with files opened for writing (or opened by fcreate). If you write data to a file and then need to leave the file open, but inactive, for some time period, you should perform a flush operation in case the program terminates abnormally.

13.4.5 Fgetc

Entry parameters:       
                es:di-  Points at a StdLib file variable.
Exit parameters:        
                If the carry flag is clear, 
                 al contains the character read from the file.
                If the carry is set, 
                 ax contains the returned DOS error code 
                 (see DOS open function). ax will contain zero if you attempt to read beyond the end of file.

Fgetc reads a single character from the file and returns this character in the al register. Unlike calls to DOS, single character I/O using fgetc is relatively fast since the StdLib routines use blocked I/O. Of course, multiple calls to fgetc will never be faster than a call to fread (see the next section), but the performance is not too bad.

Fgetc is very flexible. As you will see in a little bit, you may redirect the StdLib input routines to read their data from a file using fgetc. This lets you use the higher level routines like gets and getsm when reading data from a file.

13.4.6 Fread

Entry parameters:       
                es:di-  Points at a StdLib file variable.
                dx:si-  Points at an input data buffer.
                cx-     Contains a byte count.
Exit parameters:        
                If the carry flag is clear, 
                 ax contains the actual number of bytes 
                 read from the file.
                If the carry is set, 
                 ax contains the returned DOS error code 
                 (see DOS open function).

Fread is very similar to the DOS read command. It lets you read a block of bytes, rather than just one byte, from a file. Note that if all you are doing is reading a block of bytes from a file, the DOS call is slightly more efficient than fread. However, if you have a mixture of single byte reads and multi-byte reads, the combination of fread and fgetc work very well.

As with the DOS read operation, if the byte count returned in ax does not match the value passed in the cx register, then you've read the remaining bytes in the file. When this occurs, the next call to fread or fgetc will return an EOF error (carry will be set and ax will contain zero). Note that fread does not return EOF unless there were zero bytes read from the file.

13.4.7 Fputc

Entry parameters:       
                es:di-  Points at a StdLib file variable.
                al-     Contains the character to write to the file.

Exit parameters:        
                If the carry is set, 
                 ax contains the returned DOS error code 
                 (see DOS open function).

Fputc writes a single character (in al) to the file specified by the file variable whose address is in es:di. This call simply adds the character in al to an internal buffer (part of the file variable) until the buffer is full. Whenever the buffer is filled or you call fflush (or close the file with fclose), the file I/O routines write the data to disk.

13.4.8 Fwrite

Entry parameters:       
                es:di-  Points at a StdLib file variable.
                dx:si-  Points at an output data buffer.
                cx-     Contains a byte count.
Exit parameters:        
                If the carry flag is clear, 
                 ax contains the actual number of bytes 
                 written to the file.
                If the carry is set, 
                 ax contains the returned DOS error code 
                 (see DOS open function).

Like fread, fwrite works on blocks of bytes. It lets you write a block of bytes to a file opened for writing with fopen or fcreate.

13.4.9 Redirecting I/O Through the StdLib File I/O Routines

The Standard Library provides very few file I/O routines. Fputc and fwrite are the only two output routines, for example. The "C" programming language standard library (on which the UCR Standard Library is based) provides many routines like fprintf, fputs, fscanf, etc. None of these are necessary in the UCR Standard Library because the UCR library provides an I/O redirection mechanism that lets you reuse all existing I/O routines to perform file I/O.

The UCR Standard Library putc routine consists of a single jmp instruction. This instruction transfers control to some actual output routine via an indirect address internal to the putc code. Normally, this pointer variable points at a piece of code which writes the character in the al register to the DOS standard output device. However, the Standard Library also provides four routines which let you manipulate this indirect pointer. By changing this pointer you can redirect the output from its current routine to a routine of your choosing. All Standard Library output routines (e.g., printf, puti, puth, puts) call putc to output individual characters. Therefore, redirecting the putc routine affects all the output routines.

Likewise, the getc routine is nothing more than an indirect jmp whose pointer variable normally points at a piece of code which reads data from the DOS standard input. Since all Standard Library input routines call the getc function to read each character you can redirect file input in a manner identical to file output.

The Standard Library GetOutAdrs, SetOutAdrs, PushOutAdrs, and PopOutAdrs are the four main routines which manipulate the output redirection pointer. GetOutAdrs returns the address of the current output routine in the es:di registers. Conversely, SetOutAdrsexpects you to pass the address of a new output routine in the es:di registers and it stores this address into the output pointer. PushOutAdrs and PopOutAdrs push and pop the pointer on an internal stack. These do not use the 80x86's hardware stack. You are limited to a small number of pushes and pops. Generally, you shouldn't count on being able to push more than four of these addresses onto the internal stack without overflowing it.

GetInAdrs, SetInAdrs, PushInAdrs, and PopInAdrs are the complementary routines for the input vector. They let you manipulate the input routine pointer. Note that the stack for PushInAdrs/PopInAdrs is not the same as the stack for PushOutAdrs/PopOutAdrs. Pushes and pops to these two stacks are independent of one another.

Normally, the output pointer (which we will henceforth refer to as the output hook) points at the Standard Library routine PutcStdOut[9]. Therefore, you can return the output hook to its normal initialization state at any time by executing the statements[10]:

                mov     di, seg SL_PutcStdOut
                mov     es, di
                mov     di, offset SL_PutcStdOut
                SetOutAdrs

The PutcStdOut routine writes the character in the al register to the DOS standard output, which itself might be redirected to some file or device (using the ">" DOS redirection operator). If you want to make sure your output is going to the video display, you can always call the PutcBIOS routine which calls the BIOS directly to output a character[11]. You can force all Standard Library output to the standard error device using a code sequence like:

                mov     di, seg SL_PutcBIOS
                mov     es, di
                mov     di, offset SL_PutcBIOS
                SetOutAdrs

Generally, you would not simply blast the output hook by storing a pointer to your routine over the top of whatever pointer was there and then restoring the hook to PutcStdOut upon completion. Who knows if the hook was pointing at PutcStdOut in the first place? The best solution is to use the Standard Library PushOutAdrs and PopOutAdrs routines to preserve and restore the previous hook. The following code demonstrates a gentler way of modifying the output hook:

                PushOutAdrs             ;Save current output routine.
                mov     di, seg Output_Routine
                mov     es, di
                mov     di, offset Output_Routine
                SetOutAdrs

        <Do all output to Output_Routine here>

                PopOutAdrs              ;Restore previous output routine.

Handle input in a similar fashion using the corresponding input hook access routines and the SL_GetcStdOut and SL_GetcBIOS routines. Always keep in mind that there are a limited number of entries on the input and output hook stacks so what how many items you push onto these stacks without popping anything off.

To redirect output to a file (or redirect input from a file) you must first write a short routine which writes (reads) a single character from (to) a file. This is very easy. The code for a subroutine to output data to a file described by file variable OutputFile is

ToOutput        proc    far
                push    es
                push    di

; Load ES:DI with the address of the OutputFile variable. This
; code assumes OutputFile is of type FileVar, not a pointer to
; a variable of type FileVar.

                mov     di, seg OutputFile
                mov     es, di
                mov     di, offset OutputFile

; Output the character in AL to the file described by "OutputFile"

                fputc

                pop     di
                pop     es
                ret
ToOutput        endp

Now with only one additional piece of code, you can begin writing data to an output file using all the Standard Library output routines. That is a short piece of code which redirects the output hook to the "ToOutput" routine above:

SetOutFile      proc
                push    es
                push    di

                PushOutAdrs             ;Save current output hook.
                mov     di, seg ToOutput
                mov     es, di
                mov     di, offset ToOutput
                SetOutAdrs

                pop     di
                pop     es
                ret
SetOutFile      endp

There is no need for a separate routine to restore the output hook to its previous value; PopOutAdrs will handle that task by itself.

[8] Actually, it's declared in file.a. Stdlib.a includes file.a so this definition appears inside stdlib.a as well.

[9] Actually, the routine is SL_PutcStdOut. The Standard Library macro by which you would normally call this routine is PutcStdOut.

[10] If you do not have any calls to PutcStdOut in your program, you will also need to add the statement "externdef SL_PutcStdOut:far" to your program.

[11] It is possible to redirect even the BIOS output, but this is rarely done and not easy to do from DOS.


Chapter Thirteen (Part 9)	Table of Content	Chapter Thirteen (Part 11)

Chapter Thirteen: MS-DOS, PC-BIOS and File I/O (Part 10)
28 SEP 1996