We also meet a type of File which is not an ImageFile and so does not hold its contents as a series of records. It is called a Byte File and has three subclasses, InByteFile, OutByteFile and DirectByteFile.
The ImageFile generally views files as ordered lists of records. The DirectFile views them as numbered lists of records. The difference is rather like the difference between linked lists and arrays. To reach item 100 on a linked list you must count past the first 99, while to reach item 100 in an array you can go straight to it using a subscripted variable. With an InFile or OutFile, in order to read or write record 100, you must first read or write the previous 99, while with DirectFile you can go straight to record 100, using a procedure called Locate, which is an attribute of DirectFile.
The other difference with a DirectFile is that a program can both read from and write to the same file, without closing and reopening as a different type of file. This makes DirectFile a very useful concept for what is generally known as database work. This involves retrieving, storing and updating information.
When a DirectFile is first created it is empty, but it has certain properties. One of these is the record length, which must be the same for all its records. Another may be the maximum number of records that the file can hold, but not all systems will fix this in advance.
Once some output has been transferred to a DirectFile, through OutImage, this will contain one or more non-empty records. Each of these will have a sequence number, called its Location. As it is not necessary to write to records in the order of their sequence numbers, these non-empty records may be mixed up with empty or unwritten records.
It is central to the understanding of DirectFile to realise that there can be "holes" in the sequence of Locations of written records and that these represent unwritten records. This is particularly crucial when an attempt is made to read from a particular location. The effect will depend upon whether or not a record has been written there.
Consider example 15.1. The only unfamiliar concept is that of Locate. This simply moves the current position of the program within the sequence of records in a DirectFile to the record whose location is given as a parameter to Locate.
Example 15.1: Writing to a DirectFile.
begin ref(DirectFile) Direct; Direct :- new DirectFile("Data"); inspect Direct do begin Open(Blanks(80)); OutText("First"); OutImage; Locate(4); OutText("Fourth"); OutImage; Locate(10); OutText("Tenth"); OutImage; OutText("Eleventh"); OutImage; OutText("Sixth"); Locate(6); OutImage; Close end--of--inspect end**of**programDiagram 15.1 shows the contents of the file at the end of the program. Note the holes in the record sequence.
Diagram 15.1: Contents of file after example 15.1.
Location Content 1 "First" 2 unwritten 3 unwritten 4 "Fourth" 5 unwritten 6 "Sixth" 7 unwritten 8 unwritten 9 unwritten 10 "Tenth" 11 "Eleventh"Having described informally what a DirectFile represents, we can now move on to consider the attributes of ImageFile class DirectFile. These are as specified in the 1984 SIMULA Standard and may not all be implemented on some older systems. As usual, you should check the documentation for the system you are using.
Other attributes of ImageFile are redefined slightly for DirectFile. The differences depend on the current value of Location and what is found there. We shall consider the attributes dealing with the image locations first, to make it easier to understand these redefinitions.
Since not all the images in a DirectFile objects permitted range may be filled, the integer procedure LastLoc is provided. This gives the index number of the highest numbered image location currently in use.
Location is an integer procedure which gives the index number of the image location which is currently being accessed. It is legal for the current location to be unused.
Procedure Locate takes one integer parameter. It resets the currently accessed image location to the one whose index matches the parameter. An attempt to exceed the value of MaxLoc causes a runtime error.
These procedures allow programs to access image locations in any order and to check that the locations being accessed match the current contents of the file.
Now let us look at attributes with, mostly, familiar names. First image handling procedures.
InImage reads the contents of the current location into text attribute Image, in a similar way to the InImage of InFile. It is different since the contents of the current location can produce more possibilities. Essentially there are three cases:
OutImage transfers the contents of text attribute Image to the current location. It makes a previously unwritten location into a written one. If the current location is initially greater than LastLoc, LastLoc will be updated to the current location. The current location is then increased by one.
DeleteImage removes the current location's image. This leaves the current location unwritten, i.e. as if it had never been filled. If the location deleted is the same as LastLoc, LastLoc will be reduced to the index of the next highest position written to.
In fact it may be better to lock only that part of the file which the particular program wants to access, leaving the rest free for others to acess. This may or more may not be possible, depending on the system.
Three procedures are provided for this.
The other two parameters are both integers. They indicate the range of locations within the file which this program wishes to lock. Some systems will lock the whole file regardless. If both integers are zero, the whole file is to be locked. If the system does not support locking of files, it returns a negative result to indicate this.
If Lock succeeds within its time limit, zero is returned.
A second call of Lock, with the file already locked, will cause it to be first unlocked and then locked again. This may mean that it becomes locked first by another program.
Example 15.2: Use of DirectFile for numbered records. begin ref(DirectFile) Records; ref(InFile) InPut; InPut :- new InFile("Additions"); InPut.Open(Blanks(86)); ! First 6 hold employee number; Records :- new DirectFile("StaffRecs"); Records.Open(Blanks(80)); while not InPut.EndFile do begin Records.Locate(InPut.InInt); Records.OutText(InPut.InText(80)); Records.OutImage end--of--reading--in--records; while not Records.EndFile do OutText(Records.InText(80)) end**of**program
Not all records contain a convenient number of this sort. It may be necessary to scan a file checking for the required record. Even so, there is often an advantage in being able to search and write to a file without copying it into a new file. Think back to our earlier label programs and see how much simpler they would be with a DirectFile.
A particular example of efficient searching on a non-numerical key is known as hashing. DirectFiles are very useful for simple hashing. Most text books on searching and sorting will explain this in full.
This approach is often quite natural. It has its origins in the use of punched cards, usually holding up to 80 characters, for input and line printers, printing up to 132 characters per line, for output. The name Image is a contraction of the old term "card image", referring to how the contents of a punched card is stored in a particular computer's memory.
This view of the world has never covered all the possible devices for input and output for computers. It certainly does not represent the memory in which most information is stored on modern computers. Neither does it represent "screen oriented" input and output, nor graph plotter output.
In fact, most computers use a large number of different structures for representing data. Some are held in memory, others are connections to external sources and destinations. Only some of them can be adequately thought of in terms of records.
Even when it is possible to pretend that an unstructured file is made up of records, this may slow down access as make believe records are constructed or disassembled. In recognition of the need to provide a solution, SIMULA has a type of file called a ByteFile. This attempts to provide the most general way of reading or writing information, with no assumptions about what that information looks like. This approach is sometimes called "stream oriented" input and output.
As far as most programmers are concerned, the way in which a computer stores information is irrelevant. They are normally interested in manipulating characters and numbers. Numbers are held in most computers as sequences of binary digits (bits) with a fixed maximum length. In real numbers some of these digits represent the position of the decimal point, the others the decimal digits. In integers they all represent the digits. In general the number of bits in an integer and the number of bits in a real is the same and this number of bits is called a "word".
Long reals and short integers may be stored in longer or shorter locations as appropriate.
A computer's memory is an enormous number of bits, divided into fixed size locations which are words. The number of bits in a word is usually several times larger than that needed to represent one ISO character, which requires a minimum of eight bits for the full set. Most computers divide the words in their memory into smaller locations called bytes. Each byte can hold one character and so is, normally, at least eight, but possibly more, bits long.
Figure 15.2 shows some typical memory locations on what is called "32 bit" computer architecture. Computers are often categorised by the number of bits in one word of their memory.
Diagram 15.2: 32 bit memory locations.
integer Ibyte1Ibyte2Ibyte3Ibyte4I -> 1 word -> 32 bits real Ibyte1Ibyte2Ibyte3Ibyte4I -> 1 word -> 32 bits character Ibyte1I -> 1 byte -> 8 bits long real Ibyte1Ibyte2Ibyte3Ibyte4Ibyte5Ibyte6Ibyte7Ibyte8I -> 2 words-> 64 bits short integer Ibyte1Ibyte2I -> 2 bytes-> 16 bits
N.b. this varies from system to sytem, even among 32 bit machines. Consult your documentation carefully if you use ByteFiles.
Clearly most programs handle information by the word (integers and reals) or by the byte (characters and texts). Furthermore, it is usually possible to treat a word as a sequence of bytes. Thus a file type which allows byte by byte access to memory can be used to read words.
Records are also sequences of bytes. They contain a fixed number of bytes (fixed length images), are prefixed by a byte or word indicating their length (variable length images), end with a special character (also variable length images) or are marked in any of a large number of possible ways. Thus reading a byte at a time allows records to be accessed as well. In fact many new ways of structuring files can be built on top of the ByteFile.
Example 15.3: Use of ByteFiles.
begin ref(InByteFile) LocalChars; ref(OutByteFile) ISOChars; LocalChars :- new InByteFile("SOURCE"); ISOChars :- new OutByteFile("OUTPUT"); inspect LocalChars do begin Open; inspect ISOChars do begin Open; SetAccess("bytesize:8"); ! Standard for ISO/ASCII files; while not EndFile do OutByte(Rank(ISOChar(InByte))); Close end--of--inspecting--ISOChars; Close end..of..inspecting..LocalChars end**of**program
The program will convert a file from ISO into local characters, by reading it as a stream of bytes. each of these will occupy the standard eight bits for an ISO character and the value read will be the ISORank for it. By passing this to ISOChar a local character corresponding to this ISO internal code is generated. By passing this to Rank, the local internal code is generated. This is written out to a file as a local byte, with the appropriate number of bits.
It has a short integer procedure ByteSize, which will return the number of bits in a byte on that SIMULA system. The value of ByteSize is fixed.
It also has an Open procedure. Note that this requires no parameters, since there is no Image.
SetAccess also works for ByteFile. The mode bytesize is especially provided for use when files with non-standard byte sizes for a particular system, such as those brought from another computer, are to be processed.
The concepts of direct access and byte oriented access have been outlined.
We have seen the attributes and uses of ImageFile class DirectFile.
We have seen briefly the attributes of ByteFile and its subclasses InByteFile, OutByteFile and DirectByteFile.