Files in A+


Script Files

As described in earlier chapters, the Load and Load and Remove system functions and the Load and Load and Remove commands deal with script files, loading such a file into the active workspace by interpreting every one of its lines, starting at the top, essentially as if the lines had been entered directly in the active workspace. The difference is that after a file is loaded, the current context and the current directory are the same as they were when the Load command or function was initiated: each is automatically restored if it was changed during execution of the lines of the file. See "Workspaces and Scripts" for more details concerning this kind of file.

Mapped Files

Data files in A+ are called mapped files.

From the viewpoint of the A+ primitive functions, a mapped file is (for the most part) an ordinary array when accessed through an associated variable. A mapped file can always be referenced and (if opened for writing) selectively assigned (see "Selective Assignment") as if it were an ordinary array.

Ordinary (as opposed to selective) Assignment conveys not only value (including shape) but also mapping status to its target.  In particular, a target that is mapped will remain so if and only if the righthand side of the Assignment is mapped (see bullet items below);  otherwise, it will become an ordinary array.  Also, since passing arguments and results acts like ordinary specification, if a function is called with a mapped file as an argument, that (local variable) argument in the function will be mapped, and if the result of the last expression executed in a function is mapped, so is its result.

A+ provides a special syntax that is particularly useful for updating all elements of a mapped file:
     a[]b
and a special syntax that is particularly useful for appending new items to a mapped file:
     a[,]b

Again, see Selective Assignment for definitions of these expressions. Appending items to a mapped file with the latter special sequence is permitted only when enough space already has been allocated for the new items. If not enough space has been allocated, a maxitems error is reported. Such allocation is accomplished with _items ("Items of a Mapped File"). After allocation, the file must be remapped.

On AIX, the workspace size can't be enlarged when there is a file mapped (1 0_dbg{`display;`beam} is nonzero), because A+ requires that the entire workspace be contiguous and AIX requires that all space be taken from one end of the address space; thus, the mapped file blocks workspace enlargement.

When one mapped file is created from another, by ordinary Assignment, the new mapped file has only enough space to contain the data; the value of _items{1;} for the old file is not carried over to the new file.1

Mapped files can be shared by different A+ processes, and all A+ processes on the same machine will immediately see any updates by other A+ processes on that same machine. A+ processes on a different machine, however, may not see the updates immediately, or ever. Whether or not the updates are seen depends on when and if the underlying operating system refreshes certain virtual memory pages. They may be only partly seen when items do not correspond to pages.

When a mapped file is an argument to a defined function, the corresponding local variable is, of course, also a mapped file. If that argument is a global variable, then both variables, global and local, refer to the same file, and therefore can affect each other's values if they have write access to the file. If they have read access only and one variable is used for writing, then it is no longer a mapped file, but this change in its status naturally does not affect the other variable. In short, at the point at which the function is called they are two independent variables referring to the same file.

To obtain just the value of a mapped file, i.e., a copy of the file, the Right function can be used. See the examples below.

The primitive function (Map or Map In, or Beam) is used to create, reference and update mapped files.

   Error Messages

If a Map operation fails, you should get a domain error.  A more specific error message appears in the session log if one is generated by Unix itself.  (The message is not guaranteed to appear, but Unix so reports most problems.)  If the Map is within a protected do, the do result shows the domain error, not the specific error.  To help determine the cause of the failure, you can use sys.errno{} to retrieve the Unix error number, although exactly which system call failed will sometimes be ambiguous.

If an "Operation would block" (or perhaps "interrupted system call") error message is received, it usually means that there has been a temporary network outage or extreme slowdown, so the map operation timed out. Before reporting this problem to the user, A+ makes repeated attempts to carry out the operation, over a period of about a minute. Be aware that such a delay can occur.

If a "not an `a object" error message is received, it may mean that the address space for all mapped files (including atmp) has been exhausted, and that therefore the file header does not agree with the file body. Mortgage files can be especially large, and may contribute to the exhaustion of this 1Gb to 4Gb space.

If a file opened for reading only is "wrong-endian", it is copied into a variable in atmp and a warning message is issued. For writing, including local writing, Beam rejects a "wrong-endian" file.

   Concurrent Reading and Writing of a Mapped File

When two users, A and B, say, map the same NFS file (as A+ variables), A for writing and B only for reading, what B sees when A makes a change depends upon the location of their machines and the file in the network:

  1. A and B are on the same machine and A updates the file in place. B will see the change immediately, regardless of whether B mapped the file before or after A made the change.

  2. A and B are on the same machine and A uses _items and appends to the file before B maps (or remaps) the file. B will see the change immediately.

  3. Like 2, but A uses _items after B maps the file. There is no problem unless A has created a new page and B attempts to reference the new page, in which case B will get a segv error.

  4. Like 1, 2, and 3 except that A and B are on different machines. The results are more or less the same as in those three cases but may be delayed. Unix looks for a page in its cache first, and uses it if it is there. Depending on system activity, a page may remain in the cache even after it has been changed on another machine, as NFS has no way of knowing that the page has been changed. Hence B will see A's change only after the affected old pages are removed from the cache by the system to make way for more recently referenced pages. B can get the new pages by remapping the file before referencing it (and after A's changes), unless A has engaged in the bad practice of changing the file on a machine other than the one on which it resides - in which case any changed pages will have to make their way via NFS to the host machine for the file before users on any other machines see them.

  5. A and B and the file are all on the same machine and A rewrites (remaps) the file. The system creates a new file (a new "inode") and gives it the old name. If anyone has the old copy of the file opened or mapped (the reference count is greater than zero), the system keeps the old file around with the old inode. When B references the mapped variable (without remapping), the reference is by inode and therefore to B's private copy of the old file; A's newly written file is not seen. These old copies take up space on disk and remain until no user has them opened or mapped.

  6. Like 5 except that but B is on a different machine from A and the file, and suppose enough activity has occurred after A rewrote the file to flush the cache in B's machine. A reference to the mapped file by B will produce a "Stale NFS file handle" error report if no one on A's machine has happened to keep the file's reference count above zero and thus preserved the old inode. The system in B's machine has gone looking for an inode that no longer exists (inodes are unique with a domain).

  7. Like 6 except that A renames the file before rewriting (remapping) it (under its original name). Then B sees the old file, as it was before A's actions, even if enough activity has occurred to flush the cache in B's machine, because the old copy of the file, under the old inode, is still around. (For example, files in the mas database that get rewritten are first renamed.)
Warning! An application can crash, with a bus error, if two or more users are writing (not necessarily at the same instant) in the same mapped file. The mapped file mechanism does not mediate independent updates.

   Mapped Files On Remote Machines

Using mapped files across NFS or AFS is problematic for anything other than a simple read, and sometimes even for that. A job which will run on one machine and access data in mapped files on another machine should probably have a server process on the machine where the files are. The application should then submit queries and updates and receive data through an adap connection to that server.

   Map yx

        Arguments
y is an integer and x is a symbol or character string, or y is a symbol or a character string and x is a simple character or numeric array.
        Definition
There are two cases.

When a file is mapped, the Unix command sequence is  open(); mmap(); close().  close() does not unmap the file but does free the file descriptor. When the file is unmapped - by assigning a new value to the variable, expunging the variable, ending the A+ session, or whatever - munmap() is called.

   Map In x

        Argument
x is either a symbol or a character vector.
        Definition
x is equivalent to 0x.

   Examples

     fro0`file    Create a mapping of the file file.m for reading only.
                    If fro is changed, it becomes an unmapped array.
     frw1`file    Create a mapping of file.m for reading and writing.
     frlw2`file   Read and local write: see others' changes; don't show own.
     frw[i]new     Changes in place to frw also modify the file file.m
     frlw[i]new
 1                  Change seen.
     frlw[i]newer  This change will not modify file.m.
     frw[i]newer   Only frlw was modified:
 0                  new and newer are unequal.
     `filearray    Write a simple array as mapped file file.m

     'test.m'0 40   Create a file for a matrix with 0 rows.
      Make the allocation 10 rows. Left argument is the total number of items in the file.
     _items{10;'test.m'}
 0                  Successful allocation: result is the former number of items.
     t1'test.m'   Map the file test.m in read/write mode.
     t
 0 4
     t[,]10 20 30 40     Append a new row.
     t
 1 4
     t[,]1 2 3 4         Append another new row.
     t
 2 4
     t
 10 20 30 40
  1  2  3  4
     _items{1;'test.m'}  Determine number of allocated items.
 10                       Still 10 allocated.
      Make the allocation 20 rows. Left argument is the total number of items in the file.
     _items{20;'test.m'}
 10                   Successful: result is former number of items.
                      Since _items was executed, remap the file test.m
     t1'test.m'
     a1`file        a is a mapped file.
     ba              b is a mapped file.
     ca             c is not a mapped file, but has same current value as a
     f{x;y}:{...}
     f{a;a}          In this invocation of f, arg x is a mapped file and y is not.

Unix Text Files

A simple way to read a Unix text file f is to enter
     msys.readmat{f}
where f is a character vector giving a path name. The file is read into m as a matrix, in which all rows have been made equal in length to the longest row in the file, by appended blanks.

A file can be read as a vector, with newline characters embedded, by a Pipe In command like

    $<FileVarName cat FileName

or by a function such as:

read{file}:{
     if (0>ssys.filesize{file}) 'read failed: ',s;
     as' ';
     fdsys.open{file;`O_RDONLY;0};
     sys.read{fd;a;s};
     sys.close{fd};
     a
     }
Partition Count and Partition () can make the result of such a vector into a nested array of lines.

In this function the local variable s tells sys.read how many characters to read. Clearly, with minor alterations to the code shown, the file can also be read a portion at a time. Moreover, the function sys.lseek can be used to choose a point from which to start reading a portion of the file.

Data can be written to a Unix text file by Pipe Out and Pipe Out Append commands like

	$>FileVarName FileName
	$>>FileVarName FileName
when the data is a character vector with embedded newlines, or by functions like the following, which also convert any character matrix arguments to vectors, deleting trailing blanks and appending newline characters as required:
write{file;data}:{
     if ((`char=data)^2=data) dataclean{data};
     if (0<fdsys.open{file;
                       `O_CREAT`O_TRUNC`O_WRONLY;
                       86 4 4})
        {
        sys.write{fd;data;#data};
        sys.close{fd};
        };
     }

clean{n}:(cleanline<@1 n),<"\n"

cleanline{x}:(-+/^\' '=x)x
You can easily devise variations for yourself, for both reading and writing. For more details, such as the meanings and permissible values of the arguments to the sys functions, see "The sys Context".

doc@aplusdev.org© Copyright 1995–2008 Morgan Stanley Dean Witter & Co. All rights reserved.