next up previous
Next: Results Up: Parallel Algorithms Previous: Global Temporary Files (GTF)

Local Temporary Files (LTF)

The second algorithm we consider in this paper uses file systems that are local to each PE; such local file systems exist on many parallel machines, including most clusters of workstations. This algorithm is a new development discussed here for the first time that tries to take advantage of fast communication channels available on parallel computers and utilized local disk space space for temporary line list files. This local disk space is frequently large enough for the temporary line database and may have high local I/O performance. In addition, I/O on the local disks of a PE does not require any inter-PE communication, whereas globally accessible filesystems often use the same communication channel that explicit inter-PE communication uses. The latter can lead to network congestion if messages are exchanged simultaneously with global I/O operations.

For the line selection, we could use the algorithm described above with the difference that the I/O PE would create a global (or local) database of selected lines. After the line selection is finished, the temporary database could then simply be distributed to all PEs, and stored on their local disk for subsequent use. However, this is likely to be slower in all cases than the GTF algorithm.

Instead, we use a ``ring'' algorithm that creates the local databases directly. In particular, each of the N PEs selects lines for one block from the master database (distributed in round robin fashion between the PEs). After the selection for this one block is complete, each PE sends the necessary data to it next neighbor; PE i sends its results to PE $i+1 {\rm mod\;} N$ sends to PE 0) and, simultaneously, receives data from the previous PE in the ring. This can be easily realized using the MPI mpi_sendrecv call, which allows the simultaneous sending and receiving of data for each member of the ring. Each PE stores the data it receives into a buffer and the process is repeated until the all selected lines from the N blocks are buffered in all N PEs. The PEs then transfer the buffered line data into their local temporary databases. This cycle is repeated until the line selection phase is complete. The line opacity calculations will then proceed in the same way as outlined above, however, the temporary line databases are now local for each PE.

This approach has the advantage that accessing the temporary databases does not incur any (indirect) Network File System (NFS) communication between the PEs as each of them has its own copy of the database. However, during the line selection phase a much larger amount of data has to be communicated over the network between the PEs because now each of them has to ``know'' all selected lines, not only the I/O PE used in the first algorithm.

The key insight here is that low-cost parallel computers constructed out of commodity workstations typically have a very fast communication network (100 Mbs to 1 Gbs) but relatively slow NFS performance. This means that trading off the extra communication for fewer NFS disk accesses in the LTF algorithm is likely to give better performance.

mpi_sendrecv call, which allows the simultaneous sending and receiving of data for each member of the ring. Each PE stores the data it receives into a buffer and the process is repeated until the all selected lines from the N blocks are buffered in all N PEs. The PEs then transfer the buffered line data into their local temporary databases. This cycle is repeated until the line selection phase is complete. The line opacity calculations will then proceed in the same way as outlined above, however, the temporary line databases are now local for each PE.

This approach has the advantage that accessing the temporary databases does not incur any (indirect) Network File System (NFS) communication between the PEs as each of them has its own copy of the database. However, during the line selection phase a much larger amount of data has to be communicated over the network between the PEs because now each of them has to ``know'' all selected lines, not only the I/O PE used in the first algorithm.

The key insight here is that low-cost parallel computers constructed out of commodity workstations typically have a very fast communication network (100 Mbs to 1 Gbs) but relatively slow NFS performance. This means that trading off the extra communication for fewer NFS disk accesses in the LTF algorithm is likely to give better performance.


next up previous
Next: Results Up: Parallel Algorithms Previous: Global Temporary Files (GTF)
Peter Hauschildt
2001-04-16