Next: References Up: Parallel Implementation of the Previous: M dwarfs

Conclusions

We have been able to obtain a significant speed-up of our serial model atmosphere code with only a modest number of changes. We find that using the MPI library calls and a distributed memory model the parallel version of PHOENIX was easier to add to our existing serial code than a shared memory model where we have to carefully make sure that specific memory locations are only updated correctly. The speed-ups we have been able to achieve are below the theoretical maximum, however, this is not unexpected when such things as loop overhead and communications are accounted for. This is very similar to the earlier process of moving from strictly serial codes to vectorized codes, the theoretical maximum vector speed-up is very rarely reached in practical applications. The parallel speed-up of PHOENIX is important for practical application and, in addition, allows both much larger (in terms of memory size and CPU time) problems to be handled.

Future improvements of the parallel version of PHOENIX will include the distribution of the NLTE groups to different nodes (improving the degree of parallelization and allowing much larger problems to be handled on machines with less memory per node, e.g, the Cray T3D) as well as additional optimization of the code based on experience with large scale production runs on parallel machines.

It is a pleasure to thank D. Branch, P. Nugent, A. Schweitzer, S. Shore, and S. Starrfield for stimulating discussions. We thank the anonymous referee for suggestions which improved the presentation. This work was supported in part by NASA LTSA grants NAGW 4510 and NAGW 2628 and by NASA ATP grant NAG 5-3067 to Arizona State University and by NSF grant AST-9417242, and by an IBM SUR Grant to the University of Oklahoma. Some of the calculations presented in this paper were performed at the Cornell Theory Center (CTC), the San Diego Supercomputer Center (SDSC), supported by the NSF, and the Paderborn Center for Parallel Computing, we thank them for a generous allocation of computer time.

Next: References Up: Parallel Implementation of the Previous: M dwarfs

Peter H. Hauschildt
4/27/1999