We define a task as a logical unit of code that treats an aspect of the physics of the simulation, such as radiative transfer or NLTE rate calculations. In many cases, different tasks are independent and can be executed in parallel, we call this coarse grained parallelism task parallelism. Within each task, the opportunity of data parallelism may exist, which is a fine grained parallelism, e.g., on the level of individual loops in which loop-iteration is independent.
We use the term node to indicate a single processing element of the parallel computer which is the smallest possible separate computational unit of the parallel computer. A node might be a single CPU (e.g., an IBM SP2 thin node) or it might be a multi-CPU SMP node (e.g., a dual Pentium Pro system that is part of a networked cluster used as a parallel machine). Each node is assumed to have local virtual memory and a means of communicating with the other nodes in the system. In addition, the node has access to a global filesystem and, possibly, a local filesystem as well. These assumptions are fulfilled by basically all of the currently available parallel supercomputer systems (with the exception of the Cray T3E which does not support virtual memory).
A single node can execute a number of tasks, either serial on a single CPU or in parallel, e.g., on an SMP node of a distributed shared-memory supercomputer. In addition, data parallelism can also be used across nodes, or any combination of task and data parallelism can be used simultaneously.