As demonstrated by Olson et al. (1987) and Olson & Kunasz (1987), the coefficients , , and can be used to construct diagonal and tri-diagonal operators for 1D radiation transport problems. In fact, up to the full matrix can be constructed by a straightforward extension of the idea (Hauschildt & Baron, 2004; Hauschildt et al., 1994). These non-local operators not only lead to excellent convergence rates but they also avoid the problem of false convergence that is inherent in the iteration method and can also be an issue for diagonal (purely local) operators. Therefore, it is highly desirable to implement a non-local also for the 3D case. The tri-diagonal operator in the 1D case is simply a nearest neighbor that considers the interaction of a point with its two direct neighbors. In the 3D case, the nearest neighbor considers the interaction of a voxel with the (up to) surrounding voxels (this definition considers a somewhat larger range of voxels that a strictly face-centered view of just 6 nearest neighbors). This means that the non-local requires the storage of 27 (26 surrounding voxels plus local, i.e., diagonal effects) times the total number of voxels elements. This can be reduced, for example if one considers only the voxels that share a face to a total of 7 elements for each voxel. However, this would ignore the effect of characteristics that pass 'diagonally' through a voxel and would therefore lead to a slower convergence rate.

The construction of the operator proceeds in the same way as discussed in Hauschildt (1992). In the 3D case, the 'previous' and 'next' voxels along each characteristic must be known so that the contributions can be attributed to the correct voxel. Therefore, we use a data structure that attaches to each voxel its effects on its neighbors. The scheme can be extended trivially to include longer range interactions for better convergence rates (in particular on larger voxel grids), however the memory requirements to simply store ultimately scales like where is the total number of voxels. The storage requirements can be reduced by, e.g., using 's of different widths for different voxels. Storage requirements are not so much a problem if a domain decomposition parallelization method is used and enough processors are available. Below we will show some results for test cases with larger operators.

We describe here the general procedure of calculating the
with *arbitrary* bandwidth, up to the full -operator, for the method in
spherical symmetry (Hauschildt et al., 1994). The construction of the
is
described in Olson & Kunasz (1987), so that we here summarize the relevant formulae. In the
method of Olson & Kunasz (1987), the elements of the row of
are computed by setting the
incident intensities (boundary conditions) to zero and setting
for one voxel and performing a formal
solution analytically.

We describe the construction of
using the example of a single
characteristic. The contributions to the
at a voxel are given by

These contributions are computed along a characteristic, here labels the voxels