Convergence

The convergence properties of the line transfer tests presented here are shown in Figs. 7-10. In each figure, we show the convergence rates, as measured by the relative corrections per iteration, for a number of test runs. In all tests show here we have used

points and $n_\theta=n_\phi=32$ solid angle points for the 3D test case and 64 radial points for the 1D comparison test. The iterations were started with $S_l={\bar J}=B$ at all spatial points, this initial guess causes a relative error of about

in ${\bar J}$ at the outer zones for the case with $\epsilon _l=10^{-2}$ and about

in ${\bar J}$ at the outer zones for the case with $\epsilon _l=10^{-8}$ . The plots show that the $\Lambda$ iteration is useless even for the relatively benign case of $\epsilon _l=10^{-2}$ . The operator splitting method delivers much larger corrections and is substantially accelerated by the Ng method, similar to the results shown in Paper I. The nearest-neighbor operator gives substantially better convergence rates than the diagonal operator, cf. Fig 7, for the test cases with with $\epsilon_l < 10^{-2}$ the convergence behavior of the diagonal operator is unstable, the corrections tend to show oscillations. The nearest-neighbor operator shows stable convergence with quickly declining corrections for all test cases, its convergence rate can be accelerated with Ng's method. The total number of iterations required for the nearest-neighbor operator is essentially identical to the 1D case with a tri-diagonal operator.

**Table 1:** Scaling results for different parallel configurations. $N_{\rm worker}$ is the number of MPI processes working on a formal solution for a single wavelength, $N_{\rm cluster}$ is the number of $N_{\rm worker}$ sets of processes working on different wavelength, the total number of MPI processes is $N_{\rm cluster}\times N_{\rm worker}$ . The column 'FS+ $\ifmmode{\Lambda^*}\else\hbox{$\Lambda^*$}\fi$ +OS step' gives teh wallclock time (in seconds) for the calculation of the first formal solution plus the construction of the $\ifmmode{\Lambda^*}\else\hbox{$\Lambda^*$}\fi$ operator plus the time for the first operator splitting step, the column 'FS+OS step' is the time for the second (and subsequent) formal solution and operator splitting step.
$N_{\rm worker}$	$N_{\rm cluster}$	FS+ $\ifmmode{\Lambda^}\else\hbox{$\Lambda^$}\fi$ +OS step	FS+OS step
128	1	3018	1143
64	2	2595	1072
32	4	2340	1032
16	8	2308	1018
8	16	2264	1052
4	32	2318	1054