SOC – dust radiative transfer with OpenCL

SOC is the name of a program that is used to model the scattering and thermal emission of interstellar dust. The name was originally an acronym for Scattering with OpenCL, which refers to the fact that the program is implemented using OpenCL.

The program was used in the code comparison published in Gordon et al. 2017. However, since then the main program has been translated from C++ to Python and some additional features have been added. The initial Python version was described in Juvela 2019. Because the main computations are carried out in OpenCL kernels that are compiled to machine code, the use of Python in the main program did not result in any significant degradation of the run times. On the contrary, the main program is now much shorter and allows easier modifications and tests of alternative implementation details.

This page describes the use of the Python version of SOC. Our goal has been to keep SOC simple so that it only concentrates on the actual radiative transfer calculations. SOC needs only minimal information of the dust models, which is given through input files. SOC can calculate dust temperatures for grains that are in equilibrium with the radiation field but the emission of stochastically heated grains is handled outside SOC (for example, using DustEM).

A SOC run needs the following files that are described below:

  • A parameter file that lists the run parameters and includes references to additional files.
  • A cloud file that describes the density distribution of the model cloud.
  • A dust file that defines the absorption and scattering properties of the dust. There may be separate files for tabulated scattering functions.
  • Files describing the radiation field. This is separate for isotropic background, anisotropic background, and for (internal and external) point sources.

Parameter file (ini-file)

The parameter file (ini-file) is what is given to SOC on the command line and from where SOC extracts all information about other input files and the parameters of the run. It is a text file where each line starts with a keyword that is possibly followed by one or more parameters of that keyword. Lines that do not start with a known keyword are discarded and one can also use the “#” character to indicate a comment. The following list contains most of the common keywords with their arguments (before colon) with a short explanation.

In the descriptions we refer to the total number of cells in the model as CELLS and the number of simulated frequencies as NFREQ. When talking about data arrays, for example array [CELLS, NFREQ] corresponds to a storage order where the index over frequencies runs faster.

  • absorbed filename: gives the name for the file that will contain the number of absorbed photons in each cell and frequency
  • ali: turns on accelerated lambda iterations (only for dust emission, in calculations with multiple iterations, not for stochastically heated grains)
  • background filename [k]: specifies the isotropic background where the intensities are read from file filename, optionally scaled by a factor of k
  • bgpackets #: specifies the number of photon packages to be simulated (per iteration and per frequency) from the background (see background and hpbg)
  • cellpackets #: specifies the number of photon packages to be simulated (per iteration and per frequency) from dust within the model volume
  • CLE: telss SOC to calculate dust emission on device instead of in the host Python code (much faster; only for equilibrium dust)
  • CLT: tells SOC to calculate dust temperatures on device instead of in the host Python code (much faster; only for equilibrium dust)
  • cloud filename: gives the name of the cloud file (containing densities and information of the grid hierarchy)
  • colden filename: asks for column density maps (corresponding to the actual maps being calculated) to be saved to file named based on filename
  • DEFS: to ease experimentation with different kernels, this can be used to add definitions that are passed to the kernel at compile time; this is a string that should follow the format “-D definition” (or any strings understood by the compiler)
  • density #: specifies additional scaling for the density values that are read from the cloud file (default is 1.0, no rescaling)
  • device string: specifies the device (in some versions devices) to be used; c stands for CPU (the default) and g for GPU
  • diffpack #: gives the number of photon packages (per frequency) that are used to simulate emission from the diffuse emission sources specified with diffuse; this is separate from the dust emission from the medium
  • diffuse filename [#]: specifies a file with additional sources of diffuse radiation (photons/cm3/Hz); this is a plain binary file containing CELLS and NFREQ (two int32 numbers) followed by CELLS float32 numbers for the actual emission; see keyword diffpack; note: this diffuse emission is completely separate from the emission that is calculated for the actual dust medium
  • directions theta phi: specifies the direction towards the observer that is used when calculating the the output maps; theta is the angle from positive Z axis and phi the rotation around the +Z axis, the angle increasing from +X towards +Y (thus “directions 90.0 90.0” means that the observer is viewing the model from the direction of the positive Y axis)
  • dsc filename: specifies the name for the scattering function file; there should be either a single line with dsc or as many instances as there are dust components
  • emitted filename: specifies the name for a file where the emissions (number of emitted photons per cell and frequency) are stored
  • emweight #: for numbers #>0, generates photon packages from the dust medium according to a weighting scheme (currently options enweight 1 and emweight 2 exist)
  • EXEC: executes the remainder of the line in a shell at the time when the ini-file is read. Because of obvious security concerns, this is by default commented out in the code (see the file pySOCAux.py)
  • gridlength #: the size of a root grid cell in parsecs
  • hpbg filename # #: specifies a Healpix file that is used to describe the surface brightness of the background sky (instead of or in addition to (check!) an isotropic background); the first optional numerical parameter is a scaling factor applied to the intensity values in the file (default=1) and the second optional parameter, if >0, tells SOC to use weighted sampling when generating photon packages from the background
  • iterations #: number of iterations (when dust emission is simulated and the self-coupling between this emission and dust temperature is significant)
  • loadtemp filename: in case of equilibrium dust (and only a single dust population), recalculates emission based on dust temperatures in the given file; this is used mainly to save disk space, storing CELLS temperature values instead of directly the CELLS×NFREQ emission values
  • local #: overrides the default size of OpenCL work groups; the defaults are 8 for CPU and 32 for GPU
  • mapping nx ny step: specifies the number of pixels and the pixel size of the output maps; nx and ny are the map dimensions and step the pixel size in units of the root grid cells (=in units of gridlength)
  • noabsorbed: in dust emission calculations with equilibrium-temperature dust (not stochastically heated dust), the integral of absorbed energy in each cell can be calculated on the fly, without saving the per-frequency values to a file – by using this keyword, one can skip the writing of the large (CELLS×NFREQ×4 bytes) file of absorptions
  • nomap: skip the writing of intensity and polarisation maps
  • nosolve: skip the solving of dust temperatures and dust emission inside SOC (even when calculating a single dust species with grains in equilibrium temperature; used for example when SOC should write spectrum files based on emission that was already calculated in a previous run or with an external program)
  • optical filename [abundance-file]: this specifies the name of the file containing the dust optical parameters (see Dust Files below); the second optional parameters gives the name of a binary file of [CELLS] float32 numbers that specifies the relative abundance of the dust in every cell (the default would be 1.0 for every cell)
  • optishalf: when calculations include multiple dust components with variable abundances, we need to pass to the kernel large arrays that contain absorption and scattering cross section separately for each cell – with this keyword these are passed as half-precision floating point numbers (16 bits) allowing larger models to be calculated (especially on GPUs) but with some loss of accuracy (saving 4×CELLS bytes of device memory)
  • platform #: select a specific OpenCL platform; by default SOC uses the first available OpenCL platform, based on the CPU vs. GPU selection made with the keyword device
  • p0 #: if keyword polred is not used, specifies the intrinsic polarisation fraction (= polarisation fraction for uniform plane-of-the-sky magnetic field); note that the keyword is letter “p” followed by zero (0, not o)
  • pointsource x y z filename [scale]: adds a point-like radiation source to the model; the position is given by (x,y,z) in units of the root grid, filename refers to the luminosity file (see section Radiation Field below), and scale is an optional scaling (a floating point number) for the source luminosity; there can be many lines with this keyword
  • polmap Bx By Bz: asks SOC to write out (I, Q, U) maps using the three data files given as the arguments; these are hierarchical files with the same structure as the density file and contain the components of the magnetic field vectors for each cell
  • polred filename: file of polarisation reduction factors R for each cell; the format is a plain binary file with the number of cells (int32) followed by CELLS floating point numbers (float32); the input file format could be changed in the code, in pySOCR.py (look for USER.file_polred) also to a hierarchical file with the same format as the density file
  • psmethod #: alternative schemes for the simulation of point sources when they are located outside the model volume; the default is to simulate photon packages isotropically; method 1 sends packages only in the 2 pi solid angle that contains the cloud (a small improvement in SNR); method 2 sends photon packages only towards visible cloud surface elements but the weighting between visible cloud sides is not quite optimal; method 5 sends photon packages towards a cone that is aligned with one of the coordinate axes and contains the cloud (less optimal if the point source position is far from the cloud centre when both are projected onto the plane perpendicular to the centre direction of the cone); currently psmethod 2 is still the recommended one, especially if the point sources is far outside the model volume
  • pspackets #: specifies the number of photon packages (per frequency and iteration) that are used to simulate emission from point sources
  • prefix <string>: specifies the prefix attached to the names of some output files
  • reference: turns on the use of a reference field (only for dust emission with multiple iterations, in practice not for stochastically heated grains)
  • remit min_um max_um: limits the produced emission file (see keyword emitted) to contain data only for the given wavelength range, specified with lower and upper limits in micrometers; used mainly to reduce the size of the output files when surface brightness data are needed only for a limited set of wavelengths (note: in case of stochastically heated grains, the helper script A2E_MABU.py will also read this option)
  • saveint #: saves intensity values for all cells; if parameter # is equal to 1, SOC saves to ISRF.DAT intensities estimated from photons entering a cell (two int32 numbers CELLS, NFREQ followed by float32 array [CELLS, NFREQ]); with value 2 it saves to the file also the net vector sum using additional three vector components (file begins with three int32 numbers CELLS, NFREQ, 4 that are followed by the float32 array [CELLS, NFREQ, 4]); with parameter value 3 SOC saves intensities that are calculated based on the absorptions within the cell (effective value for the intensity over the cell volume, not the average intensity on the cell boundary), the file starts with two int32 numbers (CELLS, NFREQ) followed by the float32 array [NFREQ, CELLS] of intensity values in cgs units
  • seed #: specifies a floating point seed value for the random number generation; if the provided value is negative, the seed is taken from the clock
  • simum # #: specifies in units of µm the wavelength range used in the simulations (can be used for example to skip long wavelengths that do not contribute to heating); default range is 0.003µm-3m
  • singleabu: in the special case of calculation having two dust components for which the sum of relative abundances is one, this advises SOC to store internally only an array for the abundances of the first dust component, the second having implicitly abundances one minus the abundance of the first dust (saving 4×CELLS bytes of host memory)
  • temperature filename: in case of calculations with a single equilibrium temperature dust, specifies a file where the dust temperatures of all cells are stored (the hierarchical file format is the same as for the cloud density file)
  • wavelengths # #: specifies on optional smaller wavelength range for the output spectrum files (resulting in smaller files, if spectra are needed only for a smaller wavelength range); the arguments are the minimum and maximum wavelength in units of µm


Cloud file

We describe here the format for the density file that uses a hierarchical grid. This is a binary file where all values are either 4 byte integers or 4 byte floats. The file begins with five integers: [Nx, Ny, Nz, levels, cells]. Here Nx-Nz are the dimensions of the cartesian root grid, levels is the number of hierarchy levels and cells the total number of cells in the model. This is followed, for each level in the grid hierarchy, by the number of cells on the current level and the vector of density values for the cells on that level. The file thus in a sense consists of separate density vectors, one for each hierarchy level of the disretisation. We follow the C-ordering where the index of the X-axis is running fastest.

If the whole includes only the root grid, the file thus starts with [Nx, Ny, Nz, 1, NX×Ny×Nz, Nx×Ny×Nz], where 1 says that there is only one level. In this case the number Nx×Ny×Nz appears twice, first as the total number of cells, then as the number of cells on the first level. These six integers are followed by Nx×Ny×Nz floating point numbers. Note that if you write the density cube from a 3D python array, the last index is running fastest so that for SOC the Python array element n[k,j,i] would correspond to coordinates x=i, y=j, and z=k.

If the cloud contains multiple hierarchy levels, the density value of a refined cell should be replaced with a link to its first subcell that is part of the density vector for the next hierarchy level. If the first subcell is the k:th cell in the vector containing all cells of the next level, then the density value of the parent cell is replaced by the bit representation of the integer number –k. The fact that it is negative tells SOC that it cannot be a density value and it is thus interpreted this way to find the child cells. Note that in bit representation the value is still integer, not a floating point value like the rest of the elements in the density vector. Thus, for example when working with Python, you cannot simply take a negative value m from the density vector of floats, multiply it by -1 and assume that int(-m) is the index of the child cell. You have to convert it to integer bit representation using Python ctypes library, something like ctypes.c_int.from_buffer(ctypes.c_float(m)).value. The converse transformation from integer to floating point number would be ctypes.c_float.from_buffer(ctypes.c_int(i)).value.

The density values are typically density of hydrogen atoms per cm³ or density of hydrogen molecules per cm³. The correct unit is determined by the dust model, relative to which density the dust opacities are defined.

Dust files

SOC concentrates on the actual radiative transfer (RT) simulations and was written to be dust model agnostic. Many things that are later needed to compute dust temperatures are not needed during the RT phase, even when model includes several possibly stochastically heated dust populations. Therefore the input file for dust properties can be quite simple and is made only slightly more complex by being backwards compatible with the format used for the CRT program.

Here is an example of the beginning of the dust file that in the SOC ini-file is specified using the optical keyword.

eqdust
1.00000e-07
1.00000e-04
184
2.99792e+09 0.00000 6.22363e-15 2.97781e-27
3.67645e+09 0.00000 7.83909e-15 6.73353e-27
4.50854e+09 0.00000 9.87451e-15 1.51629e-26
5.52896e+09 0.00000 1.24392e-14 3.40209e-26
6.78033e+09 0.00000 1.56729e-14 7.67112e-26
8.31493e+09 0.00000 1.97494e-14 1.73108e-25
...

The first line is always eqdust, the second line contains the ratio x between the number of grains and the number of particles, whose number density is given in the cloud file. The third row is the radius a of the grains [cm] and the forth row the number of frequencies in the main table. The main table below the four-line header has four columns: frequency, asymmetry parameter g of the scattering function, absorption efficiency Qabs and scattering efficiency Qsca. The actual cross section for absorption per unit density is thus pi×a²×Qabs×x and similarly the cross section for scattering per unit density is pi×a²×Qsca×x and only the values of these products are used. In other words, the fact that a and x are listed in the header is superfluous because those could as well be set to a value of one and the true cross sections per unit density given directly in the third and fourth column of the main table. Also, for real dust models the cross sections are integrals over size distributions so the dust does not have a single a. For historical reasons, the format is as described above. However, in future we may allow input files that contain just the main table of four columns – or just three columns because also the g parameter is superfluous when the scattering function is in any case specified separately (see below).

In the ini-file keyword dsc is used to specify the file that contains data on the discrete scattering function. The file contains two data tables. The first one is discretised linearly as a function of the cosine of the scattering angle theta (corresponding to linspace(-1.0,+1.0, bins) where bins is the number of bins). For each value of cos(theta), one lists the scattering probability per unit solid angle (not per unit step in theta or in cosine of theta!). This is repeated for every frequency used in the simulation. The second half of the file includes the cumulative probability functions of cos(theta) that provide a mapping from values [0,1] to [-1,+1]. SOC uses this directly to generate random scattering directions. This is also simply a vector of bins elements that implicitly correspond to linear discretisation of the input range [0,1] and that is repeated for every frequency included in the simulation. All in all, the scattering function file consists of bins×NFREQ+bins×NFREQ floats of 4 bytes each.

When runs include several dust species with varying abundances, SOC must still have information of the dust opacities and scattering properties separately for each dust component. In that case the ini-file should just have one line with optical and one line with dsc keyword per each dust component (of course also specifying dust-specific abundances). If one has many dust components but their relative abundances do not vary over the model volume, it is better (=faster) to combine them into a single dust species with a single scattering function.

Radiation field

Radiation field can be produced by isotropic and non-isotropic background sources and point-like radiation sources that reside inside or outside the actual model volume.

Specification of the intensity of an isotropic background component requires a binary file that lists the intensity in cgs units for each frequency (in the order of increasing frequency) included in the simulation. It is thus a plain binary file with NFREQ 4 byte floating point numbers.

Anisotropic background is specified using a Healpix-type binary file. At the moment the resolution of the background map is hardcoded to correspond to healpix maps with size NSIDE=64 (some 49 152 pixels over the sky, each pixel ~55 arcmin in size) in the RING order. The file simply lists the pixel values (4 byte floats) for 49152 pixels on the first frequency, followed by similar vectors for the remaining frequencies in the order of increasing frequency. The total size of the file is thus always NFREQ×49152×4 bytes. For example, if the simulation uses 100 frequencies, the healpix file for the background intensity is 18.8 MB in size. The background file can be prepared in Python for example using the healpy library.

For point sources one of course specifies luminosity (in cgs units, erg/s/Hz) instead of intensity. The file is a plain binary file with NFREQ 4 byte floats, one number per simulated frequency, in the order of increasing frequency.