casa-formats-io documentation

Scope

The casa-formats-io package is a small package which implements functionality to read data stored in CASA formats (such as .image datasets). This implementation is independent of and does not use casacore. The motivation for this package is to provide:

  • Efficient data access via dask arrays

  • Cross-platform data access, supporting Linux, MacOS X and Windows

  • Data access with all modern Python versions, from 3.6 to the latest Python version

At this time (November 2020), only reading .image datasets is supported. Reading measurement sets (.ms) or writing data of any kind are not yet supported.

Using casa-formats-io

To construct a dask array backed by a .image dataset, use the image_to_dask() function:

>>> from casa_formats_io.casa_dask import image_to_dask
>>> dask_array = image_to_dask('my_dataset.image/')
dask.array<CASA Data 6bd6f684-0d21-4614-b953, shape=(2114, 1, 2450, 2450), dtype=float32, chunksize=(14, 1, 350, 2450), chunktype=numpy.ndarray>

Note that rather than use the native CASA chunk size as the size of dask chunks, which is extremely inefficient for large datasets (for which there may be a million CASA chunks or more), the casa_io_formats.image_to_dask() function will automatically join neighbouring chunks together on-the-fly which then provides significantly better performance.

In addition to image_to_dask(), this package implements getdesc() and getdminfo() which aim to return the same results as CASA’s getdesc and getdminfo respectively.

Finally, this package provides coordsys_to_astropy_wcs()) which can be used to convert CASA WCS information to WCS objects.

Table reader (experimental)

This package includes an experimental generic table reader which integrates with the astropy Table class. To use it, first import the casa_formats_io module, which registers the reader, then use the Table.read method:

>>> import casa_formats_io
>>> from astropy.table import Table
>>> table = Table.read('my_dataset.ms')

If the table contains a DATA_DESC_ID column, which is the case for e.g. measurement sets, you will need to also specify the data_desc_id= argument to Table.read with a valid integer DATA_DESC_ID value.

>>> table_3 = Table.read('my_multims.ms', data_desc_id=3)

Reference/API

casa_formats_io Package

Functions

coordsys_to_astropy_wcs(coordsys)

Convert a casac.coordsys object into an WCS object

getdesc(filename[, endian])

Return the same output as CASA's getdesc() function, namely a dictionary with metadata about the .image file, parsed from the table.dat file.

getdminfo(filename[, endian])

Return the same output as CASA's getdminfo() function, namely a dictionary with metadata about the .image file, parsed from the table.f0 file.

image_to_dask(imagename[, memmap, mask, ...])

Read a CASA image (a folder containing a table.f0_TSM0 file) into a dask array.