casa-formats-io documentation¶
Scope¶
The casa-formats-io package is a small package which implements functionality to read data stored in CASA formats (such as .image datasets). This implementation is independent of and does not use casacore. The motivation for this package is to provide:
Efficient data access via dask arrays
Cross-platform data access, supporting Linux, MacOS X and Windows
Data access with all modern Python versions, from 3.6 to the latest Python version
At this time (November 2020), only reading .image datasets is supported. Reading measurement sets (.ms) or writing data of any kind are not yet supported.
casa-formats-io supports python versions >=3.8.
Using casa-formats-io¶
To construct a dask array backed by a .image dataset, use the
image_to_dask()
function:
>>> from casa_formats_io.casa_dask import image_to_dask
>>> dask_array = image_to_dask('my_dataset.image/')
dask.array<CASA Data 6bd6f684-0d21-4614-b953, shape=(2114, 1, 2450, 2450), dtype=float32, chunksize=(14, 1, 350, 2450), chunktype=numpy.ndarray>
Note that rather than use the native CASA chunk size as the size of dask chunks,
which is extremely inefficient for large datasets (for which there may be a
million CASA chunks or more), the casa_io_formats.image_to_dask()
function will
automatically join neighbouring chunks together on-the-fly which then provides
significantly better performance.
In addition to image_to_dask()
, this package
implements getdesc()
and getdminfo()
which aim to return the same results as CASA’s
getdesc
and
getdminfo
respectively.
Finally, this package provides coordsys_to_astropy_wcs()
) which can
be used to convert CASA WCS information to WCS
objects.
Table reader (experimental)¶
This package includes an experimental generic table reader which integrates with
the astropy Table
class. To use it, first import
the casa_formats_io
module, which registers the reader, then use the
Table.read
method:
>>> import casa_formats_io
>>> from astropy.table import Table
>>> table = Table.read('my_dataset.ms')
If the table contains a DATA_DESC_ID
column, which is the case for e.g.
measurement sets, you will need to also specify the data_desc_id=
argument
to Table.read
with a valid integer
DATA_DESC_ID value.
>>> table_3 = Table.read('my_multims.ms', data_desc_id=3)
Reference/API¶
casa_formats_io Package¶
Functions¶
|
Convert a casac.coordsys object into an |
|
Return the same output as CASA's getdesc() function, namely a dictionary with metadata about the .image file, parsed from the |
|
Return the same output as CASA's getdminfo() function, namely a dictionary with metadata about the .image file, parsed from the |
|
Read a CASA image (a folder containing a |