Source code for turn_by_turn.io

"""
IO
--

This module contains high-level I/O functions to read and write turn-by-turn data objects to and from different formats.

Reading Data
============
Since version ``0.9.0`` of the package, data can be loaded either from file or from in-memory structures exclusive to certain codes (for some tracking simulation in *MAD-NG* or *xtrack*).
Two different APIs are provided for these use cases.

1. **To read from file**, use the ``read_tbt`` function (exported as ``read`` at the package's level). The file format is detected or specified by the ``datatype`` parameter.
2. **To load in-memory data**, use the ``convert_to_tbt`` function (exported as ``convert`` at the package's level). This is valid for tracking simulation results from e.g. *xtrack* or sent back by *MAD-NG*.

In both cases, the returned value is a structured ``TbtData`` object.

Writing Data
============
The single entry point for writing to disk is the ``write_tbt`` function (exported as ``write`` at the package's level). This writes a ``TbtData`` object to disk, typically in the LHC SDDS format (by default). The output file extension and format are determined by the ``datatype`` argument.

The following cases arise:
- If ``datatype`` is set to ``lhc``, ``sps`` or ``ascii``, the output will be in SDDS format and the file extension will be set to ``.sdds`` if not already present.
- If ``datatype`` is set to ``madng``, the output will be in a TFS file (extension ``.tfs`` is recommended).
- Other supported datatypes (see ``WRITERS``) will use their respective formats and conventions if implemented.

The ``datatype`` parameter controls both the output format and any additional options passed to the underlying writer.
Should the ``noise`` parameter be used, random noise will be added to the data before writing. A ``seed`` can be provided for reproducibility.

Example::

    from turn_by_turn import write
    write("output.sdds", tbt_data)  # writes in SDDS format by default
    write("output.tfs", tbt_data, datatype="madng")  # writes a TFS file in MAD-NG's tracking results format
    write("output.sdds", tbt_data, noise=0.01, seed=42)  # reproducibly adds noise before writing

While data can be loaded from the formats of different machines/codes (each through its own reader module), writing functionality is at the moment always done in the ``LHC``'s **SDDS** format by default, unless another supported format is specified. The interface is designed to be future-proof and easy to extend for new formats.


Supported Modules and Limitations
=================================

The following table summarizes which modules support disk reading and in-memory conversion, and any important limitations:

+----------------+---------------------+-----------------------+----------------------------------------------------------+
| Module         | Disk Reading        | In-Memory Conversion  | Notes / Limitations                                      |
+================+=====================+=======================+==========================================================+
| lhc            | Yes (SDDS, ASCII)   | No                    | Reads LHC SDDS and legacy ASCII files.                   |
+----------------+---------------------+-----------------------+----------------------------------------------------------+
| sps            | Yes (SDDS, ASCII)   | No                    | Reads SPS SDDS and legacy ASCII files.                   |
+----------------+---------------------+-----------------------+----------------------------------------------------------+
| doros          | Yes (HDF5)          | No                    | Reads DOROS HDF5 files.                                  |
+----------------+---------------------+-----------------------+----------------------------------------------------------+
| madng          | Yes (TFS)           | Yes                   | In-memory: only via pandas/tfs DataFrame.                |
+----------------+---------------------+-----------------------+----------------------------------------------------------+
| xtrack         | No                  | Yes                   | Only in-memory via xtrack.Line.                          |
+----------------+---------------------+-----------------------+----------------------------------------------------------+
| ptc            | Yes (trackone)      | No                    | Reads MAD-X PTC trackone files.                          |
+----------------+---------------------+-----------------------+----------------------------------------------------------+
| esrf           | Yes (Matlab .mat)   | No                    | Experimental/untested.                                   |
+----------------+---------------------+-----------------------+----------------------------------------------------------+
| iota           | Yes (HDF5)          | No                    | Reads IOTA HDF5 files.                                   |
+----------------+---------------------+-----------------------+----------------------------------------------------------+
| ascii          | Yes (legacy ASCII)  | No                    | For legacy ASCII files only.                             |
+----------------+---------------------+-----------------------+----------------------------------------------------------+
| trackone       | Yes (MAD-X)         | No                    | Reads MAD-X trackone files.                              |
+----------------+---------------------+-----------------------+----------------------------------------------------------+

- Only ``madng`` and ``xtrack`` support in-memory conversion.
- Most modules are for disk reading only.
- Some modules (e.g., ``esrf``) are experimental or have limited support.

API
===
"""

from __future__ import annotations

import logging
from pathlib import Path
from typing import TYPE_CHECKING, Any

from turn_by_turn import (
    ascii,  # noqa: A004
    doros,
    esrf,
    iota,
    lhc,
    madng,
    ptc,
    sps,
    trackone,
    xtrack_line,
)
from turn_by_turn.ascii import write_ascii
from turn_by_turn.errors import DataTypeError
from turn_by_turn.utils import add_noise_to_tbt

LOGGER = logging.getLogger(__name__)

if TYPE_CHECKING:
    from pandas import DataFrame
    from xtrack import Line

    from turn_by_turn.structures import TbtData

TBT_MODULES = {
    "lhc": lhc,
    "doros": doros,
    "doros_positions": doros,
    "doros_oscillations": doros,
    "sps": sps,
    "iota": iota,
    "esrf": esrf,
    "ptc": ptc,
    "trackone": trackone,
    "ascii": ascii,
    "madng": madng,
    "xtrack": xtrack_line,
}

# Modules supporting in-memory conversion to TbtData (not file readers)
TBT_CONVERTERS = ("madng", "xtrack")

# implemented writers
WRITERS = (
    "lhc",
    "sps",
    "doros",
    "doros_positions",
    "doros_oscillations",
    "ascii",
    "madng",
)

write_lhc_ascii = write_ascii  # Backwards compatibility <0.4


[docs] def read_tbt(file_path: str | Path, datatype: str = "lhc") -> TbtData: """ Calls the appropriate loader for the provided matrices type and returns a ``TbtData`` object of the loaded matrices. Args: file_path (Union[str, Path]): path to a file containing TbtData. datatype (str): type of matrices in the file, determines the reader to use. Case-insensitive, defaults to ``lhc``. Returns: A ``TbtData`` object with the loaded matrices. """ file_path = Path(file_path) LOGGER.info(f"Loading turn-by-turn matrices from '{file_path}'") try: module = TBT_MODULES[datatype.lower()] except KeyError as error: LOGGER.exception( f"Unsupported datatype '{datatype}' was provided, should be one of {list(TBT_MODULES.keys())}" ) raise DataTypeError(datatype) from error else: return module.read_tbt(file_path, **additional_args(datatype))
# Note: I don't specify tfs.TfsDataFrame as this inherits from pandas.DataFrame
[docs] def convert_to_tbt(file_data: DataFrame | Line, datatype: str = "xtrack") -> TbtData: """ Convert a pandas or tfs DataFrame (MAD-NG) or a Line (XTrack) to a TbtData object. Args: file_data (Union[DataFrame, xt.Line]): The data to convert. datatype (str): The type of the data, either 'xtrack' or 'madng'. Defaults to 'xtrack'. Returns: TbtData: The converted TbtData object. """ if datatype.lower() not in TBT_CONVERTERS: raise DataTypeError(f"Only {','.join(TBT_CONVERTERS)} converters are implemented for now.") module = TBT_MODULES[datatype.lower()] return module.convert_to_tbt(file_data) # No additional arguments as no doros.
[docs] def write_tbt( output_path: str | Path, tbt_data: TbtData, noise: float = None, seed: int = None, datatype: str = "lhc", ) -> None: """ Write a ``TbtData`` object's data to file, in the ``LHC``'s **SDDS** format. Args: output_path (Union[str, Path]): path to a the disk location where to write the data. tbt_data (TbtData): the ``TbtData`` object to write to disk. noise (float): optional noise to add to the data. seed (int): A given seed to initialise the RNG if one chooses to add noise. This is useful to ensure the exact same RNG state across operations. Defaults to ``None``, which means any new RNG operation in noise addition will pull fresh entropy from the OS. datatype (str): type of matrices in the file, determines the reader to use. Case-insensitive, defaults to ``lhc``. """ output_path = Path(output_path) if datatype.lower() not in WRITERS: raise DataTypeError(f"Only {','.join(WRITERS)} writers are implemented for now.") if datatype.lower() in ("lhc", "sps", "ascii") and output_path.suffix != ".sdds": # I would like to remove this, but I'm afraid of compatibility issues with omc3 (jdilly, 2024) output_path = output_path.with_name(f"{output_path.name}.sdds") # If the datatype is not in the list of writers, we raise an error. Therefore the datatype # must be in the TBT_MODULES dictionary -> No need for a try-except block here. try: module = TBT_MODULES[datatype.lower()] except KeyError: raise DataTypeError( f"Invalid datatype: {datatype}. Ensure it is one of {', '.join(TBT_MODULES)}." ) else: if noise is not None: tbt_data = add_noise_to_tbt(tbt_data, noise=noise, seed=seed) return module.write_tbt(output_path, tbt_data, **additional_args(datatype))
[docs] def additional_args(datatype: str) -> dict[str, Any]: """Additional parameters to be added to the reader/writer function. Args: datatype (str): Type of the data. """ if datatype.lower() == "doros_oscillations": return {"data_type": doros.DataKeys.OSCILLATIONS} if datatype.lower() == "doros_positions": return {"data_type": doros.DataKeys.POSITIONS} return {}