API Reference

Collection

Advanced TFS files reading and writing functionality.

class tfs.collection.Tfs(*args, **kwargs)[source]

Class to mark attributes as TFS attributes.

Any parameter given to this class will be passed to the _get_filename() method, together with the plane if two_planes=False is not present.

class tfs.collection.TfsCollection(directory: Path, allow_write: bool | None = None)[source]

Abstract class to lazily load and write TFS files.

Classes inheriting from this abstract class will be able to define TFS files as readable or writable, and read or write them just as attribute access or assignments. All attributes will be read and written as TfsDataFrame objects.

Example

If ./example is a directory that contains two TFS files beta_phase_x.tfs and beta_phase_y.tfs with BETX and BETY columns respectively:

# All TFS attributes must be marked with the Tfs(...) class,
# and generated attribute names will be appended with _x / _y
# depending on files found in "./example"
class ExampleCollection(TfsCollection):
    beta = Tfs("beta_phase_{}.tfs")  # A TFS attribute
    other_value = 7  # A traditional attribute.

    def get_filename(template: str, plane: str) -> str:
        return template.format(plane)

example = ExampleCollection("./example")

# Get the BETX / BETY column from "beta_phase_x.tfs":
beta_x_column = example.beta_x.BETX  # / example.beta_x.BETY

# Get the BETY column from "beta_phase_y.tfs":
beta_y_column = example.beta_y.BETY

# The planes can also be accessed as items (both examples below work):
beta_y_column = example.beta["y"].BETY
beta_y_column = example.beta["Y"].BETY

# This will write an empty DataFrame to "beta_phase_y.tfs":
example.allow_write = True
example.beta["y"] = DataFrame()

If the file to be loaded is not defined for two planes then the attribute can be declared and accessed as:

coupling = Tfs("getcouple.tfs", two_planes=False)  # declaration
f1001w_column = example.coupling.F1001W  # access

No file will be loaded until the corresponding attribute is accessed and the loaded TfsDataFrame will be buffered, thus the user should expect an IOError if the requested file is not in the provided directory (only the first time, but is better to always take it into account!).

When a TfsDataFrame is assigned to one attribute, it will be set as the buffer value. If the self.allow_write attribute is set to True, an assignment on one of the attributes will trigger the corresponding file write.

clear()[source]

Clear the file buffer.

Any subsequent attribute access will try to load the corresponding file again.

flush()[source]: Write the current state of the TFSDataFrames into their respective files.

get_filename(name: str) → str[source]

Return the actual filename of the property name.

Parameters:: name (str) -- Property name of the file.
Returns:: A str of the actual name of the file in directory. The path to the file is then self.directory / filename.

get_path(name: str) → Path[source]

Return the actual file path of the property name (convenience function).

Parameters:: name (str) -- Property name of the file.
Returns:: A pathlib.Path of the actual name of the file in directory. The path to the file is then self.directory / filename.

read_tfs(filename: str) → TfsDataFrame[source]

Reads the TFS file from self.directory with the given filename.

This function can be overwritten to use something instead of tfs-pandas to load the files. It does not set the TfsDataframe into the buffer (that is the job of _load_tfs)!

Parameters:: filename (str) -- The name of the file to load.
Returns:: A TfsDataFrame built from reading the requested file.

write_tfs(filename: str, data_frame: DataFrame)[source]

Write the TFS file to self.directory with the given filename.

This function can be overwritten to use something instead of tfs-pandas to write out the files. It does not check for allow_write and does not set the Dataframe into the buffer (that is the job of _write_tfs)!

Parameters:

filename (str) -- The name of the file to load.
data_frame (TfsDataFrame) -- TfsDataframe to write

Constants

General constants used throughout tfs-pandas, relating to the standard of TFS files.

Errors

Errors that can be raised during the handling of TFS files.

exception tfs.errors.AbsentColumnNameError(file_path: Path)[source]: Raised when a TFS file does not provide column names.

exception tfs.errors.AbsentColumnTypeError(file_path: Path)[source]: Raised when a TFS file does not provide column type identifiers.

exception tfs.errors.AbsentTypeIdentifierError(header_line_elements: list[str])[source]: Raised when a TFS file’s header line does not provide type identifier.

exception tfs.errors.DuplicateColumnsError[source]: Raised when a TfsDataFrame has duplicate columns.

exception tfs.errors.DuplicateIndicesError[source]: Raised when a TfsDataFrame has duplicate indices.

exception tfs.errors.InvalidBooleanHeaderError(header_value: str)[source]: Raised when an unaccepted boolean header value is read in the TFS file.

exception tfs.errors.IterableInDataFrameError[source]: Raised when an list / tuple is found in the column of a TfsDataFrame.

exception tfs.errors.MADXCompatibilityError[source]: Raised when validation for MADX compatibility fails.

exception tfs.errors.NonStringColumnNameError[source]: Raised when a TfsDataFrame has non-string type column names.

exception tfs.errors.SpaceinColumnNameError[source]: Raised when a TfsDataFrame has spaces in column names.

exception tfs.errors.TfsFormatError[source]: Raised when an issue is detected in the TFS file or dataframe.

exception tfs.errors.UnknownTypeIdentifierError(type_identifier: str)[source]: Raised when a TFS file contains an unknown type identifier.

Frame

Contains the class definition of a TfsDataFrame, inherited from the pandas DataFrame, as well as a utility function to validate the correctness of a TfsDataFrame.

class tfs.frame.TfsDataFrame(*args, **kwargs)[source]

Class to hold the information of the built an extended pandas.DataFrame, together with a way of getting the headers of the TFS file. The file headers are stored in a dictionary upon read. To get a header value use data_frame.headers["header_name"], or data_frame["header_name"] if it does not conflict with a column name in the dataframe.

merge(right: TfsDataFrame | DataFrame, how_headers: str | None = None, new_headers: dict | None = None, **kwargs) → TfsDataFrame[source]

Merge TfsDataFrame objects with a database-style join. Data manipulation is done by the pandas.Dataframe method of the same name. Resulting headers are either merged according to the provided how_headers method or as given via new_headers.

Parameters:

right (TfsDataFrame | pd.DataFrame) -- The TfsDataFrame to merge with the caller.
how_headers (str) -- Type of merge to be performed for the headers. Either left or right. Refer to tfs.frame.merge_headers for behavior. If None is provided and new_headers is not provided, the final headers will be empty. Case insensitive, defaults to None.
new_headers (dict) -- If provided, will be used as headers for the merged TfsDataFrame. Otherwise these are determined by merging the headers from the caller and the other TfsDataFrame according to the method defined by the how_headers argument.
**kwargs -- Arguments for pandas.DataFrame.merge, with the same default values as set in
codebase. (the pandas)

Returns:

A new TfsDataFrame with the merged data and merged headers.

tfs.frame.concat(objs: Sequence[TfsDataFrame | pd.DataFrame], how_headers: str | None = None, new_headers: dict | None = None, **kwargs) → TfsDataFrame[source]

Concatenate TfsDataFrame objects along a particular axis with optional set logic along the other axes. Data manipulation is done by the pandas.concat function. Resulting headers are either merged according to the provided how_headers method or as given via new_headers.

Warning

Please note that when using this function on many TfsDataFrames, leaving the contents of the final headers dictionary to the automatic merger can become unpredictable. In this case it is recommended to provide the new_headers argument to ensure the final result, or leave both how_headers and new_headers as None (their defaults) to end up with empty headers.

Parameters:

objs (Sequence[TfsDataFrame | pd.DataFrame]) -- the TfsDataFrame objects to be concatenated.
how_headers (str) -- Type of merge to be performed for the headers. Either left or right. Refer to tfs.frame.merge_headers for behavior. If None is provided and new_headers is not provided, the final headers will be empty. Case insensitive, defaults to None.
new_headers (dict) -- If provided, will be used as headers for the merged TfsDataFrame. Otherwise these are determined by successively merging the headers from all concatenated TfsDataFrames according to the method defined by the how_headers argument.
**kwargs -- Any keyword argument is given to pandas.concat.

Returns:

A new TfsDataFrame with the merged data and merged headers.

tfs.frame.merge_headers(headers_left: dict, headers_right: dict, how: str) → dict[source]

Merge headers of two TfsDataFrames together.

Parameters:

headers_left (dict) -- Headers of caller (left) TfsDataFrame when calling .append, .join or .merge. Headers of the left (preceeding) TfsDataFrame when calling tfs.frame.concat.
headers_right (dict) -- Headers of other (right) TfsDataFrame when calling .append, .join or .merge. Headers of the left (preceeding) TfsDataFrame when calling tfs.frame.concat.
how (str) -- Type of merge to be performed, either left or right. If left, prioritize keys from headers_left in case of duplicate keys. If right, prioritize keys from headers_right in case of duplicate keys. Case-insensitive. If None is given, an empty dictionary will be returned.

Returns:

A new dictionary as the merge of the two provided dictionaries.

tfs.frame.validate(data_frame: TfsDataFrame | DataFrame, info_str: str = '', non_unique_behavior: str = 'warn', compatibility: str = 'madx') → None[source]

Enforce validity rules on a TfsDataFrame (see admonition below). Additional checks are performed for compatibility with either MAD-X or MAD-NG as provided by the compatibility parameter.

Methodology

This function performs several different checks on the provided dataframe. The following checks are performed for all compatibility modes (MAD-X and MAD-NG):

Checking no single element in the data is a list or tuple.

Checking for non-physical values in the dataframe (uses isna() with the right option context).

Checking for duplicates in either indices or columns.

Checking for column names that are not strings.

Checking for column names including spaces.

When checking for MAD-X compatibility, which is more restrictive than MAD-NG, the following additional checks are performed:

Checking the dataframe has headers.

Checking no boolean values are in the dataframe headers.

Checking no complex values are in the dataframe headers.

Checking for a ‘TYPE’ entry is in the dataframe headers.

Checking no boolean-dtype columns are in the dataframe.

Checking no complex-dtype columns are in the dataframe.

Parameters:

data_frame (TfsDataFrame | pd.DataFrame) -- the dataframe to check on.
info_str (str) -- additional information to include in logging statements.
compatibility (str) -- Which code to check for compatibility with. Accepted values are madx, mad-x, madng and mad-ng, case-insensitive. Defauts to madx.
non_unique_behavior (str) -- behavior to adopt if non-unique indices or columns are found in the dataframe. Accepts warn and raise as values, case-insensitively, which dictates to respectively issue a warning or raise an error if non-unique elements are found.

HDF5 I/O

Additional tools for reading and writing TfsDataFrames into hdf5 files.

tfs.hdf.read_hdf(path: Path | str) → TfsDataFrame[source]

Read TfsDataFrame from hdf5 file. The DataFrame needs to be stored in a group named data, while the headers are stored in headers.

Parameters:: path (Path, str) -- Path of the file to read.
Returns:: A TfsDataFrame object with the loaded data from the file.

tfs.hdf.write_hdf(path: Path | str, df: TfsDataFrame, **kwargs) → None[source]

Write the TfsDataFrame to HDF file. The dataframe will be written into the group data, the headers into the group headers. Only one dataframe per file is allowed.

Parameters:

path (Path, str) -- Path of the output file.
df (TfsDataFrame) -- TfsDataFrame to write.
**kwargs -- Any keyword argument is given to pandas.DataFrame.to_hdf. Note that key is not allowed and mode needs to be w if the output file already exists (w will be used in any case, even if the file does not exist, but only a warning is logged in that case).

Reader

Reading functionalty for TFS files.

tfs.reader.read_headers(tfs_file_path: Path | str) → dict[source]

Parses the top of the tfs_file_path and returns the headers.

Parameters:: tfs_file_path (pathlib.Path | str) -- Path to the TFS file to read.
Returns:: An dictionary with the headers read from the file.

Examples

headers = read_headers("filename.tfs")

Just as with the read_tfs function, it is possible to load from compressed files if the compression format is supported by pandas. The compression format detection is handled automatically from the extension of the provided tfs_file_path suffix. For instance:

headers = read_headers("filename.tfs.gz")

tfs.reader.read_tfs(tfs_file_path: Path | str, index: str | None = None, non_unique_behavior: str = 'warn', validate: str | None = None) → TfsDataFrame[source]

Parses the TFS table present in tfs_file_path and returns a TfsDataFrame. Note that this function is also exported at the top-level of the package as tfs.read.

Note

Loading and reading compressed files is possible. Any compression format supported by pandas is accepted, which includes: .gz, .bz2, .zip, .xz, .zst, .tar, .tar.gz, .tar.xz or .tar.bz2. See below for examples.

Warning

Through the validate argument, one can activate dataframe validation after loading it from a file, which can significantly slow the execution of this function, e.g. in case of large TfsDataFrames such as a sliced FCC lattice. Note that validation can be performed at any time by using the tfs.frame.validate function.

Methodology

This function first calls a helper which parses and returns all metadata from the file (headers content, column names & types, number of lines parsed). The rest of the file (dataframe part) is given to parse to pandas.read_csv with the right options to make use of its C engine’s speed. After this, conversion to TfsDataDrame is made and, if requested, the index is set and validation performed, before the frame is being returned.

Parameters:

tfs_file_path (pathlib.Path | str) -- Path to the TFS file to read. Can be a string, in which case it will be cast to a Path object.
index (str) -- Name of the column to set as index. If not given, looks in tfs_file_path for a column starting with INDEX&&&.
non_unique_behavior (str) -- behavior to adopt if non-unique indices or columns are found in the dataframe. Accepts warn and raise as values, case-insensitively, which dictates to respectively issue a warning or raise an error if non-unique elements are found.
validate (str) -- If an accepted value is given, validation will be performed after loading. Defauts to None, which skips validation. Accepted validation modes are madx, mad-x, madng and mad-ng, case-insensitive. See the tfs.frame.validate function for more information on validation.

Returns:

A TfsDataFrame object with the loaded data from the file.

Examples

Reading from a file is simple, as most arguments have sane default values. The simplest usage goes as follows:

tfs.read("filename.tfs")

One can also pass a Path object to the function:

tfs.read(pathlib.Path("filename.tfs"))

It is possible to load compressed files if the compression format is supported by pandas. (see above). The compression format detection is handled automatically from the extension of the provided tfs_file_path suffix. For instance:

tfs.read("filename.tfs.gz")
tfs.read("filename.tfs.bz2")
tfs.read("filename.tfs.zip")

If one wants to set a specific column as index (and drop it from the data), this is done as:

tfs.read("filename.tfs", index="COLUMN_NAME")

One can choose to perform dataframe validation after reading from file, for compatibility with a certain code, by providing a valid argument:

tfs.read("filename.tfs", validate="MAD-NG")  # or validate="MAD-X"

If one wants to raise an error on non-unique indices or columns when performing validation, one can do so as:

tfs.read("filename.tfs", non_unique_behavior="raise")

Testing

Testing functionalty for TfsDataFrames.

tfs.testing.assert_tfs_frame_equal(df1: TfsDataFrame, df2: TfsDataFrame, compare_keys: bool = True, **kwargs)[source]

Compare two TfsDataFrame objects, with df1 being the reference that df2 is compared to. This is mostly intended for unit tests. Comparison is done on both the contents of the headers dictionaries (with pandas’s assert_dict_equal) as well as the data itself (with pandas’s assert_frame_equal).

Note

The compare_keys argument is inherited from pandas’s assert_dict_equal function and is quite unintuitive. It means to check that both dictionaries have the exact same set of keys.

Whether this is given as True or False, the values are compared anyway for all keys in the first (reference) dict. In the case of this helper function, all keys present in df1’s headers will be checked for in df2’s headers and their corresponding values compared. If given as True, then both headers should be the exact same dictionary.

Parameters:

df1 (TfsDataFrame) -- The first TfsDataFrame to compare.
df2 (TfsDataFrame) -- The second TfsDataFrame to compare.
compare_keys (bool) -- If True, checks that both headers have the exact same set of keys. See the above note for exact meaning and caveat. Defaults to True.
**kwargs -- Additional keyword arguments are transmitted to
of (pandas.testing.assert_frame_equal for the comparison)
themselves. (the dataframe parts)

Example

reference_df = tfs.read("path/to/file.tfs")
new_df = some_function(*args, **kwargs)
assert_tfs_frame_equal(reference_df, new_df)

Tools

Additional functions to modify TFS files.

tfs.tools.remove_header_comments_from_files(list_of_files: list[str | Path]) → None[source]

Check the files in the provided list for invalid headers (no type defined) and removes those inplace when found.

Parameters:: list_of_files (list[str | Path]) -- list of Paths to TFS files meant to be checked. The entries of the list can be strings or Path objects.

tfs.tools.remove_nan_from_files(list_of_files: list[str | Path], replace: bool = False) → None[source]

Remove NaN entries from files in list_of_files.

Parameters:

list_of_files (list[str | Path]) -- list of Paths to TFS files meant to be sanitized. The elements of the list can be strings or Path objects.
replace (bool) -- if True, the provided files will be overwritten. Otherwise new files with dropna appended to the original filenames will be written to disk. Defaults to False.

tfs.tools.significant_digits(value: float, error: float, return_floats: bool = False) → tuple[str, str] | tuple[float, float][source]

Computes value and its error properly rounded with respect to the size of error.

Parameters:

value (float) -- a number.
error (float) -- the error on the number.
return_floats (bool) -- if True, returns significant digits as floats. Otherwise as strings. Defaults to False.

Returns:

A tuple of the rounded value and error with regards to the size of the error.

Writer

Writing functionalty for TFS files.

class tfs.writer.ValueToStringFormatter[source]: Formatter class to be called for proper formatting of values (headers, dataframe data) into strings to write to file.

tfs.writer.write_tfs(tfs_file_path: pathlib.Path | str, data_frame: TfsDataFrame | pd.DataFrame | pd.Series, headers_dict: dict | None = None, save_index: str | bool = False, colwidth: int = 20, headerswidth: int = 20, non_unique_behavior: str = 'warn', validate: str | None = None) → None[source]

Writes the provided DataFrame to disk at tfs_file_path. If headers_dict is provided it is written to dist as the headers. Note that this function is exported also at the top-level of the package as tfs.write.

Note

Compression of the output file is possible, by simply providing a valid compression extension as the tfs_file_path suffix. Any compression format supported by pandas is accepted, which includes: .gz, .bz2, .zip, .xz, .zst, .tar, .tar.gz, .tar.xz or .tar.bz2. See below for examples.

Warning

Through the validate argument, one can skip dataframe validation before writing to file which can improve performance. This is not recommended if the file needs to be read by MAD-X or MAD-NG. The default behaviour is to check for MAD-X compatibility. The option to skip validation is left for the user to use (at their own risk) should they wish to avoid lengthy validation of large TfsDataFrames (such as for instance a sliced FCC lattice).

Parameters:

tfs_file_path (pathlib.Path | str) -- Path to the output TFS file.
data_frame (TfsDataFrame | pd.DataFrame | pd.Series) -- The dataframe to write to file. If a Series-like object is given, it will be converted to a TfsDataFrame first and written with a single column.
headers_dict (dict) -- Headers for the data_frame. If not provided, assumes a TfsDataFrame was given and tries to use data_frame.headers. Writes with empty headers is those are not found either.
save_index (str | bool) -- bool or string. Default to False. If True, saves the index of data_frame to a column identifiable by INDEX&&&. If given as string, saves the index of data_frame to a column named by the provided value.
colwidth (int) -- Column width, can not be smaller than MIN_COLUMN_WIDTH.
headerswidth (int) -- Used to format the header width for both keys and values.
non_unique_behavior (str) -- behavior to adopt if non-unique indices or columns are found in the dataframe. Accepts warn and raise as values, case-insensitively, which dictates to respectively issue a warning or raise an error if non-unique elements are found.
validate (str) -- Determines if and which validation will be performed before writing. By default no validation is performed. Accepted values are madx, mad-x, madng and mad-ng (case-insensitive), for compatibility with MAD-X and MAD-NG codes, respectively. See the tfs.frame.validate function for more information on the validation steps.

Examples

Writing to file is simple, as most arguments have sane default values. The simplest usage goes as follows:

tfs.write("filename.tfs", dataframe)

One can choose to perform dataframe validation before writing it to file. This can be done by providing an accepted compatibility mode to check for (either madx or madng), as:

tfs.write("filename.tfs", dataframe, validate="madx")

If one wants to, for instance, raise and error on non-unique indices or columns when validating the dataframe, one can do so as:

tfs.write(
    "filename.tfs", dataframe, non_unique_behavior="raise", validate="madng"
)

It is possible to directly have the output file be compressed, by specifying a valid compression extension as the tfs_file_path suffix. The detection and compression is handled automatically. For instance:

tfs.write("filename.tfs.gz", dataframe)