API Reference
Collection
Advanced TFS files reading and writing functionality.
- class tfs.collection.Tfs(*args, **kwargs)[source]
Class to mark attributes as TFS attributes.
Any parameter given to this class will be passed to the
_get_filename()
method, together with the plane iftwo_planes=False
is not present.
- class tfs.collection.TfsCollection(directory: Path, allow_write: bool | None = None)[source]
Abstract class to lazily load and write TFS files.
Classes inheriting from this abstract class will be able to define TFS files as readable or writable, and read or write them just as attribute access or assignments. All attributes will be read and written as
TfsDataFrame
objects.Example
If ./example is a directory that contains two TFS files beta_phase_x.tfs and beta_phase_y.tfs with
BETX
andBETY
columns respectively:# All TFS attributes must be marked with the Tfs(...) class, # and generated attribute names will be appended with _x / _y # depending on files found in "./example" class ExampleCollection(TfsCollection): beta = Tfs("beta_phase_{}.tfs") # A TFS attribute other_value = 7 # A traditional attribute. def get_filename(template: str, plane: str) -> str: return template.format(plane) example = ExampleCollection("./example") # Get the BETX / BETY column from "beta_phase_x.tfs": beta_x_column = example.beta_x.BETX # / example.beta_x.BETY # Get the BETY column from "beta_phase_y.tfs": beta_y_column = example.beta_y.BETY # The planes can also be accessed as items (both examples below work): beta_y_column = example.beta["y"].BETY beta_y_column = example.beta["Y"].BETY # This will write an empty DataFrame to "beta_phase_y.tfs": example.allow_write = True example.beta["y"] = DataFrame()
If the file to be loaded is not defined for two planes then the attribute can be declared and accessed as:
coupling = Tfs("getcouple.tfs", two_planes=False) # declaration f1001w_column = example.coupling.F1001W # access
No file will be loaded until the corresponding attribute is accessed and the loaded
TfsDataFrame
will be buffered, thus the user should expect anIOError
if the requested file is not in the provided directory (only the first time, but is better to always take it into account!).When a
TfsDataFrame
is assigned to one attribute, it will be set as the buffer value. If theself.allow_write
attribute is set toTrue
, an assignment on one of the attributes will trigger the corresponding file write.- clear()[source]
Clear the file buffer.
Any subsequent attribute access will try to load the corresponding file again.
- flush()[source]
Write the current state of the TFSDataFrames into their respective files.
- get_path(name: str) Path [source]
Return the actual file path of the property
name
(convenience function).- Parameters:
name (
str
) -- Property name of the file.- Returns:
A
pathlib.Path
of the actual name of the file indirectory
. The path to the file is thenself.directory / filename
.
- read_tfs(filename: str) TfsDataFrame [source]
Reads the TFS file from
self.directory
with the given filename.This function can be overwritten to use something instead of
tfs-pandas
to load the files. It does not set the TfsDataframe into the buffer (that is the job of_load_tfs
)!- Parameters:
filename (
str
) -- The name of the file to load.- Returns:
A
TfsDataFrame
built from reading the requested file.
- write_tfs(filename: str, data_frame: DataFrame)[source]
Write the TFS file to
self.directory
with the given filename.This function can be overwritten to use something instead of
tfs-pandas
to write out the files. It does not check forallow_write
and does not set the Dataframe into the buffer (that is the job of_write_tfs
)!- Parameters:
filename (
str
) -- The name of the file to load.data_frame (
TfsDataFrame
) -- TfsDataframe to write
Constants
General constants used throughout tfs-pandas
, relating to the standard of TFS files.
Errors
Errors that can be raised during the handling of TFS files.
- exception tfs.errors.AbsentColumnNameError(file_path: Path)[source]
Raised when a TFS file does not provide column names.
- exception tfs.errors.AbsentColumnTypeError(file_path: Path)[source]
Raised when a TFS file does not provide column type identifiers.
- exception tfs.errors.AbsentTypeIdentifierError(header_line_elements: list[str])[source]
Raised when a TFS file’s header line does not provide type identifier.
- exception tfs.errors.DuplicateColumnsError[source]
Raised when a TfsDataFrame has duplicate columns.
- exception tfs.errors.DuplicateIndicesError[source]
Raised when a TfsDataFrame has duplicate indices.
- exception tfs.errors.InvalidBooleanHeaderError(header_value: str)[source]
Raised when an unaccepted boolean header value is read in the TFS file.
- exception tfs.errors.IterableInDataFrameError[source]
Raised when an list / tuple is found in the column of a TfsDataFrame.
- exception tfs.errors.MADXCompatibilityError[source]
Raised when validation for MADX compatibility fails.
- exception tfs.errors.NonStringColumnNameError[source]
Raised when a TfsDataFrame has non-string type column names.
- exception tfs.errors.SpaceinColumnNameError[source]
Raised when a TfsDataFrame has spaces in column names.
- exception tfs.errors.TfsFormatError[source]
Raised when an issue is detected in the TFS file or dataframe.
- exception tfs.errors.UnknownTypeIdentifierError(type_identifier: str)[source]
Raised when a TFS file contains an unknown type identifier.
Frame
Contains the class definition of a TfsDataFrame
, inherited from the pandas
DataFrame
, as well
as a utility function to validate the correctness of a TfsDataFrame
.
- class tfs.frame.TfsDataFrame(*args, **kwargs)[source]
Class to hold the information of the built an extended
pandas.DataFrame
, together with a way of getting the headers of the TFS file. The file headers are stored in a dictionary upon read. To get a header value usedata_frame.headers["header_name"]
, ordata_frame["header_name"]
if it does not conflict with a column name in the dataframe.- merge(right: TfsDataFrame | DataFrame, how_headers: str | None = None, new_headers: dict | None = None, **kwargs) TfsDataFrame [source]
Merge
TfsDataFrame
objects with a database-style join. Data manipulation is done by thepandas.Dataframe
method of the same name. Resulting headers are either merged according to the provided how_headers method or as given via new_headers.- Parameters:
right (
TfsDataFrame | pd.DataFrame
) -- TheTfsDataFrame
to merge with the caller.how_headers (
str
) -- Type of merge to be performed for the headers. Either left or right. Refer totfs.frame.merge_headers
for behavior. IfNone
is provided and new_headers is not provided, the final headers will be empty. Case insensitive, defaults toNone
.new_headers (
dict
) -- If provided, will be used as headers for the mergedTfsDataFrame
. Otherwise these are determined by merging the headers from the caller and the otherTfsDataFrame
according to the method defined by the how_headers argument.**kwargs -- Arguments for
pandas.DataFrame.merge
, with the same default values as set incodebase. (the pandas)
- Returns:
A new
TfsDataFrame
with the merged data and merged headers.
- tfs.frame.concat(objs: Sequence[TfsDataFrame | pd.DataFrame], how_headers: str | None = None, new_headers: dict | None = None, **kwargs) TfsDataFrame [source]
Concatenate
TfsDataFrame
objects along a particular axis with optional set logic along the other axes. Data manipulation is done by thepandas.concat
function. Resulting headers are either merged according to the provided how_headers method or as given via new_headers.Warning
Please note that when using this function on many
TfsDataFrames
, leaving the contents of the final headers dictionary to the automatic merger can become unpredictable. In this case it is recommended to provide the new_headers argument to ensure the final result, or leave both how_headers and new_headers asNone
(their defaults) to end up with empty headers.- Parameters:
objs (
Sequence[TfsDataFrame | pd.DataFrame]
) -- theTfsDataFrame
objects to be concatenated.how_headers (
str
) -- Type of merge to be performed for the headers. Either left or right. Refer totfs.frame.merge_headers
for behavior. IfNone
is provided and new_headers is not provided, the final headers will be empty. Case insensitive, defaults toNone
.new_headers (
dict
) -- If provided, will be used as headers for the mergedTfsDataFrame
. Otherwise these are determined by successively merging the headers from all concatenatedTfsDataFrames
according to the method defined by the how_headers argument.**kwargs -- Any keyword argument is given to
pandas.concat
.
- Returns:
A new
TfsDataFrame
with the merged data and merged headers.
- tfs.frame.merge_headers(headers_left: dict, headers_right: dict, how: str) dict [source]
Merge headers of two
TfsDataFrames
together.- Parameters:
headers_left (
dict
) -- Headers of caller (left)TfsDataFrame
when calling.append
,.join
or.merge
. Headers of the left (preceeding)TfsDataFrame
when callingtfs.frame.concat
.headers_right (
dict
) -- Headers of other (right)TfsDataFrame
when calling.append
,.join
or.merge
. Headers of the left (preceeding)TfsDataFrame
when callingtfs.frame.concat
.how (
str
) -- Type of merge to be performed, either left or right. If left, prioritize keys from headers_left in case of duplicate keys. If right, prioritize keys from headers_right in case of duplicate keys. Case-insensitive. IfNone
is given, an empty dictionary will be returned.
- Returns:
A new dictionary as the merge of the two provided dictionaries.
- tfs.frame.validate(data_frame: TfsDataFrame | DataFrame, info_str: str = '', non_unique_behavior: str = 'warn', compatibility: str = 'madx') None [source]
Enforce validity rules on a
TfsDataFrame
(see admonition below). Additional checks are performed for compatibility with eitherMAD-X
orMAD-NG
as provided by thecompatibility
parameter.Methodology
This function performs several different checks on the provided dataframe. The following checks are performed for all compatibility modes (
MAD-X
andMAD-NG
):When checking for
MAD-X
compatibility, which is more restrictive thanMAD-NG
, the following additional checks are performed:Checking the dataframe has headers.
Checking no boolean values are in the dataframe headers.
Checking no complex values are in the dataframe headers.
Checking for a ‘TYPE’ entry is in the dataframe headers.
Checking no boolean-dtype columns are in the dataframe.
Checking no complex-dtype columns are in the dataframe.
- Parameters:
data_frame (
TfsDataFrame | pd.DataFrame
) -- the dataframe to check on.info_str (
str
) -- additional information to include in logging statements.compatibility (
str
) -- Which code to check for compatibility with. Accepted values aremadx
,mad-x
,madng
andmad-ng
, case-insensitive. Defauts tomadx
.non_unique_behavior (
str
) -- behavior to adopt if non-unique indices or columns are found in the dataframe. Acceptswarn
andraise
as values, case-insensitively, which dictates to respectively issue a warning or raise an error if non-unique elements are found.
HDF5 I/O
Additional tools for reading and writing TfsDataFrames
into hdf5
files.
- tfs.hdf.read_hdf(path: Path | str) TfsDataFrame [source]
Read TfsDataFrame from hdf5 file. The DataFrame needs to be stored in a group named
data
, while the headers are stored inheaders
.- Parameters:
path (
Path, str
) -- Path of the file to read.- Returns:
A
TfsDataFrame
object with the loaded data from the file.
- tfs.hdf.write_hdf(path: Path | str, df: TfsDataFrame, **kwargs) None [source]
Write the
TfsDataFrame
to HDF file. The dataframe will be written into the groupdata
, the headers into the groupheaders
. Only one dataframe per file is allowed.- Parameters:
path (
Path, str
) -- Path of the output file.df (
TfsDataFrame
) -- TfsDataFrame to write.**kwargs -- Any keyword argument is given to
pandas.DataFrame.to_hdf
. Note thatkey
is not allowed andmode
needs to bew
if the output file already exists (w
will be used in any case, even if the file does not exist, but only a warning is logged in that case).
Reader
Reading functionalty for TFS files.
- tfs.reader.read_headers(tfs_file_path: Path | str) dict [source]
Parses the top of the tfs_file_path and returns the headers.
- Parameters:
tfs_file_path (
pathlib.Path | str
) -- Path to the TFS file to read.- Returns:
An dictionary with the headers read from the file.
Examples
headers = read_headers("filename.tfs")
Just as with the
read_tfs
function, it is possible to load from compressed files if the compression format is supported bypandas
. The compression format detection is handled automatically from the extension of the provided tfs_file_path suffix. For instance:headers = read_headers("filename.tfs.gz")
- tfs.reader.read_tfs(tfs_file_path: Path | str, index: str | None = None, non_unique_behavior: str = 'warn', validate: str | None = None) TfsDataFrame [source]
Parses the TFS table present in tfs_file_path and returns a
TfsDataFrame
. Note that this function is also exported at the top-level of the package astfs.read
.Note
Loading and reading compressed files is possible. Any compression format supported by
pandas
is accepted, which includes:.gz
,.bz2
,.zip
,.xz
,.zst
,.tar
,.tar.gz
,.tar.xz
or.tar.bz2
. See below for examples.Warning
Through the validate argument, one can activate dataframe validation after loading it from a file, which can significantly slow the execution of this function, e.g. in case of large
TfsDataFrames
such as a sliced FCC lattice. Note that validation can be performed at any time by using thetfs.frame.validate
function.Methodology
This function first calls a helper which parses and returns all metadata from the file (headers content, column names & types, number of lines parsed). The rest of the file (dataframe part) is given to parse to
pandas.read_csv
with the right options to make use of its C engine’s speed. After this, conversion toTfsDataDrame
is made and, if requested, the index is set and validation performed, before the frame is being returned.- Parameters:
tfs_file_path (
pathlib.Path | str
) -- Path to the TFS file to read. Can be a string, in which case it will be cast to a Path object.index (
str
) -- Name of the column to set as index. If not given, looks in tfs_file_path for a column starting withINDEX&&&
.non_unique_behavior (
str
) -- behavior to adopt if non-unique indices or columns are found in the dataframe. Acceptswarn
andraise
as values, case-insensitively, which dictates to respectively issue a warning or raise an error if non-unique elements are found.validate (
str
) -- If an accepted value is given, validation will be performed after loading. Defauts toNone
, which skips validation. Accepted validation modes aremadx
,mad-x
,madng
andmad-ng
, case-insensitive. See thetfs.frame.validate
function for more information on validation.
- Returns:
A
TfsDataFrame
object with the loaded data from the file.
Examples
Reading from a file is simple, as most arguments have sane default values. The simplest usage goes as follows:
tfs.read("filename.tfs")
One can also pass a
Path
object to the function:tfs.read(pathlib.Path("filename.tfs"))
It is possible to load compressed files if the compression format is supported by
pandas
. (see above). The compression format detection is handled automatically from the extension of the provided tfs_file_path suffix. For instance:tfs.read("filename.tfs.gz") tfs.read("filename.tfs.bz2") tfs.read("filename.tfs.zip")
If one wants to set a specific column as index (and drop it from the data), this is done as:
tfs.read("filename.tfs", index="COLUMN_NAME")
One can choose to perform dataframe validation after reading from file, for compatibility with a certain code, by providing a valid argument:
tfs.read("filename.tfs", validate="MAD-NG") # or validate="MAD-X"
If one wants to raise an error on non-unique indices or columns when performing validation, one can do so as:
tfs.read("filename.tfs", non_unique_behavior="raise")
Testing
Testing functionalty for TfsDataFrames.
- tfs.testing.assert_tfs_frame_equal(df1: TfsDataFrame, df2: TfsDataFrame, compare_keys: bool = True, **kwargs)[source]
Compare two
TfsDataFrame
objects, withdf1
being the reference thatdf2
is compared to. This is mostly intended for unit tests. Comparison is done on both the contents of the headers dictionaries (withpandas
’sassert_dict_equal
) as well as the data itself (withpandas
’sassert_frame_equal
).Note
The
compare_keys
argument is inherited frompandas
’sassert_dict_equal
function and is quite unintuitive. It means to check that both dictionaries have the exact same set of keys.Whether this is given as
True
orFalse
, the values are compared anyway for all keys in the first (reference) dict. In the case of this helper function, all keys present indf1
’s headers will be checked for indf2
’s headers and their corresponding values compared. If given asTrue
, then both headers should be the exact same dictionary.- Parameters:
df1 (
TfsDataFrame
) -- The firstTfsDataFrame
to compare.df2 (
TfsDataFrame
) -- The secondTfsDataFrame
to compare.compare_keys (
bool
) -- IfTrue
, checks that both headers have the exact same set of keys. See the above note for exact meaning and caveat. Defaults toTrue
.**kwargs -- Additional keyword arguments are transmitted to
of (pandas.testing.assert_frame_equal for the comparison)
themselves. (the dataframe parts)
Example
reference_df = tfs.read("path/to/file.tfs") new_df = some_function(*args, **kwargs) assert_tfs_frame_equal(reference_df, new_df)
Tools
Additional functions to modify TFS files.
- tfs.tools.remove_header_comments_from_files(list_of_files: list[str | Path]) None [source]
Check the files in the provided list for invalid headers (no type defined) and removes those inplace when found.
- Parameters:
list_of_files (
list[str | Path]
) -- list of Paths to TFS files meant to be checked. The entries of the list can be strings or Path objects.
- tfs.tools.remove_nan_from_files(list_of_files: list[str | Path], replace: bool = False) None [source]
Remove
NaN
entries from files inlist_of_files
.- Parameters:
list_of_files (
list[str | Path]
) -- list of Paths to TFS files meant to be sanitized. The elements of the list can be strings or Path objects.replace (
bool
) -- ifTrue
, the provided files will be overwritten. Otherwise new files withdropna
appended to the original filenames will be written to disk. Defaults toFalse
.
- tfs.tools.significant_digits(value: float, error: float, return_floats: bool = False) tuple[str, str] | tuple[float, float] [source]
Computes
value
and its error properly rounded with respect to the size oferror
.
Writer
Writing functionalty for TFS files.
- class tfs.writer.ValueToStringFormatter[source]
Formatter class to be called for proper formatting of values (headers, dataframe data) into strings to write to file.
- tfs.writer.write_tfs(tfs_file_path: Path | str, data_frame: TfsDataFrame | DataFrame | Series, headers_dict: dict | None = None, save_index: str | bool = False, colwidth: int = 20, headerswidth: int = 20, non_unique_behavior: str = 'warn', validate: str | None = None) None [source]
Writes the provided
DataFrame
to disk at tfs_file_path. Ifheaders_dict
is provided it is written to dist as the headers. Note that this function is exported also at the top-level of the package astfs.write
.Note
Compression of the output file is possible, by simply providing a valid compression extension as the tfs_file_path suffix. Any compression format supported by
pandas
is accepted, which includes:.gz
,.bz2
,.zip
,.xz
,.zst
,.tar
,.tar.gz
,.tar.xz
or.tar.bz2
. See below for examples.Warning
Through the validate argument, one can skip dataframe validation before writing to file which can improve performance. This is not recommended if the file needs to be read by
MAD-X
orMAD-NG
. The default behaviour is to check forMAD-X
compatibility. The option to skip validation is left for the user to use (at their own risk) should they wish to avoid lengthy validation of largeTfsDataFrames
(such as for instance a sliced FCC lattice).- Parameters:
tfs_file_path (
pathlib.Path | str
) -- Path to the output TFS file.data_frame (
TfsDataFrame | pd.DataFrame | pd.Series
) -- The dataframe to write to file. If a Series-like object is given, it will be converted to aTfsDataFrame
first and written with a single column.headers_dict (
dict
) -- Headers for thedata_frame
. If not provided, assumes aTfsDataFrame
was given and tries to usedata_frame.headers
. Writes with empty headers is those are not found either.save_index (
str | bool
) -- bool or string. Default toFalse
. IfTrue
, saves the index ofdata_frame
to a column identifiable byINDEX&&&
. If given as string, saves the index ofdata_frame
to a column named by the provided value.colwidth (
int
) -- Column width, can not be smaller thanMIN_COLUMN_WIDTH
.headerswidth (
int
) -- Used to format the header width for both keys and values.non_unique_behavior (
str
) -- behavior to adopt if non-unique indices or columns are found in the dataframe. Acceptswarn
andraise
as values, case-insensitively, which dictates to respectively issue a warning or raise an error if non-unique elements are found.validate (
str
) -- Determines if and which validation will be performed before writing. By default no validation is performed. Accepted values aremadx
,mad-x
,madng
andmad-ng
(case-insensitive), for compatibility withMAD-X
andMAD-NG
codes, respectively. See thetfs.frame.validate
function for more information on the validation steps.
Examples
Writing to file is simple, as most arguments have sane default values. The simplest usage goes as follows:
tfs.write("filename.tfs", dataframe)
One can choose to perform dataframe validation before writing it to file. This can be done by providing an accepted compatibility mode to check for (either
madx
ormadng
), as:tfs.write("filename.tfs", dataframe, validate="madx")
If one wants to, for instance, raise and error on non-unique indices or columns when validating the dataframe, one can do so as:
tfs.write( "filename.tfs", dataframe, non_unique_behavior="raise", validate="madng" )
It is possible to directly have the output file be compressed, by specifying a valid compression extension as the tfs_file_path suffix. The detection and compression is handled automatically. For instance:
tfs.write("filename.tfs.gz", dataframe)