Submitter
HTCondor Utilities
This module provides functionality to create HTCondor jobs and submit them to HTCondor
.
write_bash
creates bash scripts executing either a python or madx script.
Takes as input Dataframe, job type, and optional additional commandline arguments for the script.
A shell script is created in each job directory in the dataframe.
make_subfile
takes the job dataframe and creates the .sub files required for submissions to
HTCondor
. The .sub file will be put in the working directory. The maximum runtime of one
job can be specified, standard is 8h.
- pylhc_submitter.submitter.htc_utils.create_multijob_for_bashfiles(job_df: DataFrame, **kwargs) str [source]
Function to create an
HTCondor
submission content for all job-scripts, i.e. bash-files, in the job_df.- Keyword Arguments:
output_dir (str) -- output directory that will be transferred. Defaults to
None
.jobflavour (str) -- max duration of the job. Needs to be one of the
HTCondor
Jobflavours. Defaults toworkday
.group (str) -- force use of accounting group. Defaults to
None
.retries (int) -- maximum amount of retries. Default to
3
.notification (str) -- Notify under certain conditions. Defaults to
error
.priority (int) -- Priority to order your jobs. Defaults to
None
.
- Returns:
HTCondor submission definition.
- Return type:
str
- pylhc_submitter.submitter.htc_utils.create_subfile_from_job(cwd: Path, submission: str | Submit) Path [source]
Write file to submit to
HTCondor
.- Parameters:
cwd (Path) -- working directory
submission (str, htcondor.Submit) -- HTCondor submission definition (i.e. content of the file)
- Returns:
path to sub-file
- Return type:
Path
- pylhc_submitter.submitter.htc_utils.make_subfile(cwd: Path, job_df: DataFrame, **kwargs) Path [source]
Creates submit-file for
HTCondor
. For kwargs, seecreate_multijob_for_bashfiles
.- Parameters:
cwd (Path) -- working directory
job_df (DataFrame) -- DataFrame containing all the job-information
- Returns:
path to the submit-file
- Return type:
Path
- pylhc_submitter.submitter.htc_utils.map_kwargs(add_dict: Dict[str, Any]) Dict[str, Any] [source]
Maps the kwargs for the job-file. Some arguments have pre-defined choices and defaults, the remaining ones are just passed on.
- Parameters:
add_dict (Dict[str, Any]) -- additional kwargs to add to the defaults.
- Returns:
The mapped kwargs.
- Return type:
Dict[str, Any]
- pylhc_submitter.submitter.htc_utils.submit_jobfile(jobfile: Path, ssh: str) None [source]
Submit subfile to
HTCondor
via subprocess.- Parameters:
jobfile (Path) -- path to sub-file
ssh (str) -- ssh target
- pylhc_submitter.submitter.htc_utils.write_bash(job_df: DataFrame, output_dir: Path = None, executable: str = 'madx', cmdline_arguments: dict = None, mask: str | Path = None) DataFrame [source]
Write the bash-files to be called by
HTCondor
, which in turn call the executable. Takes as input Dataframe, job type, and optional additional commandline arguments for the script. A shell script is created in each job directory in the dataframe.- Parameters:
job_df (DataFrame) -- DataFrame containing all the job-information
output_dir (str) -- output directory that will be transferred. Defaults to
None
.executable (str) -- name of the executable. Defaults to
madx
.cmdline_arguments (dict) -- additional commandline arguments for the executable
mask (Union[str, Path]) -- string or path to the mask-file. Defaults to
None
.
- Returns:
The provided
job_df
but with added path to the scripts.- Return type:
DataFrame
Job Submitter IO-Tools
Tools for input and output for the job-submitter.
- class pylhc_submitter.submitter.iotools.CreationOpts(working_directory: Path, mask: Path | str, jobid_mask: str, replace_dict: Dict[str, Any], output_dir: Path, output_destination: Path | str, append_jobs: bool, resume_jobs: bool, executable: str, check_files: Sequence[str], script_arguments: Dict[str, Any], script_extension: str)[source]
Options for creating jobs.
- should_drop_jobs() bool [source]
Check if jobs should be dropped after creating the whole parameter space, e.g. because they already exist.
- pylhc_submitter.submitter.iotools.create_folders(job_df: TfsDataFrame, working_directory: Path, destination_directory: Path | str = None) TfsDataFrame [source]
Create the folder-structure in the given working directory and the destination directory if given. This creates a folder per job in which then the job-scripts and bash-scripts can be stored later.
- Parameters:
job_df (tfs.TfsDataFrame) -- DataFrame containing all the job-information
working_directory (Path) -- Path to the working directory
destination_directory (Path, optional) -- Path to the destination directory,
None. (i.e. the directory to copy the outputs to manually. Defaults to)
- Returns:
The job-dataframe again, but with the added paths to the job-dirs.
- Return type:
tfs.TfsDataFrame
- pylhc_submitter.submitter.iotools.create_jobs(opt: CreationOpts) TfsDataFrame [source]
Main function to prepare all the jobs and folder structure. This greates the value-grid based on the replace-dict and checks for existing jobs (if so desired). A job-dataframe is created - and written out - containing all the information and its values are used to generate the job-scripts. It also creates bash-scripts to call the executable for the job-scripts.
- Parameters:
opt (CreationOpts) -- Options for creating jobs
- Returns:
The job-dataframe containing information for all jobs.
- Return type:
tfs.TfsDataFrame
- pylhc_submitter.submitter.iotools.get_server_from_uri(path: Path | str) str [source]
Get server information from a path. E.g.: root://eosuser.cern.ch//eos/user/a/ -> root://eosuser.cern.ch/
- pylhc_submitter.submitter.iotools.is_eos_uri(path: Path | str | None) bool [source]
Check if the given path is an EOS-URI as eos cp only works with those. E.g.: root://eosuser.cern.ch//eos/user/a/anabramo/banana.txt
This function does not check the double slashes, to avoid having the user pass a malformed path by accident and then assuming it is just a path. This is tested for in
pylhc_submitter.job_submitter.check_opts()
.
- pylhc_submitter.submitter.iotools.print_stats(new_jobs: Sequence[str | int], finished_jobs: Sequence[str | int])[source]
Print some quick statistics.
- pylhc_submitter.submitter.iotools.uri_to_path(path: Path | str) Path [source]
Strip EOS path information from a path. EOS paths for HTCondor can be given as URI. Strip for direct writing. E.g.: root://eosuser.cern.ch//eos/user/a/anabramo/banana.txt
Mask Resolver
This module provides functionality to resolve and write script masks for HTCondor
jobs
submission.
- pylhc_submitter.submitter.mask.check_percentage_signs_in_mask(mask: str) None [source]
Checks for ‘%’ in the mask, that are not replacement variables.
- pylhc_submitter.submitter.mask.create_job_scripts_from_mask(job_df: DataFrame, maskfile: Path, replace_keys: dict, file_ext: str) DataFrame [source]
Takes path to mask file, list of parameter to be replaced and pandas dataframe containg per job the job directory where processed mask is to be put, and columns containing the parameter values with column named like replace parameters. Job directories have to be created beforehand. Processed (madx) mask has the same filename as mask but with the given file extension. Input Dataframe is returned with additional column containing path to the processed script files.
- Parameters:
job_df (pd.DataFrame) -- Job parameters as defined in description.
maskfile -- Path object to the mask file.
replace_keys -- keys to be replaced (must correspond to columns in
job_df
).file_ext -- file extention to use (defaults to madx).
- Returns:
The provided
job_df
but with added path to the scripts.
- pylhc_submitter.submitter.mask.find_named_variables_in_mask(mask: str) Set[str] [source]
Find all variable-names in the mask.
- pylhc_submitter.submitter.mask.generate_jobdf_index(old_df: DataFrame, jobid_mask: str, keys: Sequence[str], values: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) List[str] | Iterable[int] [source]
Generates index for jobdf from mask for job_id naming.
- Parameters:
old_df (pd.DataFrame) -- Existing jobdf.
jobid_mask (str) -- Mask for naming the jobs.
keys (Sequence[str]) -- Keys to be replaced in the mask.
values (np.array_like) -- Values-Grid to be replaced in the mask.
- Returns:
Index for jobdf, either list of strings (the filled jobid_masks) or integer-range.
- Return type:
List[str]
- pylhc_submitter.submitter.mask.is_mask_file(mask: str) bool [source]
Check if given string points to a file.
- pylhc_submitter.submitter.mask.is_mask_string(mask: str) bool [source]
Checks that given string does not point to a file.
Job Submitter Runners
Defines the methods to run the job-submitter, locally or on HTC.
- class pylhc_submitter.submitter.runners.RunnerOpts(working_directory: ~pathlib.Path, jobflavour: str | None = None, output_dir: str | None = None, ssh: str | None = None, dryrun: bool | None = False, htc_arguments: ~typing.Dict[str, ~typing.Any] | None = <factory>, run_local: bool | None = False, num_processes: int | None = 4)[source]
Options for running the submission.
- pylhc_submitter.submitter.runners.run_htc(job_df: TfsDataFrame, opt: RunnerOpts) None [source]
Create submission file and submit the jobs to
HTCondor
.- Parameters:
job_df (tfs.TfsDataFrame) -- DataFrame containing all the job-information
opt (RunnerOpts) -- Parameters for the runner
- pylhc_submitter.submitter.runners.run_jobs(job_df: TfsDataFrame, opt: RunnerOpts) None [source]
Selects how to run the jobs.
- Parameters:
job_df (tfs.TfsDataFrame) -- DataFrame containing all the job-information
opt (RunnerOpts) -- Parameters for the runner
- pylhc_submitter.submitter.runners.run_local(job_df: TfsDataFrame, opt: RunnerOpts) None [source]
Run all jobs locally.
- Parameters:
job_df (tfs.TfsDataFrame) -- DataFrame containing all the job-information
opt (RunnerOpts) -- Parameters for the runner