config package¶
Submodules¶
config.config module¶
- class config.config.TextConfig(mean_length: int, standard_deviation: int)[source]¶
Bases:
objectConfiguration for generated text length distribution.
- mean_length¶
Average length of generated text.
- Type:
int
- standard_deviation¶
Standard deviation for text length.
- Type:
int
- mean_length: int¶
- standard_deviation: int¶
- __init__(mean_length: int, standard_deviation: int) None¶
- class config.config.RunType(*values)[source]¶
Bases:
EnumExecution mode of the data generation pipeline.
- PRODUCTION = 'PRODUCTION'¶
- DEVELOPMENT = 'DEVELOPMENT'¶
- class config.config.Config(release_information: ReleaseInformation, number_of_tickets: int, number_translation_nodes: int, graph_runs_per_batch: int, text_config: TextConfig, assistants: dict[AssistantName, Assistant], run_type: RunType = RunType.DEVELOPMENT, time_zone: ZoneInfo = zoneinfo.ZoneInfo(key='Europe/Berlin'))[source]¶
Bases:
objectHolds all configuration parameters for a dataset generation run.
- release_information¶
Metadata about this release.
- Type:
- number_of_tickets¶
Total synthetic tickets to generate.
- Type:
int
- number_translation_nodes¶
How many translation steps per ticket.
- Type:
int
- graph_runs_per_batch¶
Number of graph executions per batch.
- Type:
int
- text_config¶
Settings for text length variation.
- Type:
- assistants¶
Mapping of assistant roles.
- Type:
dict[AssistantName, Assistant]
- time_zone¶
Time zone for timestamps; default Europe/Berlin.
- Type:
ZoneInfo
- release_information: ReleaseInformation¶
- number_of_tickets: int¶
- number_translation_nodes: int¶
- graph_runs_per_batch: int¶
- text_config: TextConfig¶
- assistants: dict[AssistantName, Assistant]¶
- time_zone: ZoneInfo = zoneinfo.ZoneInfo(key='Europe/Berlin')¶
- property output_file: Path¶
Get the output file path for this release.
- Returns:
Path where generated dataset will be saved.
- Return type:
Path
- property run_information: RunInformation¶
Construct run metrics for reporting and logging.
- Returns:
Object summarizing batch and ticket counts.
- Return type:
- __init__(release_information: ReleaseInformation, number_of_tickets: int, number_translation_nodes: int, graph_runs_per_batch: int, text_config: TextConfig, assistants: dict[AssistantName, Assistant], run_type: RunType = RunType.DEVELOPMENT, time_zone: ZoneInfo = zoneinfo.ZoneInfo(key='Europe/Berlin')) None¶
config.run_information module¶
- class config.run_information.RunInformation(number_of_tickets: int, number_translation_nodes: int, graph_runs_per_batch: int)[source]¶
Bases:
objectHolds parameters and derived metrics for a run of the dataset generator.
- number_of_tickets¶
Total number of tickets to generate.
- Type:
int
- number_translation_nodes¶
Number of translation nodes per ticket.
- Type:
int
- graph_runs_per_batch¶
Number of graph executions in each batch.
- Type:
int
- number_of_tickets: int¶
- number_translation_nodes: int¶
- graph_runs_per_batch: int¶
- property graph_ticket_runs: int¶
Calculate how many ticket graphs will be run.
- Returns:
The number of ticket-graph runs (tickets divided by translation nodes).
- Return type:
int
- property amount_batches: int¶
Calculate how many batches are required.
- Returns:
Total batches, computed as (tickets / batch size) / translation nodes.
- Return type:
int
- __init__(number_of_tickets: int, number_translation_nodes: int, graph_runs_per_batch: int) None¶
config.version module¶
- class config.version.VersionType(*values)[source]¶
Bases:
EnumSemVer part types and their index in the version tuple.
- MAJOR¶
Major version index 0.
- Type:
- MINOR¶
Minor version index 1.
- Type:
- PATCH¶
Patch version index 2.
- Type:
- MAJOR = 0¶
- MINOR = 1¶
- PATCH = 2¶
- class config.version.VersionTag(*values)[source]¶
Bases:
EnumOptional tags appended to version strings.
- TEST¶
“test” tag.
- Type:
- PROD¶
“prod” tag.
- Type:
- TEST = 'test'¶
- PROD = 'prod'¶
- class config.version.Version(version_parts: tuple[int, int, int], version_tag: VersionTag | None = None)[source]¶
Bases:
objectSemantic version representation with optional tag.
- version_parts¶
(major, minor, patch).
- Type:
tuple[int, int, int]
- version_tag¶
Optional tag to append.
- Type:
VersionTag | None
- VERSION_SPLIT = '.'¶
- VERSION_FILE_SPLIT = '_'¶
- VERSION_FILE_PREFIX = 'v'¶
- VERSION_LIMIT = 100¶
- VERSION_PARTS = 3¶
- version_parts: tuple[int, int, int]¶
- version_tag: VersionTag | None = None¶
- static from_string(version_string: str) Version[source]¶
Parse a dotted version string into a Version, padding missing parts with zero.
- with_tag(version_tag: VersionTag) Version[source]¶
Return a new Version with the given tag.
- next_version(version_type: VersionType) Version[source]¶
Increment the specified part and return a new Version.
- __init__(version_parts: tuple[int, int, int], version_tag: VersionTag | None = None) None¶
- class config.version.ReleaseInformation(version: Version, output_dir: Path, file_prefix: str = 'ticket-dataset-', file_extension: str = '.json', release_type: VersionType = VersionType.PATCH)[source]¶
Bases:
objectHolds metadata for a release and computes output file paths.
- output_dir¶
Directory to write the file.
- Type:
Path
- file_prefix¶
Prefix for the output filename.
- Type:
str
- file_extension¶
Extension for the output filename.
- Type:
str
- release_type¶
Which part to bump next.
- Type:
- output_dir: Path¶
- file_prefix: str = 'ticket-dataset-'¶
- file_extension: str = '.json'¶
- release_type: VersionType = 2¶
- get_next_existing() Self[source]¶
Recursively find the next non-existing release, raising if limit exceeded.
- property next_version: Self¶
Return a new ReleaseInformation with its version bumped.
- __init__(version: Version, output_dir: Path, file_prefix: str = 'ticket-dataset-', file_extension: str = '.json', release_type: VersionType = VersionType.PATCH) None¶