config package
Submodules
config.config module
- class config.config.TextConfig(mean_length: int, standard_deviation: int)[source]
Bases:
object
Configuration for generated text length distribution.
- mean_length
Average length of generated text.
- Type:
int
- standard_deviation
Standard deviation for text length.
- Type:
int
- mean_length: int
- standard_deviation: int
- __init__(mean_length: int, standard_deviation: int) None
- class config.config.RunType(*values)[source]
Bases:
Enum
Execution mode of the data generation pipeline.
- PRODUCTION = 'PRODUCTION'
- DEVELOPMENT = 'DEVELOPMENT'
- class config.config.Config(release_information: ReleaseInformation, number_of_tickets: int, number_translation_nodes: int, graph_runs_per_batch: int, text_config: TextConfig, assistants: dict[AssistantName, Assistant], run_type: RunType = RunType.DEVELOPMENT, time_zone: ZoneInfo = zoneinfo.ZoneInfo(key='Europe/Berlin'))[source]
Bases:
object
Holds all configuration parameters for a dataset generation run.
- release_information
Metadata about this release.
- Type:
- number_of_tickets
Total synthetic tickets to generate.
- Type:
int
- number_translation_nodes
How many translation steps per ticket.
- Type:
int
- graph_runs_per_batch
Number of graph executions per batch.
- Type:
int
- text_config
Settings for text length variation.
- Type:
- assistants
Mapping of assistant roles.
- Type:
dict[AssistantName, Assistant]
- time_zone
Time zone for timestamps; default Europe/Berlin.
- Type:
ZoneInfo
- release_information: ReleaseInformation
- number_of_tickets: int
- number_translation_nodes: int
- graph_runs_per_batch: int
- text_config: TextConfig
- assistants: dict[AssistantName, Assistant]
- time_zone: ZoneInfo = zoneinfo.ZoneInfo(key='Europe/Berlin')
- property output_file: Path
Get the output file path for this release.
- Returns:
Path where generated dataset will be saved.
- Return type:
Path
- property run_information: RunInformation
Construct run metrics for reporting and logging.
- Returns:
Object summarizing batch and ticket counts.
- Return type:
- __init__(release_information: ReleaseInformation, number_of_tickets: int, number_translation_nodes: int, graph_runs_per_batch: int, text_config: TextConfig, assistants: dict[AssistantName, Assistant], run_type: RunType = RunType.DEVELOPMENT, time_zone: ZoneInfo = zoneinfo.ZoneInfo(key='Europe/Berlin')) None
config.run_information module
- class config.run_information.RunInformation(number_of_tickets: int, number_translation_nodes: int, graph_runs_per_batch: int)[source]
Bases:
object
Holds parameters and derived metrics for a run of the dataset generator.
- number_of_tickets
Total number of tickets to generate.
- Type:
int
- number_translation_nodes
Number of translation nodes per ticket.
- Type:
int
- graph_runs_per_batch
Number of graph executions in each batch.
- Type:
int
- number_of_tickets: int
- number_translation_nodes: int
- graph_runs_per_batch: int
- property graph_ticket_runs: int
Calculate how many ticket graphs will be run.
- Returns:
The number of ticket-graph runs (tickets divided by translation nodes).
- Return type:
int
- property amount_batches: int
Calculate how many batches are required.
- Returns:
Total batches, computed as (tickets / batch size) / translation nodes.
- Return type:
int
- __init__(number_of_tickets: int, number_translation_nodes: int, graph_runs_per_batch: int) None
config.version module
- class config.version.VersionType(*values)[source]
Bases:
Enum
SemVer part types and their index in the version tuple.
- MAJOR
Major version index 0.
- Type:
- MINOR
Minor version index 1.
- Type:
- PATCH
Patch version index 2.
- Type:
- MAJOR = 0
- MINOR = 1
- PATCH = 2
- class config.version.VersionTag(*values)[source]
Bases:
Enum
Optional tags appended to version strings.
- TEST
“test” tag.
- Type:
- PROD
“prod” tag.
- Type:
- TEST = 'test'
- PROD = 'prod'
- class config.version.Version(version_parts: tuple[int, int, int], version_tag: VersionTag | None = None)[source]
Bases:
object
Semantic version representation with optional tag.
- version_parts
(major, minor, patch).
- Type:
tuple[int, int, int]
- version_tag
Optional tag to append.
- Type:
VersionTag | None
- VERSION_SPLIT = '.'
- VERSION_FILE_SPLIT = '_'
- VERSION_FILE_PREFIX = 'v'
- VERSION_LIMIT = 100
- VERSION_PARTS = 3
- version_parts: tuple[int, int, int]
- version_tag: VersionTag | None = None
- static from_string(version_string: str) Version [source]
Parse a dotted version string into a Version, padding missing parts with zero.
- with_tag(version_tag: VersionTag) Version [source]
Return a new Version with the given tag.
- next_version(version_type: VersionType) Version [source]
Increment the specified part and return a new Version.
- __init__(version_parts: tuple[int, int, int], version_tag: VersionTag | None = None) None
- class config.version.ReleaseInformation(version: Version, output_dir: Path, file_prefix: str = 'ticket-dataset-', file_extension: str = '.json', release_type: VersionType = VersionType.PATCH)[source]
Bases:
object
Holds metadata for a release and computes output file paths.
- output_dir
Directory to write the file.
- Type:
Path
- file_prefix
Prefix for the output filename.
- Type:
str
- file_extension
Extension for the output filename.
- Type:
str
- release_type
Which part to bump next.
- Type:
- output_dir: Path
- file_prefix: str = 'ticket-dataset-'
- file_extension: str = '.json'
- release_type: VersionType = 2
- get_next_existing() Self [source]
Recursively find the next non-existing release, raising if limit exceeded.
- property next_version: Self
Return a new ReleaseInformation with its version bumped.
- __init__(version: Version, output_dir: Path, file_prefix: str = 'ticket-dataset-', file_extension: str = '.json', release_type: VersionType = VersionType.PATCH) None