util package¶

Subpackages¶

Submodules¶

util.csv_json_converter module¶

util.csv_json_converter.json_to_csv_splitting_tags(json_file_path: Path, columns: list[str], output_file: Path, delete_duplicate_value_columns: list[str], tag_column_key: str = 'tags', tag_columns_prefix: str = 'tag_', order_by: str | None = None, shuffle=True, number_tags=5, max_size=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000)[source]¶

util.csv_json_converter.json_to_csv(json_file_path: Path, csv_file_path: Path, shuffle=True, max_size=1000000)[source]¶

util.csv_json_converter.is_value_in_set(value: str | list | dict, _set: set) → bool[source]¶

util.csv_json_converter.delete_duplicate_values(tickets: list[dict], columns: list[str]) → list[dict][source]¶

util.csv_json_converter.csv_to_json(csv_file_path, json_file_path)[source]¶

util.lin_reg_plot_helper module¶

util.lin_reg_plot_helper.plot_regression(x, y, coeffs, prediction_xs=None)[source]¶

util.lin_reg_plot_helper.plot_scatter(x1s, y1s, label1, x2s, y2s, label2)[source]¶

class util.lin_reg_plot_helper.LinRegPredictor(coeffs)[source]¶

Bases: object

__init__(coeffs)[source]¶

predict(x)[source]¶

get_quadratic_coeff()[source]¶

util.merge_datasets module¶

class util.merge_datasets.FileInformation(file_path: pathlib._local.Path, version: int)[source]¶

Bases: object

file_path: Path¶

version: int¶

__init__(file_path: Path, version: int) → None¶

util.merge_datasets.merge_json_and_assign_uuid(files: list[FileInformation], output: Path)[source]¶

util.number_interval_generator module¶

class util.number_interval_generator.NumberInterval(lower_bound: int, upper_bound: int)[source]¶

Bases: object

Represents a numeric interval with a lower and upper bound.

lower_bound¶

The lower limit of the interval.

Type:: int

upper_bound¶

The upper limit of the interval.

Type:: int

lower_bound: int¶

upper_bound: int¶

static create_unbounded_interval()[source]¶: Creates a NumberInterval with no bounds, spanning from negative to positive infinity.

static create_positive_unbounded_interval()[source]¶: Creates a NumberInterval, spanning from 0 to positive infinity.

property range: int¶: Returns the range of the interval.

__init__(lower_bound: int, upper_bound: int) → None¶

class util.number_interval_generator.NormalizedNumberGenerator(*, mean: float, number_bounds: NumberInterval = NumberInterval(lower_bound=-10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000), standard_deviation: Annotated[float, Gt(gt=0)])[source]¶

Bases: BaseModel

Generates random numbers based on a normal distribution, constrained by a numeric interval.

mean¶

The mean of the normal distribution.

Type:: float

number_bounds¶

The bounds within which generated numbers must fall.

Type:: NumberInterval

standard_deviation¶

The standard deviation of the normal distribution.

Type:: float

mean: float¶

number_bounds: NumberInterval¶

standard_deviation: float¶

model_post_init(context: Any, /) → None¶: We need to both initialize private attributes and call the user-defined model_post_init method.

generate_bounded_number() → int[source]¶: Generates a random number that falls within the specified bounds.

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}¶: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'mean': FieldInfo(annotation=float, required=True), 'number_bounds': FieldInfo(annotation=NumberInterval, required=False, default=NumberInterval(lower_bound=-10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000)), 'standard_deviation': FieldInfo(annotation=float, required=True, metadata=[Gt(gt=0)])}¶

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

class util.number_interval_generator.NumberIntervalGenerator(*, mean: float, standard_deviation: Annotated[float, Gt(gt=0)], upper_bound_difference_log_factor: Annotated[float, Ge(ge=1)] = 5, min_upper_bound_log_base: Annotated[float, Gt(gt=1)] = 2, lower_number_bounds: NumberInterval = NumberInterval(lower_bound=0, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000), lower_number_generator: NormalizedNumberGenerator = None)[source]¶

Bases: BaseModel

Generates a random numeric interval based on a normal distribution. difference between lower and upper bound is calculated on a logarithmic scale so that larger values get higher difference: when log_base = 2 and factor=5 text_length=8 -> log(8, 2) * 5 = 3 * 5 = 15 text_length=256 -> log(256, 2) * 5 = 8 * 5 = 40 .. attribute:: mean

The mean of the normal distribution.

type:

float

standard_deviation¶

The standard deviation of the normal distribution.

Type:: float

upper_bound_difference_log_factor¶

Factor used to determine the upper bound relative to the lower bound.

Type:: float

min_upper_bound_log_base¶

Base for the logarithmic calculation of the upper bound.

Type:: float

lower_number_bounds¶

Bounds for generating the lower number.

Type:: NumberInterval

lower_number_generator¶

Generator for the lower bound value.

Type:: NormalizedNumberGenerator

mean: float¶

standard_deviation: float¶

upper_bound_difference_log_factor: float¶

min_upper_bound_log_base: float¶

lower_number_bounds: NumberInterval¶

lower_number_generator: NormalizedNumberGenerator¶

model_post_init(_NumberIntervalGenerator__context: any) → None[source]¶: Initializes the lower number generator if it is not provided.

generate_interval() → NumberInterval[source]¶

Generates a random numeric interval consisting of a lower and upper bound.

Returns:: A randomly generated numeric interval.
Return type:: NumberInterval

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}¶: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'lower_number_bounds': FieldInfo(annotation=NumberInterval, required=False, default=NumberInterval(lower_bound=0, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000)), 'lower_number_generator': FieldInfo(annotation=NormalizedNumberGenerator, required=False, default=None), 'mean': FieldInfo(annotation=float, required=True), 'min_upper_bound_log_base': FieldInfo(annotation=float, required=False, default=2, metadata=[Gt(gt=1)]), 'standard_deviation': FieldInfo(annotation=float, required=True, metadata=[Gt(gt=0)]), 'upper_bound_difference_log_factor': FieldInfo(annotation=float, required=False, default=5, metadata=[Ge(ge=1)])}¶

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

util.number_interval_generator_test module¶

util.number_interval_generator_test.test_equal_method()[source]¶

util.number_interval_generator_test.get_random_intervals(mean: int, standard_deviation: int, _lower_number_min_value: int = 0) → list[NumberInterval][source]¶

util.number_interval_generator_test.get_random_numbers(mean: int, standard_deviation: int, _lower_number_min_value: int = 0) → list[int][source]¶

util.number_interval_generator_test.test_random_interval_distribution(text_length_mean, text_length_standard_deviation)[source]¶

util.number_interval_generator_test.test_random_numbers_intervals_are_in_bound(lower_number_min_value)[source]¶

util.number_interval_generator_test.test_text_length_upper_bound()[source]¶

util package¶

Subpackages¶

Submodules¶

util.csv_json_converter module¶

util.lin_reg_plot_helper module¶

util.merge_datasets module¶

util.number_interval_generator module¶

util.number_interval_generator_test module¶

Module contents¶