util package

Subpackages

Submodules

util.csv_json_converter module

util.lin_reg_plot_helper module

util.lin_reg_plot_helper.plot_regression(x, y, coeffs, prediction_xs=None)[source]

util.lin_reg_plot_helper.plot_scatter(x1s, y1s, label1, x2s, y2s, label2)[source]

class util.lin_reg_plot_helper.LinRegPredictor(coeffs)[source]

Bases: object

__init__(coeffs)[source]

predict(x)[source]

get_quadratic_coeff()[source]

util.merge_datasets module

class util.merge_datasets.FileInformation(file_path: pathlib._local.Path, version: int)[source]

Bases: object

file_path: Path

version: int

__init__(file_path: Path, version: int) → None

util.merge_datasets.merge_json_and_assign_uuid(files: list[FileInformation], output: Path)[source]

util.number_interval_generator module

class util.number_interval_generator.NumberInterval(lower_bound: int, upper_bound: int)[source]

Bases: object

Represents a numeric interval with a lower and upper bound.

lower_bound

The lower limit of the interval.

Type:: int

upper_bound

The upper limit of the interval.

Type:: int

lower_bound: int

upper_bound: int

static create_unbounded_interval()[source]: Creates a NumberInterval with no bounds, spanning from negative to positive infinity.

static create_positive_unbounded_interval()[source]: Creates a NumberInterval, spanning from 0 to positive infinity.

property range: int: Returns the range of the interval.

__init__(lower_bound: int, upper_bound: int) → None

class util.number_interval_generator.NormalizedNumberGenerator(*, mean: float, number_bounds: NumberInterval = NumberInterval(lower_bound=-10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000), standard_deviation: Annotated[float, Gt(gt=0)])[source]

Bases: BaseModel

Generates random numbers based on a normal distribution, constrained by a numeric interval.

mean

The mean of the normal distribution.

Type:: float

number_bounds

The bounds within which generated numbers must fall.

Type:: NumberInterval

standard_deviation

The standard deviation of the normal distribution.

Type:: float

mean: float

number_bounds: NumberInterval

standard_deviation: float

model_post_init(context: Any, /) → None: We need to both initialize private attributes and call the user-defined model_post_init method.

generate_bounded_number() → int[source]: Generates a random number that falls within the specified bounds.

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'mean': FieldInfo(annotation=float, required=True), 'number_bounds': FieldInfo(annotation=NumberInterval, required=False, default=NumberInterval(lower_bound=-10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000)), 'standard_deviation': FieldInfo(annotation=float, required=True, metadata=[Gt(gt=0)])}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

class util.number_interval_generator.NumberIntervalGenerator(*, mean: float, standard_deviation: Annotated[float, Gt(gt=0)], upper_bound_difference_log_factor: Annotated[float, Ge(ge=1)] = 5, min_upper_bound_log_base: Annotated[float, Gt(gt=1)] = 2, lower_number_bounds: NumberInterval = NumberInterval(lower_bound=0, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000), lower_number_generator: NormalizedNumberGenerator = None)[source]

Bases: BaseModel

Generates a random numeric interval based on a normal distribution. difference between lower and upper bound is calculated on a logarithmic scale so that larger values get higher difference: when log_base = 2 and factor=5 text_length=8 -> log(8, 2) * 5 = 3 * 5 = 15 text_length=256 -> log(256, 2) * 5 = 8 * 5 = 40 .. attribute:: mean

The mean of the normal distribution.

type:

float

standard_deviation

The standard deviation of the normal distribution.

Type:: float

upper_bound_difference_log_factor

Factor used to determine the upper bound relative to the lower bound.

Type:: float

min_upper_bound_log_base

Base for the logarithmic calculation of the upper bound.

Type:: float

lower_number_bounds

Bounds for generating the lower number.

Type:: NumberInterval

lower_number_generator

Generator for the lower bound value.

Type:: NormalizedNumberGenerator

mean: float

standard_deviation: float

upper_bound_difference_log_factor: float

min_upper_bound_log_base: float

lower_number_bounds: NumberInterval

lower_number_generator: NormalizedNumberGenerator

model_post_init(_NumberIntervalGenerator__context: any) → None[source]: Initializes the lower number generator if it is not provided.

generate_interval() → NumberInterval[source]

Generates a random numeric interval consisting of a lower and upper bound.

Returns:: A randomly generated numeric interval.
Return type:: NumberInterval

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'lower_number_bounds': FieldInfo(annotation=NumberInterval, required=False, default=NumberInterval(lower_bound=0, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000)), 'lower_number_generator': FieldInfo(annotation=NormalizedNumberGenerator, required=False, default=None), 'mean': FieldInfo(annotation=float, required=True), 'min_upper_bound_log_base': FieldInfo(annotation=float, required=False, default=2, metadata=[Gt(gt=1)]), 'standard_deviation': FieldInfo(annotation=float, required=True, metadata=[Gt(gt=0)]), 'upper_bound_difference_log_factor': FieldInfo(annotation=float, required=False, default=5, metadata=[Ge(ge=1)])}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

util.number_interval_generator_test module

util.number_interval_generator_test.test_equal_method()[source]

util.number_interval_generator_test.get_random_intervals(mean: int, standard_deviation: int, _lower_number_min_value: int = 0) → list[NumberInterval][source]

util.number_interval_generator_test.get_random_numbers(mean: int, standard_deviation: int, _lower_number_min_value: int = 0) → list[int][source]

util.number_interval_generator_test.test_random_interval_distribution(text_length_mean, text_length_standard_deviation)[source]

util.number_interval_generator_test.test_random_numbers_intervals_are_in_bound(lower_number_min_value)[source]

util.number_interval_generator_test.test_text_length_upper_bound()[source]

util package

Subpackages

Submodules

util.csv_json_converter module

util.lin_reg_plot_helper module

util.merge_datasets module

util.number_interval_generator module

util.number_interval_generator_test module

Module contents