util package
Subpackages
- util.formatting package
- util.text_similarity package
- Subpackages
- Submodules
- util.text_similarity.max_independent_set_calc module
- util.text_similarity.max_independent_set_calc_test module
PAIRS_LIN_LOG_FUNC()
CalcNumPairs
StaticCalcNumPairs
LogCalcNumPairs
LinSquareRootCalcNumPairs
RandomGraphGenerator
MeasurementResult
MeasureIndependentSetCalc
greedy_calc()
approx_calc()
optimal_calc()
random_graph_gen()
dense_graph_gen()
test_find_max_set()
test_time_complexity()
ScatterData
plot_algos()
test_greedy_accuracy()
- util.text_similarity.texts_similarity_filter module
- util.text_similarity.texts_similarity_filter_test module
- Module contents
Submodules
util.csv_json_converter module
util.lin_reg_plot_helper module
util.merge_datasets module
- class util.merge_datasets.FileInformation(file_path: pathlib._local.Path, version: int)[source]
Bases:
object
- file_path: Path
- version: int
- __init__(file_path: Path, version: int) None
- util.merge_datasets.merge_json_and_assign_uuid(files: list[FileInformation], output: Path)[source]
util.number_interval_generator module
- class util.number_interval_generator.NumberInterval(lower_bound: int, upper_bound: int)[source]
Bases:
object
Represents a numeric interval with a lower and upper bound.
- lower_bound
The lower limit of the interval.
- Type:
int
- upper_bound
The upper limit of the interval.
- Type:
int
- lower_bound: int
- upper_bound: int
- static create_unbounded_interval()[source]
Creates a NumberInterval with no bounds, spanning from negative to positive infinity.
- static create_positive_unbounded_interval()[source]
Creates a NumberInterval, spanning from 0 to positive infinity.
- property range: int
Returns the range of the interval.
- __init__(lower_bound: int, upper_bound: int) None
- class util.number_interval_generator.NormalizedNumberGenerator(*, mean: float, number_bounds: NumberInterval = NumberInterval(lower_bound=-10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000), standard_deviation: Annotated[float, Gt(gt=0)])[source]
Bases:
BaseModel
Generates random numbers based on a normal distribution, constrained by a numeric interval.
- mean
The mean of the normal distribution.
- Type:
float
- number_bounds
The bounds within which generated numbers must fall.
- Type:
- standard_deviation
The standard deviation of the normal distribution.
- Type:
float
- mean: float
- number_bounds: NumberInterval
- standard_deviation: float
- model_post_init(context: Any, /) None
We need to both initialize private attributes and call the user-defined model_post_init method.
- generate_bounded_number() int [source]
Generates a random number that falls within the specified bounds.
- model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[dict[str, FieldInfo]] = {'mean': FieldInfo(annotation=float, required=True), 'number_bounds': FieldInfo(annotation=NumberInterval, required=False, default=NumberInterval(lower_bound=-10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000)), 'standard_deviation': FieldInfo(annotation=float, required=True, metadata=[Gt(gt=0)])}
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].
This replaces Model.__fields__ from Pydantic V1.
- class util.number_interval_generator.NumberIntervalGenerator(*, mean: float, standard_deviation: Annotated[float, Gt(gt=0)], upper_bound_difference_log_factor: Annotated[float, Ge(ge=1)] = 5, min_upper_bound_log_base: Annotated[float, Gt(gt=1)] = 2, lower_number_bounds: NumberInterval = NumberInterval(lower_bound=0, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000), lower_number_generator: NormalizedNumberGenerator = None)[source]
Bases:
BaseModel
Generates a random numeric interval based on a normal distribution. difference between lower and upper bound is calculated on a logarithmic scale so that larger values get higher difference: when log_base = 2 and factor=5 text_length=8 -> log(8, 2) * 5 = 3 * 5 = 15 text_length=256 -> log(256, 2) * 5 = 8 * 5 = 40 .. attribute:: mean
The mean of the normal distribution.
- type:
float
- standard_deviation
The standard deviation of the normal distribution.
- Type:
float
- upper_bound_difference_log_factor
Factor used to determine the upper bound relative to the lower bound.
- Type:
float
- min_upper_bound_log_base
Base for the logarithmic calculation of the upper bound.
- Type:
float
- lower_number_bounds
Bounds for generating the lower number.
- Type:
- lower_number_generator
Generator for the lower bound value.
- mean: float
- standard_deviation: float
- upper_bound_difference_log_factor: float
- min_upper_bound_log_base: float
- lower_number_bounds: NumberInterval
- lower_number_generator: NormalizedNumberGenerator
- model_post_init(_NumberIntervalGenerator__context: any) None [source]
Initializes the lower number generator if it is not provided.
- generate_interval() NumberInterval [source]
Generates a random numeric interval consisting of a lower and upper bound.
- Returns:
A randomly generated numeric interval.
- Return type:
- model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[dict[str, FieldInfo]] = {'lower_number_bounds': FieldInfo(annotation=NumberInterval, required=False, default=NumberInterval(lower_bound=0, upper_bound=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000)), 'lower_number_generator': FieldInfo(annotation=NormalizedNumberGenerator, required=False, default=None), 'mean': FieldInfo(annotation=float, required=True), 'min_upper_bound_log_base': FieldInfo(annotation=float, required=False, default=2, metadata=[Gt(gt=1)]), 'standard_deviation': FieldInfo(annotation=float, required=True, metadata=[Gt(gt=0)]), 'upper_bound_difference_log_factor': FieldInfo(annotation=float, required=False, default=5, metadata=[Ge(ge=1)])}
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].
This replaces Model.__fields__ from Pydantic V1.
util.number_interval_generator_test module
- util.number_interval_generator_test.get_random_intervals(mean: int, standard_deviation: int, _lower_number_min_value: int = 0) list[NumberInterval] [source]
- util.number_interval_generator_test.get_random_numbers(mean: int, standard_deviation: int, _lower_number_min_value: int = 0) list[int] [source]
- util.number_interval_generator_test.test_random_interval_distribution(text_length_mean, text_length_standard_deviation)[source]