dm.coefficients package

Submodules

dm.coefficients.AbstractLineCoefficients module

Gets coefficients of equation in form ax + by + c = 0 from equation in form y = kx + q.

class dm.coefficients.AbstractLineCoefficients.AbstractLineCoefficients

Bases: abc.ABC

abstract calculate(data, interval, col1, col2, col3, point_x, point_y)

It calculates a slope of line(s).

Parameters
  • data – training data

  • interval – interval that denotes a ventilation length (class, cluster)

  • col1 – column name containing values of specific humidity measured when the events started

  • col2 – column name containing values of specific humidity measured when the events finished

  • col3 – column name containing values of differences between indoor and outdoor specific humidity when the events started

  • point_x – x-coordinate of a point (cluster centroid)

  • point_y – y-coordinate of a point (cluster centroid)

Returns

slope of line(s)

convert_line(coeffs)

It converts line equation y = kx + q to the form ax + by + c = 0 (general form).

Parameters

coeffs – coefficients of a line

Returns

general form of a line

dm.coefficients.CenterLineSlope module

Calculates slope of a line passing through a point.

class dm.coefficients.CenterLineSlope.CenterLineSlope

Bases: dm.coefficients.AbstractLineCoefficients.AbstractLineCoefficients

calculate(data, interval, col1, col2, col3, point_x, point_y)

It calculates a slope of a line passing through a point.

Parameters
  • data – training data

  • interval – interval that denotes a ventilation length (class, cluster)

  • col1 – column name containing values of specific humidity measured when the events started

  • col2 – column name containing values of specific humidity measured when the events finished

  • col3 – column name containing values of differences between indoor and outdoor specific humidity when the events started

  • point_x – x-coordinate of a point (cluster centroid)

  • point_y – y-coordinate of a point (cluster centroid)

Returns

slope of a line passing through a point

dm.coefficients.DistanceToLine module

Calculates distance between two points and between line and point.

Finds clusters of data using indoor specific humidity decrease and difference between indoor and outdoor humidity and calculates distance between point and cluster centroid and distance of point to cluster trendline.

class dm.coefficients.DistanceToLine.DistanceToLine(training)

Bases: object

distance_point_line(a1, a2, a, b, c)

It calculates distance from point to line.

Parameters
  • a1 – point coordinate x

  • a2 – point coordinate y

  • a – parameter of the line equation

  • b – parameter of the line equation

  • c – parameter of the line equation

Returns

distance from point to line

distance_point_point_Euclidean(a1, a2, b1, b2)

It calculates distance from point to point (Euclidean).

Parameters
  • a1 – point 1 coordinate x

  • a2 – point 1 coordinate y

  • b1 – point 2 coordinate x

  • b2 – point 2 coordinate y

Returns

distance from point to point

exec(intervals, data_testing, col1, col2, col3, strategy, strategyFlag, one_line, test_points, cluster_boundaries, cluster_boundaries_all, precision=2)
It calculates coefficients used in general form of a line, x-coordinate and y-coordinate

of a cluster centroid and it also creates object for matplotlib using humidity_clusters function. Moreover, it can create an output file containing trendline or clusters. It creates testing set and it alternatively plots testing data points.

Parameters
  • intervals – list of intervals that denote ventilation lengths (class, cluster)

  • data_testing – training data

  • col1 – column name containing values of specific humidity measured when the events started

  • col2 – column name containing values of specific humidity measured when the events finished

  • col3 – column name containing values of differences between indoor and outdoor specific humidity when the events started

  • strategy – strategy how to compute a slope of line(s)

  • strategyFlag – flag that denotes a used strategy for computation of a slope of line(s)

  • one_line – if only one line should be plotted

  • test_points – true if test points are plotted

  • cluster_boundaries – if cluster boundaries should be plotted

  • cluster_boundaries_all – if all cluster boundaries should be plotted

  • precision

Returns

coefficients used in general form of a line, x-coordinate and y-coordinate of a cluster centroid and object for matplotlib

humidity_clusters(training, col1, col2, col3, intervals, strategy, strategy_flag, one_line, cluster_boundaries, cluster_boundaries_all)
It calculates coefficients used in general form of a line, x-coordinate and y-coordinate

of a cluster centroid and it creates object for matplotlib

Parameters
  • training – training data

  • col1 – column name containing values of specific humidity measured when the events started

  • col2 – column name containing values of specific humidity measured when the events finished

  • col3 – column name containing values of differences between indoor and outdoor specific humidity when the events started

  • intervals – list of intervals that denote ventilation lengths (class, cluster)

  • strategy – strategy how to compute a slope of line(s)

  • strategy_flag – flag that denotes a used strategy for computation of a slope of line(s)

  • one_line – if only one line should be plotted

  • cluster_boundaries – if cluster boundaries should be plotted

  • cluster_boundaries_all – if all cluster boundaries should be plotted

Returns

coefficients used in general form of a line, x-coordinate and y-coordinate of a cluster centroid and object for matplotlib

static select_attributes(data, attributes)

It selects required attributes.

Parameters
  • data – training or testing data

  • attributes – list of required attributes

Returns

list of required attributes

static ventilation_length_events(training: list, ventilation_length: int)

It gets events assigned to a given class.

It means that it gets events when ventilation lasted approximately the same time.

Parameters
  • training – training data

  • ventilation_length – length of ventilation that corresponds with a class

Returns

list of events assigned to a given class

dm.coefficients.MathLineAvgSlope module

dm.coefficients.PolyfitLineAvgSlope module

Calculates average slope of multiple lines.

class dm.coefficients.PolyfitLineAvgSlope.PolyfitLineAvgSlope

Bases: dm.coefficients.AbstractLineCoefficients.AbstractLineCoefficients

calculate(data, interval, col1, col2, col3, point_x, point_y)

It calculates average slope of multiple lines.

It uses the polyfit function from numpy library.

Parameters
  • data – training data

  • interval – interval that denotes a ventilation length (class, cluster)

  • col1 – column name containing values of specific humidity measured when the events started

  • col2 – column name containing values of specific humidity measured when the events finished

  • col3 – column name containing values of differences between indoor and outdoor specific humidity when the events started

  • point_x – deprecated

  • point_y – deprecated

Returns

average slope of multiple lines

Module contents