Histogram.py - Various functions to deal with histograms¶
- Author
- Tags
Python
Histograms can be calculated from a list/tuple/array of values. The histogram returned is then a list of tuples of the format [(bin1,value1), (bin2,value2), …].
-
Histogram.
CalculateFromTable
(dbhandle, field_name, from_statement, num_bins=None, min_value=None, max_value=None, intervals=None, increment=None)¶ get a histogram using an SQL-statement. Intervals can be either supplied directly or are build from the data by providing the number of bins and optionally a minimum or maximum value.
If no number of bins are provided, the bin-size is 1.
This command uses the INTERVAL command from MYSQL, i.e. a bin value determines the upper boundary of a bin.
-
Histogram.
CalculateConst
(values, num_bins=None, min_value=None, max_value=None, intervals=None, increment=None, combine=None)¶ calculate a histogram based on a list or tuple of values.
-
Histogram.
Calculate
(values, num_bins=None, min_value=None, max_value=None, intervals=None, increment=None, combine=None, no_empty_bins=0, dynamic_bins=False, ignore_out_of_range=True)¶ calculate a histogram based on a list or tuple of values.
use scipy for calculation.
-
Histogram.
Scale
(h, scale=1.0)¶ rescale bins in histogram.
-
Histogram.
convert
(h, i, no_empty_bins=0)¶ add bins to histogram.
-
Histogram.
Combine
(source_histograms, missing_value=0)¶ combine a list of histograms Each histogram is a sorted list of bins and counts. The counts can be tuples.
-
Histogram.
Print
(h, intervalls=None, format=0, nonull=None, format_value=None, format_bin=None)¶ print a histogram.
A histogram can either be a list/tuple of values or a list/tuple of lists/tuples where the first value contains the bin and second contains the values (which can again be a list/tuple).
- format
0 = print histogram in several lines 1 = print histogram on single line
-
Histogram.
Write
(outfile, h, intervalls=None, format=0, nonull=None, format_value=None, format_bin=None)¶ print a histogram.
A histogram can either be a list/tuple of values or a list/tuple of lists/tuples where the first value contains the bin and second contains the values (which can again be a list/tuple).
- Parameters
format – output format. 0 = print histogram in several lines, 1 = print histogram on single line
-
Histogram.
Fill
(h)¶ fill every empty value in histogram with previous value.
-
Histogram.
Add
(h1, h2)¶ adds values of histogram h1 and h2 and returns a new histogram
-
Histogram.
SmoothWrap
(histogram, window_size)¶ smooth histogram by sliding window-method, where the window is wrapped around the borders. The sum of all values is entered at center of window.
-
Histogram.
PrintAscii
(histogram, step_size=1)¶ print histogram ascii-style.
-
Histogram.
Count
(data)¶ count categorized data. Returns a list of tuples with (count, token).
-
Histogram.
Accumulate
(h, num_bins=2, direction=1)¶ add successive counts in histogram. Bins are labelled by group average.
-
Histogram.
Cumulate
(h, direction=1)¶ calculate cumulative distribution.
-
Histogram.
AddRelativeAndCumulativeDistributions
(h)¶ adds relative and cumulative percents to a histogram.
-
Histogram.
histogram
(values, mode=0, bin_function=None)¶ Return a list of (value, count) pairs, summarizing the input values. Sorted by increasing value, or if mode=1, by decreasing count. If bin_function is given, map it over values first. Ex: vals = [100, 110, 160, 200, 160, 110, 200, 200, 220] histogram(vals) ==> [(100, 1), (110, 2), (160, 2), (200, 3), (220, 1)] histogram(vals, 1) ==> [(200, 3), (160, 2), (110, 2), (100, 1), (220, 1)] histogram(vals, 1, lambda v: round(v, -2)) ==> [(200.0, 6), (100.0, 3)]
-
Histogram.
cumulate
(histogram)¶ cumulate histogram in place.
histogram is list of (bin, value) or (bin, (values,) )
-
Histogram.
normalize
(histogram)¶ normalize histogram in place.
histogram is list of (bin, value) or (bin, (values,) )
-
Histogram.
fill
(iterator, bins)¶ fill a histogram from bins.
The values are given by an iterator so that the histogram can be built on the fly.
Description:
Count the number of times values from array a fall into numerical ranges defined by bins. Range x is given by bins[x] <= range_x < bins[x+1] where x =0,N and N is the length of the bins array. The last range is given by bins[N] <= range_N < infinity. Values less than bins[0] are not included in the histogram.
- Parameters
-- The iterator. (iterator) –
-- 1D array. Defines the ranges of values to use during (bins) –
histogramming. –
Returns: 1D array. Each value represents the occurences for a given bin (range) of values.
-
Histogram.
fillHistograms
(infile, columns, bins)¶ fill several histograms from several columns in a file.
The histograms are built on the fly.
Description:
Count the number of times values from array a fall into numerical ranges defined by bins. Range x is given by bins[x] <= range_x < bins[x+1] where x =0,N and N is the length of the bins array. The last range is given by bins[N] <= range_N < infinity. Values less than bins[0] are not included in the histogram.
- Parameters
-- The input file. (file) –
-- columns to use (columns) –
-- a list of 1D arrays. Defines the ranges of values to use during (bins) –
histogramming. –
Returns: a list of 1D arrays. Each value represents the occurences for a given bin (range) of values.
WARNING: missing value in columns are ignored