Data Grouping and Values Approximation

Overview

For the better graphical representation of the large data sets AnyChart Stock component provides the ability to group data points and display them as a single point. We call this feature data grouping.

The benefits of showing grouped data are better readability of the charts and faster rendering of chart on the page at the expense of drawing less number of elements.

Illustration below shows two chart depicting the same data in non grouped and grouped modes:

With grouping turned on you should explicitly set maximum number of point on X axis that can be shown on the chart for the given time interval, with no regard to the current scale. When the number of points in the given interval exceeds defined maximum - data points will be grouped.

to top

Basic Settings

Basic settings for the grouping are on/off status and the number of maximum visible points:

XML Sample:

XML/JSON Syntax
Plain code
01 <?xml version="1.0" encoding="UTF-8"?>
02 <stock xmlns="http://anychart.com/products/stock/schemas/1.9.0/schema.xsd">
03   <settings>
04     <data_grouping enabled="true" max_visible_points="150" />
05   </settings>
06 </stock>
01{
02  settings: {
03    dataGrouping: {
04      enabled: true,
05      maxVisiblePoints: 150
06    }
07  }
08}

to top

Approximation Types

When you decide to use grouping you need to set how exactly several values should be grouped in one. To do that you need to set type of approximation for each field of Data-Provider in Data-Provider definition.

Approximation type is set in approximation_type attribute of <field> node. XML below shows the syntax for Data-Provider definition and setting approximation type for the single field:

XML/JSON Syntax
Plain code
03     <data_provider data_set="dataSet1" id="dpMsft">
04       <fields>
05         <field type="Value" column="4" approximation_type="Close" />
06       </fields>
07     </data_provider>
01{
03    {
04      dataSet: "dataSet1",
05      id: "dpMsft",
06      fields: [
07        {
08          type: "Value",
09          column: 4,
10          approximationType: "Close"
11        }
12      ]
13    }
14  ]
15}

There are seven Approximation types in AnyChart Stock Component:

Type Description
Average Calculates the arithmetic mean of values in the interval.
Open Returns the first value in the interval.
Low Finds the minimal value in the interval.
High Finds the maximal value in the interval.
Close Returns the last value in the interval.
Sum Calculates the sum of all values in the interval.
WeightedAverage Described below.

Illustration below highlights the period with values to be grouped. Depending on the chosen approximation type one of values shown by the red points is used:

Live sample below shows how different type of approximation change the look of the chart. The same data is used as the source from the Data Set and Data Providers are created for each line. Each provider uses one of approximation types and, as the result, lines are drawn by the different values on the large intervals and by the same on the small intervals (as the grouping isn't used):

Live Sample:  Comparing Types of Approximation

Look at the correct settings to group OHLC series:

XML/JSON Syntax
Plain code
03     <data_provider data_set="dataSet1" id="dpMsft">
04       <fields>
05         <field type="Open" column="1" approximation_type="Close" />
06         <field type="High" column="2" approximation_type="High" />
07         <field type="Low" column="3" approximation_type="Low" />
08         <field type="Close" column="4" approximation_type="Close" />
09         <field type="Volume" column="5" approximation_type="Average" />
10       </fields>
11     </data_provider>
01{
03    {
04      dataSet: "dataSet1",
05      id: "dpMsft",
06      fields: [
07        {
08          type: "Open",
09          column: 1,
10          approximationType: "Close"
11        },
12        {
13          type: "High",
14          column: 2,
15          approximationType: "High"
16        },
17        {
18          type: "Low",
19          column: 3,
20          approximationType: "Low"
21        },
22        {
23          type: "Close",
24          column: 4,
25          approximationType: "Close"
26        },
27        {
28          type: "Volume",
29          column: 5,
30          approximationType: "Average"
31        }
32      ]
33    }
34  ]
35}

to top

Weighted-Average Approximation Type

You can also use the Weighted-Average approximation type, which requires some additional settings. The primary objective of this type is preserving peak changes when grouping data.

You can find the further description of this type at: http://en.wikipedia.org/wiki/Weighted_mean

To use Weighted Average in AnyChart Stock, in data provider you should specify the column with weighting coefficients.

Here is an XML syntax for configuring Weighted Average:

XML/JSON Syntax
Plain code
03     <data_provider data_set="dataSet1" id="dp1">
04       <fields>
05         <field type="Value" column="1" approximation_type="WeightedAverage" weights_column="2" />
06       </fields>
07     </data_provider>
01{
03    {
04      dataSet: "dataSet1",
05      id: "dp1",
06      fields: [
07        {
08          type: "Value",
09          column: 1,
10          approximationType: "WeightedAverage",
11          weightsColumn: 2
12        }
13      ]
14    }
15  ]
16}

As you can see, once the approximation type is set to Weighted Average, we define the weights_column attribute in the <field>, which sets the zero-based index for the column with weights.

The illustration below shows a table with three columns: Timestamp, Value and Weight:

Weight can be any positive number, and the greater the weight is - the more likely the value point would be chosen as the value that represents several points in the grouped period.

To demonstrate the pros of the Weighted-Average type, the live sample below provides a comparison of it to the Average and Close types. The sample uses the following data table: values_and_weights.csv.

For each approximation type, we have placed axis markers colored in red and green - they represent the minimal and maximal values. Use the scroller or range selector buttons to see how each approximation type works:

Live Sample:  Weighted Average Approximation Type

If you have studied the look of the chart in different ranges, you could see that with the Average approximation line changes and loses the peak values, and the Close approximation turns in almost a straight line, while the Weighted-Average approximation type preserves the peak values, since those values have appropriate weights in the weight column.

to top

Managing Intervals

Grouping intervals are time periods that are used for grouping data when zooming.

By default, the engine uses a predefined all-purpose set of grouping intervals. But if that set doesn't work for you - you can modify them.

Intervals List

To redefine the list of intervals, use the <intervals> node with several <interval> subnodes contained in them.

Here is the list of intervals that the component uses by default:

XML/JSON Syntax
Plain code
01 <?xml version="1.0" encoding="UTF-8"?>
02 <stock xmlns="http://anychart.com/products/stock/schemas/1.9.0/schema.xsd">
03   <settings>
04     <data_grouping enabled="true" max_visible_points="720">
05       <intervals>
06         <interval unit="Millisecond" count="1" />
07         <interval unit="Millisecond" count="50" />
08         <interval unit="Millisecond" count="100" />
09         <interval unit="Millisecond" count="500" />
10         <interval unit="Second" count="1" />
11         <interval unit="Second" count="10" />
12         <interval unit="Second" count="20" />
13         <interval unit="Second" count="30" />
14         <interval unit="Second" count="45" />
15         <interval unit="Minute" count="1" />
16         <interval unit="Minute" count="15" />
17         <interval unit="Minute" count="30" />
18         <interval unit="Hour" count="1" />
19         <interval unit="Hour" count="2" />
20         <interval unit="Hour" count="6" />
21         <interval unit="Day" count="1" />
22         <interval unit="Week" count="1" />
23         <interval unit="Month" count="1" />
24         <interval unit="Year" count="1" />
25       </intervals>
26     </data_grouping>
27   </settings>
28 </stock>
01{
02  settings: {
03    dataGrouping: {
04      enabled: true,
05      maxVisiblePoints: 720,
06      intervals: [
07        {
08          unit: "Millisecond",
09          count: 1
10        },
11        {
12          unit: "Millisecond",
13          count: 50
14        },
15        {
16          unit: "Millisecond",
17          count: 100
18        },
19        {
20          unit: "Millisecond",
21          count: 500
22        },
23        {
24          unit: "Second",
25          count: 1
26        },
27        {
28          unit: "Second",
29          count: 10
30        },
31        {
32          unit: "Second",
33          count: 20
34        },
35        {
36          unit: "Second",
37          count: 30
38        },
39        {
40          unit: "Second",
41          count: 45
42        },
43        {
44          unit: "Minute",
45          count: 1
46        },
47        {
48          unit: "Minute",
49          count: 15
50        },
51        {
52          unit: "Minute",
53          count: 30
54        },
55        {
56          unit: "Hour",
57          count: 1
58        },
59        {
60          unit: "Hour",
61          count: 2
62        },
63        {
64          unit: "Hour",
65          count: 6
66        },
67        {
68          unit: "Day",
69          count: 1
70        },
71        {
72          unit: "Week",
73          count: 1
74        },
75        {
76          unit: "Month",
77          count: 1
78        },
79        {
80          unit: "Year",
81          count: 1
82        }
83      ]
84    }
85  }
86}

The table below describes the attributes of the <interval> node:

Type Possible values Description
unit Year
Semester
Quarter
Month
ThirdOfMonth
Week
Day
Hour
Minute
Second
Millisecond
Defines what time unit in the count setting represents one grouping domain.
count Any positive integer value greater than zero. Sets the number of units that together correspond with one grouping domain.
max_visible_points Any positive integer value. Sets visible point count maximum for this grouping interval. If not set, the default value set in max_visible_points of the <data_grouping> node is used.

Take a look at the sample XML, where only three intervals are used - week, month and 3-months:

XML/JSON Syntax
Plain code
01 <?xml version="1.0" encoding="UTF-8"?>
02 <stock xmlns="http://anychart.com/products/stock/schemas/1.9.0/schema.xsd">
03   <settings>
04     <data_grouping>
05       <intervals>
06         <interval unit="Week" count="1" />
07         <interval unit="Month" count="1" />
08         <interval unit="Month" count="3" />
09       </intervals>
10     </data_grouping>
11   </settings>
12 </stock>
01{
02  settings: {
03    dataGrouping: {
04      intervals: [
05        {
06          unit: "Week",
07          count: 1
08        },
09        {
10          unit: "Month",
11          count: 1
12        },
13        {
14          unit: "Month",
15          count: 3
16        }
17      ]
18    }
19  }
20}

Note: If you manually define at least one interval on the list - all the default intervals are reset. Thus, if the new list doesn't contain intervals matching the original data - the points will be grouped at the very start.

To explain intervals better, let's take a look at the following sample.

There is a set with daily reports for a long period of time, say, over 80 years, and we need to show the data the following way:

To have the AnyChart Stock component show the data according to this scenario, you should configure the intervals as follows:

XML/JSON Syntax
Plain code
01 <?xml version="1.0" encoding="UTF-8"?>
02 <stock xmlns="http://anychart.com/products/stock/schemas/1.9.0/schema.xsd">
03   <settings>
04     <data_grouping enabled="true">
05       <intervals>
06         <interval unit="Day" count="1" max_visible_points="1092" />
07         <interval unit="Week" count="1" max_visible_points="780" />
08         <interval unit="Month" count="1" />
09       </intervals>
10     </data_grouping>
11   </settings>
12 </stock>
01{
02  settings: {
03    dataGrouping: {
04      enabled: true,
05      intervals: [
06        {
07          unit: "Day",
08          count: 1,
09          maxVisiblePoints: 1092
10        },
11        {
12          unit: "Week",
13          count: 1,
14          maxVisiblePoints: 780
15        },
16        {
17          unit: "Month",
18          count: 1
19        }
20      ]
21    }
22  }
23}

And here is the explanation for each interval:

And now take a look at the live sample with the settings shown and described above: This sample uses the dji_daily_close.csv data set with daily close values of Dow Jones Industrial Average from 1928 to 2009, which is over 20 thousands of points:

Live Sample:  Data Grouping - Grouping Intervals

to top

Fixed Intervals

For many applications, especially those that deal with financial data, it may be necessary to show charts grouped in some predefined interval. Usually the grouping interval is set with some external control like ComboBox, where end user selects options like Daily, Weekly, etc.

By default, the component groups or shows the original data according to predefined intervals set as shown above; to change this and make the component show data grouped with some predefined precision, you should use one interval of your choice from the interval list. The sample XML below demonstrates how to make AnyChart Stock always show data grouped in weeks:

XML/JSON Syntax
Plain code
01 <?xml version="1.0" encoding="UTF-8"?>
02 <stock xmlns="http://anychart.com/products/stock/schemas/1.9.0/schema.xsd">
03   <settings>
04     <data_grouping enabled="true" max_visible_points="0">
05       <intervals>
06         <interval unit="Week" count="1" />
07       </intervals>
08     </data_grouping>
09   </settings>
10 </stock>
01{
02  settings: {
03    dataGrouping: {
04      enabled: true,
05      maxVisiblePoints: 0,
06      intervals: [
07        {
08          unit: "Week",
09          count: 1
10        }
11      ]
12    }
13  }
14}

The live sample below uses a data source with daily data, but the chart always shows the data grouped in weeks, because of the custom interval settings:

Live Sample:  Data Grouping - Using Fixed Intervals

Please see the Custom Sample that shows how an external control can be used for switching fixed grouping intervals on-the-fly:

Online HTML/JavaScript Sample

to top

Adaptive Date Time Formatting

When the use of grouping changes the interval, the data details change as well; for example, when a chart is shown in daily intervals, the date and time are shown as "21 May, 2009" or "16 Apr, 2009"; but when the data groups into months, the date and time are shown as "1 May, 2009" or "1 Apr 2009", because all the month's data is represented by the first day of the month.

But user may be confused with this label and prefer to see a label like "May, 2009".

To solve this issue and prevent the confusion, use adaptive date-time formatting, which allows to set how the elements change their format depending on the grouping level.

This mechanism is described in detail in:

to top