profile
viewpoint

Ask questionsDownsampling data

Is there ability to downsample data?

For example, I need to store raw metrics for 1 month, and metrics aggregated by 30 minutes for 1 year.

VictoriaMetrics/VictoriaMetrics

Answer questions valyala

@jujo1 , did you try using rollup_candlestick(m[d]) on the raw data? This function returns four time series per each input time series - open, high, low and close. See https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/ExtendedPromQL for details.

Could you answer the following questions in order to understand better your use case:

  1. What is the interval between data points in a single time series? Note that the minimal interval between data points supported by VictoriaMetrics is 1 millisecond.
  2. How many unique time series are queried by a single query on average and on maximum?
  3. What is the average and the maximum query interval? (hour, day, week, month, year, etc.)
  4. Could you provide slow query for your case, so we could optimize it without resorting to down-sampling?

VictoriaMetrics is able to scan up to 50 millions of data points per second per CPU core, and the performance scales almost linearly with the number of CPU cores. For instance, 20 CPU cores can result in scan speed of up to 1 billion of data points per second.

Example calculations: if new points for each time series arrive every 100ms, then a single time series for one year would contain 10*3600*24*365=315M data points. This means that VictoriaMetrics could process up to 3 such time series per second on 20 CPU cores for year-long time range.

In the mean time you can create a custom script, which would periodically fetch down-sampled OHLC numbers from raw data and put them into a separate VictoriaMetrics instance for fast query processing on long time ranges.

useful!
source:https://uonfu.com/
Github User Rank List