Home PostgreSQL Improving Zabbix + PostgreSQL performance with partitioning and indexing

Improving Zabbix + PostgreSQL performance with partitioning and indexing

by admin

About a year ago my colleagues and I were assigned the task to tackle the popular network infrastructure monitoring system Zabbix.After reviewing documentation we immediately switched to load testing: we wanted to estimate with how many parameters Zabbix could work without noticeable performance loss. We used only PostgreSQL as the database management system.
During the tests some architectural features of the database partitioning and behavior of the monitoring system itself, which by default does not allow the monitoring system to reach its maximum performance. As a result, some optimization measures were developed, carried out and tested, mainly in terms of setting up the database.
I want to share the results of this work in this article. This article will be useful for Zabbix administrators, PostgreSQL DBAs, and everyone who wants to better understand and master the popular PosgreSQL DBMS.
A little spoiler: on a weak machine with 200, 000 parameters per minute load we managed to reduce the iowait CPU index from 20% to 2%, reduced the write time to primary data tables by 250 times and to aggregate data tables by 32 times, reduced the index size by 5-10 times and accelerated getting of historical selections in some cases up to 18 times.

Load testing

The load testing was done by scheme: one Zabbix server, one active Zabbix proxy, two agents. Each agent was configured to deliver 50 tons of integer parameters and 50 tons of string parameters per minute (the total of parameters generated by two agents is 200 tons per minute, or 3333 parameters per second). To generate agent parameters, we used plugin for Zabbix To check what is the maximum number of parameters the agent can generate, we use a special script from the same author of the zabbix_module_stress plugin Zabbix web-admin has difficulty registering large templates, so we broke the parameters into 20 templates of 5 t parameters each (2500 numeric and 2500 string parameters).
Template generator script for python stress testing

import argparse"""Generates templates for the load test.Twenty templates of 5000 parameters each were generated(2500 of each kind: the echo parameter, type string; the ping parameter, type number)"""TEMP_HEAD = """<?xml version="1.0" encoding="UTF-8"?><zabbix_export><version> 2.0</version><date> 2015-08-17T23:15:01Z</date><groups> <groups><groups><group> <group><group><name> Templates</name></group> </groups>;</groups>;</groups>;<templates> <templates><templates> <templates><template> Template Zabbix Srv Stress {count} passive {char}</template><name> Template Zabbix Srv Stress {count} passive {char}</name><description/><groups> <group> <name> Templates</name></group> </groups>;<applications/><items> "items>"""TEMP_END = """</items><discovery_rules/><macros/><templates/> <templates/><screens/></template></templates></zabbix_export>"""TEMP_ITEM = """<item><name> {k}</name><type> 0</type><snmp_community/><multiplier> 0</multiplier><snmp_oid/><key> {k}</key><delay> 1m</delay><history> 3</history><trends> 365</trends><status> 0</status><value_type> {t}</value_type><allowed_hosts/><units/><delta> 0</delta><snmpv3_contextname/><snmpv3_securityname/><snmpv3_securitylevel> 0</snmpv3_securitylevel><snmpv3_authprotocol> 0</snmpv3_authprotocol><snmpv3_authpassphrase/><snmpv3_privprotocol> 0</snmpv3_privprotocol><snmpv3_privpassphrase/><formula> 1</formula><delay_flex/><params/><ipmi_sensor/><data_type> 0</data_type><authtype> 0</authtype><username/><password/><publickey/><privatekey/><port/><description/><inventory_link> 0</inventory_link><applications/><valuemap/><logtimefmt/></item>"""TMP_FNAME_DEFAULT = "Template_App_Zabbix_Server_Stress_{count}_passive_{char}.xml"chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"if __name__ == "__main__":parser = argparse.ArgumentParser(description='zabbix stress test template generator')parser.add_argument('--items', dest='items', type=int, default=1000, help='Number of data elements (default: 1000)')parser.add_argument('--templates', dest='templates', type=int, default=1, help=f'Number of templates [1-{len(chars)}] (default: 1)')args = parser.parse_args()items_count = args.itemstmps_count = args.templatesif not (tmps_count > = 1 and tmps_count <= len(chars)):sys.exit(f"Templates must be in range 1 - {len(chars)}")for i in range(tmps_count):fname = TMP_FNAME_DEFAULT.format(count=items_count, char=chars[i])with open(fname, "w") as output:output.write(TEMP_HEAD.format(count=items_count, char=chars[i]))for k, t in [('stress.ping[{}-I-{:06d}]', 3), ('stress.echo[{}-S-{:06d}]', 4)]:for j in range(int(items_count/2)):output.write(TEMP_ITEM.format(k=k.format(chars[i], j), t=t))output.write(TEMP_END)

The cpu iostat metric serves as a good indicator of Zabbix performance – it reflects the fraction of the unit of time that the CPU waits for disk access. The higher it is, the more the disk is busy with read and write operations, which indirectly affects the performance degradation of the monitoring system as a whole. So, this is a sure sign that something is wrong with monitoring. By the way, the question "how to remove iostat trigger in Zabbix" is quite popular on the web, so it is a sore subject, because there are many reasons of increasing of iowait metrics value.
This is the picture we got after three days for the cpu iowait metric originally :
Improving Zabbix + PostgreSQL performance with partitioning and indexing
And here’s what picture we got for the same metric also for three days as a result of all the optimizations done, which we’ll talk about below :
Improving Zabbix + PostgreSQL performance with partitioning and indexing
As you can see from the graphs, the cpu iowait has fallen from almost 20% to 2%, which has indirectly accelerated the execution time of all queries to add and read data. Now let’s analyze why the overall performance of the monitoring system is falling and how to fix it with the standard database settings.

Causes of Zabbix performance degradation

When accumulating more than 10 million parameter values in each primary data table, it was noticed that the performance of the monitoring system drops dramatically, which is due to the following reasons :

  • the iowait metric for the server CPU exceeds 20%, which indicates an increase in the time the CPU is waiting for access to disk read and write operations
  • the indices of the tables storing the monitoring data become significantly inflated
  • metric usage rises to 100% for a disk with monitoring data, which indicates that the disk is fully loaded with read and write operations
  • Obsolete values do not have time to be removed from the historytables when the schedule is cleared

The situation is exacerbated at the beginning of each hour, when in addition to this is calculated aggregate hourly statistics – while performing active reading and writing index pages from the disk, removing obsolete data from the history, which leads to the same result – a decrease in database performance and increase in query time (in the limit was marked query lasting up to 5 minutes!).
A little help on organizing the monitoring data storage in Zabbix. It stores primary data and aggregated data in different tables, with separation by parameter type. Each table keeps an itemid field (implicit link to the registered data item in the system), a timestamp of the clock value registration in unix timestamp format (milliseconds in a separate column) and a value in a separate column (the exception is the log table, it has more fields – similar to the event log):

Table name Destination Data Type
history Primary monitoring data numeric(16, 4)
history_uint Primary monitoring data numeric(20, 0)
history_str Primary monitoring data varchar(255)
history_text Primary monitoring data text
history_logs Primary monitoring data text and int fields
trends Aggregated monitoring data numeric(16, 4)
trends_uint Aggregated monitoring data numeric(20, 0)

Optimization measures

To improve PostgreSQL database performance different optimization measures have been taken, the main ones are partitioning and changing indexes. But it is worth mentioning a few more important and useful activities that can accelerate the performance of any database under the PostgreSQL database.
Important Note. At the time of this article we were using Zabbix version 4.0, although version 4.2 is out and 4.4 is under preparation. Why is it important to mention this? Because since v4.2 Zabbix supports a special powerful time-series query extension called TimescaleDB but it is still in experimental mode: although it is good to use this extension, some queries are slower and there are performance problems which have not been solved yet (but will be resolved in v4.4) read this article In the next article I plan to write about the results of load testing already using TimescaleDB extension in comparison with this decision case. PostgreSQL version 10 was used, but the above information is also relevant for versions 11 and 12 (wait!).
So it’s all in order :

  • Configuration file setup with pgtune utility
  • moving the database to a separate physical disk
  • partitioning history tables with pg_pathman
  • changing history table index types to brin (clock) and btree-gin (itemid)
  • pg_stat_statements query statistics collection and analysis
  • setting up parameters for monitoring physical disks
  • hardware performance improvement
  • creating a distributed cluster (material outside the scope of this article)

Setting up a configuration file with the pgtune utility

Actually, PostgreSQL is a pretty lightweight database. Its default configuration file is set up to, as my colleague says, "work even on a coffee machine", i.e. on very modest hardware. So it is absolutely necessary to configure PostgreSQL for your server, taking into account the amount of memory, number of processors, type of DB usage, disk type (HDD or SSD) and the number of connections.
Alas, there is no single formula for configuring all the DBMS, but there are certain rules and patterns that are suitable for most configurations (more fine-tuning – it’s up to the expert). To simplify the life of a DBA the following utility was written pgtune which has been supplemented by web version by the user le0pard – author of an interesting and useful book on PostgreSQL administration.
Example of running the utility in the console with 100 connections (Zabbix has a demanding Web admin area)under the "Data warehouses" application type:

 pgtune -i postgresql.conf -o new_postgresql.conf -T DW -c 100 

Configuration parameters that are changed by the pgtune utility, with a description of the purpose (values are given as an example)

 # DB Version: 11# OS Type: linux# DB Type: web# Total Memory (RAM): 8 GB# CPUs num: 1# Connections num: 100# Data Storage: hddmax_connections = 100 # maximal number of simultaneous connections to the databaseshared_buffers = 2GB # amount of memory for different buffers (mainly cache blocks of tables and indexes) in the shared memoryeffective_cache_size = 6GB # maximum memory size required to execute a query using indexesmaintenance_work_mem = 512MB # affects the execution speed of VACUUM, ANALYZE, CREATE INDEX operationscheckpoint_completion_target = 0.7 # target time for completing the checkpoint procedurewal_buffers = 16MB # amount of memory used in Shared Memory for transactional loggingdefault_statistics_target = 100 # amount of statistics collected by ANALYZE command - if bigger, the optimizer will build queries more slowly, but betterrandom_page_cost = 4 # conditional cost of index access to data pages - affects the decision to use the indexeffective_io_concurrency = 2 # number of asynchronous input/output operations the DBMS will try to execute in a separate sessionwork_mem = 10485kB # amount of memory used for sorting and hash table operations before activating temporary files on diskmin_wal_size = 1GB # which limits the number of WAL files from below that will be recycled for future usemax_wal_size = 2GB # limit the number of WAL files from above that will be recycled for future use 

Some useful postgresql configuration options

 # management of parallel query handlersmax_worker_processes = 8 # maximal number of background processes - at least one per databasemax_parallell_workers_per_gather = 4 # maximum number of parallel processes within one querymax_parallel_workers = 8 # maximum number of worker processes the system can support for concurrent operations# logging settings (an easy way to find out about the query execution time without using the pg_stat_statements extension)log_min_duration_statement = 3000 # write all commands with execution time > = specified value in ms into logslog_duration = off # log the duration of each completed commandlog_statement = 'none' # SQL commands to log: none (disabled), ddl, mod and all (all commands)debug_print_plan = off # output the query plan tree for further analysis# get the most out of the database and be ready to get kicked in the ass if something goes wrong (for the brainiacs who can ignore an ssd or a distributed cluster)#fsync = off # physical write to disk changes, turning off fsync gives a speed advantage, but can lead to irreversible failures#synchronous_commit = off # allows you to respond to the client before the transaction hits WAL - a nearly safe alternative to disabling fsync#full_page_writes = off # disabling speeds up normal operations, but can lead to corruption or unnoticeable data corruption after a system crash 

Moving database to a separate physical disk

This item is not obligatory and is rather a transitional solution on the way to a full-fledged distributed cluster, but it will be useful to know about such a possibility. To speed up the database, you can put it on a separate disk. We mounted the whole disk in the base directory, where all PostgreSQL databases are kept, but in general you can do it in a different way: create a new tablebase and put the database (or even just part of it – the tables of primary and aggregate monitoring data) in this tablebase on a separate disk.
Example of mounting You have to format the disk with the ext4 file system and mount it to the server first. It is necessary to mount database disk with noatime label:

 mount /dev/sdc1 /var/lib/pgsql/10/data/base -o noatime 

For permanent mounting you have to add to /etc/fstab the line :

 # where UUID is the disk ID, you can look it up with the blkid utilityUUID=121efe29-70bf-410b-bc71-90704568ce3b /var/lib/pgsql/10/data/base ext4 defaults, noatime 0 0 

Partitioning history tables with pg_pathman

One of the problems we encountered with Zabbix load testing is that PostgreSQL does not have time to remove obsolete data from the database. With partitioning, you can split a table into constituent parts, thereby reducing the size of the indexes and constituent parts of the super-table, which has a positive effect on the performance of the database as a whole.
Partitioning solves two problems at once :
1. speeding up the deletion of obsolete data by deleting entire tables
2. split indexes under each compound table
There are four mechanisms for partitioning in PostgreSQL :
1. standard constraint_exclusion
2. pg_partman extension ( not to be confused with pg_pathman )
3. extension pg_pathman
4. manually create and maintain partitions by yourself
The most convenient, reliable, and optimized solution for partitioning, in our opinion, is the extension pg_pathman With this method of partitioning, the query planner is flexible in determining in which partitions to look for data. Rumor has it that PostgreSQL version 12 will have excellent partitioning out of the box.
Thus, we started to write the monitoring data for each day in a separate inherited table from the super-table and deleting the obsolete parameter values began by deleting all the obsolete tables at once, which is much easier for the DBMS in terms of labor costs. Deletion was done through the call of the user database function as a monitoring parameter of the Zabbix server at 2 a.m. with the indication of the permissible range for storing statistics.
Installing and configuring partitioning for PostgreSQL 10 Install and configure the extension pg_pathman from the standard OS repository (see the same repository at github for instructions about building the latest version of the extension from source):

 yum install pg_pathman10nano /var/pgsqldb/postgresql.confshared_preload_libraries = 'pg_pathman' # important - here write pg_pathman last in the list 

Reboot DBMS, create database extension and perform partitioning setup (1 day for primary monitoring data and 3 days for aggregated monitoring data – you could have done 1 day each):

 systemctl restart postgresql-10.servicepsql -d zabbix -U postgresCREATE EXTENSION pg_pathman;# let's configure one day at a time for the primary monitoring data tables# 1552424400 - start as unix timestamp, 86400 - seconds in a dayselect create_range_partitions('history', 'clock', 1552424400, 86400);select create_range_partitions('history_uint', 'clock', 1552424400, 86400);select create_range_partitions('history_text', 'clock', 1552424400, 86400);select create_range_partitions('history_str', 'clock', 1552424400, 86400);select create_range_partitions('history_log', 'clock', 1552424400, 86400);# let's set up three days each for the aggregate monitoring data tables# 1552424400 - start as unix timestamp, 259200 - seconds in three daysselect create_range_partitions('trends', 'clock', 1545771600, 259200);select create_range_partitions('trends_uint', 'clock', 1545771600, 259200); 

If any of the tables doesn’t have data yet, you need to pass one more additional argument p_count = 0_ when calling the create_range_partitions function.
Useful queries for monitoring and managing partitioning :

 # a common list of partitioned tables, the main configuration repository :select * from pathman_config;# view with all existing partitions as well as their parents and range boundaries :select * from pathman_partition_list;# additional parameters that override the standard pg_pathman behavior:select * from pathman_config_params;# copy the contents back to the parent table and delete the partitions :select drop_partitions('table_name'::regclass, false); 

Useful script for viewing statistics on number and size of partitions :

/* Statistics on the number and size of partitions */SELECTnspname AS schemaname, relname, relkind, cast (reltuples as int), pg_size_pretty(pg_relation_size(C.oid)) AS "size"FROMpg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)WHEREnspname NOT IN ('pg_catalog', 'information_schema')and (relname like 'history%' or relname like 'trends%') and relkind = 'r'-- and reltuples > 0-- and pg_relation_size(C.oid) > = 0ORDER BY schemaname, relname

Auto-configuration of obsolete partitions deletion (ahtung – big SQL function) To set up auto-delete partitions, you must create a function in the database
(wide text, so I had to remove the syntax highlighting):

 CREATE OR REPLACE FUNCTION public.delete_old_partitions(history_days integer, trends_days integer, str_days integer)RETURNS textLANGUAGE plpgsqlAS $function$/*The function deletes all partitions older than the specified number of days : history_days- for the partitions history_x, history_uint_xtrends_days - for the trends_x, trends_uint_xstr_days - for the history_str_x, history_text_x, history_log_x*/declare clock_today_start int partitions;declare clock_delete_less_history int = 0;declare clock_delete_less_trends int = 0;declare clock_delete_less_strings int = 0;clock_delete_less int = 0;declare iterator int = 0;declare result_str text = '';declare buf_table_size text;declare buf_table_len text;declare partition_name text;declare clock_max text;declare err_detail text;declare t_start timestamp = clock_timestamp();declare t_end timestamp;beginif $1 <= 0 then return 'ups, something wrong: history_days argument must be positive integer value'; end if;if $2 <= 0 then return 'ups, something wrong: trends_days argument must be positive integer value'; end if;if $3 <= 0 then return 'ups, something wrong: str_days argument must be positive integer value'; end if;clock_today_start = extract(epoch from date_trunc('day', now()))::int;clock_delete_less_history = extract(epoch from date_trunc('day', now()) - ($1::text || ' days')::interval)::int;clock_delete_less_trends = extract(epoch from date_trunc('day', now()) - ($2::text || ' days')::interval)::int;clock_delete_less_strings = extract(epoch from date_trunc('day', now()) - ($3::text || ' days')::interval)::int;clock_delete_less = least(clock_delete_less_history, clock_delete_less_trends, clock_delete_less_strings);--raise notice 'clock_today_start % (%)', to_timestamp(clock_today_start), clock_today_start;--raise notice 'clock_delete_less_history % (%) % days', to_timestamp(clock_delete_less_history), clock_delete_less_history, $1;--raise notice 'clock_delete_less_trends % (%) % days', to_timestamp(clock_delete_less_trends), clock_delete_less_trends, $2;--raise notice 'clock_delete_less_strings % (%) % days', to_timestamp(clock_delete_less_strings), clock_delete_less_strings, $3;for partition_name, clock_max in select partition, range_max from pathman_partition_list whererange_max::int <= greatest(clock_delete_less_history, clock_delete_less_trends, clock_delete_less_strings) and(partition::text like 'history%' or partition::text like 'trends%') order by partition ascloopif (partition_name ~ 'history_uint_d' and clock_max::int <= clock_delete_less_history)or (partition_name ~ 'history_d' and clock_max::int <= clock_delete_less_history)or (partition_name ~ 'trends_d' and clock_max::int <= clock_delete_less_trends)or (partition_name ~ 'history_log_d' and clock_max::int <= clock_delete_less_strings)or (partition_name ~ 'history_str_d' and clock_max::int <= clock_delete_less_strings)or (partition_name ~ 'history_text_d' and clock_max::int <= clock_delete_less_strings)theniterator = iterator + 1;raise notice '%', format('!!! delete %s %s', partition_name, clock_max);select max(reltuples::int), pg_size_pretty(sum(pg_relation_size(pg_class.oid))) as "size" from pg_class where relname like partition_name || '%' into strict buf_table_len, buf_table_size;if result_str != '' then result_str = result_str || ', '; end if;result_str = result_str || format('%s (dt < %s, len %s, %s)', partition_name, to_char(to_timestamp(clock_max::int), 'YYYY-MM-DD'), buf_table_len, buf_table_size);execute format('drop table if exists %s', partition_name);end if;end loop;if iterator = 0 then result_str = format('there is no partitions to delete older, then %s date', to_char(to_timestamp(clock_delete_less), 'YYYY-MM-DD'));else result_str = format('deleted %s partitions in %s seconds: ', iterator, trunc(extract(seconds from (clock_timestamp() - t_start))::numeric, 3)) || result_str;end if;--raise notice '%', result_str;return result_str;exception when others thenget stacked diagnostics err_detail = PG_EXCEPTION_CONTEXT;return format('ups, something wrong: %s [err code %s], %s', sqlerrm, sqlstate, err_detail);end;$function$; 

To automatically call the partition auto-cleaning function, you need to create one data item for the zabbix server host of the "DB Monitor" type with the following settings :

 - type : database monitor- name : delete_old_history_partitions- key : db.odbc.select[delete_old_history_partitions, zabbix]- sql expression : select delete_old_partitions(3, 30, 30);# here the storage time in days is specified as parameters of the delete_old_partitions function call# for numeric values, aggregated numeric values and string values- data type : Text- refresh interval : 0- user interval : schedule in h2- history storage period : 90 days- data element group : Database 

As a result, we will get partition cleanup statistics like this :

 2019-09-16 02:00:00, deleted 3 partitions in 0.024 seconds: trends_78 (dt < 2019-08-17, len 1, 48 kB), history_193 (dt < 2019-09-13, len 85343, 9448 kB), history_uint_186 (dt < 2019-09-13, len 27969, 3480 kB) 

Important! After setting up auto-delete partitions via a data item and a custom function, you must turn off history and trend clearing in the housekeeper Zabbix task scheduler: through zabbix menu item choose "Administration" -> "General" -> on the right corner choose from the list "History clearing" -> disable all checkboxes for the sections "History" and "Trends".

Change history table index types to brin (clock) and btree-gin (itemid)

Special thanks to the user erogov for an excellent series of review articles about PostgreSQL indexes And to the entire PostgresPRO team. Prompted by these articles, we played around with different types of indexes on the monitoring data tables and concluded which types of indexes on which fields would give the best performance gains.
It was noticed that on all monitoring data tables a composite index btree(itemid, clock) is created by default – it’s fast for searching, especially for monotonically ordered values, but it gets very "puffy" on disk when there is a lot of data – more than 10 million.
On the hourly aggregated statistics tables, a unique index is created by default at all, although these tables are for data warehouse and uniqueness is provided here at the application server level, and the unique index only slows down data insertion.
During the testing of different indexes the most successful combination of indexes was found: index brin on the clock field and index btree-gin on the itemid field for all tables of monitoring data.
The brin index is ideal for monotonically increasing data, such as the timestamp of the fact of some event, i.e., for time series. And the btree-gin index is essentially a gin index over standard data types, which is generally much faster than the classic btree index because the gin index is not rebuilt as new values are added, but only augmented by them. The btree-gin index is put as an extension to PostgreSQL.
A comparison of the sampling speed for this indexing strategy and for the default indexes in the Zabbix database is shown below. In load tests, we accumulated three days of data for three partitions :

Partition name Number of lines in MLN Size in MB
history_uint_1 81.3 4119
history_uint_2 74.9 4426
history_uint_3 100.7 5387

Three kinds of queries were performed to evaluate the results :

  1. for one particular itemid data for the last month, in fact the last three days (a total of 1660 records)
     explain analyze select * from history_uint where itemid = 313300and clock > = extract (epoch from '2019-03-09 00:00:00'::timestamp)::intand clock <= extract (epoch from '2019-04-09 12:00:00'::timestamp)::int; 
  2. For one particular parameter, data for 12 hours of one day (total of 649 entries)
     explain analyze select * from history_text where itemid = 310650and clock > = extract (epoch from '2019-04-09 00:00:00'::timestamp)::intand clock <= extract (epoch from '2019-04-09 12:00:00'::timestamp)::int; 
  3. for one particular parameter, data for one hour (total of 61 entries):
     explain analyze select count(*) from history_text where itemid = 336540and clock > = extract (epoch from '2019-04-08 11:00:00'::timestamp)::intand clock <= extract (epoch from '2019-04-08 12:00:00'::timestamp)::int; 

The results of the test were summarized in the table below :

index type size in MB* query 1** in msec request 2** in ms request 3** in ms
btree (clock, itemid) 14741 7154.3 2205.3 1860.4
brin(clock),
btree-gin (itemid)
0.42 and 1329 2958.2 1820.4 102.1

* MB size is the total of the three partitions
** query type 1 – data for 3 days, query type 2 – data for 12 hours, query type 3 – data for one hour
From the comparison table we see that for big data tables with more than 100 million records it is clear that changing the standard composite index btree into two indexes brin and btree-gin had a beneficial effect on reducing the size of the indexes and speeding up the query execution time.
The efficiency of indexing and partitioning is shown below on the example of a query for adding new records to tables history_uint and trends_uint(the additions occur at an average of 2000 values per query).

Table Average query time before improvements, ms Average query time after improvements, ms
trends_uint 2201.48 8.72
trends_uint 1997.27 62.16

Summarizing the results of tests of different index configurations for zabbix monitoring data tables, we can say that such a change in the default index for zabbix monitoring data tables has a positive effect on the performance of the system in general, which is felt most strongly when the data volumes of 10 million and more are accumulated. Also you should not forget about indirect effect of "bloating" of default btree index – frequent rebuilds of multi-gigabyte index leads to heavy load on HDD (utilization metric), which in the end increase HDD operations time and waiting time of CPU access to disk (iowait metric).
But, for the btree-gin index to handle the bigint (in8) data type, which is the itemid column, you need to register the bigint family of operators for the btree-gin index.
Registration of the bigint type operator family for the btree-gin index

 /*The script allows you to fully use the gin index on biginteger and integer data types without explicit type conversion.The problem is that the gin index lacks the family of operators for int2, int4, int8, so when building an index on a field of bigint type, it is not used in a query if the specified number is less than bigint (<= 2147483647)To create an index with support for the new family of intger_ops operators, one must execute :create index on tablename using gin(columnname int8_family_ops) with (fastupdate = false);*/-- don't forget to put in the extension for btree_ginCREATE EXTENSION btree_gin;CREATE OPERATOR FAMILY integer_ops using gin;CREATE OPERATOR CLASS int4_family_opsFOR TYPE int4 USING gin FAMILY integer_opsASOPERATOR 1 <, OPERATOR 2 <=, OPERATOR 3 =, OPERATOR 4 > =, OPERATOR 5 > , FUNCTION 1 btint4cmp(int4, int4), FUNCTION 2 gin_extract_value_int4(int4, internal), FUNCTION 3 gin_extract_query_int4(int4, internal, int2, internal, internal), FUNCTION 4 gin_btree_consistent(internal, int2, anyelement, int4, internal, internal), FUNCTION 5 gin_compare_prefix_int4(int4, int4, int2, internal), STORAGE int4;CREATE OPERATOR CLASS int8_family_opsFOR TYPE int8 USING gin FAMILY integer_opsASOPERATOR 1 <, OPERATOR 2 <=, OPERATOR 3 =, OPERATOR 4 > =, OPERATOR 5 > , FUNCTION 1 btint8cmp(int8, int8), FUNCTION 2 gin_extract_value_int8(int8, internal), FUNCTION 3 gin_extract_query_int8(int8, internal, int2, internal, internal), FUNCTION 4 gin_btree_consistent(internal, int2, anyelement, int4, internal, internal), FUNCTION 5 gin_compare_prefix_int8(int8, int8, int2, internal), STORAGE int8;ALTER OPERATOR FAMILY integer_ops USING gin addOPERATOR 1 <(int4, int8), OPERATOR 2 <=(int4, int8), OPERATOR 3 =(int4, int8), OPERATOR 4 > =(int4, int8), OPERATOR 5 > (int4, int8);ALTER OPERATOR FAMILY integer_ops USING gin addOPERATOR 1 <(int8, int4), OPERATOR 2 <=(int8, int4), OPERATOR 3 =(int8, int4), OPERATOR 4 > =(int8, int4), OPERATOR 5 > (int8, int4); 

This script re-labels all indexes in the PostgreSQL database for Zabbix from the default configuration to the optimal configuration described above.

 /*Index reassignment for primary and aggregate monitoring data tables*/-- drop old indexesdrop index history_1;drop index history_uint_1;drop index history_str_1;drop index history_text_1;drop index history_log_1;-- on aggregate data tables, indexes were created as unique PK-- (this is not needed at all, we only waste time on checking uniqueness on insertion)alter table trends drop constraint trends_pk;alter table trends_uint drop constraint trends_uint_pk;-- create new bree-gin type indexes on itemid fields in monitoring data tables-- the support of btree-gin application for bigint is described in the comment to the post at the link-- https://habr.com/ru/company/postgrespro/blog/340978/#comment_10545932-- first create extension btree_gin;create index on history using gin(itemid int8_family_ops) with (fastupdate = false);create index on history_uint using gin(itemid int8_family_ops) with (fastupdate = false);create index on history_str using gin(itemid int8_family_ops) with (fastupdate = false);create index on history_text using gin(itemid int8_family_ops) with (fastupdate = false);create index on history_log using gin(itemid int8_family_ops) with (fastupdate = false);create index on trends using gin(itemid int8_family_ops) with (fastupdate = false);create index on trends_uint using gin(itemid int8_family_ops) with (fastupdate = false);-- create new indexes of the bree-gin type on the itemid fields in the monitoring data tables-- the size of the zone for the brin index is selected stardart at 128 pages, but on big data-- large zone sizes can be effective in primary monitoring data tables, -- the method of selection is described in the post https://habr.com/ru/company/postgrespro/blog/346460/ create index on history using brin(clock) with (pages_per_range = 128); create index on history_uint using brin(clock) with (pages_per_range = 128); create index on history_str using brin(clock) with (pages_per_range = 128); create index on history_text using brin(clock) with (pages_per_range = 128); create index on history_log using brin(clock) with (pages_per_range = 128); create index on trends using brin(clock) with (pages_per_range = 128);create index on trends_uint using brin(clock) with (pages_per_range = 128); 

For the brin index for our data volume at an intensity of 100 t parameters per minute (100 t in history and 100 t in history_uint) it was observed that on primary monitoring data tables the index at a zone size of 512 pages works twice as fast as the standard size of 128 pages, but this is individual and depends on table size and server configuration.In any case, the brin index takes very little space, but its speed can be slightly increased by fine-tuning the zone size, but provided that the intensity of data flow does not change much.
To summarize, there is a limitation related to the architecture of Zabbix itself: the "Recent Data" tab collects the last two values for each parameter, taking into account filtering. For each parameter, the values are queried in the database separately. Therefore, the more such parameters will be selected, the longer the query will take. The quickest way to find the latest data is to set up a btree(itemid, clock desc) index on the history tables, which is sorted by time, but the index itself of course "puffs up" on disk and generally indirectly slows down the database, which causes the problem described above.
So there are three ways out :

  1. Perform the above described manipulations with the indexes and try not to select values for more than 100 parameters simultaneously in the "Recent data" tab (i.e. accept that the data in the "Recent data" tab will appear a bit slower)
  2. To re-do the mechanism inside the Zabbix server, so that the last two values for all the parameters would be written by the trigger to a separate table with the last two values, and from there they would return to the requests for drawing the "Last Data" tab
  3. leave indexes as they are by default, and limit by partitioning only, in order to get rather large selections on "Recent Data" tab simultaneously by many parameters (however it was noticed that Zabbix web server still has limitation on number of simultaneously displayed parameter values on "Recent Data" tab – so when we tried to display 5000 values, database calculated result, but server could not prepare web page and display such great amount of data).

Collection and analysis of pg_stat_statements query statistics

Pg_stat_statements is an extension for gathering server-wide query execution statistics. The advantage of this extension is that it does not need to collect and parse PostgreSQL logs.
Using the pg_stat_statements extension Installing the extension via psql:

 CREATE EXTENSION pg_stat_statements; 

Settings for the extension in the postgresql.conf configuration file:

 shared_preload_libraries = 'pg_stat_statements'pg_stat_statements.max = 10000 # maximum number of sql queries that will be stored by the extension (the entries with the lowest number of calls will be deleted);pg_stat_statements.track = all # all - all calls (including in functions), top - only client/server calls, none - disable collectionpg_stat_statements.save = true # save statistics between DBMS restarts 

Zero the stats :

 SELECT pg_stat_statements_reset(); 

Useful query for viewing the average execution time of operations on tables :

select substring(query from '[^(]*') as query_sub, sum(calls) as calls, avg(mean_time) as mean_timefrom pg_stat_statementswherequery ~ 'insert into' or query ~ 'update trends'group by substring(query from '[^(]*')order by calls desc

Configuring physical disk monitoring settings

Zabbix out of the box only provides vfs.dev.read and vfs.dev.write parameters for monitoring hard disks. These parameters do not provide information about disk utilization. Useful criteria for troubleshooting hard disk performance problems are the load factor, query time, and disk request queue load.
As a rule, high disk load correlates with high iowait of the cpu itself and with increased execution time of sql queries, which was found by load testing zabbix server with a standard configuration without partitioning and without configuring alternative indexes. You can add these hard disk monitoring parameters using the following steps, which were spied out in article From a comrade lesovsky and improved: now iostat parameters are collected separately for each drive in a json time parameter, from where they are postprocessed in the final monitoring parameters.
While the Pull request is pending, you can try to deploy disk parameter monitoring by detailed instructions via my fork
After all described steps you can add to Zabbix main server monitoring panel custom graph with iowait cpu and utiliztion parameters for system disk and database disk (if they are different). The result may look like this (sda – the main disk, sdc – the disk with the database):
Improving Zabbix + PostgreSQL performance with partitioning and indexing

Hardware performance improvement

After configuring the DBMS, indexing and partitioning, you can proceed to vertical scaling – improving the hardware characteristics of the server: add RAM, change the drives to solid state and add processor cores. This is a guaranteed performance gain, but it is better to do it only after software optimization.

Creating a distributed cluster

After a moderate vertical scaling we need to proceed to the horizontal one – to create a distributed cluster: do either sharding or master-slave replication. But this is a separate topic and the material of a separate article (how to make a cluster of shit and sticks) as well as comparing above described methods of Zabbix database optimization using pg_pathman and indexing with TimescaleDB extension.
In the meantime, we hope that the material in this article was useful and informative!

You may also like