info:'Disk Utilization measures the amount of time the disk was busy with something. This is not related to its performance. 100% means that the system always had an outstanding operation on the disk. Keep in mind that depending on the underlying technology of the disk, 100% here may or may not be an indication of congestion.'
},
'disk.busy':{
colors:'#FF5588',
info:'Disk Busy Time measures the amount of time the disk was busy with something.'
},
'disk.backlog':{
colors:'#0099CC',
info:'Backlog is an indication of the duration of pending disk operations. On every I/O event the system is multiplying the time spent doing I/O since the last update of this field with the number of pending operations. While not accurate, this metric can provide an indication of the expected completion time of the operations in progress.'
},
'disk.io':{
heads:[
netdataDashboard.gaugeChart('读取','12%','reads'),
netdataDashboard.gaugeChart('写入','12%','writes')
],
info:'磁碟传输资料的总计。'
},
'disk_ext.io':{
info:'The amount of discarded data that are no longer in use by a mounted file system.'
info:'<p>The number (after merges) of completed discard/flush requests.</p>'+
'<p><b>Discard</b> commands inform disks which blocks of data are no longer considered to be in use and therefore can be erased internally. '+
'They are useful for solid-state drivers (SSDs) and thinly-provisioned storage. '+
'Discarding/trimming enables the SSD to handle garbage collection more efficiently, '+
'which would otherwise slow future write operations to the involved blocks down.</p>'+
'<p><b>Flush</b> operations transfer all modified in-core data (i.e., modified buffer cache pages) to the disk device '+
'so that all changed information can be retrieved even if the system crashes or is rebooted. '+
'Flush requests are executed by disks. Flush requests are not tracked for partitions. '+
'Before being merged, flush operations are counted as writes.</p>'
},
'disk.qops':{
info:'I/O operations currently in progress. This metric is a snapshot - it is not an average over the last interval.'
},
'disk.iotime':{
height:0.5,
info:'The sum of the duration of all completed I/O operations. This number can exceed the interval if the disk is able to execute I/O operations in parallel.'
},
'disk_ext.iotime':{
height:0.5,
info:'The sum of the duration of all completed discard/flush operations. This number can exceed the interval if the disk is able to execute discard/flush operations in parallel.'
},
'disk.mops':{
height:0.5,
info:'The number of merged disk operations. The system is able to merge adjacent I/O operations, for example two 4KB reads can become one 8KB read before given to disk.'
},
'disk_ext.mops':{
height:0.5,
info:'The number of merged discard disk operations. Discard operations which are adjacent to each other may be merged for efficiency.'
},
'disk.svctm':{
height:0.5,
info:'The average service time for completed I/O operations. This metric is calculated using the total busy time of the disk and the number of completed operations. If the disk is able to execute multiple parallel operations the reporting average service time will be misleading.'
},
'disk.latency_io':{
height:0.5,
info:'Disk I/O latency is the time it takes for an I/O request to be completed. Latency is the single most important metric to focus on when it comes to storage performance, under most circumstances. For hard drives, an average latency somewhere between 10 to 20 ms can be considered acceptable. For SSD (Solid State Drives), depending on the workload it should never reach higher than 1-3 ms. In most cases, workloads will experience less than 1ms latency numbers. The dimensions refer to time intervals. This chart is based on the <a href="https://github.com/cloudflare/ebpf_exporter/blob/master/examples/bio-tracepoints.yaml" target="_blank">bio_tracepoints</a> tool of the ebpf_exporter.'
info:'The average time for discard/flush requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.'
info:'Inodes (or index nodes) are filesystem objects (e.g. files and directories). On many types of file system implementations, the maximum number of inodes is fixed at filesystem creation, limiting the maximum number of files the filesystem can hold. It is possible for a device to run out of inodes. When this happens, new files cannot be created on the device, even though there may be free space available.'
},
'disk.bcache_hit_ratio':{
info:'<p><b>Bcache (block cache)</b> is a cache in the block layer of Linux kernel, '+
'which is used for accessing secondary storage devices. '+
'It allows one or more fast storage devices, such as flash-based solid-state drives (SSDs), '+
'to act as a cache for one or more slower storage devices, such as hard disk drives (HDDs).</p>'+
'<p>Percentage of data requests that were fulfilled right from the block cache. '+
'Hits and misses are counted per individual IO as bcache sees them. '+
'A partial hit is counted as a miss.</p>'
},
'disk.bcache_rates':{
info:'Throttling rates. '+
'To avoid congestions bcache tracks latency to the cache device, and gradually throttles traffic if the latency exceeds a threshold. '+
'If the writeback percentage is nonzero, bcache tries to keep around this percentage of the cache dirty by '+
'throttling background writeback and using a PD controller to smoothly adjust the rate.'
},
'disk.bcache_size':{
info:'The amount of dirty data for this backing device in the cache.'
},
'disk.bcache_usage':{
info:'The percentage of cache device which does not contain dirty data, and could potentially be used for writeback.'
},
'disk.bcache_cache_read_races':{
info:'<b>Read races</b> happen when a bucket was reused and invalidated while data was being read from the cache. '+
'When this occurs the data is reread from the backing device. '+
'<b>IO errors</b> are decayed by the half life. '+
'If the decaying count reaches the limit, dirty data is written out and the cache is disabled.'
},
'disk.bcache':{
info:'Hits and misses are counted per individual IO as bcache sees them; a partial hit is counted as a miss. '+
'Collisions happen when data was going to be inserted into the cache from a cache miss, '+
'but raced with a write and data was already present. '+
'Cache miss reads are rounded up to the readahead size, but without overlapping existing cache entries.'
},
'disk.bcache_bypass':{
info:'Hits and misses for IO that is intended to skip the cache.'
},
'disk.bcache_cache_alloc':{
info:'<p>Working set size.</p>'+
'<p><b>Unused</b> is the percentage of the cache that does not contain any data. '+
'<b>Dirty</b> is the data that is modified in the cache but not yet written to the permanent storage. '+
'<b>Clean</b> data matches the data stored on the permanent storage. '+
'<b>Metadata</b> is bcache\'s metadata overhead.</p>'
info:'The amount of data sent to mysql clients (<strong>out</strong>) and received from mysql clients (<strong>in</strong>).'
},
'mysql.queries':{
info:'The number of statements executed by the server.<ul>'+
'<li><strong>queries</strong> counts the statements executed within stored SQL programs.</li>'+
'<li><strong>questions</strong> counts the statements sent to the mysql server by mysql clients.</li>'+
'<li><strong>slow queries</strong> counts the number of statements that took more than <a href="http://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_long_query_time" target="_blank">long_query_time</a> seconds to be executed.'+
' For more information about slow queries check the mysql <a href="http://dev.mysql.com/doc/refman/5.7/en/slow-query-log.html" target="_blank">slow query log</a>.</li>'+
'</ul>'
},
'mysql.handlers':{
info:'Usage of the internal handlers of mysql. This chart provides very good insights of what the mysql server is actually doing.'+
' (if the chart is not showing all these dimensions it is because they are zero - set <strong>Which dimensions to show?</strong> to <strong>All</strong> from the dashboard settings, to render even the zero values)<ul>'+
'<li><strong>commit</strong>, the number of internal <a href="http://dev.mysql.com/doc/refman/5.7/en/commit.html" target="_blank">COMMIT</a> statements.</li>'+
'<li><strong>delete</strong>, the number of times that rows have been deleted from tables.</li>'+
'<li><strong>prepare</strong>, a counter for the prepare phase of two-phase commit operations.</li>'+
'<li><strong>read first</strong>, the number of times the first entry in an index was read. A high value suggests that the server is doing a lot of full index scans; e.g. <strong>SELECT col1 FROM foo</strong>, with col1 indexed.</li>'+
'<li><strong>read key</strong>, the number of requests to read a row based on a key. If this value is high, it is a good indication that your tables are properly indexed for your queries.</li>'+
'<li><strong>read next</strong>, the number of requests to read the next row in key order. This value is incremented if you are querying an index column with a range constraint or if you are doing an index scan.</li>'+
'<li><strong>read prev</strong>, the number of requests to read the previous row in key order. This read method is mainly used to optimize <strong>ORDER BY ... DESC</strong>.</li>'+
'<li><strong>read rnd</strong>, the number of requests to read a row based on a fixed position. A high value indicates you are doing a lot of queries that require sorting of the result. You probably have a lot of queries that require MySQL to scan entire tables or you have joins that do not use keys properly.</li>'+
'<li><strong>read rnd next</strong>, the number of requests to read the next row in the data file. This value is high if you are doing a lot of table scans. Generally this suggests that your tables are not properly indexed or that your queries are not written to take advantage of the indexes you have.</li>'+
'<li><strong>rollback</strong>, the number of requests for a storage engine to perform a rollback operation.</li>'+
'<li><strong>savepoint</strong>, the number of requests for a storage engine to place a savepoint.</li>'+
'<li><strong>savepoint rollback</strong>, the number of requests for a storage engine to roll back to a savepoint.</li>'+
'<li><strong>update</strong>, the number of requests to update a row in a table.</li>'+
'<li><strong>write</strong>, the number of requests to insert a row in a table.</li>'+
'</ul>'
},
'mysql.table_locks':{
info:'MySQL table locks counters: <ul>'+
'<li><strong>immediate</strong>, the number of times that a request for a table lock could be granted immediately.</li>'+
'<li><strong>waited</strong>, the number of times that a request for a table lock could not be granted immediately and a wait was needed. If this is high and you have performance problems, you should first optimize your queries, and then either split your table or tables or use replication.</li>'+
'</ul>'
},
'mysql.innodb_deadlocks':{
info:'A deadlock happens when two or more transactions mutually hold and request for locks, creating a cycle of dependencies. For more information about <a href="https://dev.mysql.com/doc/refman/5.7/en/innodb-deadlocks-handling.html" target="_blank">how to minimize and handle deadlocks</a>.'
},
'mysql.galera_cluster_status':{
info:
'<code>-1</code>: unknown, '+
'<code>0</code>: primary (primary group configuration, quorum present), '+
'<code>1</code>: non-primary (non-primary group configuration, quorum lost), '+
'<code>2</code>: disconnected(not connected to group, retrying).'
},
'mysql.galera_cluster_state':{
info:
'<code>0</code>: Undefined, '+
'<code>1</code>: Joining, '+
'<code>2</code>: Donor/Desynced, '+
'<code>3</code>: Joined, '+
'<code>4</code>: Synced, '+
'<code>5</code>: Inconsistent.'
},
'mysql.galera_cluster_weight':{
info:'The value is counted as a sum of <code>pc.weight</code> of the nodes in the current Primary Component.'
},
'mysql.galera_connected':{
info:'<code>0</code> means that the node has not yet connected to any of the cluster components. '+
'This may be due to misconfiguration.'
},
'mysql.open_transactions':{
info:'The number of locally running transactions which have been registered inside the wsrep provider. '+
'This means transactions which have made operations which have caused write set population to happen. '+
'Transactions which are read only are not counted.'
'<li><strong>blks_read:</strong> number of disk blocks read in this database.</li>'+
'<li><strong>blks_hit:</strong> number of times disk blocks were found already in the buffer cache, so that a read was not necessary (this only includes hits in the PostgreSQL buffer cache, not the operating system's file system cache)</li>'+
'</ul>'
},
'postgres.db_stat_tuple_write':{
info:'<ul><li>Number of rows inserted/updated/deleted.</li>'+
'<li><strong>conflicts:</strong> number of queries canceled due to conflicts with recovery in this database. (Conflicts occur only on standby servers; see <a href="https://www.postgresql.org/docs/10/static/monitoring-stats.html#PG-STAT-DATABASE-CONFLICTS-VIEW" target="_blank">pg_stat_database_conflicts</a> for details.)</li>'+
'</ul>'
},
'postgres.db_stat_temp_bytes':{
info:'Temporary files can be created on disk for sorts, hashes, and temporary query results.'
},
'postgres.db_stat_temp_files':{
info:'<ul>'+
'<li><strong>files:</strong> number of temporary files created by queries. All temporary files are counted, regardless of why the temporary file was created (e.g., sorting or hashing).</li>'+
'</ul>'
},
'postgres.archive_wal':{
info:'WAL archiving.<ul>'+
'<li><strong>total:</strong> total files.</li>'+
'<li><strong>ready:</strong> WAL waiting to be archived.</li>'+
'Ready WAL can indicate archive_command is in error, see <a href="https://www.postgresql.org/docs/current/static/continuous-archiving.html" target="_blank">Continuous Archiving and Point-in-Time Recovery</a>.</li>'+
'</ul>'
},
'postgres.checkpointer':{
info:'Number of checkpoints.<ul>'+
'<li><strong>scheduled:</strong> when checkpoint_timeout is reached.</li>'+
'<li><strong>requested:</strong> when max_wal_size is reached.</li>'+
'</ul>'+
'For more information see <a href="https://www.postgresql.org/docs/current/static/wal-configuration.html" target="_blank">WAL Configuration</a>.'
},
'postgres.autovacuum':{
info:'PostgreSQL databases require periodic maintenance known as vacuuming. For many installations, it is sufficient to let vacuuming be performed by the autovacuum daemon. '+
'For more information see <a href="https://www.postgresql.org/docs/current/static/routine-vacuuming.html#AUTOVACUUM" target="_blank">The Autovacuum Daemon</a>.'
},
'postgres.standby_delta':{
info:'Streaming replication delta.<ul>'+
'<li><strong>sent_delta:</strong> replication delta sent to standby.</li>'+
'<li><strong>write_delta:</strong> replication delta written to disk by this standby.</li>'+
'<li><strong>flush_delta:</strong> replication delta flushed to disk by this standby server.</li>'+
'<li><strong>replay_delta:</strong> replication delta replayed into the database on this standby server.</li>'+
'</ul>'+
'For more information see <a href="https://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION" target="_blank">Synchronous Replication</a>.'
},
'postgres.replication_slot':{
info:'Replication slot files.<ul>'+
'<li><strong>wal_keeped:</strong> WAL files retained by each replication slots.</li>'+
'<li><strong>pg_replslot_files:</strong> files present in pg_replslot.</li>'+
'</ul>'+
'For more information see <a href="https://www.postgresql.org/docs/current/static/warm-standby.html#STREAMING-REPLICATION-SLOTS" target="_blank">Replication Slots</a>.'
},
'postgres.backend_usage':{
info:'Connections usage against maximum connections allowed, as defined in the <i>max_connections</i> setting.<ul>'+
'<li><strong>available:</strong> maximum new connections allowed.</li>'+
'<li><strong>used:</strong> connections currently in use.</li>'+
'</ul>'+
'Assuming non-superuser accounts are being used to connect to Postgres (so <i>superuser_reserved_connections</i> are subtracted from <i>max_connections</i>).<br/>'+
'For more information see <a href="https://www.postgresql.org/docs/current/runtime-config-connection.html" target="_blank">Connections and Authentication</a>.'
},
'postgres.forced_autovacuum':{
info:'Percent towards forced autovacuum for one or more tables.<ul>'+
'<li><strong>percent_towards_forced_autovacuum:</strong> a forced autovacuum will run once this value reaches 100.</li>'+
'</ul>'+
'For more information see <a href="https://www.postgresql.org/docs/current/routine-vacuuming.html" target="_blank">Preventing Transaction ID Wraparound Failures</a>.'
},
'postgres.tx_wraparound_oldest_current_xid':{
info:'The oldest current transaction id (xid).<ul>'+
'<li><strong>oldest_current_xid:</strong> oldest current transaction id.</li>'+
'</ul>'+
'If for some reason autovacuum fails to clear old XIDs from a table, the system will begin to emit warning messages when the database\'s oldest XIDs reach eleven million transactions from the wraparound point.<br/>'+
'For more information see <a href="https://www.postgresql.org/docs/current/routine-vacuuming.html" target="_blank">Preventing Transaction ID Wraparound Failures</a>.'
},
'postgres.percent_towards_wraparound':{
info:'Percent towards transaction wraparound.<ul>'+
'<li><strong>percent_towards_wraparound:</strong> transaction wraparound may occur when this value reaches 100.</li>'+
'</ul>'+
'For more information see <a href="https://www.postgresql.org/docs/current/routine-vacuuming.html" target="_blank">Preventing Transaction ID Wraparound Failures</a>.'
info:'The netdata API response time measures the time netdata needed to serve requests. This time includes everything, from the reception of the first byte of a request, to the dispatch of the last byte of its reply, therefore it includes all network latencies involved (i.e. a client over a slow network will influence these metrics).'
},
'netdata.ebpf_threads':{
info:'Show total number of threads and number of active threads. For more details about the threads, see the <a href="https://learn.netdata.cloud/docs/agent/collectors/ebpf.plugin#ebpf-programs">official documentation</a>.'
},
'netdata.ebpf_load_methods':{
info:'Show number of threads loaded using legacy code (independent binary) or <code>CO-RE (Compile Once Run Everywhere)</code>.'
'<b>Full</b> indicates the share of time in which all non-idle tasks are stalled on I/O simultaneously. '+
'In this state actual CPU cycles are going to waste, '+
'and a workload that spends extended time in this state is considered to be thrashing. '+
'The ratios (in %) are tracked as recent trends over 10-, 60-, and 300-second windows.'
},
'cgroup.swap_read':{
info:'The function <code>swap_readpage</code> is called when the kernel reads a page from swap memory. This chart is provided by eBPF plugin.'
},
'cgroup.swap_write':{
info:'The function <code>swap_writepage</code> is called when the kernel writes a page to swap memory. This chart is provided by eBPF plugin.'
},
'cgroup.fd_open':{
info:'Calls to the internal function <code>do_sys_open</code> (for kernels newer than <code>5.5.19</code> we add a kprobe to <code>do_sys_openat2</code>. ), which is the common function called from'+
' and <a href="https://www.man7.org/linux/man-pages/man2/openat.2.html" target="_blank">openat(2)</a>. '
},
'cgroup.fd_open_error':{
info:'Failed calls to the internal function <code>do_sys_open</code> (for kernels newer than <code>5.5.19</code> we add a kprobe to <code>do_sys_openat2</code>. ).'
},
'cgroup.fd_close':{
info:'Calls to the internal function <a href="https://elixir.bootlin.com/linux/v5.10/source/fs/file.c#L665" target="_blank">__close_fd</a> or <a href="https://elixir.bootlin.com/linux/v5.11/source/fs/file.c#L617" target="_blank">close_fd</a> according to your kernel version, which is called from'+
info:'Failed calls to the internal function <a href="https://elixir.bootlin.com/linux/v5.10/source/fs/file.c#L665" target="_blank">__close_fd</a> or <a href="https://elixir.bootlin.com/linux/v5.11/source/fs/file.c#L617" target="_blank">close_fd</a> according to your kernel version.'
},
'cgroup.vfs_unlink':{
info:'Calls to the function <a href="https://www.kernel.org/doc/htmldocs/filesystems/API-vfs-unlink.html" target="_blank">vfs_unlink</a>. This chart does not show all events that remove files from the filesystem, because filesystems can create their own functions to remove files.'
},
'cgroup.vfs_write':{
info:'Successful calls to the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_write</a>. This chart may not show all filesystem events if it uses other functions to store data on disk.'
},
'cgroup.vfs_write_error':{
info:'Failed calls to the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_write</a>. This chart may not show all filesystem events if it uses other functions to store data on disk.'
},
'cgroup.vfs_read':{
info:'Successful calls to the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_read</a>. This chart may not show all filesystem events if it uses other functions to store data on disk.'
},
'cgroup.vfs_read_error':{
info:'Failed calls to the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_read</a>. This chart may not show all filesystem events if it uses other functions to store data on disk.'
},
'cgroup.vfs_write_bytes':{
info:'Total of bytes successfully written using the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_write</a>.'
},
'cgroup.vfs_read_bytes':{
info:'Total of bytes successfully read using the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_read</a>.'
},
'cgroup.process_create':{
info:'Calls to either <a href="https://programming.vip/docs/the-execution-procedure-of-do_fork-function-in-linux.html" target="_blank">do_fork</a>, or <code>kernel_clone</code> if you are running kernel newer than 5.9.16, to create a new task, which is the common name used to define process and tasks inside the kernel. Netdata identifies the process by counting the number of calls to <a href="https://linux.die.net/man/2/clone" target="_blank">sys_clone</a> that do not have the flag <code>CLONE_THREAD</code> set.'
},
'cgroup.thread_create':{
info:'Calls to either <a href="https://programming.vip/docs/the-execution-procedure-of-do_fork-function-in-linux.html" target="_blank">do_fork</a>, or <code>kernel_clone</code> if you are running kernel newer than 5.9.16, to create a new task, which is the common name used to define process and tasks inside the kernel. Netdata identifies the threads by counting the number of calls to <a href="https://linux.die.net/man/2/clone" target="_blank">sys_clone</a> that have the flag <code>CLONE_THREAD</code> set.'
},
'cgroup.task_exit':{
info:'Calls to the function responsible for closing (<a href="https://www.informit.com/articles/article.aspx?p=370047&seqNum=4" target="_blank">do_exit</a>) tasks.'
},
'cgroup.task_close':{
info:'Calls to the functions responsible for releasing (<a href="https://www.informit.com/articles/article.aspx?p=370047&seqNum=4" target="_blank">release_task</a>) tasks.'
},
'cgroup.task_error':{
info:'Number of errors to create a new process or thread. This chart is provided by eBPF plugin.'
},
'cgroup.dc_ratio':{
info:'Percentage of file accesses that were present in the directory cache. 100% means that every file that was accessed was present in the directory cache. If files are not present in the directory cache 1) they are not present in the file system, 2) the files were not accessed before. Read more about <a href="https://www.kernel.org/doc/htmldocs/filesystems/the_directory_cache.html" target="_blank">directory cache</a>. Netdata also gives a summary for these charts in <a href="#menu_filesystem_submenu_directory_cache__eBPF_">Filesystem submenu</a>.'
},
'cgroup.dc_reference':{
info:'Counters of file accesses. <code>Reference</code> is when there is a file access, see the <code>filesystem.dc_reference</code> chart for more context. Read more about <a href="https://www.kernel.org/doc/htmldocs/filesystems/the_directory_cache.html" target="_blank">directory cache</a>.'
},
'cgroup.dc_not_cache':{
info:'Counters of file accesses. <code>Slow</code> is when there is a file access and the file is not present in the directory cache, see the <code>filesystem.dc_reference</code> chart for more context. Read more about <a href="https://www.kernel.org/doc/htmldocs/filesystems/the_directory_cache.html" target="_blank">directory cache</a>.'
},
'cgroup.dc_not_found':{
info:'Counters of file accesses. <code>Miss</code> is when there is file access and the file is not found in the filesystem, see the <code>filesystem.dc_reference</code> chart for more context. Read more about <a href="https://www.kernel.org/doc/htmldocs/filesystems/the_directory_cache.html" target="_blank">directory cache</a>.'
},
'cgroup.shmget':{
info:'Number of times the syscall <code>shmget</code> is called. Netdata also gives a summary for these charts in <a href="#menu_system_submenu_ipc_shared_memory">System overview</a>.'
info:'Bytes received by functions <code>tcp_cleanup_rbuf</code> . We use <code>tcp_cleanup_rbuf</code> instead <code>tcp_recvmsg</code>, because this last misses <code>tcp_read_sock()</code> traffic and we would also need to have more probes to get the socket and package size.'
info:'When the processor needs to read or write a location in main memory, it checks for a corresponding entry in the page cache. If the entry is there, a page cache hit has occurred and the read is from the cache. If the entry is not there, a page cache miss has occurred and the kernel allocates a new entry and copies in data from the disk. Netdata calculates the percentage of accessed files that are cached on memory. <a href="https://github.com/iovisor/bcc/blob/master/tools/cachestat.py#L126-L138" target="_blank">The ratio</a> is calculated counting the accessed cached pages (without counting dirty pages and pages added because of read misses) divided by total access without dirty pages.'
info:'Number of <a href="https://en.wikipedia.org/wiki/Page_cache#Memory_conservation" target="_blank">dirty(modified) pages</a> cache. Pages in the page cache modified after being brought in are called dirty pages. Since non-dirty pages in the page cache have identical copies in <a href="https://en.wikipedia.org/wiki/Secondary_storage" target="_blank">secondary storage</a> (e.g. hard disk drive or solid-state drive), discarding and reusing their space is much quicker than paging out application memory, and is often preferred over flushing the dirty pages into secondary storage and reusing their space.'
info:'When the processor needs to read or write a location in main memory, it checks for a corresponding entry in the page cache. If the entry is there, a page cache hit has occurred and the read is from the cache. Hits show pages accessed that were not modified (we are excluding dirty pages), this counting also excludes the recent pages inserted for read.'
},
'cgroup.cachestat_misses':{
info:'When the processor needs to read or write a location in main memory, it checks for a corresponding entry in the page cache. If the entry is not there, a page cache miss has occurred and the cache allocates a new entry and copies in data for the main memory. Misses count page insertions to the memory not related to writing.'
info:'The amount of data transferred to specific devices as seen by the throttling policy.'
},
'services.throttle_io_ops_read':{
info:'The number of read operations performed on specific devices as seen by the throttling policy.'
},
'services.throttle_io_ops_write':{
info:'The number of write operations performed on specific devices as seen by the throttling policy.'
},
'services.queued_io_ops_read':{
info:'The number of queued read requests.'
},
'services.queued_io_ops_write':{
info:'The number of queued write requests.'
},
'services.merged_io_ops_read':{
info:'The number of read requests merged.'
},
'services.merged_io_ops_write':{
info:'The number of write requests merged.'
},
'services.swap_read':{
info:'The function <code>swap_readpage</code> is called when the kernel reads a page from swap memory. This chart is provided by eBPF plugin.'
},
'services.swap_write':{
info:'The function <code>swap_writepage</code> is called when the kernel writes a page to swap memory. This chart is provided by eBPF plugin.'
},
'services.fd_open':{
info:'Calls to the internal function <code>do_sys_open</code> (for kernels newer than <code>5.5.19</code> we add a kprobe to <code>do_sys_openat2</code>. ), which is the common function called from'+
' and <a href="https://www.man7.org/linux/man-pages/man2/openat.2.html" target="_blank">openat(2)</a>. '
},
'services.fd_open_error':{
info:'Failed calls to the internal function <code>do_sys_open</code> (for kernels newer than <code>5.5.19</code> we add a kprobe to <code>do_sys_openat2</code>. ).'
},
'services.fd_close':{
info:'Calls to the internal function <a href="https://elixir.bootlin.com/linux/v5.10/source/fs/file.c#L665" target="_blank">__close_fd</a> or <a href="https://elixir.bootlin.com/linux/v5.11/source/fs/file.c#L617" target="_blank">close_fd</a> according to your kernel version, which is called from'+
info:'Failed calls to the internal function <a href="https://elixir.bootlin.com/linux/v5.10/source/fs/file.c#L665" target="_blank">__close_fd</a> or <a href="https://elixir.bootlin.com/linux/v5.11/source/fs/file.c#L617" target="_blank">close_fd</a> according to your kernel version.'
},
'services.vfs_unlink':{
info:'Calls to the function <a href="https://www.kernel.org/doc/htmldocs/filesystems/API-vfs-unlink.html" target="_blank">vfs_unlink</a>. This chart does not show all events that remove files from the filesystem, because filesystems can create their own functions to remove files.'
},
'services.vfs_write':{
info:'Successful calls to the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_write</a>. This chart may not show all filesystem events if it uses other functions to store data on disk.'
},
'services.vfs_write_error':{
info:'Failed calls to the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_write</a>. This chart may not show all filesystem events if it uses other functions to store data on disk.'
},
'services.vfs_read':{
info:'Successful calls to the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_read</a>. This chart may not show all filesystem events if it uses other functions to store data on disk.'
},
'services.vfs_read_error':{
info:'Failed calls to the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_read</a>. This chart may not show all filesystem events if it uses other functions to store data on disk.'
},
'services.vfs_write_bytes':{
info:'Total of bytes successfully written using the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_write</a>.'
},
'services.vfs_read_bytes':{
info:'Total of bytes successfully read using the function <a href="https://topic.alibabacloud.com/a/kernel-state-file-operation-__-work-information-kernel_8_8_20287135.html" target="_blank">vfs_read</a>.'
},
'services.process_create':{
info:'Calls to either <a href="https://programming.vip/docs/the-execution-procedure-of-do_fork-function-in-linux.html" target="_blank">do_fork</a>, or <code>kernel_clone</code> if you are running kernel newer than 5.9.16, to create a new task, which is the common name used to define process and tasks inside the kernel. Netdata identifies the process by counting the number of calls to <a href="https://linux.die.net/man/2/clone" target="_blank">sys_clone</a> that do not have the flag <code>CLONE_THREAD</code> set.'
},
'services.thread_create':{
info:'Calls to either <a href="https://programming.vip/docs/the-execution-procedure-of-do_fork-function-in-linux.html" target="_blank">do_fork</a>, or <code>kernel_clone</code> if you are running kernel newer than 5.9.16, to create a new task, which is the common name used to define process and tasks inside the kernel. Netdata identifies the threads by counting the number of calls to <a href="https://linux.die.net/man/2/clone" target="_blank">sys_clone</a> that have the flag <code>CLONE_THREAD</code> set.'
},
'services.task_exit':{
info:'Calls to the functions responsible for closing (<a href="https://www.informit.com/articles/article.aspx?p=370047&seqNum=4" target="_blank">do_exit</a>) tasks.'
},
'services.task_close':{
info:'Calls to the functions responsible for releasing (<a href="https://www.informit.com/articles/article.aspx?p=370047&seqNum=4" target="_blank">release_task</a>) tasks.'
},
'services.task_error':{
info:'Number of errors to create a new process or thread. This chart is provided by eBPF plugin.'
},
'services.dc_ratio':{
info:'Percentage of file accesses that were present in the directory cache. 100% means that every file that was accessed was present in the directory cache. If files are not present in the directory cache 1) they are not present in the file system, 2) the files were not accessed before. Read more about <a href="https://www.kernel.org/doc/htmldocs/filesystems/the_directory_cache.html" target="_blank">directory cache</a>. Netdata also gives a summary for these charts in <a href="#menu_filesystem_submenu_directory_cache__eBPF_">Filesystem submenu</a>.'
},
'services.dc_reference':{
info:'Counters of file accesses. <code>Reference</code> is when there is a file access, see the <code>filesystem.dc_reference</code> chart for more context. Read more about <a href="https://www.kernel.org/doc/htmldocs/filesystems/the_directory_cache.html" target="_blank">directory cache</a>.'
},
'services.dc_not_cache':{
info:'Counters of file accesses. <code>Slow</code> is when there is a file access and the file is not present in the directory cache, see the <code>filesystem.dc_reference</code> chart for more context. Read more about <a href="https://www.kernel.org/doc/htmldocs/filesystems/the_directory_cache.html" target="_blank">directory cache</a>.'
},
'services.dc_not_found':{
info:'Counters of file accesses. <code>Miss</code> is when there is file access and the file is not found in the filesystem, see the <code>filesystem.dc_reference</code> chart for more context. Read more about <a href="https://www.kernel.org/doc/htmldocs/filesystems/the_directory_cache.html" target="_blank">directory cache</a>.'
},
'services.shmget':{
info:'Number of times the syscall <code>shmget</code> is called. Netdata also gives a summary for these charts in <a href="#menu_system_submenu_ipc_shared_memory">System overview</a>.'
},
'services.shmat':{
info:'Number of times the syscall <code>shmat</code> is called.'
},
'services.shmdt':{
info:'Number of times the syscall <code>shmdt</code> is called.'
},
'services.shmctl':{
info:'Number of times the syscall <code>shmctl</code> is called.'
},
'services.net_bytes_send':{
info:'Bytes sent by functions <code>tcp_sendmsg</code>.'
},
'services.net_bytes_recv':{
info:'Bytes received by functions <code>tcp_cleanup_rbuf</code> . We use <code>tcp_cleanup_rbuf</code> instead <code>tcp_recvmsg</code>, because this last misses <code>tcp_read_sock()</code> traffic and we would also need to have more probes to get the socket and package size.'
},
'services.net_tcp_send':{
info:'The function <code>tcp_sendmsg</code> is used to collect number of bytes sent from TCP connections.'
},
'services.net_tcp_recv':{
info:'The function <code>tcp_cleanup_rbuf</code> is used to collect number of bytes received from TCP connections.'
info:'When the processor needs to read or write a location in main memory, it checks for a corresponding entry in the page cache. If the entry is there, a page cache hit has occurred and the read is from the cache. If the entry is not there, a page cache miss has occurred and the kernel allocates a new entry and copies in data from the disk. Netdata calculates the percentage of accessed files that are cached on memory. <a href="https://github.com/iovisor/bcc/blob/master/tools/cachestat.py#L126-L138" target="_blank">The ratio</a> is calculated counting the accessed cached pages (without counting dirty pages and pages added because of read misses) divided by total access without dirty pages.'
},
'services.cachestat_dirties':{
info:'Number of <a href="https://en.wikipedia.org/wiki/Page_cache#Memory_conservation" target="_blank">dirty(modified) pages</a> cache. Pages in the page cache modified after being brought in are called dirty pages. Since non-dirty pages in the page cache have identical copies in <a href="https://en.wikipedia.org/wiki/Secondary_storage" target="_blank">secondary storage</a> (e.g. hard disk drive or solid-state drive), discarding and reusing their space is much quicker than paging out application memory, and is often preferred over flushing the dirty pages into secondary storage and reusing their space.'
},
'services.cachestat_hits':{
info:'When the processor needs to read or write a location in main memory, it checks for a corresponding entry in the page cache. If the entry is there, a page cache hit has occurred and the read is from the cache. Hits show pages accessed that were not modified (we are excluding dirty pages), this counting also excludes the recent pages inserted for read.'
},
'services.cachestat_misses':{
info:'When the processor needs to read or write a location in main memory, it checks for a corresponding entry in the page cache. If the entry is not there, a page cache miss has occurred and the cache allocates a new entry and copies in data for the main memory. Misses count page insertions to the memory not related to writing.'
info:'The usage and available space in all ceph cluster.'
},
'ceph.general_objects':{
info:'Total number of objects storage on ceph cluster.'
},
'ceph.general_bytes':{
info:'Cluster read and write data per second.'
},
'ceph.general_operations':{
info:'Number of read and write operations per second.'
},
'ceph.general_latency':{
info:'Total of apply and commit latency in all OSDs. The apply latency is the total time taken to flush an update to disk. The commit latency is the total time taken to commit an operation to the journal.'
},
'ceph.pool_usage':{
info:'The usage space in each pool.'
},
'ceph.pool_objects':{
info:'Number of objects presents in each pool.'
},
'ceph.pool_read_bytes':{
info:'The rate of read data per second in each pool.'
},
'ceph.pool_write_bytes':{
info:'The rate of write data per second in each pool.'
},
'ceph.pool_read_objects':{
info:'Number of read objects per second in each pool.'
},
'ceph.pool_write_objects':{
info:'Number of write objects per second in each pool.'
info:'Web server responses by type. <code>success</code> includes <b>1xx</b>, <b>2xx</b>, <b>304</b> and <b>401</b>, <code>error</code> includes <b>5xx</b>, <code>redirect</code> includes <b>3xx</b> except <b>304</b>, <code>bad</code> includes <b>4xx</b> except <b>401</b>, <code>other</code> are all the other responses.',
'According to the standards <code>1xx</code> are informational responses, '+
'<code>2xx</code> are successful responses, '+
'<code>3xx</code> are redirects (although they include <b>304</b> which is used as "<b>not modified</b>"), '+
'<code>4xx</code> are bad requests, '+
'<code>5xx</code> are internal server errors, '+
'<code>other</code> are non-standard responses, '+
'<code>unmatched</code> counts the lines in the log file that are not matched by the plugin (<a href="https://github.com/netdata/netdata/issues/new?title=web_log%20reports%20unmatched%20lines&body=web_log%20plugin%20reports%20unmatched%20lines.%0A%0AThis%20is%20my%20log:%0A%0A%60%60%60txt%0A%0Aplease%20paste%20your%20web%20server%20log%20here%0A%0A%60%60%60" target="_blank">let us know</a> if you have any unmatched).'
info:'Number of responses for each response code individually.'
},
'web_log.requests_per_ipproto':{
info:'Web server requests received per IP protocol version.'
},
'web_log.clients':{
info:'Unique client IPs accessing the web server, within each data collection iteration. If data collection is <b>per second</b>, this chart shows <b>unique client IPs per second</b>.'
},
'web_log.clients_all':{
info:'Unique client IPs accessing the web server since the last restart of netdata. This plugin keeps in memory all the unique IPs that have accessed the web server. On very busy web servers (several millions of unique IPs) you may want to disable this chart (check <a href="https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/web_log/web_log.conf" target="_blank"><code>/etc/netdata/python.d/web_log.conf</code></a>).'
'According to HTTP standards <code>1xx</code> are informational responses, '+
'<code>2xx</code> are successful responses, '+
'<code>3xx</code> are redirects (although they include <b>304</b> which is used as "<b>not modified</b>"), '+
'<code>4xx</code> are bad requests, '+
'<code>5xx</code> are internal server errors. '+
'Squid also defines <code>000</code> mostly for UDP requests, and '+
'<code>6xx</code> for broken upstream servers sending wrong headers. '+
'Finally, <code>other</code> are non-standard responses, and '+
'<code>unmatched</code> counts the lines in the log file that are not matched by the plugin (<a href="https://github.com/netdata/netdata/issues/new?title=web_log%20reports%20unmatched%20lines&body=web_log%20plugin%20reports%20unmatched%20lines.%0A%0AThis%20is%20my%20log:%0A%0A%60%60%60txt%0A%0Aplease%20paste%20your%20web%20server%20log%20here%0A%0A%60%60%60" target="_blank">let us know</a> if you have any unmatched).'
info:'Number of responses for each response code individually.'
},
'web_log.squid_clients':{
info:'Unique client IPs accessing squid, within each data collection iteration. If data collection is <b>per second</b>, this chart shows <b>unique client IPs per second</b>.'
},
'web_log.squid_clients_all':{
info:'Unique client IPs accessing squid since the last restart of netdata. This plugin keeps in memory all the unique IPs that have accessed the server. On very busy squid servers (several millions of unique IPs) you may want to disable this chart (check <a href="https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/web_log/web_log.conf" target="_blank"><code>/etc/netdata/python.d/web_log.conf</code></a>).'
},
'web_log.squid_transport_methods':{
info:'Break down per delivery method: <code>TCP</code> are requests on the HTTP port (usually 3128), '+
'<code>UDP</code> are requests on the ICP port (usually 3130), or HTCP port (usually 4128). '+
'If ICP logging was disabled using the log_icp_queries option, no ICP replies will be logged. '+
'<code>NONE</code> are used to state that squid delivered an unusual response or no response at all. '+
'Seen with cachemgr requests and errors, usually when the transaction fails before being classified into one of the above outcomes. '+
'Also seen with responses to <code>CONNECT</code> requests.'
},
'web_log.squid_code':{
info:'These are combined squid result status codes. A break down per component is given in the following charts. '+
info:'These tags are optional and describe why the particular handling was performed or where the request came from. '+
'<code>CLIENT</code> means that the client request placed limits affecting the response. Usually seen with client issued a <b>no-cache</b>, or analogous cache control command along with the request. Thus, the cache has to validate the object.'+
'<code>IMS</code> states that the client sent a revalidation (conditional) request. '+
'<code>ASYNC</code>, is used when the request was generated internally by Squid. Usually this is background fetches for cache information exchanges, background revalidation from stale-while-revalidate cache controls, or ESI sub-objects being loaded. '+
'<code>SWAPFAIL</code> is assigned when the object was believed to be in the cache, but could not be accessed. A new copy was requested from the server. '+
'<code>REFRESH</code> when a revalidation (conditional) request was sent to the server. '+
'<code>SHARED</code> when this request was combined with an existing transaction by collapsed forwarding. NOTE: the existing request is not marked as SHARED. '+
'<code>REPLY</code> when particular handling was requested in the HTTP reply from server or peer. Usually seen on DENIED due to http_reply_access ACLs preventing delivery of servers response object to the client.'
},
'web_log.squid_object_types':{
info:'These tags are optional and describe what type of object was produced. '+
'<code>NEGATIVE</code> is only seen on HIT responses, indicating the response was a cached error response. e.g. <b>404 not found</b>. '+
'<code>STALE</code> means the object was cached and served stale. This is usually caused by stale-while-revalidate or stale-if-error cache controls. '+
'<code>OFFLINE</code> when the requested object was retrieved from the cache during offline_mode. The offline mode never validates any object. '+
'<code>INVALID</code> when an invalid request was received. An error response was delivered indicating what the problem was. '+
'<code>FAIL</code> is only seen on <code>REFRESH</code> to indicate the revalidation request failed. The response object may be the server provided network error or the stale object which was being revalidated depending on stale-if-error cache control. '+
'<code>MODIFIED</code> is only seen on <code>REFRESH</code> responses to indicate revalidation produced a new modified object. '+
'<code>UNMODIFIED</code> is only seen on <code>REFRESH</code> responses to indicate revalidation produced a <b>304</b> (Not Modified) status, which was relayed to the client. '+
'<code>REDIRECT</code> when squid generated an HTTP redirect response to this request.'
},
'web_log.squid_cache_events':{
info:'These tags are optional and describe whether the response was loaded from cache, network, or otherwise. '+
'<code>HIT</code> when the response object delivered was the local cache object. '+
'<code>MEM</code> when the response object came from memory cache, avoiding disk accesses. Only seen on HIT responses. '+
'<code>MISS</code> when the response object delivered was the network response object. '+
'<code>DENIED</code> when the request was denied by access controls. '+
'<code>NOFETCH</code> an ICP specific type, indicating service is alive, but not to be used for this request (sent during "-Y" startup, or during frequent failures, a cache in hit only mode will return either UDP_HIT or UDP_MISS_NOFETCH. Neighbours will thus only fetch hits). '+
'<code>TUNNEL</code> when a binary tunnel was established for this transaction.'
},
'web_log.squid_transport_errors':{
info:'These tags are optional and describe some error conditions which occurred during response delivery (if any). '+
'<code>ABORTED</code> when the response was not completed due to the connection being aborted (usually by the client). '+
'<code>TIMEOUT</code>, when the response was not completed due to a connection timeout.'
info:'Web server responses by type. <code>success</code> includes <b>1xx</b>, <b>2xx</b>, <b>304</b> and <b>401</b>, <code>error</code> includes <b>5xx</b>, <code>redirect</code> includes <b>3xx</b> except <b>304</b>, <code>bad</code> includes <b>4xx</b> except <b>401</b>, <code>other</code> are all the other responses.',
info:'Positive <code>Grid</code> values mean that power is coming from the grid. Negative values are excess power that is going back into the grid, possibly selling it. '+
'<code>Photovoltaics</code> is the power generated from the solar panels. '+
'<code>Accumulator</code> is the stored power in the accumulator, if one is present.'
},
'fronius.autonomy':{
commonMin:true,
commonMax:true,
valueRange:"[0, 100]",
info:'The <code>Autonomy</code> is the percentage of how autonomous the installation is. An autonomy of 100 % means that the installation is producing more energy than it is needed. '+
'The <code>Self consumption</code> indicates the ratio between the current power generated and the current load. When it reaches 100 %, the <code>Autonomy</code> declines, since the solar panels can not produce enough energy and need support from the grid.'
info:'The <code>latency</code> describes the time spent connecting to a TCP port. No data is sent or received. '+
'Currently, the accuracy of the latency is low and should be used as reference only.'
},
'portcheck.status':{
valueRange:"[0, 1]",
info:'The <code>status</code> chart verifies the availability of the service. '+
'Each status dimension will have a value of <code>1</code> if triggered. Dimension <code>success</code> is <code>1</code> only if connection could be established. '+
'This chart is most useful for alarms and third-party apps.'
info:'In normal operation, chronyd never steps the system clock, because any jump in the timescale can have adverse consequences for certain application programs. Instead, any error in the system clock is corrected by slightly speeding up or slowing down the system clock until the error has been removed, and then returning to the system clock’s normal speed. A consequence of this is that there will be a period when the system clock (as read by other programs using the <code>gettimeofday()</code> system call, or by the <code>date</code> command in the shell) will be different from chronyd\'s estimate of the current true time (which it reports to NTP clients when it is operating in server mode). The value reported on this line is the difference due to this effect.',
colors:NETDATA.colors[3]
},
'chrony.offsets':{
info:'<code>last offset</code> is the estimated local offset on the last clock update. <code>RMS offset</code> is a long-term average of the offset value.',
height:0.5
},
'chrony.stratum':{
info:'The <code>stratum</code> indicates how many hops away from a computer with an attached reference clock we are. Such a computer is a stratum-1 computer.',
decimalDigits:0,
height:0.5
},
'chrony.root':{
info:'Estimated delays against the root time server this system is synchronized with. <code>delay</code> is the total of the network path delays to the stratum-1 computer from which the computer is ultimately synchronised. <code>dispersion</code> is the total dispersion accumulated through all the computers back to the stratum-1 computer from which the computer is ultimately synchronised. Dispersion is due to system clock resolution, statistical measurement variations etc.'
},
'chrony.frequency':{
info:'The <code>frequency</code> is the rate by which the system\'s clock would be would be wrong if chronyd was not correcting it. It is expressed in ppm (parts per million). For example, a value of 1ppm would mean that when the system\'s clock thinks it has advanced 1 second, it has actually advanced by 1.000001 seconds relative to true time.',
colors:NETDATA.colors[0]
},
'chrony.residualfreq':{
info:'This shows the <code>residual frequency</code> for the currently selected reference source. '+
'It reflects any difference between what the measurements from the reference source indicate the '+
'frequency should be and the frequency currently being used. The reason this is not always zero is '+
'that a smoothing procedure is applied to the frequency. Each time a measurement from the reference '+
'source is obtained and a new residual frequency computed, the estimated accuracy of this residual '+
'is compared with the estimated accuracy (see <code>skew</code>) of the existing frequency value. '+
'A weighted average is computed for the new frequency, with weights depending on these accuracies. '+
'If the measurements from the reference source follow a consistent trend, the residual will be '+
'driven to zero over time.',
height:0.5,
colors:NETDATA.colors[3]
},
'chrony.skew':{
info:'The estimated error bound on the frequency.',
height:0.5,
colors:NETDATA.colors[5]
},
'couchdb.active_tasks':{
info:'Active tasks running on this CouchDB <b>cluster</b>. Four types of tasks currently exist: indexer (view building), replication, database compaction and view compaction.'
info:'Detailed breakdown of any replication jobs in progress on this node. For more information, see the <a href="http://docs.couchdb.org/en/latest/replication/replicator.html" target="_blank">replicator documentation</a>.'
info:'Count of all files held open by CouchDB. If this value seems pegged at 1024 or 4096, your server process is probably hitting the open file handle limit and <a href="http://docs.couchdb.org/en/latest/maintenance/performance.html#pam-and-ulimit" target="_blank">needs to be increased.</a>'
info:'Physical disk usage of BTRFS. The disk space reported here is the raw physical disk space assigned to the BTRFS volume (i.e. <b>before any RAID levels</b>). BTRFS uses a two-stage allocator, first allocating large regions of disk space for one type of block (data, metadata, or system), and then using a regular block allocator inside those regions. <code>unallocated</code> is the physical disk space that is not allocated yet and is available to become data, metadata or system on demand. When <code>unallocated</code> is zero, all available disk space has been allocated to a specific function. Healthy volumes should ideally have at least five percent of their total space <code>unallocated</code>. You can keep your volume healthy by running the <code>btrfs balance</code> command on it regularly (check <code>man btrfs-balance</code> for more info). Note that some of the space listed as <code>unallocated</code> may not actually be usable if the volume uses devices of different sizes.',
info:'Logical disk usage for BTRFS data. Data chunks are used to store the actual file data (file contents). The disk space reported here is the usable allocation (i.e. after any striping or replication). Healthy volumes should ideally have no more than a few GB of free space reported here persistently. Running <code>btrfs balance</code> can help here.'
info:'Logical disk usage for BTRFS metadata. Metadata chunks store most of the filesystem internal structures, as well as information like directory structure and file names. The disk space reported here is the usable allocation (i.e. after any striping or replication). Healthy volumes should ideally have no more than a few GB of free space reported here persistently. Running <code>btrfs balance</code> can help here.'
info:'Logical disk usage for BTRFS system. System chunks store information about the allocation of other chunks. The disk space reported here is the usable allocation (i.e. after any striping or replication). The values reported here should be relatively small compared to Data and Metadata, and will scale with the volume size and overall space usage.'
info:'Overall totals for channels, consumers, connections, queues and exchanges.'
},
'rabbitmq.file_descriptors':{
info:'Total number of used filed descriptors. See <code><a href="https://www.rabbitmq.com/production-checklist.html#resource-limits-file-handle-limit" target="_blank">Open File Limits</a></code> for further details.',
colors:NETDATA.colors[3]
},
'rabbitmq.sockets':{
info:'Total number of used socket descriptors. Each used socket also counts as a used file descriptor. See <code><a href="https://www.rabbitmq.com/production-checklist.html#resource-limits-file-handle-limit" target="_blank">Open File Limits</a></code> for further details.',
colors:NETDATA.colors[3]
},
'rabbitmq.processes':{
info:'Total number of processes running within the Erlang VM. This is not the same as the number of processes running on the host.',
colors:NETDATA.colors[3]
},
'rabbitmq.erlang_run_queue':{
info:'Number of Erlang processes the Erlang schedulers have queued to run.',
colors:NETDATA.colors[3]
},
'rabbitmq.memory':{
info:'Total amount of memory used by the RabbitMQ. This is a complex statistic that can be further analyzed in the management UI. See <code><a href="https://www.rabbitmq.com/production-checklist.html#resource-limits-ram" target="_blank">Memory</a></code> for further details.',
colors:NETDATA.colors[3]
},
'rabbitmq.disk_space':{
info:'Total amount of disk space consumed by the message store(s). See <code><a href="https://www.rabbitmq.com/production-checklist.html#resource-limits-disk-space" target=_"blank">Disk Space Limits</a></code> for further details.',
colors:NETDATA.colors[3]
},
'rabbitmq.queue_messages':{
info:'Total amount of messages and their states in this queue.',
info:'For hosts without any time critical services an offset of < 100 ms should be acceptable even with high network latencies. For hosts with time critical services an offset of about 0.01 ms or less can be achieved by using peers with low delays and configuring optimal <b>poll exponent</b> values.',
colors:NETDATA.colors[4]
},
'ntpd.sys_jitter':{
info:'The jitter statistics are exponentially-weighted RMS averages. The system jitter is defined in the NTPv4 specification; the clock jitter statistic is computed by the clock discipline module.'
},
'ntpd.sys_frequency':{
info:'The frequency offset is shown in ppm (parts per million) relative to the frequency of the system. The frequency correction needed for the clock can vary significantly between boots and also due to external influences like temperature or radiation.',
colors:NETDATA.colors[2],
height:0.6
},
'ntpd.sys_wander':{
info:'The wander statistics are exponentially-weighted RMS averages.',
colors:NETDATA.colors[3],
height:0.6
},
'ntpd.sys_rootdelay':{
info:'The rootdelay is the round-trip delay to the primary reference clock, similar to the delay shown by the <code>ping</code> command. A lower delay should result in a lower clock offset.',
colors:NETDATA.colors[1]
},
'ntpd.sys_stratum':{
info:'The distance in "hops" to the primary reference clock',
info:'Time constants and poll intervals are expressed as exponents of 2. The default poll exponent of 6 corresponds to a poll interval of 64 s. For typical Internet paths, the optimum poll interval is about 64 s. For fast LANs with modern computers, a poll exponent of 4 (16 s) is appropriate. The <a href="http://doc.ntp.org/current-stable/poll.html" target="_blank">poll process</a> sends NTP packets at intervals determined by the clock discipline algorithm.',
info:'The offset of the peer clock relative to the system clock in milliseconds. Smaller values here weight peers more heavily for selection after the initial synchronization of the local clock. For a system providing time service to other systems, these should be as low as possible.'
},
'ntpd.peer_delay':{
info:'The round-trip time (RTT) for communication with the peer, similar to the delay shown by the <code>ping</code> command. Not as critical as either the offset or jitter, but still factored into the selection algorithm (because as a general rule, lower delay means more accurate time). In most cases, it should be below 100ms.'
},
'ntpd.peer_dispersion':{
info:'This is a measure of the estimated error between the peer and the local system. Lower values here are better.'
},
'ntpd.peer_jitter':{
info:'This is essentially a remote estimate of the peer\'s <code>system_jitter</code> value. Lower values here weight highly in favor of peer selection, and this is a good indicator of overall quality of a given time server (good servers will have values not exceeding single digit milliseconds here, with high quality stratum one servers regularly having sub-millisecond jitter).'
info:'This variable is used in interleaved mode (used only in NTP symmetric and broadcast modes). See <a href="http://doc.ntp.org/current-stable/xleave.html" target="_blank">NTP Interleaved Modes</a>.'
info:'For a stratum 1 server, this is the access latency for the reference clock. For lower stratum servers, it is the sum of the <code>peer_delay</code> and <code>peer_rootdelay</code> for the system they are syncing off of. Similarly to <code>peer_delay</code>, lower values here are technically better, but have limited influence in peer selection.'
},
'ntpd.peer_rootdisp':{
info:'Is the same as <code>peer_rootdelay</code>, but measures accumulated <code>peer_dispersion</code> instead of accumulated <code>peer_delay</code>.'
},
'ntpd.peer_hmode':{
info:'The <code>peer_hmode</code> and <code>peer_pmode</code> variables give info about what mode the packets being sent to and received from a given peer are. Mode 1 is symmetric active (both the local system and the remote peer have each other declared as peers in <code>/etc/ntp.conf</code>), Mode 2 is symmetric passive (only one side has the other declared as a peer), Mode 3 is client, Mode 4 is server, and Mode 5 is broadcast (also used for multicast and manycast operation).',
height:0.2
},
'ntpd.peer_pmode':{
height:0.2
},
'ntpd.peer_hpoll':{
info:'The <code>peer_hpoll</code> and <code>peer_ppoll</code> variables are log2 representations of the polling interval in seconds.',
height:0.5
},
'ntpd.peer_ppoll':{
height:0.5
},
'ntpd.peer_precision':{
height:0.2
},
'spigotmc.tps':{
info:'The running 1, 5, and 15 minute average number of server ticks per second. An idealized server will show 20.0 for all values, but in practice this almost never happens. Typical servers should show approximately 19.98-20.0 here. Lower values indicate progressively more server-side lag (and thus that you need better hardware for your server or a lower user limit). For every 0.05 ticks below 20, redstone clocks will lag behind by approximately 0.25%. Values below approximately 19.50 may interfere with complex free-running redstone circuits and will noticeably slow down growth.'
info:'The total number of tasks and the number of active tasks. Active tasks are those which are either currently being processed, or are partially processed but suspended.'
info:'Counts of tasks in each task state. The normal sequence of states is <code>New</code>, <code>Downloading</code>, <code>Ready to Run</code>, <code>Uploading</code>, <code>Uploaded</code>. Tasks which are marked <code>Ready to Run</code> may be actively running, or may be waiting to be scheduled. <code>Compute Errors</code> are tasks which failed for some reason during execution. <code>Aborted</code> tasks were manually cancelled, and will not be processed. <code>Failed Uploads</code> are otherwise finished tasks which failed to upload to the server, and usually indicate networking issues.'
},
'boinc.sched':{
info:'Counts of active tasks in each scheduling state. <code>Scheduled</code> tasks are the ones which will run if the system is permitted to process tasks. <code>Preempted</code> tasks are on standby, and will run if a <code>Scheduled</code> task stops running for some reason. <code>Uninitialized</code> tasks should never be present, and indicate tha the scheduler has not tried to schedule them yet.'
},
'boinc.process':{
info:'Counts of active tasks in each process state. <code>Executing</code> tasks are running right now. <code>Suspended</code> tasks have an associated process, but are not currently running (either because the system isn\'t processing any tasks right now, or because they have been preempted by higher priority tasks). <code>Quit</code> tasks are exiting gracefully. <code>Aborted</code> tasks exceeded some resource limit, and are being shut down. <code>Copy Pending</code> tasks are waiting on a background file transfer to finish. <code>Uninitialized</code> tasks do not have an associated process yet.'
},
'w1sensor.temp':{
info:'Temperature derived from 1-Wire temperature sensors.'
},
'logind.sessions':{
info:'Shows the number of active sessions of each type tracked by logind.'
},
'logind.users':{
info:'Shows the number of active users of each type tracked by logind.'
},
'logind.seats':{
info:'Shows the number of active seats tracked by logind. Each seat corresponds to a combination of a display device and input device providing a physical presence for the system.'
'<code>1=ONLINE</code> backend server is fully operational, '+
'<code>2=SHUNNED</code> backend sever is temporarily taken out of use because of either too many connection errors in a time that was too short, or replication lag exceeded the allowed threshold, '+
'<code>3=OFFLINE_SOFT</code> when a server is put into OFFLINE_SOFT mode, new incoming connections aren\'t accepted anymore, while the existing connections are kept until they became inactive. In other words, connections are kept in use until the current transaction is completed. This allows to gracefully detach a backend, '+
'<code>4=OFFLINE_HARD</code> when a server is put into OFFLINE_HARD mode, the existing connections are dropped, while new incoming connections aren\'t accepted either. This is equivalent to deleting the server from a hostgroup, or temporarily taking it out of the hostgroup for maintenance work, '+
'<code>slow_queries</code> number of queries that ran for longer than the threshold in milliseconds defined in global variable <code>mysql-long_query_time</code>. '
info:'Percentage of used machine memory: <code>consumed</code> / <code>machine-memory-size</code>.'
},
'vsphere.host_mem_usage':{
info:
'<code>granted</code> is amount of machine memory that is mapped for a host, '+
'it equals sum of all granted metrics for all powered-on virtual machines, plus machine memory for vSphere services on the host. '+
'<code>consumed</code> is amount of machine memory used on the host, it includes memory used by the Service Console, the VMkernel, vSphere services, plus the total consumed metrics for all running virtual machines. '+
'For details see <a href="https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.resmgmt.doc/GUID-BFDC988B-F53D-4E97-9793-A002445AFAE1.html" target="_blank">Measuring and Differentiating Types of Memory Usage</a> and '+
'<code>granted</code> is amount of guest “physical” memory that is mapped to machine memory, it includes <code>shared</code> memory amount. '+
'<code>consumed</code> is amount of guest “physical” memory consumed by the virtual machine for guest memory, '+
'<code>consumed</code> = <code>granted</code> - <code>memory saved due to memory sharing</code>. '+
'<code>active</code> is amount of memory that is actively used, as estimated by VMkernel based on recently touched memory pages. '+
'<code>shared</code> is amount of guest “physical” memory shared with other virtual machines (through the VMkernel’s transparent page-sharing mechanism, a RAM de-duplication technique). '+
'For details see <a href="https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.resmgmt.doc/GUID-BFDC988B-F53D-4E97-9793-A002445AFAE1.html" target="_blank">Measuring and Differentiating Types of Memory Usage</a> and '+
info:'Total number of requests (log lines read). It includes <code>unmatched</code>.'
},
'squidlog.excluded_requests':{
info:'<code>unmatched</code> counts the lines in the log file that are not matched by the plugin parser (<a href="https://github.com/netdata/netdata/issues/new?title=squidlog%20reports%20unmatched%20lines&body=squidlog%20plugin%20reports%20unmatched%20lines.%0A%0AThis%20is%20my%20log:%0A%0A%60%60%60txt%0A%0Aplease%20paste%20your%20squid%20server%20log%20here%0A%0A%60%60%60" target="_blank">let us know</a> if you have any unmatched).'
},
'squidlog.type_requests':{
info:'Requests by response type:<br>'+
'<ul>'+
' <li><code>success</code> includes 1xx, 2xx, 0, 304, 401.</li>'+
' <li><code>error</code> includes 5xx and 6xx.</li>'+
' <li><code>redirect</code> includes 3xx except 304.</li>'+
' <li><code>bad</code> includes 4xx except 401.</li>'+
' </ul>'
},
'squidlog.http_status_code_class_responses':{
info:'The HTTP response status code classes. According to <a href="https://tools.ietf.org/html/rfc7231" target="_blank">rfc7231</a>:<br>'+
' <li><code>1xx</code> is informational responses.</li>'+
' <li><code>2xx</code> is successful responses.</li>'+
' <li><code>3xx</code> is redirects.</li>'+
' <li><code>4xx</code> is bad requests.</li>'+
' <li><code>5xx</code> is internal server errors.</li>'+
' </ul>'+
'Squid also uses <code>0</code> for a result code being unavailable, and <code>6xx</code> to signal an invalid header, a proxy error.'
},
'squidlog.http_status_code_responses':{
info:'Number of responses for each http response status code individually.'
},
'squidlog.uniq_clients':{
info:'Unique clients (requesting instances), within each data collection iteration. If data collection is <b>per second</b>, this chart shows <b>unique clients per second</b>.'
},
'squidlog.bandwidth':{
info:'The size is the amount of data delivered to the clients. Mind that this does not constitute the net object size, as headers are also counted. '+
'Also, failed requests may deliver an error page, the size of which is also logged here.'
},
'squidlog.response_time':{
info:'The elapsed time considers how many milliseconds the transaction busied the cache. It differs in interpretation between TCP and UDP:'+
'<ul>'+
' <li><code>TCP</code> this is basically the time from having received the request to when Squid finishes sending the last byte of the response.</li>'+
' <li><code>UDP</code> this is the time between scheduling a reply and actually sending it.</li>'+
' </ul>'+
'Please note that <b>the entries are logged after the reply finished being sent</b>, not during the lifetime of the transaction.'
},
'squidlog.cache_result_code_requests':{
info:'The Squid result code is composed of several tags (separated by underscore characters) which describe the response sent to the client. '+
info:'These tags are always present and describe delivery method.<br>'+
'<ul>'+
' <li><code>TCP</code> requests on the HTTP port (usually 3128).</li>'+
' <li><code>UDP</code> requests on the ICP port (usually 3130) or HTCP port (usually 4128).</li>'+
' <li><code>NONE</code> Squid delivered an unusual response or no response at all. Seen with cachemgr requests and errors, usually when the transaction fails before being classified into one of the above outcomes. Also seen with responses to CONNECT requests.</li>'+
info:'These tags are optional and describe why the particular handling was performed or where the request came from.<br>'+
'<ul>'+
' <li><code>CF</code> at least one request in this transaction was collapsed. See <a href="http://www.squid-cache.org/Doc/config/collapsed_forwarding/" target="_blank">collapsed_forwarding</a> for more details about request collapsing.</li>'+
' <li><code>CLIENT</code> usually seen with client issued a "no-cache", or analogous cache control command along with the request. Thus, the cache has to validate the object.</li>'+
' <li><code>IMS</code> the client sent a revalidation (conditional) request.</li>'+
' <li><code>ASYNC</code> the request was generated internally by Squid. Usually this is background fetches for cache information exchanges, background revalidation from <i>stale-while-revalidate</i> cache controls, or ESI sub-objects being loaded.</li>'+
' <li><code>SWAPFAIL</code> the object was believed to be in the cache, but could not be accessed. A new copy was requested from the server.</li>'+
' <li><code>REFRESH</code> a revalidation (conditional) request was sent to the server.</li>'+
' <li><code>SHARED</code> this request was combined with an existing transaction by collapsed forwarding.</li>'+
' <li><code>REPLY</code> the HTTP reply from server or peer. Usually seen on <code>DENIED</code> due to <a href="http://www.squid-cache.org/Doc/config/http_reply_access/" target="_blank">http_reply_access</a> ACLs preventing delivery of servers response object to the client.</li>'+
' </ul>'
},
'squidlog.cache_code_object_tag_requests':{
info:'These tags are optional and describe what type of object was produced.<br>'+
'<ul>'+
' <li><code>NEGATIVE</code> only seen on HIT responses, indicating the response was a cached error response. e.g. <b>404 not found</b>.</li>'+
' <li><code>STALE</code> the object was cached and served stale. This is usually caused by <i>stale-while-revalidate</i> or <i>stale-if-error</i> cache controls.</li>'+
' <li><code>OFFLINE</code> the requested object was retrieved from the cache during <a href="http://www.squid-cache.org/Doc/config/offline_mode/" target="_blank">offline_mode</a>. The offline mode never validates any object.</li>'+
' <li><code>INVALID</code> an invalid request was received. An error response was delivered indicating what the problem was.</li>'+
' <li><code>FAILED</code> only seen on <code>REFRESH</code> to indicate the revalidation request failed. The response object may be the server provided network error or the stale object which was being revalidated depending on stale-if-error cache control.</li>'+
' <li><code>MODIFIED</code> only seen on <code>REFRESH</code> responses to indicate revalidation produced a new modified object.</li>'+
' <li><code>UNMODIFIED</code> only seen on <code>REFRESH</code> responses to indicate revalidation produced a 304 (Not Modified) status. The client gets either a full 200 (OK), a 304 (Not Modified), or (in theory) another response, depending on the client request and other details.</li>'+
' <li><code>REDIRECT</code> Squid generated an HTTP redirect response to this request.</li>'+
' </ul>'
},
'squidlog.cache_code_load_source_tag_requests':{
info:'These tags are optional and describe whether the response was loaded from cache, network, or otherwise.<br>'+
'<ul>'+
' <li><code>HIT</code> the response object delivered was the local cache object.</li>'+
' <li><code>MEM</code> the response object came from memory cache, avoiding disk accesses. Only seen on HIT responses.</li>'+
' <li><code>MISS</code> the response object delivered was the network response object.</li>'+
' <li><code>DENIED</code> the request was denied by access controls.</li>'+
' <li><code>NOFETCH</code> an ICP specific type, indicating service is alive, but not to be used for this request.</li>'+
' <li><code>TUNNEL</code> a binary tunnel was established for this transaction.</li>'+
' </ul>'
},
'squidlog.cache_code_error_tag_requests':{
info:'These tags are optional and describe some error conditions which occurred during response delivery.<br>'+
'<ul>'+
' <li><code>ABORTED</code> the response was not completed due to the connection being aborted (usually by the client).</li>'+
' <li><code>TIMEOUT</code> the response was not completed due to a connection timeout.</li>'+
' <li><code>IGNORED</code> while refreshing a previously cached response A, Squid got a response B that was older than A (as determined by the Date header field). Squid ignored response B (and attempted to use A instead).</li>'+
info:'The request method to obtain an object. Please refer to section <a href="https://wiki.squid-cache.org/SquidFaq/SquidLogs#Request_methods" target="_blank">request-methods</a> for available methods and their description.'
info:'A code that explains how the request was handled, e.g. by forwarding it to a peer, or going straight to the source. '+
'Any hierarchy tag may be prefixed with <code>TIMEOUT_</code>, if the timeout occurs waiting for all ICP replies to return from the neighbours. The timeout is either dynamic, if the <a href="http://www.squid-cache.org/Doc/config/icp_query_timeout/" target="_blank">icp_query_timeout</a> was not set, or the time configured there has run up. '+
'Refer to <a href="https://wiki.squid-cache.org/SquidFaq/SquidLogs#Hierarchy_Codes" target="_blank">Hierarchy Codes</a> for details on hierarchy codes.'
},
'squidlog.server_address_forwarded_requests':{
info:'The IP address or hostname where the request (if a miss) was forwarded. For requests sent to origin servers, this is the origin server\'s IP address. '+
'For requests sent to a neighbor cache, this is the neighbor\'s hostname. NOTE: older versions of Squid would put the origin server hostname here.'
},
'squidlog.mime_type_requests':{
info:'The content type of the object as seen in the HTTP reply header. Please note that ICP exchanges usually don\'t have any content type.'
info:'Current combined cpu utilization, calculated as <code>(user+system)/num of logical cpus</code>.'
},
'cockroachdb.host_disk_bandwidth':{
info:'Summary disk bandwidth statistics across all system host disks.'
},
'cockroachdb.host_disk_operations':{
info:'Summary disk operations statistics across all system host disks.'
},
'cockroachdb.host_disk_iops_in_progress':{
info:'Summary disk iops in progress statistics across all system host disks.'
},
'cockroachdb.host_network_bandwidth':{
info:'Summary network bandwidth statistics across all system host network interfaces.'
},
'cockroachdb.host_network_packets':{
info:'Summary network packets statistics across all system host network interfaces.'
},
'cockroachdb.live_nodes':{
info:'Will be <code>0</code> if this node is not itself live.'
},
'cockroachdb.total_storage_capacity':{
info:'Entire disk capacity. It includes non-CR data, CR data, and empty space.'
},
'cockroachdb.storage_capacity_usability':{
info:'<code>usable</code> is sum of empty space and CR data, <code>unusable</code> is space used by non-CR data.'
},
'cockroachdb.storage_usable_capacity':{
info:'Breakdown of <code>usable</code> space.'
},
'cockroachdb.storage_used_capacity_percentage':{
info:'<code>total</code> is % of <b>total</b> space used, <code>usable</code> is % of <b>usable</b> space used.'
},
'cockroachdb.sql_bandwidth':{
info:'The total amount of SQL client network traffic.'
},
'cockroachdb.sql_errors':{
info:'<code>statement</code> is statements resulting in a planning or runtime error, '+
'<code>transaction</code> is SQL transactions abort errors.'
},
'cockroachdb.sql_started_ddl_statements':{
info:'The amount of <b>started</b> DDL (Data Definition Language) statements. '+
'This type means database schema changes. '+
'It includes <code>CREATE</code>, <code>ALTER</code>, <code>DROP</code>, <code>RENAME</code>, <code>TRUNCATE</code> and <code>COMMENT</code> statements.'
},
'cockroachdb.sql_executed_ddl_statements':{
info:'The amount of <b>executed</b> DDL (Data Definition Language) statements. '+
'This type means database schema changes. '+
'It includes <code>CREATE</code>, <code>ALTER</code>, <code>DROP</code>, <code>RENAME</code>, <code>TRUNCATE</code> and <code>COMMENT</code> statements.'
},
'cockroachdb.sql_started_dml_statements':{
info:'The amount of <b>started</b> DML (Data Manipulation Language) statements.'
},
'cockroachdb.sql_executed_dml_statements':{
info:'The amount of <b>executed</b> DML (Data Manipulation Language) statements.'
},
'cockroachdb.sql_started_tcl_statements':{
info:'The amount of <b>started</b> TCL (Transaction Control Language) statements.'
},
'cockroachdb.sql_executed_tcl_statements':{
info:'The amount of <b>executed</b> TCL (Transaction Control Language) statements.'
},
'cockroachdb.live_bytes':{
info:'The amount of live data used by both applications and the CockroachDB system.'
' <li><code>write too old</code> restarts due to a concurrent writer committing first.</li>'+
' <li><code>write too old (multiple)</code> restarts due to multiple concurrent writers committing first.</li>'+
' <li><code>forwarded timestamp (iso=serializable)</code> restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE".</li>'+
' <li><code>possible replay</code> restarts due to possible replays of command batches at the storage layer.</li>'+
' <li><code>async consensus failure</code> restarts due to async consensus writes that failed to leave intents.</li>'+
' <li><code>read within uncertainty interval</code> restarts due to reading a new value within the uncertainty interval.</li>'+
' <li><code>aborted</code> restarts due to an abort by a concurrent transaction (usually due to deadlock).</li>'+
' <li><code>push failure</code> restarts due to a transaction push failure.</li>'+
' <li><code>unknown</code> restarts due to a unknown reasons.</li>'+
' </ul>'
},
'cockroachdb.ranges':{
info:'CockroachDB stores all user data (tables, indexes, etc.) and almost all system data in a giant sorted map of key-value pairs. '+
'This keyspace is divided into "ranges", contiguous chunks of the keyspace, so that every key can always be found in a single range.'
},
'cockroachdb.ranges_replication_problem':{
info:'Ranges with not optimal number of replicas:<br>'+
'<ul>'+
' <li><code>unavailable</code> ranges with fewer live replicas than needed for quorum.</li>'+
' <li><code>under replicated</code> ranges with fewer live replicas than the replication target.</li>'+
' <li><code>over replicated</code> ranges with more live replicas than the replication target.</li>'+
' </ul>'
},
'cockroachdb.replicas':{
info:'CockroachDB replicates each range (3 times by default) and stores each replica on a different node.'
},
'cockroachdb.replicas_leaders':{
info:'For each range, one of the replicas is the <code>leader</code> for write requests, <code>not leaseholders</code> is the number of Raft leaders whose range lease is held by another store.'
},
'cockroachdb.replicas_leaseholders':{
info:'For each range, one of the replicas holds the "range lease". This replica, referred to as the <code>leaseholder</code>, is the one that receives and coordinates all read and write requests for the range.'
},
'cockroachdb.queue_processing_failures':{
info:'Failed replicas breakdown by queue:<br>'+
'<ul>'+
' <li><code>gc</code> replicas which failed processing in the GC queue.</li>'+
' <li><code>replica gc</code> replicas which failed processing in the replica GC queue.</li>'+
' <li><code>replication</code> replicas which failed processing in the replicate queue.</li>'+
' <li><code>split</code> replicas which failed processing in the split queue.</li>'+
' <li><code>consistency</code> replicas which failed processing in the consistency checker queue.</li>'+
' <li><code>raft log</code> replicas which failed processing in the Raft log queue.</li>'+
' <li><code>raft snapshot</code> replicas which failed processing in the Raft repair queue.</li>'+
' <li><code>time series maintenance</code> replicas which failed processing in the time series maintenance queue.</li>'+
' </ul>'
},
'cockroachdb.rebalancing_queries':{
info:'Number of kv-level requests received per second by the store, averaged over a large time period as used in rebalancing decisions.'
},
'cockroachdb.rebalancing_writes':{
info:'Number of keys written (i.e. applied by raft) per second to the store, averaged over a large time period as used in rebalancing decisions.'
},
'cockroachdb.slow_requests':{
info:'Requests that have been stuck for a long time.'
},
'cockroachdb.timeseries_samples':{
info:'The amount of metric samples written to disk.'
},
'cockroachdb.timeseries_write_errors':{
info:'The amount of errors encountered while attempting to write metrics to disk.'
info:'An IPC < 1.0 likely means memory bound, and an IPC > 1.0 likely means instruction bound. For more details about the metric take a look at this <a href="https://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html" target="_blank">blog post</a>.'
info:'This chart does not show all events that remove files from the file system, because file systems can create their own functions to remove files, it shows calls for the function <code>vfs_unlink</code>. '
info:'Successful or failed calls to functions <code>vfs_read</code> and <code>vfs_write</code>. This chart may not show all file system events if it uses other functions to store data on disk.'
info:'Successful or failed calls to functions <code>vfs_fsync</code>.'
},
'filesystem.vfs_fsync_error':{
info:'Failed calls to functions <code>vfs_fsync</code>.'
},
'filesystem.vfs_open':{
info:'Successful or failed calls to functions <code>vfs_open</code>.'
},
'filesystem.vfs_open_error':{
info:'Failed calls to functions <code>vfs_open</code>.'
},
'filesystem.vfs_create':{
info:'Successful or failed calls to functions <code>vfs_create</code>.'
},
'filesystem.vfs_create_error':{
info:'Failed calls to functions <code>vfs_create</code>.'
},
'filesystem.ext4_read_latency':{
info:'Netdata is attaching <code>kprobes</code> for the function <code>ext4_file_read_iter</code>.'
},
'filesystem.ext4_write_latency':{
info:'Netdata is attaching <code>kprobes</code> for the function <code>ext4_file_write_iter</code>.'
},
'filesystem.ext4_open_latency':{
info:'Netdata is attaching <code>kprobes</code> for the function <code>ext4_file_open</code>.'
},
'filesystem.ext4_sync_latency':{
info:'Netdata is attaching <code>kprobes</code> for the function <code>ext4_sync_file</code>.'
},
'filesystem.xfs_read_latency':{
info:'Netdata is attaching <code>kprobes</code> for the function <code>xfs_file_read_iter</code>.'
},
'filesystem.xfs_write_latency':{
info:'Netdata is attaching <code>kprobes</code> for the function <code>xfs_file_write_iter</code>.'
},
'filesystem.xfs_open_latency':{
info:'Netdata is attaching <code>kprobes</code> for the function <code>xfs_file_open</code>.'
},
'filesystem.xfs_sync_latency':{
info:'Netdata is attaching <code>kprobes</code> for the function <code>xfs_file_sync</code>.'
},
'filesystem.nfs_read_latency':{
info:'Netdata is attaching <code>kprobes</code> for the function <code>nfs_file_read</code>.'
},
'filesystem.nfs_write_latency':{
info:'Netdata is attaching <code>kprobes</code> for the function <code>nfs_file_write</code>.'
},
'filesystem.nfs_open_latency':{
info:'Netdata is attaching <code>kprobes</code> for functions <code>nfs_file_open</code> and <code>nfs4_file_open</code>'
},
'filesystem.nfs_attribute_latency':{
info:'Netdata is attaching <code>kprobes</code> for the function <code>nfs_getattr</code>.'
},
'filesystem.zfs_read_latency':{
info:'Netdata is attaching <code>kprobes</code> for when the function <code>zpl_iter_read</code>.'
},
'filesystem.zfs_write_latency':{
info:'Netdata is attaching <code>kprobes</code> for when the function <code>zpl_iter_write</code>.'
},
'filesystem.zfs_open_latency':{
info:'Netdata is attaching <code>kprobes</code> for when the function <code>zpl_open</code>.'
},
'filesystem.zfs_sync_latency':{
info:'Netdata is attaching <code>kprobes</code> for when the function <code>zpl_fsync</code>.'
},
'filesystem.btrfs_read_latency':{
info:'Netdata is attaching <code>kprobes</code> for when the function <code>btrfs_file_read_iter</code> (kernel newer than 5.9.16) or the function <code>generic_file_read_iter</code> (old kernels).'
},
'filesystem.btrfs_write_latency':{
info:'Netdata is attaching <code>kprobes</code> for when the function <code>btrfs_file_write_iter</code>.'
},
'filesystem.btrfs_open_latency':{
info:'Netdata is attaching <code>kprobes</code> for when the function <code>btrfs_file_open</code>.'
},
'filesystem.btrfs_sync_latency':{
info:'Netdata is attaching <code>kprobes</code> for when the function <code>btrfs_sync_file</code>.'
},
'mount_points.call':{
info:'Monitor calls to syscalls <code>mount(2)</code> and <code>umount(2)</code> that are responsible for attaching or removing filesystems.'
},
'mount_points.error':{
info:'Monitor errors in calls to syscalls <code>mount(2)</code> and <code>umount(2)</code>.'
info:'Calls for internal functions on Linux kernel. The open dimension is attached to the kernel internal function <code>do_sys_open</code> ( For kernels newer than <code>5.5.19</code> we add a kprobe to <code>do_sys_openat2</code>. ), which is the common function called from'+
' and <a href="https://www.man7.org/linux/man-pages/man2/openat.2.html" target="_blank">openat(2)</a>. '+
' The close dimension is attached to the function <code>__close_fd</code> or <code>close_fd</code> according to your kernel version, which is called from system call'+
info:'Failed calls to the kernel internal function <code>do_sys_open</code> ( For kernels newer than <code>5.5.19</code> we add a kprobe to <code>do_sys_openat2</code>. ), which is the common function called from'+
' and <a href="https://www.man7.org/linux/man-pages/man2/openat.2.html" target="_blank">openat(2)</a>. '+
' The close dimension is attached to the function <code>__close_fd</code> or <code>close_fd</code> according to your kernel version, which is called from system call'+
info:'The function <code>swap_readpage</code> is called when the kernel reads a page from swap memory. Netdata also gives a summary for these charts in <a href="#menu_system_submenu_swap">System overview</a>.'
info:'Number of times the syscall <code>shmget</code> is called. Netdata also gives a summary for these charts in <a href="#menu_system_submenu_ipc_shared_memory">System overview</a>.'
info:'Service units start and control daemons and the processes they consist of. '+
'For details, see <a href="https://www.freedesktop.org/software/systemd/man/systemd.service.html#" target="_blank"> systemd.service(5)</a>'
},
'systemd.socket_unit_state':{
info:'Socket units encapsulate local IPC or network sockets in the system, useful for socket-based activation. '+
'For details about socket units, see <a href="https://www.freedesktop.org/software/systemd/man/systemd.socket.html#" target="_blank"> systemd.socket(5)</a>, '+
'for details on socket-based activation and other forms of activation, see <a href="https://www.freedesktop.org/software/systemd/man/daemon.html#" target="_blank"> daemon(7)</a>.'
},
'systemd.target_unit_state':{
info:'Target units are useful to group units, or provide well-known synchronization points during boot-up, '+