[[how-metricbeat-works]] == How Metricbeat works Metricbeat consists of modules and metricsets. A Metricbeat _module_ defines the basic logic for collecting data from a specific service, such as Redis, MySQL, and so on. The module specifies details about the service, including how to connect, how often to collect metrics, and which metrics to collect. Each module has one or more metricsets. A _metricset_ is the part of the module that fetches and structures the data. Rather than collecting each metric as a separate event, metricsets retrieve a list of multiple related metrics in a single request to the remote system. So, for example, the Redis module provides an `info` metricset that collects information and statistics from Redis by running the http://redis.io/commands/INFO[`INFO`] command and parsing the returned result. image:./images/module-overview.png[Modules and metricsets] Likewise, the MySQL module provides a `status` metricset that collects data from MySQL by running a http://dev.mysql.com/doc/refman/5.7/en/show-status.html[`SHOW GLOBAL STATUS`] SQL query. Metricsets make it easier for you by grouping sets of related metrics together in a single request returned by the remote server. Metricbeat retrieves metrics by periodically interrogating the host system based on the `period` value that you specify when you configure the module. Because multiple metricsets can send requests to the same service, Metricbeat reuses connections whenever possible. If Metricbeat cannot connect to the host system within the time specified by the `timeout` config setting, it returns an error. Metricbeat sends the events asynchronously, which means the event retrieval is not acknowledged. If the configured output is not available, events may be lost. When Metricbeat encounters an error (for example, when it cannot connect to the host system), it sends an event error to the specified output. This means that Metricbeat always sends an event, even when there is a failure. This allows you to monitor for errors and see debug messages to help you diagnose what went wrong. The following topics provide more detail about the structure of Metricbeat events: * <> * <> For more about the benefits of using Metricbeat, see <>. [[metricbeat-event-structure]] === Event structure Every event sent by Metricbeat has the same basic structure. It contains the following fields: *`@timestamp`*:: Time when the event was captured *`beat.hostname`*:: Hostname of the server on which the Beat is running *`beat.name`*:: Name given to the Beat *`metricset.module`*:: Name of the module that the data is from *`metricset.name`*:: Name of the metricset that the data is from *`metricset.rtt`*:: Round trip time of the request in microseconds *`type`*:: This is always "metricsets" For example: [source,json] ---- { "@timestamp": "2016-06-22T22:05:53.291Z", "beat": { "hostname": "host.example.com", "name": "host.example.com" }, "metricset": { "module": "system", "name": "process", "rtt": 7419 }, . . . "type": "metricsets" } ---- For more information about the exported fields, see <>. [[error-event-structure]] === Error event structure Metricbeat sends an error event when the service is not reachable. The error event has the same structure as the <>, but also has an error field that contains an error string. This makes it possible to check for errors across all metric events. The following example shows an error event sent when the Apache server is not reachable: [source,json] ---- { "@timestamp": "2016-03-18T12:18:57.124Z", "apache-status": {}, "beat": { "hostname": "host.example.com", "name": "host.example.com" }, "error": { "message": "Get http://127.0.0.1/server-status?auto: dial tcp 127.0.0.1:80: getsockopt: connection refused", }, "metricset": { "module": "apache", "name": "status", "rtt": 1082 }, . . . "type": "metricsets" ---- [[key-features]] === Key metricbeat features Metricbeat has some key features that are critical to how it works: * <> * <> * <> * <> [[metricbeat-error-events]] ==== Metricbeat error events Metricbeat sends more than just metrics. When it cannot retrieve metrics, it sends error events. The error is not simply a flag, but a full error string that is created during fetching from the host systems. This enables you to monitor not only the metrics, but also any errors that occur during metrics monitoring. Because you see the full error message, you can track down the error faster. Metricbeat is installed locally on the host machine, which means that you can differentiate errors that happen locally from other issues, such as network problems. Each metricset is retrieved based on a predefined period, so when Metricbeat fails to retrieve metrics for more than one interval, you can infer that there is potentially something wrong with the host or host connectivity. [[no-aggregations]] ==== No aggregations when data is fetched Metricbeat doesn't do aggregations like gauge, sum, counters, and so on. Metricbeat sends the raw data retrieved from the host to the output for processing. When using Elasticsearch, this has the advantage that all raw data is available on the Elasticsearch host for drilling down into the details, and the data can be reprocessed at any time. It also reduces the complexity of Metricbeat. [[more-than-numbers]] ==== Sends more than just numbers Metricbeat sends more than just numbers. The metrics that Metricbeat sends can also contain strings to report status information. This is useful when you're using Elasticsearch to store the metrics data. Because each metricset has a predefined structure, Elasticsearch knows in advance which types will be stored in Elasticsearch, and it can optimize storage. Basic meta information about each metric (such as the host) is also sent as part of each event. [[multiple-events-in-one]] ==== Multiple metrics in one event Rather than containing a single metric, each event created by Metricbeat contains a list of metrics. This means that you can retrieve all the metrics in a single request to the host system, resulting in less load on the host system. If you are sending the metrics to Elasticsearch as the output, Elasticsearch can directly store and query the metrics as a nested JSON document, making it very efficient for sending metrics data to Elasticsearch. Because the full raw event data is available, Metricbeat or Elasticsearch can do any required transformations on the data later. For example, if you need to store data in the http://metrics20.org/[Metrics2.0] format, you could generate the format out of the existing event by splitting up the full event into multiple metrics2.0 events. Meta information about the type of each metric is stored in the mapping template. Meta information that is common to all metric events, such as host and timestamp, is part of the event structure itself and is only stored once for all events in the metricset. Having all the related metrics in a single event also makes it easier to look at other values when one of the metrics for a service seems off.