1124 lines
32 KiB
Text
1124 lines
32 KiB
Text
|
[[defining-processors]]
|
|||
|
=== Define processors
|
|||
|
|
|||
|
You can use processors to filter and enhance data before sending it to the
|
|||
|
configured output. To define a processor, you specify the processor name, an
|
|||
|
optional condition, and a set of parameters:
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
------
|
|||
|
processors:
|
|||
|
- <processor_name>:
|
|||
|
when:
|
|||
|
<condition>
|
|||
|
<parameters>
|
|||
|
|
|||
|
- <processor_name>:
|
|||
|
when:
|
|||
|
<condition>
|
|||
|
<parameters>
|
|||
|
|
|||
|
...
|
|||
|
------
|
|||
|
|
|||
|
Where:
|
|||
|
|
|||
|
* `<processor_name>` specifies a <<processors,processor>> that performs some kind
|
|||
|
of action, such as selecting the fields that are exported or adding metadata to
|
|||
|
the event.
|
|||
|
* `<condition>` specifies an optional <<conditions,condition>>. If the
|
|||
|
condition is present, then the action is executed only if the condition is
|
|||
|
fulfilled. If no condition is passed, then the action is always executed.
|
|||
|
* `<parameters>` is the list of parameters to pass to the processor.
|
|||
|
|
|||
|
|
|||
|
[[where-valid]]
|
|||
|
==== Where are processors valid?
|
|||
|
|
|||
|
// TODO: ANY NEW BEATS THAT RE-USE THIS TOPIC NEED TO DEFINE processor-scope.
|
|||
|
|
|||
|
ifeval::["{beatname_lc}"=="filebeat"]
|
|||
|
:processor-scope: input
|
|||
|
endif::[]
|
|||
|
|
|||
|
ifeval::["{beatname_lc}"=="auditbeat" or "{beatname_lc}"=="metricbeat"]
|
|||
|
:processor-scope: module
|
|||
|
endif::[]
|
|||
|
|
|||
|
ifeval::["{beatname_lc}"=="packetbeat"]
|
|||
|
:processor-scope: protocol
|
|||
|
endif::[]
|
|||
|
|
|||
|
ifeval::["{beatname_lc}"=="heartbeat"]
|
|||
|
:processor-scope: monitor
|
|||
|
endif::[]
|
|||
|
|
|||
|
ifeval::["{beatname_lc}"=="winlogbeat"]
|
|||
|
:processor-scope: event log shipper
|
|||
|
endif::[]
|
|||
|
|
|||
|
Processors are valid:
|
|||
|
|
|||
|
* At the top-level in the configuration. The processor is applied to all data
|
|||
|
collected by {beatname_uc}.
|
|||
|
* Under a specific {processor-scope}. The processor is applied to the data
|
|||
|
collected for that {processor-scope}.
|
|||
|
ifeval::["{beatname_lc}"=="filebeat"]
|
|||
|
For example:
|
|||
|
+
|
|||
|
[source,yaml]
|
|||
|
------
|
|||
|
- type: <input_type>
|
|||
|
processors:
|
|||
|
- <processor_name>:
|
|||
|
when:
|
|||
|
<condition>
|
|||
|
<parameters>
|
|||
|
...
|
|||
|
------
|
|||
|
+
|
|||
|
Similarly, for {beatname_uc} modules, you can define processors under the
|
|||
|
`input` section of the module definition.
|
|||
|
endif::[]
|
|||
|
ifeval::["{beatname_lc}"=="metricbeat"]
|
|||
|
[source,yaml]
|
|||
|
----
|
|||
|
- module: <module_name>
|
|||
|
metricsets: ["<metricset_name>"]
|
|||
|
processors:
|
|||
|
- <processor_name>:
|
|||
|
when:
|
|||
|
<condition>
|
|||
|
<parameters>
|
|||
|
----
|
|||
|
endif::[]
|
|||
|
ifeval::["{beatname_lc}"=="auditbeat"]
|
|||
|
For example:
|
|||
|
+
|
|||
|
[source,yaml]
|
|||
|
----
|
|||
|
auditbeat.modules:
|
|||
|
- module: <module_name>
|
|||
|
processors:
|
|||
|
- <processor_name>:
|
|||
|
when:
|
|||
|
<condition>
|
|||
|
<parameters>
|
|||
|
----
|
|||
|
endif::[]
|
|||
|
ifeval::["{beatname_lc}"=="packetbeat"]
|
|||
|
For example:
|
|||
|
+
|
|||
|
[source,yaml]
|
|||
|
----
|
|||
|
packetbeat.protocols:
|
|||
|
- type: <protocol_type>
|
|||
|
processors:
|
|||
|
- <processor_name>:
|
|||
|
when:
|
|||
|
<condition>
|
|||
|
<parameters>
|
|||
|
----
|
|||
|
|
|||
|
* Under `packetbeat.flows`. The processor is applied to the data in
|
|||
|
<<configuration-flows,network flows>>:
|
|||
|
+
|
|||
|
[source,yaml]
|
|||
|
----
|
|||
|
packetbeat.flows:
|
|||
|
processors:
|
|||
|
- <processor_name>:
|
|||
|
when:
|
|||
|
<condition>
|
|||
|
<parameters>
|
|||
|
----
|
|||
|
endif::[]
|
|||
|
ifeval::["{beatname_lc}"=="heartbeat"]
|
|||
|
For example:
|
|||
|
+
|
|||
|
[source,yaml]
|
|||
|
----
|
|||
|
heartbeat.monitors:
|
|||
|
- type: <monitor_type>
|
|||
|
processors:
|
|||
|
- <processor_name>:
|
|||
|
when:
|
|||
|
<condition>
|
|||
|
<parameters>
|
|||
|
----
|
|||
|
endif::[]
|
|||
|
ifeval::["{beatname_lc}"=="winlogbeat"]
|
|||
|
For example:
|
|||
|
+
|
|||
|
[source,yaml]
|
|||
|
----
|
|||
|
winlogbeat.event_logs:
|
|||
|
- name: <network_shipper_name>
|
|||
|
processors:
|
|||
|
- <processor_name>:
|
|||
|
when:
|
|||
|
<condition>
|
|||
|
<parameters>
|
|||
|
----
|
|||
|
endif::[]
|
|||
|
|
|||
|
|
|||
|
[[processors]]
|
|||
|
==== Processors
|
|||
|
|
|||
|
The supported processors are:
|
|||
|
|
|||
|
* <<add-cloud-metadata,`add_cloud_metadata`>>
|
|||
|
* <<add-locale,`add_locale`>>
|
|||
|
* <<decode-json-fields,`decode_json_fields`>>
|
|||
|
* <<drop-event,`drop_event`>>
|
|||
|
* <<drop-fields,`drop_fields`>>
|
|||
|
* <<include-fields,`include_fields`>>
|
|||
|
* <<rename-fields,`rename`>>
|
|||
|
* <<add-kubernetes-metadata,`add_kubernetes_metadata`>>
|
|||
|
* <<add-docker-metadata,`add_docker_metadata`>>
|
|||
|
* <<add-host-metadata,`add_host_metadata`>>
|
|||
|
* <<dissect, `dissect`>>
|
|||
|
* <<processor-dns, `dns`>>
|
|||
|
* <<add-process-metadata,`add_process_metadata`>>
|
|||
|
|
|||
|
[[conditions]]
|
|||
|
==== Conditions
|
|||
|
|
|||
|
Each condition receives a field to compare. You can specify multiple fields
|
|||
|
under the same condition by using `AND` between the fields (for example,
|
|||
|
`field1 AND field2`).
|
|||
|
|
|||
|
For each field, you can specify a simple field name or a nested map, for example
|
|||
|
`dns.question.name`.
|
|||
|
|
|||
|
See <<exported-fields>> for a list of all the fields that are exported by
|
|||
|
{beatname_uc}.
|
|||
|
|
|||
|
The supported conditions are:
|
|||
|
|
|||
|
* <<condition-equals,`equals`>>
|
|||
|
* <<condition-contains,`contains`>>
|
|||
|
* <<condition-regexp,`regexp`>>
|
|||
|
* <<condition-range, `range`>>
|
|||
|
* <<condition-has_fields, `has_fields`>>
|
|||
|
* <<condition-or, `or`>>
|
|||
|
* <<condition-and, `and`>>
|
|||
|
* <<condition-not, `not`>>
|
|||
|
|
|||
|
|
|||
|
[float]
|
|||
|
[[condition-equals]]
|
|||
|
===== `equals`
|
|||
|
|
|||
|
With the `equals` condition, you can compare if a field has a certain value.
|
|||
|
The condition accepts only an integer or a string value.
|
|||
|
|
|||
|
For example, the following condition checks if the response code of the HTTP
|
|||
|
transaction is 200:
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------
|
|||
|
equals:
|
|||
|
http.response.code: 200
|
|||
|
-------
|
|||
|
|
|||
|
[float]
|
|||
|
[[condition-contains]]
|
|||
|
===== `contains`
|
|||
|
|
|||
|
The `contains` condition checks if a value is part of a field. The field can be
|
|||
|
a string or an array of strings. The condition accepts only a string value.
|
|||
|
|
|||
|
For example, the following condition checks if an error is part of the
|
|||
|
transaction status:
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
------
|
|||
|
contains:
|
|||
|
status: "Specific error"
|
|||
|
------
|
|||
|
|
|||
|
[float]
|
|||
|
[[condition-regexp]]
|
|||
|
===== `regexp`
|
|||
|
|
|||
|
The `regexp` condition checks the field against a regular expression. The
|
|||
|
condition accepts only strings.
|
|||
|
|
|||
|
For example, the following condition checks if the process name starts with
|
|||
|
`foo`:
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-----
|
|||
|
regexp:
|
|||
|
system.process.name: "foo.*"
|
|||
|
-----
|
|||
|
|
|||
|
[float]
|
|||
|
[[condition-range]]
|
|||
|
===== `range`
|
|||
|
|
|||
|
The `range` condition checks if the field is in a certain range of values. The
|
|||
|
condition supports `lt`, `lte`, `gt` and `gte`. The condition accepts only
|
|||
|
integer or float values.
|
|||
|
|
|||
|
For example, the following condition checks for failed HTTP transactions by
|
|||
|
comparing the `http.response.code` field with 400.
|
|||
|
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
------
|
|||
|
range:
|
|||
|
http.response.code:
|
|||
|
gte: 400
|
|||
|
------
|
|||
|
|
|||
|
This can also be written as:
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
----
|
|||
|
range:
|
|||
|
http.response.code.gte: 400
|
|||
|
----
|
|||
|
|
|||
|
The following condition checks if the CPU usage in percentage has a value
|
|||
|
between 0.5 and 0.8.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
------
|
|||
|
range:
|
|||
|
system.cpu.user.pct.gte: 0.5
|
|||
|
system.cpu.user.pct.lt: 0.8
|
|||
|
------
|
|||
|
|
|||
|
|
|||
|
[float]
|
|||
|
[[condition-has_fields]]
|
|||
|
===== `has_fields`
|
|||
|
|
|||
|
The `has_fields` condition checks if all the given fields exist in the
|
|||
|
event. The condition accepts a list of string values denoting the field names.
|
|||
|
|
|||
|
For example, the following condition checks if the `http.response.code` field
|
|||
|
is present in the event.
|
|||
|
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
------
|
|||
|
has_fields: ['http.response.code']
|
|||
|
------
|
|||
|
|
|||
|
|
|||
|
[float]
|
|||
|
[[condition-or]]
|
|||
|
===== `or`
|
|||
|
|
|||
|
The `or` operator receives a list of conditions.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------
|
|||
|
or:
|
|||
|
- <condition1>
|
|||
|
- <condition2>
|
|||
|
- <condition3>
|
|||
|
...
|
|||
|
|
|||
|
-------
|
|||
|
|
|||
|
For example, to configure the condition
|
|||
|
`http.response.code = 304 OR http.response.code = 404`:
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
------
|
|||
|
or:
|
|||
|
- equals:
|
|||
|
http.response.code: 304
|
|||
|
- equals:
|
|||
|
http.response.code: 404
|
|||
|
------
|
|||
|
|
|||
|
[float]
|
|||
|
[[condition-and]]
|
|||
|
===== `and`
|
|||
|
|
|||
|
The `and` operator receives a list of conditions.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------
|
|||
|
and:
|
|||
|
- <condition1>
|
|||
|
- <condition2>
|
|||
|
- <condition3>
|
|||
|
...
|
|||
|
|
|||
|
-------
|
|||
|
|
|||
|
For example, to configure the condition
|
|||
|
`http.response.code = 200 AND status = OK`:
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
------
|
|||
|
and:
|
|||
|
- equals:
|
|||
|
http.response.code: 200
|
|||
|
- equals:
|
|||
|
status: OK
|
|||
|
------
|
|||
|
|
|||
|
To configure a condition like `<condition1> OR <condition2> AND <condition3>`:
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
------
|
|||
|
or:
|
|||
|
- <condition1>
|
|||
|
- and:
|
|||
|
- <condition2>
|
|||
|
- <condition3>
|
|||
|
|
|||
|
------
|
|||
|
|
|||
|
[float]
|
|||
|
[[condition-not]]
|
|||
|
===== `not`
|
|||
|
|
|||
|
The `not` operator receives the condition to negate.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------
|
|||
|
not:
|
|||
|
<condition>
|
|||
|
|
|||
|
-------
|
|||
|
|
|||
|
For example, to configure the condition `NOT status = OK`:
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
------
|
|||
|
not:
|
|||
|
equals:
|
|||
|
status: OK
|
|||
|
------
|
|||
|
|
|||
|
[[add-cloud-metadata]]
|
|||
|
=== Add cloud metadata
|
|||
|
|
|||
|
The `add_cloud_metadata` processor enriches each event with instance metadata
|
|||
|
from the machine's hosting provider. At startup it will detect the hosting
|
|||
|
provider and cache the instance metadata.
|
|||
|
|
|||
|
The following cloud providers are supported:
|
|||
|
|
|||
|
- Amazon Elastic Compute Cloud (EC2)
|
|||
|
- Digital Ocean
|
|||
|
- Google Compute Engine (GCE)
|
|||
|
- https://www.qcloud.com/?lang=en[Tencent Cloud] (QCloud)
|
|||
|
- Alibaba Cloud (ECS)
|
|||
|
- Azure Virtual Machine
|
|||
|
- Openstack Nova
|
|||
|
|
|||
|
The simple configuration below enables the processor.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
processors:
|
|||
|
- add_cloud_metadata: ~
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
The `add_cloud_metadata` processor has one optional configuration setting named
|
|||
|
`timeout` that specifies the maximum amount of time to wait for a successful
|
|||
|
response when detecting the hosting provider. The default timeout value is
|
|||
|
`3s`.
|
|||
|
|
|||
|
If a timeout occurs then no instance metadata will be added to the events. This
|
|||
|
makes it possible to enable this processor for all your deployments (in the
|
|||
|
cloud or on-premise).
|
|||
|
|
|||
|
The metadata that is added to events varies by hosting provider. Below are
|
|||
|
examples for each of the supported providers.
|
|||
|
|
|||
|
_EC2_
|
|||
|
|
|||
|
[source,json]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
{
|
|||
|
"meta": {
|
|||
|
"cloud": {
|
|||
|
"availability_zone": "us-east-1c",
|
|||
|
"instance_id": "i-4e123456",
|
|||
|
"machine_type": "t2.medium",
|
|||
|
"provider": "ec2",
|
|||
|
"region": "us-east-1"
|
|||
|
}
|
|||
|
}
|
|||
|
}
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
_Digital Ocean_
|
|||
|
|
|||
|
[source,json]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
{
|
|||
|
"meta": {
|
|||
|
"cloud": {
|
|||
|
"instance_id": "1234567",
|
|||
|
"provider": "digitalocean",
|
|||
|
"region": "nyc2"
|
|||
|
}
|
|||
|
}
|
|||
|
}
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
_GCE_
|
|||
|
|
|||
|
[source,json]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
{
|
|||
|
"meta": {
|
|||
|
"cloud": {
|
|||
|
"availability_zone": "projects/1234567890/zones/us-east1-b",
|
|||
|
"instance_id": "1234556778987654321",
|
|||
|
"machine_type": "projects/1234567890/machineTypes/f1-micro",
|
|||
|
"project_id": "my-dev",
|
|||
|
"provider": "gce"
|
|||
|
}
|
|||
|
}
|
|||
|
}
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
_Tencent Cloud_
|
|||
|
|
|||
|
[source,json]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
{
|
|||
|
"meta": {
|
|||
|
"cloud": {
|
|||
|
"availability_zone": "gz-azone2",
|
|||
|
"instance_id": "ins-qcloudv5",
|
|||
|
"provider": "qcloud",
|
|||
|
"region": "china-south-gz"
|
|||
|
}
|
|||
|
}
|
|||
|
}
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
_Alibaba Cloud_
|
|||
|
|
|||
|
This metadata is only available when VPC is selected as the network type of the
|
|||
|
ECS instance.
|
|||
|
|
|||
|
[source,json]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
{
|
|||
|
"meta": {
|
|||
|
"cloud": {
|
|||
|
"availability_zone": "cn-shenzhen",
|
|||
|
"instance_id": "i-wz9g2hqiikg0aliyun2b",
|
|||
|
"provider": "ecs",
|
|||
|
"region": "cn-shenzhen-a"
|
|||
|
}
|
|||
|
}
|
|||
|
}
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
_Azure Virtual Machine_
|
|||
|
|
|||
|
[source,json]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
{
|
|||
|
"meta": {
|
|||
|
"cloud": {
|
|||
|
"provider": "az",
|
|||
|
"instance_id": "04ab04c3-63de-4709-a9f9-9ab8c0411d5e",
|
|||
|
"instance_name": "test-az-vm",
|
|||
|
"machine_type": "Standard_D3_v2",
|
|||
|
"region": "eastus2"
|
|||
|
}
|
|||
|
}
|
|||
|
}
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
_Openstack Nova_
|
|||
|
|
|||
|
[source,json]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
{
|
|||
|
"meta": {
|
|||
|
"cloud": {
|
|||
|
"provider": "openstack",
|
|||
|
"instance_name": "test-998d932195.mycloud.tld",
|
|||
|
"availability_zone": "xxxx-az-c",
|
|||
|
"instance_id": "i-00011a84",
|
|||
|
"machine_type": "m2.large"
|
|||
|
}
|
|||
|
}
|
|||
|
}
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
|
|||
|
[[add-locale]]
|
|||
|
=== Add the local time zone
|
|||
|
|
|||
|
The `add_locale` processor enriches each event with the machine's time zone
|
|||
|
offset from UTC or with the name of the time zone. It supports one configuration
|
|||
|
option named `format` that controls whether an offset or time zone abbreviation
|
|||
|
is added to the event. The default format is `offset`. The processor adds the
|
|||
|
a `beat.timezone` value to each event.
|
|||
|
|
|||
|
The configuration below enables the processor with the default settings.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
processors:
|
|||
|
- add_locale: ~
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
This configuration enables the processor and configures it to add the time zone
|
|||
|
abbreviation to events.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
processors:
|
|||
|
- add_locale:
|
|||
|
format: abbreviation
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
NOTE: Please note that `add_locale` differentiates between daylight savings
|
|||
|
time (DST) and regular time. For example `CEST` indicates DST and and `CET` is
|
|||
|
regular time.
|
|||
|
|
|||
|
|
|||
|
[[decode-json-fields]]
|
|||
|
=== Decode JSON fields
|
|||
|
|
|||
|
The `decode_json_fields` processor decodes fields containing JSON strings and
|
|||
|
replaces the strings with valid JSON objects.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-----------------------------------------------------
|
|||
|
processors:
|
|||
|
- decode_json_fields:
|
|||
|
fields: ["field1", "field2", ...]
|
|||
|
process_array: false
|
|||
|
max_depth: 1
|
|||
|
target: ""
|
|||
|
overwrite_keys: false
|
|||
|
-----------------------------------------------------
|
|||
|
|
|||
|
The `decode_json_fields` processor has the following configuration settings:
|
|||
|
|
|||
|
`fields`:: The fields containing JSON strings to decode.
|
|||
|
`process_array`:: (Optional) A boolean that specifies whether to process
|
|||
|
arrays. The default is false.
|
|||
|
`max_depth`:: (Optional) The maximum parsing depth. The default is 1.
|
|||
|
`target`:: (Optional) The field under which the decoded JSON will be written. By
|
|||
|
default the decoded JSON object replaces the string field from which it was
|
|||
|
read. To merge the decoded JSON fields into the root of the event, specify
|
|||
|
`target` with an empty string (`target: ""`). Note that the `null` value (`target:`)
|
|||
|
is treated as if the field was not set at all.
|
|||
|
`overwrite_keys`:: (Optional) A boolean that specifies whether keys that already
|
|||
|
exist in the event are overwritten by keys from the decoded JSON object. The
|
|||
|
default value is false.
|
|||
|
|
|||
|
[[drop-event]]
|
|||
|
=== Drop events
|
|||
|
|
|||
|
The `drop_event` processor drops the entire event if the associated condition
|
|||
|
is fulfilled. The condition is mandatory, because without one, all the events
|
|||
|
are dropped.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
------
|
|||
|
processors:
|
|||
|
- drop_event:
|
|||
|
when:
|
|||
|
condition
|
|||
|
------
|
|||
|
|
|||
|
See <<conditions>> for a list of supported conditions.
|
|||
|
|
|||
|
[[drop-fields]]
|
|||
|
=== Drop fields from events
|
|||
|
|
|||
|
The `drop_fields` processor specifies which fields to drop if a certain
|
|||
|
condition is fulfilled. The condition is optional. If it's missing, the
|
|||
|
specified fields are always dropped. The `@timestamp` and `type` fields cannot
|
|||
|
be dropped, even if they show up in the `drop_fields` list.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-----------------------------------------------------
|
|||
|
processors:
|
|||
|
- drop_fields:
|
|||
|
when:
|
|||
|
condition
|
|||
|
fields: ["field1", "field2", ...]
|
|||
|
-----------------------------------------------------
|
|||
|
|
|||
|
See <<conditions>> for a list of supported conditions.
|
|||
|
|
|||
|
NOTE: If you define an empty list of fields under `drop_fields`, then no fields
|
|||
|
are dropped.
|
|||
|
|
|||
|
[[include-fields]]
|
|||
|
=== Keep fields from events
|
|||
|
|
|||
|
The `include_fields` processor specifies which fields to export if a certain
|
|||
|
condition is fulfilled. The condition is optional. If it's missing, the
|
|||
|
specified fields are always exported. The `@timestamp` and `type` fields are
|
|||
|
always exported, even if they are not defined in the `include_fields` list.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------
|
|||
|
processors:
|
|||
|
- include_fields:
|
|||
|
when:
|
|||
|
condition
|
|||
|
fields: ["field1", "field2", ...]
|
|||
|
-------
|
|||
|
|
|||
|
See <<conditions>> for a list of supported conditions.
|
|||
|
|
|||
|
You can specify multiple `include_fields` processors under the `processors`
|
|||
|
section.
|
|||
|
|
|||
|
NOTE: If you define an empty list of fields under `include_fields`, then only
|
|||
|
the required fields, `@timestamp` and `type`, are exported.
|
|||
|
|
|||
|
[[rename-fields]]
|
|||
|
=== Rename fields from events
|
|||
|
|
|||
|
The `rename` processor specifies a list of fields to rename. Under the `fields`
|
|||
|
key each entry contains a `from: old-key` and a `to: new-key` pair. `from` is
|
|||
|
the origin and `to` the target name of the field.
|
|||
|
|
|||
|
Renaming fields can be useful in cases where field names cause conflicts. For
|
|||
|
example if an event has two fields, `c` and `c.b`, that are both assigned scalar
|
|||
|
values (e.g. `{"c": 1, "c.b": 2}`) this will result in an Elasticsearch error at
|
|||
|
ingest time. This is because the value of a cannot simultaneously be a scalar
|
|||
|
and an object. To prevent this rename_fields can be used to rename `c` to
|
|||
|
`c.value`.
|
|||
|
|
|||
|
Rename fields cannot be used to overwrite fields. To overwrite fields either
|
|||
|
first rename the target field or use the `drop_fields` processor to drop the
|
|||
|
field and then rename the field.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------
|
|||
|
processors:
|
|||
|
- rename:
|
|||
|
fields:
|
|||
|
- from: "a.g"
|
|||
|
to: "e.d"
|
|||
|
ignore_missing: false
|
|||
|
fail_on_error: true
|
|||
|
-------
|
|||
|
|
|||
|
The `rename` processor has the following configuration settings:
|
|||
|
|
|||
|
`ignore_missing`:: (Optional) If set to true, no error is logged in case a key
|
|||
|
which should be renamed is missing. Default is `false`.
|
|||
|
|
|||
|
`fail_on_error`:: (Optional) If set to true, in case of an error the renaming of
|
|||
|
fields is stopped and the original event is returned. If set to false, renaming
|
|||
|
continues also if an error happened during renaming. Default is `true`.
|
|||
|
|
|||
|
See <<conditions>> for a list of supported conditions.
|
|||
|
|
|||
|
You can specify multiple `ignore_missing` processors under the `processors`
|
|||
|
section.
|
|||
|
|
|||
|
[[add-kubernetes-metadata]]
|
|||
|
=== Add Kubernetes metadata
|
|||
|
|
|||
|
The `add_kubernetes_metadata` processor annotates each event with relevant
|
|||
|
metadata based on which Kubernetes pod the event originated from. Each event is
|
|||
|
annotated with:
|
|||
|
|
|||
|
* Pod Name
|
|||
|
* Namespace
|
|||
|
* Labels
|
|||
|
|
|||
|
The `add_kubernetes_metadata` processor has two basic building blocks which are:
|
|||
|
|
|||
|
* Indexers
|
|||
|
* Matchers
|
|||
|
|
|||
|
Indexers take in a pod's metadata and builds indices based on the pod metadata.
|
|||
|
For example, the `ip_port` indexer can take a Kubernetes pod and index the pod
|
|||
|
metadata based on all `pod_ip:container_port` combinations.
|
|||
|
|
|||
|
Matchers are used to construct lookup keys for querying indices. For example,
|
|||
|
when the `fields` matcher takes `["metricset.host"]` as a lookup field, it would
|
|||
|
construct a lookup key with the value of the field `metricset.host`.
|
|||
|
|
|||
|
Each Beat can define its own default indexers and matchers which are enabled by
|
|||
|
default. For example, FileBeat enables the `container` indexer, which indexes
|
|||
|
pod metadata based on all container IDs, and a `logs_path` matcher, which takes
|
|||
|
the `source` field, extracts the container ID, and uses it to retrieve metadata.
|
|||
|
|
|||
|
The configuration below enables the processor when {beatname_lc} is run as a pod in
|
|||
|
Kubernetes.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
processors:
|
|||
|
- add_kubernetes_metadata:
|
|||
|
in_cluster: true
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
The configuration below enables the processor on a Beat running as a process on
|
|||
|
the Kubernetes node.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
processors:
|
|||
|
- add_kubernetes_metadata:
|
|||
|
in_cluster: false
|
|||
|
host: <hostname>
|
|||
|
kube_config: ${HOME}/.kube/config
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
The configuration below has the default indexers and matchers disabled and
|
|||
|
enables ones that the user is interested in.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
processors:
|
|||
|
- add_kubernetes_metadata:
|
|||
|
in_cluster: false
|
|||
|
host: <hostname>
|
|||
|
kube_config: ~/.kube/config
|
|||
|
default_indexers.enabled: false
|
|||
|
default_matchers.enabled: false
|
|||
|
indexers:
|
|||
|
- ip_port:
|
|||
|
matchers:
|
|||
|
- fields:
|
|||
|
lookup_fields: ["metricset.host"]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
The `add_kubernetes_metadata` processor has the following configuration settings:
|
|||
|
|
|||
|
`in_cluster`:: (Optional) Use in cluster settings for Kubernetes client, `true`
|
|||
|
by default.
|
|||
|
`host`:: (Optional) Identify the node where {beatname_lc} is running in case it
|
|||
|
cannot be accurately detected, as when running {beatname_lc} in host network
|
|||
|
mode.
|
|||
|
`kube_config`:: (Optional) Use given config file as configuration for Kubernetes
|
|||
|
client.
|
|||
|
`default_indexers.enabled`:: (Optional) Enable/Disable default pod indexers, in
|
|||
|
case you want to specify your own.
|
|||
|
`default_matchers.enabled`:: (Optional) Enable/Disable default pod matchers, in
|
|||
|
case you want to specify your own.
|
|||
|
|
|||
|
[[add-docker-metadata]]
|
|||
|
=== Add Docker metadata
|
|||
|
|
|||
|
The `add_docker_metadata` processor annotates each event with relevant metadata
|
|||
|
from Docker containers:
|
|||
|
|
|||
|
* Container ID
|
|||
|
* Name
|
|||
|
* Image
|
|||
|
* Labels
|
|||
|
|
|||
|
[NOTE]
|
|||
|
=====
|
|||
|
When running {beatname_uc} in a container, you need to provide access to
|
|||
|
Docker’s unix socket in order for the `add_docker_metadata` processor to work.
|
|||
|
You can do this by mounting the socket inside the container. For example:
|
|||
|
|
|||
|
`docker run -v /var/run/docker.sock:/var/run/docker.sock ...`
|
|||
|
|
|||
|
To avoid privilege issues, you may also need to add `--user=root` to the
|
|||
|
`docker run` flags. Because the user must be part of the docker group in order
|
|||
|
to access `/var/run/docker.sock`, root access is required if {beatname_uc} is
|
|||
|
running as non-root inside the container.
|
|||
|
=====
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
processors:
|
|||
|
- add_docker_metadata:
|
|||
|
host: "unix:///var/run/docker.sock"
|
|||
|
#match_fields: ["system.process.cgroup.id"]
|
|||
|
#match_pids: ["process.pid", "process.ppid"]
|
|||
|
#match_source: true
|
|||
|
#match_source_index: 4
|
|||
|
#match_short_id: true
|
|||
|
#cleanup_timeout: 60
|
|||
|
# To connect to Docker over TLS you must specify a client and CA certificate.
|
|||
|
#ssl:
|
|||
|
# certificate_authority: "/etc/pki/root/ca.pem"
|
|||
|
# certificate: "/etc/pki/client/cert.pem"
|
|||
|
# key: "/etc/pki/client/cert.key"
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
It has the following settings:
|
|||
|
|
|||
|
`host`:: (Optional) Docker socket (UNIX or TCP socket). It uses
|
|||
|
`unix:///var/run/docker.sock` by default.
|
|||
|
|
|||
|
`ssl`:: (Optional) SSL configuration to use when connecting to the Docker
|
|||
|
socket.
|
|||
|
|
|||
|
`match_fields`:: (Optional) A list of fields to match a container ID, at least
|
|||
|
one of them should hold a container ID to get the event enriched.
|
|||
|
|
|||
|
`match_pids`:: (Optional) A list of fields that contain process IDs. If the
|
|||
|
process is running in Docker then the event will be enriched. The default value
|
|||
|
is `["process.pid", "process.ppid"]`.
|
|||
|
|
|||
|
`match_source`:: (Optional) Match container ID from a log path present in the
|
|||
|
`source` field. Enabled by default.
|
|||
|
|
|||
|
`match_short_id`:: (Optional) Match container short ID from a log path present
|
|||
|
in the `source` field. Disabled by default.
|
|||
|
This allows to match directories names that have the first 12 characters
|
|||
|
of the container ID. For example, `/var/log/containers/b7e3460e2b21/*.log`.
|
|||
|
|
|||
|
`match_source_index`:: (Optional) Index in the source path split by `/` to look
|
|||
|
for container ID. It defaults to 4 to match
|
|||
|
`/var/lib/docker/containers/<container_id>/*.log`
|
|||
|
|
|||
|
`cleanup_timeout`:: (Optional) Time of inactivity to consider we can clean and
|
|||
|
forget metadata for a container, 60s by default.
|
|||
|
|
|||
|
[[add-host-metadata]]
|
|||
|
=== Add Host metadata
|
|||
|
|
|||
|
beta[]
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
processors:
|
|||
|
- add_host_metadata:
|
|||
|
netinfo.enabled: false
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
It has the following settings:
|
|||
|
|
|||
|
`netinfo.enabled`:: (Optional) Default false. Include IP addresses and MAC addresses as fields host.ip and host.mac
|
|||
|
|
|||
|
The `add_host_metadata` processor annotates each event with relevant metadata from the host machine.
|
|||
|
The fields added to the event are looking as following:
|
|||
|
|
|||
|
[source,json]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
{
|
|||
|
"host":{
|
|||
|
"architecture":"x86_64",
|
|||
|
"name":"example-host",
|
|||
|
"id":"",
|
|||
|
"os":{
|
|||
|
"family":"darwin",
|
|||
|
"build":"16G1212",
|
|||
|
"platform":"darwin",
|
|||
|
"version":"10.12.6"
|
|||
|
},
|
|||
|
"ip": ["192.168.0.1", "10.0.0.1"],
|
|||
|
"mac": ["00:25:96:12:34:56", "72:00:06:ff:79:f1"]
|
|||
|
}
|
|||
|
}
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
NOTE: The host information is refreshed every 5 minutes.
|
|||
|
|
|||
|
[[dissect]]
|
|||
|
=== Dissect strings
|
|||
|
|
|||
|
The dissect processor tokenizes incoming strings using defined patterns.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------
|
|||
|
processors:
|
|||
|
- dissect:
|
|||
|
tokenizer: "%{key1} %{key2}"
|
|||
|
field: "message"
|
|||
|
target_prefix: "dissect"
|
|||
|
-------
|
|||
|
|
|||
|
The `dissect` processor has the following configuration settings:
|
|||
|
|
|||
|
`field`:: (Optional) The event field to tokenize. Default is `message`.
|
|||
|
|
|||
|
`target_prefix`:: (Optional) The name of the field where the values will be extracted. When an empty
|
|||
|
string is defined, the processor will create the keys at the root of the event. Default is
|
|||
|
`dissect`. When the target key already exists in the event, the processor won't replace it and log
|
|||
|
an error; you need to either drop or rename the key before using dissect.
|
|||
|
|
|||
|
For tokenization to be successful, all keys must be found and extracted, if one of them cannot be
|
|||
|
found an error will be logged and no modification is done on the original event.
|
|||
|
|
|||
|
NOTE: A key can contain any characters except reserved suffix or prefix modifiers: `/`,`&`, `+`
|
|||
|
and `?`.
|
|||
|
|
|||
|
See <<conditions>> for a list of supported conditions.
|
|||
|
|
|||
|
[[processor-dns]]
|
|||
|
=== DNS Reverse Lookup
|
|||
|
|
|||
|
The DNS processor performs reverse DNS lookups of IP addresses. It caches the
|
|||
|
responses that it receives in accordance to the time-to-live (TTL) value
|
|||
|
contained in the response. It also caches failures that occur during lookups.
|
|||
|
Each instance of this processor maintains its own independent cache.
|
|||
|
|
|||
|
The processor uses its own DNS resolver to send requests to nameservers and does
|
|||
|
not use the operating system's resolver. It does not read any values contained
|
|||
|
in `/etc/hosts`.
|
|||
|
|
|||
|
This processor can significantly slow down your pipeline's throughput if you
|
|||
|
have a high latency network or slow upstream nameserver. The cache will help
|
|||
|
with performance, but if the addresses being resolved have a high cardinality
|
|||
|
then the cache benefits will be diminished due to the high miss ratio.
|
|||
|
|
|||
|
By way of example, if each DNS lookup takes 2 milliseconds, the maximum
|
|||
|
throughput you can achieve is 500 events per second (1000 milliseconds / 2
|
|||
|
milliseconds). If you have a high cache hit ratio then your throughput can be
|
|||
|
higher.
|
|||
|
|
|||
|
This is a minimal configuration example that resolves the IP addresses contained
|
|||
|
in two fields.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
----
|
|||
|
processors:
|
|||
|
- dns:
|
|||
|
type: reverse
|
|||
|
fields:
|
|||
|
source.ip: source.hostname
|
|||
|
destination.ip: destination.hostname
|
|||
|
----
|
|||
|
|
|||
|
Next is a configuration example showing all options.
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
----
|
|||
|
processors:
|
|||
|
- dns:
|
|||
|
type: reverse
|
|||
|
action: append
|
|||
|
fields:
|
|||
|
server.ip: server.hostname
|
|||
|
client.ip: client.hostname
|
|||
|
success_cache:
|
|||
|
capacity.initial: 1000
|
|||
|
capacity.max: 10000
|
|||
|
failure_cache:
|
|||
|
capacity.initial: 1000
|
|||
|
capacity.max: 10000
|
|||
|
ttl: 1m
|
|||
|
nameservers: ['192.0.2.1', '203.0.113.1']
|
|||
|
timeout: 500ms
|
|||
|
tag_on_failure: [_dns_reverse_lookup_failed]
|
|||
|
----
|
|||
|
|
|||
|
The `dns` processor has the following configuration settings:
|
|||
|
|
|||
|
`type`:: The type of DNS lookup to perform. The only supported type is
|
|||
|
`reverse` which queries for a PTR record.
|
|||
|
|
|||
|
`action`:: This defines the behavior of the processor when the target field
|
|||
|
already exists in the event. The options are `append` (default) and `replace`.
|
|||
|
|
|||
|
`fields`:: This is a mapping of source field names to target field names. The
|
|||
|
value of the source field will be used in the DNS query and result will be
|
|||
|
written to the target field.
|
|||
|
|
|||
|
`success_cache.capacity.initial`:: The initial number of items that the success
|
|||
|
cache will be allocated to hold. When initialized the processor will allocate
|
|||
|
the memory for this number of items. Default value is `1000`.
|
|||
|
|
|||
|
`success_cache.capacity.max`:: The maximum number of items that the success
|
|||
|
cache can hold. When the maximum capacity is reached a random item is evicted.
|
|||
|
Default value is `10000`.
|
|||
|
|
|||
|
`failure_cache.capacity.initial`:: The initial number of items that the failure
|
|||
|
cache will be allocated to hold. When initialized the processor will allocate
|
|||
|
the memory for this number of items. Default value is `1000`.
|
|||
|
|
|||
|
`failure_cache.capacity.max`:: The maximum number of items that the failure
|
|||
|
cache can hold. When the maximum capacity is reached a random item is evicted.
|
|||
|
Default value is `10000`.
|
|||
|
|
|||
|
`failure_cache.ttl`:: The duration for which failures are cached. Valid time
|
|||
|
units are "ns", "us" (or "µs"), "ms", "s", "m", "h". Default value is `1m`.
|
|||
|
|
|||
|
`nameservers`:: A list of nameservers to query. If there are multiple servers,
|
|||
|
the resolver queries them in the order listed. If none are specified then it
|
|||
|
will read the nameservers listed in `/etc/resolv.conf` once at initialization.
|
|||
|
On Windows you must always supply at least one nameserver.
|
|||
|
|
|||
|
`timeout`:: The duration after which a DNS query will timeout. This is timeout
|
|||
|
for each DNS request so if you have 2 nameservers then the total timeout will be
|
|||
|
2 times this value. Valid time units are "ns", "us" (or "µs"), "ms", "s", "m",
|
|||
|
"h". Default value is `500ms`.
|
|||
|
|
|||
|
`tag_on_failure`:: A list of tags to add to the event when any lookup fails. The
|
|||
|
tags are only added once even if multiple lookups fail. By default no tags are
|
|||
|
added upon failure.
|
|||
|
|
|||
|
[[add-process-metadata]]
|
|||
|
=== Add process metadata
|
|||
|
|
|||
|
The Add process metadata processor enriches events with information from running
|
|||
|
processes, identified by their process ID (PID).
|
|||
|
|
|||
|
[source,yaml]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
processors:
|
|||
|
- add_process_metadata:
|
|||
|
match_pids: [system.process.ppid]
|
|||
|
target: system.process.parent
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
The fields added to the event look as follows:
|
|||
|
[source,json]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
"process": {
|
|||
|
"name": "systemd",
|
|||
|
"title": "/usr/lib/systemd/systemd --switched-root --system --deserialize 22",
|
|||
|
"exe": "/usr/lib/systemd/systemd",
|
|||
|
"args": ["/usr/lib/systemd/systemd", "--switched-root", "--system", "--deserialize", "22"],
|
|||
|
"pid": 1,
|
|||
|
"ppid": 0,
|
|||
|
"start_time": "2018-08-22T08:44:50.684Z",
|
|||
|
}
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
|
|||
|
Optionally, the process environment can be included, too:
|
|||
|
[source,json]
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
...
|
|||
|
"env": {
|
|||
|
"HOME": "/",
|
|||
|
"TERM": "linux",
|
|||
|
"BOOT_IMAGE": "/boot/vmlinuz-4.11.8-300.fc26.x86_64",
|
|||
|
"LANG": "en_US.UTF-8",
|
|||
|
}
|
|||
|
...
|
|||
|
-------------------------------------------------------------------------------
|
|||
|
It has the following settings:
|
|||
|
|
|||
|
`match_pids`:: List of fields to lookup for a PID. The processor will
|
|||
|
search the list sequentially until the field is found in the current event, and
|
|||
|
the PID lookup will be applied to the value of this field.
|
|||
|
|
|||
|
`target`:: (Optional) Destination prefix where the `process` object will be
|
|||
|
created. The default is the event's root.
|
|||
|
|
|||
|
`include_fields`:: (Optional) List of fields to add. By default, the processor
|
|||
|
will add all the available fields except `process.env`.
|
|||
|
|
|||
|
`ignore_missing`:: (Optional) When set to `false`, events that don't contain any
|
|||
|
of the fields in match_pids will be discarded and an error will be generated. By
|
|||
|
default, this condition is ignored.
|
|||
|
|
|||
|
`overwrite_keys`:: (Optional) By default, if a target field already exists, it
|
|||
|
will not be overwritten and an error will be logged. If `overwrite_keys` is
|
|||
|
set to `true`, this condition will be ignored.
|
|||
|
|
|||
|
`restricted_fields`:: (Optional) By default, the `process.env` field is not
|
|||
|
output, to avoid leaking sensitive data. If `restricted_fields` is `true`, the
|
|||
|
field will be present in the output.
|