[[defining-processors]]
=== Define processors

You can use processors to filter and enhance data before sending it to the
configured output. To define a processor, you specify the processor name, an
optional condition, and a set of parameters:

[source,yaml]
------
processors:
- <processor_name>:
    when:
      <condition>
    <parameters>

- <processor_name>:
    when:
      <condition>
    <parameters>

...
------

Where:

* `<processor_name>` specifies a <<processors,processor>> that performs some kind
of action, such as selecting the fields that are exported or adding metadata to
the event.
* `<condition>` specifies an optional <<conditions,condition>>. If the
condition is present, then the action is executed only if the condition is
fulfilled. If no condition is passed, then the action is always executed.
* `<parameters>` is the list of parameters to pass to the processor.


[[where-valid]]
==== Where are processors valid?

// TODO: ANY NEW BEATS THAT RE-USE THIS TOPIC NEED TO DEFINE processor-scope.

ifeval::["{beatname_lc}"=="filebeat"]
:processor-scope: input
endif::[]

ifeval::["{beatname_lc}"=="auditbeat" or "{beatname_lc}"=="metricbeat"]
:processor-scope: module
endif::[]

ifeval::["{beatname_lc}"=="packetbeat"]
:processor-scope: protocol
endif::[]

ifeval::["{beatname_lc}"=="heartbeat"]
:processor-scope: monitor
endif::[]

ifeval::["{beatname_lc}"=="winlogbeat"]
:processor-scope: event log shipper
endif::[]

Processors are valid:

* At the top-level in the configuration. The processor is applied to all data
collected by {beatname_uc}.
* Under a specific {processor-scope}. The processor is applied to the data
collected for that {processor-scope}.
ifeval::["{beatname_lc}"=="filebeat"]
For example:
+
[source,yaml]
------
- type: <input_type>
  processors:
  - <processor_name>:
      when:
        <condition>
      <parameters>
...
------
+
Similarly, for {beatname_uc} modules, you can define processors under the
`input` section of the module definition. 
endif::[]
ifeval::["{beatname_lc}"=="metricbeat"]
[source,yaml]
----
- module: <module_name>
  metricsets: ["<metricset_name>"]
  processors:
  - <processor_name>:
      when:
        <condition>
      <parameters> 
----
endif::[]
ifeval::["{beatname_lc}"=="auditbeat"]
For example:
+
[source,yaml]
----
auditbeat.modules:
- module: <module_name>
  processors:
  - <processor_name>:
      when:
        <condition>
      <parameters> 
----
endif::[]
ifeval::["{beatname_lc}"=="packetbeat"]
For example:
+
[source,yaml]
----
packetbeat.protocols:
- type: <protocol_type>  
  processors:
  - <processor_name>:
      when:
        <condition>
      <parameters>
----

* Under `packetbeat.flows`. The processor is applied to the data in
<<configuration-flows,network flows>>:
+
[source,yaml]
----
packetbeat.flows:
  processors:
  - <processor_name>:
      when:
        <condition>
      <parameters>
----
endif::[]
ifeval::["{beatname_lc}"=="heartbeat"]
For example:
+
[source,yaml]
----
heartbeat.monitors:
- type: <monitor_type>
  processors:
  - <processor_name>:
      when:
        <condition>
      <parameters>
----
endif::[]
ifeval::["{beatname_lc}"=="winlogbeat"]
For example:
+
[source,yaml]
----
winlogbeat.event_logs:
- name: <network_shipper_name>
  processors:
  - <processor_name>:
      when:
        <condition>
      <parameters>
----
endif::[]


[[processors]]
==== Processors

The supported processors are:

 * <<add-cloud-metadata,`add_cloud_metadata`>>
 * <<add-locale,`add_locale`>>
 * <<decode-json-fields,`decode_json_fields`>>
 * <<drop-event,`drop_event`>>
 * <<drop-fields,`drop_fields`>>
 * <<include-fields,`include_fields`>>
 * <<rename-fields,`rename`>>
 * <<add-kubernetes-metadata,`add_kubernetes_metadata`>>
 * <<add-docker-metadata,`add_docker_metadata`>>
 * <<add-host-metadata,`add_host_metadata`>>
 * <<dissect, `dissect`>>
 * <<processor-dns, `dns`>>
 * <<add-process-metadata,`add_process_metadata`>>

[[conditions]]
==== Conditions

Each condition receives a field to compare. You can specify multiple fields
under the same condition by using `AND` between the fields (for example,
`field1 AND field2`).

For each field, you can specify a simple field name or a nested map, for example
`dns.question.name`.

See <<exported-fields>> for a list of all the fields that are exported by
{beatname_uc}.

The supported conditions are:

* <<condition-equals,`equals`>>
* <<condition-contains,`contains`>>
* <<condition-regexp,`regexp`>>
* <<condition-range, `range`>>
* <<condition-has_fields, `has_fields`>>
* <<condition-or, `or`>>
* <<condition-and, `and`>>
* <<condition-not, `not`>>


[float]
[[condition-equals]]
===== `equals`

With the `equals` condition, you can compare if a field has a certain value.
The condition accepts only an integer or a string value.

For example, the following condition checks if the response code of the HTTP
transaction is 200:

[source,yaml]
-------
equals:
  http.response.code: 200
-------

[float]
[[condition-contains]]
===== `contains`

The `contains` condition checks if a value is part of a field. The field can be
a string or an array of strings. The condition accepts only a string value.

For example, the following condition checks if an error is part of the
transaction status:

[source,yaml]
------
contains:
  status: "Specific error"
------

[float]
[[condition-regexp]]
===== `regexp`

The `regexp` condition checks the field against a regular expression. The
condition accepts only strings.

For example, the following condition checks if the process name starts with
`foo`:

[source,yaml]
-----
regexp:
  system.process.name: "foo.*"
-----

[float]
[[condition-range]]
===== `range`

The `range` condition checks if the field is in a certain range of values. The
condition supports `lt`, `lte`, `gt` and `gte`. The condition accepts only
integer or float values.

For example, the following condition checks for failed HTTP transactions by
comparing the `http.response.code` field with 400.


[source,yaml]
------
range:
    http.response.code:
        gte: 400
------

This can also be written as:

[source,yaml]
----
range:
    http.response.code.gte: 400
----

The following condition checks if the CPU usage in percentage has a value
between 0.5 and 0.8.

[source,yaml]
------
range:
    system.cpu.user.pct.gte: 0.5
    system.cpu.user.pct.lt: 0.8
------


[float]
[[condition-has_fields]]
===== `has_fields`

The `has_fields` condition checks if all the given fields exist in the
event. The condition accepts a list of string values denoting the field names.

For example, the following condition checks if the `http.response.code` field
is present in the event.


[source,yaml]
------
has_fields: ['http.response.code']
------


[float]
[[condition-or]]
===== `or`

The `or` operator receives a list of conditions.

[source,yaml]
-------
or:
  - <condition1>
  - <condition2>
  - <condition3>
  ...

-------

For example, to configure the condition
`http.response.code = 304 OR http.response.code = 404`:

[source,yaml]
------
or:
  - equals:
      http.response.code: 304
  - equals:
      http.response.code: 404
------

[float]
[[condition-and]]
===== `and`

The `and` operator receives a list of conditions.

[source,yaml]
-------
and:
  - <condition1>
  - <condition2>
  - <condition3>
  ...

-------

For example, to configure the condition
`http.response.code = 200 AND status = OK`:

[source,yaml]
------
and:
  - equals:
      http.response.code: 200
  - equals:
      status: OK
------

To configure a condition like `<condition1> OR <condition2> AND <condition3>`:

[source,yaml]
------
or:
 - <condition1>
 - and:
    - <condition2>
    - <condition3>

------

[float]
[[condition-not]]
===== `not`

The `not` operator receives the condition to negate.

[source,yaml]
-------
not:
  <condition>

-------

For example, to configure the condition `NOT status = OK`:

[source,yaml]
------
not:
  equals:
    status: OK
------

[[add-cloud-metadata]]
=== Add cloud metadata

The `add_cloud_metadata` processor enriches each event with instance metadata
from the machine's hosting provider. At startup it will detect the hosting
provider and cache the instance metadata.

The following cloud providers are supported:

- Amazon Elastic Compute Cloud (EC2)
- Digital Ocean
- Google Compute Engine (GCE)
- https://www.qcloud.com/?lang=en[Tencent Cloud] (QCloud)
- Alibaba Cloud (ECS)
- Azure Virtual Machine
- Openstack Nova

The simple configuration below enables the processor.

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_cloud_metadata: ~
-------------------------------------------------------------------------------

The `add_cloud_metadata` processor has one optional configuration setting named
`timeout` that specifies the maximum amount of time to wait for a successful
response when detecting the hosting provider. The default timeout value is
`3s`.

If a timeout occurs then no instance metadata will be added to the events. This
makes it possible to enable this processor for all your deployments (in the
cloud or on-premise).

The metadata that is added to events varies by hosting provider. Below are
examples for each of the supported providers.

_EC2_

[source,json]
-------------------------------------------------------------------------------
{
  "meta": {
    "cloud": {
      "availability_zone": "us-east-1c",
      "instance_id": "i-4e123456",
      "machine_type": "t2.medium",
      "provider": "ec2",
      "region": "us-east-1"
    }
  }
}
-------------------------------------------------------------------------------

_Digital Ocean_

[source,json]
-------------------------------------------------------------------------------
{
  "meta": {
    "cloud": {
      "instance_id": "1234567",
      "provider": "digitalocean",
      "region": "nyc2"
    }
  }
}
-------------------------------------------------------------------------------

_GCE_

[source,json]
-------------------------------------------------------------------------------
{
  "meta": {
    "cloud": {
      "availability_zone": "projects/1234567890/zones/us-east1-b",
      "instance_id": "1234556778987654321",
      "machine_type": "projects/1234567890/machineTypes/f1-micro",
      "project_id": "my-dev",
      "provider": "gce"
    }
  }
}
-------------------------------------------------------------------------------

_Tencent Cloud_

[source,json]
-------------------------------------------------------------------------------
{
  "meta": {
    "cloud": {
      "availability_zone": "gz-azone2",
      "instance_id": "ins-qcloudv5",
      "provider": "qcloud",
      "region": "china-south-gz"
    }
  }
}
-------------------------------------------------------------------------------

_Alibaba Cloud_

This metadata is only available when VPC is selected as the network type of the
ECS instance.

[source,json]
-------------------------------------------------------------------------------
{
  "meta": {
    "cloud": {
      "availability_zone": "cn-shenzhen",
      "instance_id": "i-wz9g2hqiikg0aliyun2b",
      "provider": "ecs",
      "region": "cn-shenzhen-a"
    }
  }
}
-------------------------------------------------------------------------------

_Azure Virtual Machine_

[source,json]
-------------------------------------------------------------------------------
{
  "meta": {
    "cloud": {
      "provider": "az",
      "instance_id": "04ab04c3-63de-4709-a9f9-9ab8c0411d5e",
      "instance_name": "test-az-vm",
      "machine_type": "Standard_D3_v2",
      "region": "eastus2"
    }
  }
}
-------------------------------------------------------------------------------

_Openstack Nova_

[source,json]
-------------------------------------------------------------------------------
{
  "meta": {
    "cloud": {
      "provider": "openstack",
      "instance_name": "test-998d932195.mycloud.tld",
      "availability_zone": "xxxx-az-c",
      "instance_id": "i-00011a84",
      "machine_type": "m2.large"
    }
  }
}
-------------------------------------------------------------------------------


[[add-locale]]
=== Add the local time zone

The `add_locale` processor enriches each event with the machine's time zone
offset from UTC or with the name of the time zone. It supports one configuration
option named `format` that controls whether an offset or time zone abbreviation
is added to the event. The default format is `offset`. The processor adds the
a `beat.timezone` value to each event.

The configuration below enables the processor with the default settings.

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_locale: ~
-------------------------------------------------------------------------------

This configuration enables the processor and configures it to add the time zone
abbreviation to events.

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_locale:
    format: abbreviation
-------------------------------------------------------------------------------

NOTE: Please note that `add_locale` differentiates between daylight savings
time (DST) and regular time. For example `CEST` indicates DST and and `CET` is
regular time.


[[decode-json-fields]]
=== Decode JSON fields

The `decode_json_fields` processor decodes fields containing JSON strings and
replaces the strings with valid JSON objects.

[source,yaml]
-----------------------------------------------------
processors:
 - decode_json_fields:
     fields: ["field1", "field2", ...]
     process_array: false
     max_depth: 1
     target: ""
     overwrite_keys: false
-----------------------------------------------------

The `decode_json_fields` processor has the following configuration settings:

`fields`:: The fields containing JSON strings to decode.
`process_array`:: (Optional) A boolean that specifies whether to process
arrays. The default is false.
`max_depth`:: (Optional) The maximum parsing depth. The default is 1.
`target`:: (Optional) The field under which the decoded JSON will be written. By
default the decoded JSON object replaces the string field from which it was
read. To merge the decoded JSON fields into the root of the event, specify
`target` with an empty string (`target: ""`). Note that the `null` value (`target:`)
is treated as if the field was not set at all.
`overwrite_keys`:: (Optional) A boolean that specifies whether keys that already
exist in the event are overwritten by keys from the decoded JSON object. The
default value is false.

[[drop-event]]
=== Drop events

The `drop_event` processor drops the entire event if the associated condition
is fulfilled. The condition is mandatory, because without one, all the events
are dropped.

[source,yaml]
------
processors:
 - drop_event:
     when:
        condition
------

See <<conditions>> for a list of supported conditions.

[[drop-fields]]
=== Drop fields from events

The `drop_fields` processor specifies which fields to drop if a certain
condition is fulfilled. The condition is optional. If it's missing, the
specified fields are always dropped. The `@timestamp` and `type` fields cannot
be dropped, even if they show up in the `drop_fields` list.

[source,yaml]
-----------------------------------------------------
processors:
 - drop_fields:
     when:
        condition
     fields: ["field1", "field2", ...]
-----------------------------------------------------

See <<conditions>> for a list of supported conditions.

NOTE: If you define an empty list of fields under `drop_fields`, then no fields
are dropped.

[[include-fields]]
=== Keep fields from events

The `include_fields` processor specifies which fields to export if a certain
condition is fulfilled. The condition is optional. If it's missing, the
specified fields are always exported. The `@timestamp` and `type` fields are
always exported, even if they are not defined in the `include_fields` list.

[source,yaml]
-------
processors:
 - include_fields:
     when:
        condition
     fields: ["field1", "field2", ...]
-------

See <<conditions>> for a list of supported conditions.

You can specify multiple `include_fields` processors under the `processors`
section.

NOTE: If you define an empty list of fields under `include_fields`, then only
the required fields, `@timestamp` and `type`, are exported.

[[rename-fields]]
=== Rename fields from events

The `rename` processor specifies a list of fields to rename. Under the `fields`
key each entry contains a `from: old-key` and a `to: new-key` pair. `from` is
the origin and `to` the target name of the field.

Renaming fields can be useful in cases where field names cause conflicts. For
example if an event has two fields, `c` and `c.b`, that are both assigned scalar
values (e.g. `{"c": 1, "c.b": 2}`) this will result in an Elasticsearch error at
ingest time. This is because the value of a cannot simultaneously be a scalar
and an object. To prevent this rename_fields can be used to rename `c` to
`c.value`.

Rename fields cannot be used to overwrite fields. To overwrite fields either
first rename the target field or use the `drop_fields` processor to drop the
field and then rename the field.

[source,yaml]
-------
processors:
- rename:
    fields:
     - from: "a.g"
       to: "e.d"
    ignore_missing: false
    fail_on_error: true
-------

The `rename` processor has the following configuration settings:

`ignore_missing`:: (Optional) If set to true, no error is logged in case a key
which should be renamed is missing. Default is `false`.

`fail_on_error`:: (Optional) If set to true, in case of an error the renaming of
fields is stopped and the original event is returned. If set to false, renaming
continues also if an error happened during renaming. Default is `true`.

See <<conditions>> for a list of supported conditions.

You can specify multiple `ignore_missing` processors under the `processors`
section.

[[add-kubernetes-metadata]]
=== Add Kubernetes metadata

The `add_kubernetes_metadata` processor annotates each event with relevant
metadata based on which Kubernetes pod the event originated from. Each event is
annotated with:

* Pod Name
* Namespace
* Labels

The `add_kubernetes_metadata` processor has two basic building blocks which are:

* Indexers
* Matchers

Indexers take in a pod's metadata and builds indices based on the pod metadata.
For example, the `ip_port` indexer can take a Kubernetes pod and index the pod
metadata based on all `pod_ip:container_port` combinations.

Matchers are used to construct lookup keys for querying indices. For example,
when the `fields` matcher takes `["metricset.host"]` as a lookup field, it would
construct a lookup key with the value of the field `metricset.host`.

Each Beat can define its own default indexers and matchers which are enabled by
default. For example, FileBeat enables the `container` indexer, which indexes
pod metadata based on all container IDs, and a `logs_path` matcher, which takes
the `source` field, extracts the container ID, and uses it to retrieve metadata.

The configuration below enables the processor when {beatname_lc} is run as a pod in
Kubernetes.

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_kubernetes_metadata:
    in_cluster: true
-------------------------------------------------------------------------------

The configuration below enables the processor on a Beat running as a process on
the Kubernetes node.

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_kubernetes_metadata:
    in_cluster: false
    host: <hostname>
    kube_config: ${HOME}/.kube/config
-------------------------------------------------------------------------------

The configuration below has the default indexers and matchers disabled and
enables ones that the user is interested in.

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_kubernetes_metadata:
    in_cluster: false
    host: <hostname>
    kube_config: ~/.kube/config
    default_indexers.enabled: false
    default_matchers.enabled: false
    indexers:
      - ip_port:
    matchers:
      - fields:
          lookup_fields: ["metricset.host"]
-------------------------------------------------------------------------------

The `add_kubernetes_metadata` processor has the following configuration settings:

`in_cluster`:: (Optional) Use in cluster settings for Kubernetes client, `true`
by default.
`host`:: (Optional) Identify the node where {beatname_lc} is running in case it
cannot be accurately detected, as when running {beatname_lc} in host network
mode.
`kube_config`:: (Optional) Use given config file as configuration for Kubernetes
client.
`default_indexers.enabled`:: (Optional) Enable/Disable default pod indexers, in
case you want to specify your own.
`default_matchers.enabled`:: (Optional) Enable/Disable default pod matchers, in
case you want to specify your own.

[[add-docker-metadata]]
=== Add Docker metadata

The `add_docker_metadata` processor annotates each event with relevant metadata
from Docker containers:

* Container ID
* Name
* Image
* Labels

[NOTE]
=====
When running {beatname_uc} in a container, you need to provide access to
Docker’s unix socket in order for the `add_docker_metadata` processor to work.
You can do this by mounting the socket inside the container. For example:

`docker run -v /var/run/docker.sock:/var/run/docker.sock ...`

To avoid privilege issues, you may also need to add `--user=root` to the
`docker run` flags. Because the user must be part of the docker group in order
to access `/var/run/docker.sock`, root access is required if {beatname_uc} is
running as non-root inside the container. 
=====

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_docker_metadata:
    host: "unix:///var/run/docker.sock"
    #match_fields: ["system.process.cgroup.id"]
    #match_pids: ["process.pid", "process.ppid"]
    #match_source: true
    #match_source_index: 4
    #match_short_id: true
    #cleanup_timeout: 60
    # To connect to Docker over TLS you must specify a client and CA certificate.
    #ssl:
    #  certificate_authority: "/etc/pki/root/ca.pem"
    #  certificate:           "/etc/pki/client/cert.pem"
    #  key:                   "/etc/pki/client/cert.key"
-------------------------------------------------------------------------------

It has the following settings:

`host`:: (Optional) Docker socket (UNIX or TCP socket). It uses
`unix:///var/run/docker.sock` by default.

`ssl`:: (Optional) SSL configuration to use when connecting to the Docker
socket.

`match_fields`:: (Optional) A list of fields to match a container ID, at least
one of them should hold a container ID to get the event enriched.

`match_pids`:: (Optional) A list of fields that contain process IDs. If the
process is running in Docker then the event will be enriched. The default value
is `["process.pid", "process.ppid"]`.

`match_source`:: (Optional) Match container ID from a log path present in the
`source` field. Enabled by default.

`match_short_id`:: (Optional) Match container short ID from a log path present
in the `source` field. Disabled by default.
This allows to match directories names that have the first 12 characters
of the container ID. For example, `/var/log/containers/b7e3460e2b21/*.log`.

`match_source_index`:: (Optional) Index in the source path split by `/` to look
for container ID. It defaults to 4 to match
`/var/lib/docker/containers/<container_id>/*.log`

`cleanup_timeout`:: (Optional) Time of inactivity to consider we can clean and
forget metadata for a container, 60s by default.

[[add-host-metadata]]
=== Add Host metadata

beta[]

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_host_metadata:
    netinfo.enabled: false
-------------------------------------------------------------------------------

It has the following settings:

`netinfo.enabled`:: (Optional) Default false. Include IP addresses and MAC addresses as fields host.ip and host.mac

The `add_host_metadata` processor annotates each event with relevant metadata from the host machine.
The fields added to the event are looking as following:

[source,json]
-------------------------------------------------------------------------------
{
   "host":{
      "architecture":"x86_64",
      "name":"example-host",
      "id":"",
      "os":{
         "family":"darwin",
         "build":"16G1212",
         "platform":"darwin",
         "version":"10.12.6"
      },
      "ip": ["192.168.0.1", "10.0.0.1"],
      "mac": ["00:25:96:12:34:56", "72:00:06:ff:79:f1"]
   }
}
-------------------------------------------------------------------------------

NOTE: The host information is refreshed every 5 minutes.

[[dissect]]
=== Dissect strings

The dissect processor tokenizes incoming strings using defined patterns.

[source,yaml]
-------
processors:
- dissect:
    tokenizer: "%{key1} %{key2}"
    field: "message"
    target_prefix: "dissect"
-------

The `dissect` processor has the following configuration settings:

`field`:: (Optional) The event field to tokenize. Default is `message`.

`target_prefix`:: (Optional) The name of the field where the values will be extracted. When an empty
string is defined, the processor will create the keys at the root of the event. Default is
`dissect`. When the target key already exists in the event, the processor won't replace it and log
an error; you need to either drop or rename the key before using dissect.

For tokenization to be successful, all keys must be found and extracted, if one of them cannot be
found an error will be logged and no modification is done on the original event.

NOTE: A key can contain any characters except reserved suffix or prefix modifiers:  `/`,`&`, `+`
and `?`.

See <<conditions>> for a list of supported conditions.

[[processor-dns]]
=== DNS Reverse Lookup

The DNS processor performs reverse DNS lookups of IP addresses. It caches the
responses that it receives in accordance to the time-to-live (TTL) value
contained in the response. It also caches failures that occur during lookups.
Each instance of this processor maintains its own independent cache.

The processor uses its own DNS resolver to send requests to nameservers and does
not use the operating system's resolver. It does not read any values contained
in `/etc/hosts`.

This processor can significantly slow down your pipeline's throughput if you
have a high latency network or slow upstream nameserver. The cache will help
with performance, but if the addresses being resolved have a high cardinality
then the cache benefits will be diminished due to the high miss ratio.

By way of example, if each DNS lookup takes 2 milliseconds, the maximum
throughput you can achieve is 500 events per second (1000 milliseconds / 2
milliseconds). If you have a high cache hit ratio then your throughput can be
higher.

This is a minimal configuration example that resolves the IP addresses contained
in two fields.

[source,yaml]
----
processors:
- dns:
    type: reverse
    fields:
      source.ip: source.hostname
      destination.ip: destination.hostname
----

Next is a configuration example showing all options.

[source,yaml]
----
processors:
- dns:
    type: reverse
    action: append
    fields:
      server.ip: server.hostname
      client.ip: client.hostname
    success_cache:
      capacity.initial: 1000
      capacity.max: 10000
    failure_cache:
      capacity.initial: 1000
      capacity.max: 10000
      ttl: 1m
    nameservers: ['192.0.2.1', '203.0.113.1']
    timeout: 500ms
    tag_on_failure: [_dns_reverse_lookup_failed]
----

The `dns` processor has the following configuration settings:

`type`:: The type of DNS lookup to perform. The only supported type is
`reverse` which queries for a PTR record.

`action`:: This defines the behavior of the processor when the target field
already exists in the event. The options are `append` (default) and `replace`.

`fields`:: This is a mapping of source field names to target field names. The
value of the source field will be used in the DNS query and result will be
written to the target field.

`success_cache.capacity.initial`:: The initial number of items that the success
cache will be allocated to hold. When initialized the processor will allocate
the memory for this number of items. Default value is `1000`.

`success_cache.capacity.max`:: The maximum number of items that the success
cache can hold. When the maximum capacity is reached a random item is evicted.
Default value is `10000`.

`failure_cache.capacity.initial`:: The initial number of items that the failure
cache will be allocated to hold. When initialized the processor will allocate
the memory for this number of items. Default value is `1000`.

`failure_cache.capacity.max`:: The maximum number of items that the failure
cache can hold. When the maximum capacity is reached a random item is evicted.
Default value is `10000`.

`failure_cache.ttl`:: The duration for which failures are cached. Valid time
units are "ns", "us" (or "µs"), "ms", "s", "m", "h". Default value is `1m`.

`nameservers`:: A list of nameservers to query. If there are multiple servers,
the resolver queries them in the order listed. If none are specified then it
will read the nameservers listed in `/etc/resolv.conf` once at initialization.
On Windows you must always supply at least one nameserver.

`timeout`:: The duration after which a DNS query will timeout. This is timeout
for each DNS request so if you have 2 nameservers then the total timeout will be
2 times this value. Valid time units are "ns", "us" (or "µs"), "ms", "s", "m",
"h". Default value is `500ms`.

`tag_on_failure`:: A list of tags to add to the event when any lookup fails. The
tags are only added once even if multiple lookups fail. By default no tags are
added upon failure.

[[add-process-metadata]]
=== Add process metadata

The Add process metadata processor enriches events with information from running
processes, identified by their process ID (PID).

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_process_metadata:
    match_pids: [system.process.ppid]
    target: system.process.parent
-------------------------------------------------------------------------------

The fields added to the event look as follows:
[source,json]
-------------------------------------------------------------------------------
"process": {
  "name":  "systemd",
  "title": "/usr/lib/systemd/systemd --switched-root --system --deserialize 22",
  "exe":   "/usr/lib/systemd/systemd",
  "args":  ["/usr/lib/systemd/systemd", "--switched-root", "--system", "--deserialize", "22"],
  "pid":   1,
  "ppid":  0,
  "start_time": "2018-08-22T08:44:50.684Z",
}
-------------------------------------------------------------------------------

Optionally, the process environment can be included, too:
[source,json]
-------------------------------------------------------------------------------
  ...
  "env": {
    "HOME":       "/",
    "TERM":       "linux",
    "BOOT_IMAGE": "/boot/vmlinuz-4.11.8-300.fc26.x86_64",
    "LANG":       "en_US.UTF-8",
  }
  ...
-------------------------------------------------------------------------------
It has the following settings:

`match_pids`:: List of fields to lookup for a PID. The processor will
search the list sequentially until the field is found in the current event, and
the PID lookup will be applied to the value of this field.

`target`:: (Optional) Destination prefix where the `process` object will be
created. The default is the event's root.

`include_fields`:: (Optional) List of fields to add. By default, the processor
will add all the available fields except `process.env`.

`ignore_missing`:: (Optional) When set to `false`, events that don't contain any
of the fields in match_pids will be discarded and an error will be generated. By
default, this condition is ignored.

`overwrite_keys`:: (Optional) By default, if a target field already exists, it
will not be overwritten and an error will be logged. If `overwrite_keys` is
set to `true`, this condition will be ignored.

`restricted_fields`:: (Optional) By default, the `process.env` field is not
output, to avoid leaking sensitive data. If `restricted_fields` is `true`, the
field will be present in the output.