394 lines
13 KiB
Text
394 lines
13 KiB
Text
[[filebeat-modules-devguide]]
|
||
== Creating a New Filebeat Module
|
||
|
||
This guide will walk you through creating a new Filebeat module.
|
||
|
||
All Filebeat modules currently live in the main
|
||
https://github.com/elastic/beats[Beats] repository. To clone the repository and
|
||
build Filebeat (which you will need for testing), please follow the general
|
||
instructions in <<beats-contributing>>.
|
||
|
||
[float]
|
||
=== Overview
|
||
|
||
Each Filebeat module is composed of one or more "filesets". We usually create a
|
||
module for each service that we support (`nginx` for Nginx, `mysql` for Mysql,
|
||
and so on) and a fileset for each type of log that the service creates. For
|
||
example, the Nginx module has `access` and `error` filesets. You can contribute
|
||
a new module (with at least one fileset), or a new fileset for an existing
|
||
module.
|
||
|
||
NOTE: In this guide we use `{module}` and `{fileset}` as placeholders for the
|
||
module and fileset names. You need to replace these with the actual names you
|
||
entered when your created the module and fileset. Only use characters `[a-z]` and, if required, underscores (`_`). No other characters are allowed.
|
||
|
||
[float]
|
||
=== Creating a new module
|
||
|
||
Run the following command in the `filebeat` folder:
|
||
|
||
[source,bash]
|
||
----
|
||
make create-module MODULE={module}
|
||
----
|
||
|
||
After running the `make create-module` command, you'll find the module,
|
||
along with its generated files, under `module/{module}`. This
|
||
directory contains the following files:
|
||
|
||
[source,bash]
|
||
----
|
||
module/{module}
|
||
├── module.yml
|
||
└── _meta
|
||
└── docs.asciidoc
|
||
└── fields.yml
|
||
└── kibana
|
||
----
|
||
|
||
Let's look at these files one by one.
|
||
|
||
[float]
|
||
==== module.yml
|
||
|
||
This file contains list of all the dashboards available for the module and used by `export_dashboards.go` script for exporting dashboards.
|
||
Each dashboard is defined by an id and the name of json file where the dashboard is saved locally.
|
||
At generation new fileset this file will be automatically updated with "default" dashboard settings for new fileset.
|
||
Please ensure that this settings are correct.
|
||
|
||
[float]
|
||
==== _meta/docs.asciidoc
|
||
|
||
This file contains module-specific documentation. You should include information
|
||
about which versions of the service were tested and the variables that are
|
||
defined in each fileset.
|
||
|
||
[float]
|
||
==== _meta/fields.yml
|
||
|
||
The module level `fields.yml` contains descriptions for the module-level fields.
|
||
Please review and update the title and the descriptions in this file. The title
|
||
is used as a title in the docs, so it's best to capitalize it.
|
||
|
||
[float]
|
||
==== _meta/kibana
|
||
|
||
This folder contains the sample Kibana dashboards for this module. To create
|
||
them, you can build them visually in Kibana and then export them with `export_dashboards`.
|
||
|
||
The tool will export all of the dashboard dependencies (visualizations,
|
||
saved searches) automatically.
|
||
|
||
You can see various ways of using `export_dashboards` at <<export-dashboards>>.
|
||
The recommended way to export them is to list your dashboards in your module's
|
||
`module.yml` file:
|
||
|
||
[source,yaml]
|
||
----
|
||
dashboards:
|
||
- id: 69f5ae20-eb02-11e7-8f04-beef1daadb05
|
||
file: mymodule-overview.json
|
||
- id: c0a7ce90-cafe-4242-8647-534bb4c21040
|
||
file: mymodule-errors.json
|
||
----
|
||
|
||
Then run `export_dashboards` like this:
|
||
|
||
[source,shell]
|
||
----
|
||
$ cd dev-tools/cmd/dashboards
|
||
$ make # if export_dashboard is not built yet
|
||
$ ./export_dashboards -yml '../../../filebeat/module/{module}/module.yml'
|
||
----
|
||
|
||
New Filebeat modules might not be compatible with Kibana 5.x. To export dashboards
|
||
that are compatible with 5.x, run the following command inside the developer virtualenv:
|
||
|
||
[source,shell]
|
||
----
|
||
$ cd filebeat
|
||
$ make python-env
|
||
$ cd module/{module}/
|
||
$ python ../../../dev-tools/export_5x_dashboards.py --regex {module} --dir _meta/kibana/5.x
|
||
----
|
||
|
||
Where the `--regex` parameter should match the dashboard you want to export.
|
||
|
||
Please note that dashboards exported from Kibana 5.x are not compatible with Kibana 6.x.
|
||
|
||
You can find more details about the process of creating and exporting the Kibana
|
||
dashboards by reading {beatsdevguide}/new-dashboards.html[this guide].
|
||
|
||
[float]
|
||
=== Creating a new fileset
|
||
|
||
Run the following command in the `filebeat` folder:
|
||
|
||
[source,bash]
|
||
----
|
||
make create-fileset MODULE={module} FILESET={fileset}
|
||
----
|
||
|
||
After running the `make create-fileset` command, you'll find the fileset,
|
||
along with its generated files, under `module/{module}/{fileset}`. This
|
||
directory contains the following files:
|
||
|
||
[source,bash]
|
||
----
|
||
module/{module}/{fileset}
|
||
├── manifest.yml
|
||
├── config
|
||
│ └── {fileset}.yml
|
||
├── ingest
|
||
│ └── pipeline.json
|
||
├── _meta
|
||
│ └── fields.yml
|
||
│ └── kibana
|
||
│ └── default
|
||
└── test
|
||
----
|
||
|
||
Let's look at these files one by one.
|
||
|
||
[float]
|
||
==== manifest.yml
|
||
|
||
The `manifest.yml` is the control file for the module, where variables are
|
||
defined and the other files are referenced. It is a YAML file, but in many
|
||
places in the file, you can use built-in or defined variables by using the
|
||
`{{.variable}}` syntax.
|
||
|
||
The `var` section of the file defines the fileset variables and their default
|
||
values. The module variables can be referenced in other configuration files,
|
||
and their value can be overridden at runtime by the Filebeat configuration.
|
||
|
||
As the fileset creator, you can use any names for the variables you define. Each
|
||
variable must have a default value. So in it's simplest form, this is how you
|
||
can define a new variable:
|
||
|
||
[source,yaml]
|
||
----
|
||
var:
|
||
- name: pipeline
|
||
default: with_plugins
|
||
----
|
||
|
||
Most fileset should have a `paths` variable defined, which sets the default
|
||
paths where the log files are located:
|
||
|
||
[source,yaml]
|
||
----
|
||
var:
|
||
- name: paths
|
||
default:
|
||
- /example/test.log*
|
||
os.darwin:
|
||
- /usr/local/example/test.log*
|
||
- /example/test.log*
|
||
os.windows:
|
||
- c:/programdata/example/logs/test.log*
|
||
----
|
||
|
||
There's quite a lot going on in this file, so let's break it down:
|
||
|
||
* The name of the variable is `paths` and the default value is an array with one
|
||
element: `"/example/test.log*"`.
|
||
* Note that variable values don't have to be strings.
|
||
They can be also numbers, objects, or as shown in this example, arrays.
|
||
* We will use the `paths` variable to set the input `paths`
|
||
setting, so "glob" values can be used here.
|
||
* Besides the `default` value, the file defines values for particular
|
||
operating systems: a default for darwin/OS X/macOS systems and a default for
|
||
Windows systems. These are introduced via the `os.darwin` and `os.windows`
|
||
keywords. The values under these keys become the default for the variable, if
|
||
Filebeat is executed on the respective OS.
|
||
|
||
Besides the variable definition, the `manifest.yml` file also contains
|
||
references to the ingest pipeline and input configuration to use (see next
|
||
sections):
|
||
|
||
[source,yaml]
|
||
----
|
||
ingest_pipeline: ingest/pipeline.json
|
||
input: config/testfileset.yml
|
||
----
|
||
|
||
These should point to the respective files from the fileset.
|
||
|
||
Note that when evaluating the contents of these files, the variables are
|
||
expanded, which enables you to select one file or the other depending on the
|
||
value of a variable. For example:
|
||
|
||
[source,yaml]
|
||
----
|
||
ingest_pipeline: ingest/{{.pipeline}}.json
|
||
----
|
||
|
||
This example selects the ingest pipeline file based on the value of the
|
||
`pipeline` variable. For the `pipeline` variable shown earlier, the path would
|
||
resolve to `ingest/with_plugins.json` (assuming the variable value isn't
|
||
overridden at runtime.)
|
||
|
||
[float]
|
||
==== config/*.yml
|
||
|
||
The `config/` folder contains template files that generate Filebeat input
|
||
configurations. The Filebeat inputs are primarily responsible for tailing
|
||
files, filtering, and multi-line stitching, so that's what you configure in the
|
||
template files.
|
||
|
||
A typical example looks like this:
|
||
|
||
[source,yaml]
|
||
----
|
||
type: log
|
||
paths:
|
||
{{ range $i, $path := .paths }}
|
||
- {{$path}}
|
||
{{ end }}
|
||
exclude_files: [".gz$"]
|
||
----
|
||
|
||
You'll find this example in the template file that gets generated automatically
|
||
when you run `make create-fileset`. In this example, the `paths` variable is
|
||
used to construct the `paths` list for the input `paths` option.
|
||
|
||
Any template files that you add to the `config/` folder need to generate a valid
|
||
Filebeat input configuration in YAML format. The options accepted by the
|
||
input configuration are documented in the
|
||
{filebeat}/configuration-filebeat-options.html[Filebeat Inputs] section of
|
||
the Filebeat documentation.
|
||
|
||
The template files use the templating language defined by the
|
||
https://golang.org/pkg/text/template/[Go standard library].
|
||
|
||
Here is another example that also configures multiline stitching:
|
||
|
||
[source,yaml]
|
||
----
|
||
type: log
|
||
paths:
|
||
{{ range $i, $path := .paths }}
|
||
- {{$path}}
|
||
{{ end }}
|
||
exclude_files: [".gz$"]
|
||
multiline:
|
||
pattern: "^# User@Host: "
|
||
negate: true
|
||
match: after
|
||
----
|
||
|
||
Although you can add multiple configuration files under the `config/` folder,
|
||
only the file indicated by the `manifest.yml` file will be loaded. You can use
|
||
variables to dynamically switch between configurations.
|
||
|
||
[float]
|
||
==== ingest/*.json
|
||
|
||
The `ingest/` folder contains Elasticsearch
|
||
{elasticsearch}/ingest.html[Ingest Node] pipeline configurations. The Ingest
|
||
Node pipelines are responsible for parsing the log lines and doing other
|
||
manipulations on the data.
|
||
|
||
The files in this folder are JSON documents representing
|
||
{elasticsearch}/pipeline.html[pipeline definitions]. Just like with the `config/`
|
||
folder, you can define multiple pipelines, but a single one is loaded at runtime
|
||
based on the information from `manifest.yml`.
|
||
|
||
The generator creates a JSON object similar to this one:
|
||
|
||
[source,json]
|
||
----
|
||
{
|
||
"description": "Pipeline for parsing {module} {fileset} logs",
|
||
"processors": [
|
||
],
|
||
"on_failure" : [{
|
||
"set" : {
|
||
"field" : "error.message",
|
||
"value" : "{{ _ingest.on_failure_message }}"
|
||
}
|
||
}]
|
||
}
|
||
----
|
||
|
||
From here, you would typically add processors to the `processors` array to do
|
||
the actual parsing. For details on how to use ingest node processors, see the
|
||
{elasticsearch}/ingest-processors.html[ingest node documentation]. In
|
||
particular, you will likely find the
|
||
{elasticsearch}/grok-processor.html[Grok processor] to be useful for parsing.
|
||
Here is an example for parsing the Nginx access logs.
|
||
|
||
[source,json]
|
||
----
|
||
{
|
||
"grok": {
|
||
"field": "message",
|
||
"patterns":[
|
||
"%{IPORHOST:nginx.access.remote_ip} - %{DATA:nginx.access.user_name} \\[%{HTTPDATE:nginx.access.time}\\] \"%{WORD:nginx.access.method} %{DATA:nginx.access.url} HTTP/%{NUMBER:nginx.access.http_version}\" %{NUMBER:nginx.access.response_code} %{NUMBER:nginx.access.body_sent.bytes} \"%{DATA:nginx.access.referrer}\" \"%{DATA:nginx.access.agent}\""
|
||
],
|
||
"ignore_missing": true
|
||
}
|
||
}
|
||
----
|
||
|
||
Note that you should follow the convention of naming of fields prefixed with the
|
||
module and fileset name: `{module}.{fileset}.field`, e.g.
|
||
`nginx.access.remote_ip`. Also, please review our <<event-conventions>>.
|
||
|
||
While developing the pipeline definition, we recommend making use of the
|
||
{elasticsearch}/simulate-pipeline-api.html[Simulate Pipeline API] for testing
|
||
and quick iteration.
|
||
|
||
By default Filebeat does not update Ingest pipelines if already loaded. If you
|
||
want to force updating your pipeline during development, use
|
||
`./filebeat setup --pipelines` command. This uploads pipelines even if they
|
||
are already available on the node.
|
||
|
||
[float]
|
||
==== _meta/fields.yml
|
||
|
||
The `fields.yml` file contains the top-level structure for the fields in your
|
||
fileset. It is used as the source of truth for:
|
||
|
||
* the generated Elasticsearch mapping template
|
||
* the generated Kibana index pattern
|
||
* the generated documentation for the exported fields
|
||
|
||
Besides the `fields.yml` file in the fileset, there is also a `fields.yml` file
|
||
at the module level, placed under `module/{module}/_meta/fields.yml`, which
|
||
should contain the fields defined at the module level, and the description of
|
||
the module itself. In most cases, you should add the fields at the fileset
|
||
level.
|
||
|
||
After `pipeline.json` is created, it is possible to generate a base `field.yml`.
|
||
|
||
[source,bash]
|
||
----
|
||
make create-fields MODULE={module} FILESET={fileset}
|
||
----
|
||
|
||
Please, always check the generated file and make sure the fields are correct.
|
||
Documenatation of fields must be added manually.
|
||
|
||
If the fields are correct, it is time to generate documentation, configuration
|
||
and Kibana index patterns.
|
||
|
||
[source,bash]
|
||
----
|
||
make update
|
||
----
|
||
|
||
[float]
|
||
==== test
|
||
|
||
In the `test/` directory, you should place sample log files generated by the
|
||
service. We have integration tests, automatically executed by CI, that will run
|
||
Filebeat on each of the log files under the `test/` folder and check that there
|
||
are no parsing errors and that all fields are documented.
|
||
|
||
In addition, assuming you have a `test.log` file, you can add a
|
||
`test.log-expected.json` file in the same directory that contains the expected
|
||
documents as they are found via an Elasticsearch search. In this case, the
|
||
integration tests will automatically check that the result is the same on each
|
||
run.
|