Filebeat将csv导入es尝试

news2025/12/21 18:28:12

一、安装

在docker中安装部署ELK+filebeat

二、主要配置

- type: log

# Change to true to enable this input configuration.

enabled: true

# Paths that should be crawled and fetched. Glob based paths.

paths:

- /home/centos/pip_v2.csv #源路径

#- c:\programdata\elasticsearch\logs\*

#exclude_lines: ["^Restaurant Name,"] #第一行为字段头以"Restaurant Name"开头，不要第一行

multiline:

pattern: ^\d{4}

#pattern: ',\d+,[^\",]+$'

negate: true

match: after

max_lines: 1000

timeout: 30s

三、关于elastic的pipline

https://hacpai.com/article/1512990272091

我简单介绍主流程，详情见上链接

1.开启数据预处理，node.ingest: true

2.向es提交pipline，并命名为my-pipeline-id

PUT _ingest/pipeline/my-pipeline-id
{
"description" : "describe pipeline",
"processors" : [
{
"set" : {
"field": "foo",
"value": "bar"
}
}
]
}

3.以上pipline的作用

若产生新的数据，会新增一个字段为foo:bar

4.curl的pipline即时测试

POST _ingest/pipeline/_simulate

是一个测试接口，提供pipline的规则和测试数据，返回结果数据

四、关于grok

是pipline中的正则匹配模式，以上规则的复杂版

POST _ingest/pipeline/_simulate

{

"pipeline": {

"description": "grok processor",

"processors" : [

{

"grok": {

"field": "message",

"patterns": ["%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}"]

}

]

},

"docs": [

{

"_index": "index",

"_type": "type",

"_id": "id",

"_source": {

"message": "55.3.244.1 GET /index.html 15824 0.043"

}

]

}

五、使用pipline导入csv

utput.elasticsearch:

# Array of hosts to connect to.

hosts: ["localhost:9200"]

#index: "csvindex"

pipline: "my-pipeline-id"

# Protocol - either `http` (default) or `https`.

#protocol: "https"

测试结果pipline配置后，并没生效。

六、结论

1.filebeat 导入csv的资料很少，主要为pipline方式，测试几个失败。

2.J和数据组并没有filebaeat 导入csv的成功案例。J不太建议使用

结论：filebeat导csv并不方便，建议采用logstash。

一般日志收集可使用logstash，每行的信息会存到message中

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/1484275.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！