现状

device_traffic 表所在的bucket已经设置了 Retention Policy 为 3d仍然整体的占用体积超过20G。tsm存储引擎文件过大严重超出预期,影响其他业务。通过简单分析主要是tsm文件过大,也就是说确实是存储的内容过多。

排查过程

通过简单分析,总体的思路是找到存储引擎位置,逐一分析具体是哪个bucket体积过大,然后分析原因。
在存储目录通过命令

cd /usr/local/influxdb/engine/data
du -sh ./*

逐一排查具体是哪个buckt所占的目录体积较大。目前来看是 “/” bucket体积特别大。“/”主要是influxdb系统相关的数据,并不会影响业务。

打开管理界面
http://x.x.x.x:8086/
data --> Buckets

顺便说一下其他两个系统bucket,_monitoring和_tasks。

_monitoring bucket是一个系统级bucket,用于存储inflxudb数据监控和发送告警通知。其保留策略时间为7天。
_monitoring 模式:

statuses (measurement)
    tags:
        _check_id: check ID
        _check_name: check name
        _level: level evaluated by the check (ok, info, warn, or crit)
        _source_measurement: original measurement queried by the check
        _type: check type (threshold or deadman)
        other tags inherited from queried data or added in the check configuration
    fields:
        _message: message generated by the check
        _source_timestamp: original timestamp of the queried data
    other fields inherited from queried data
notifications (measurement)
    tags:
        _check_id: check ID that triggered the notification
        _check_name: check name that triggered the notification
        _level: check-evaluated level that triggered the notification (ok, info, warn, or crit)
        _notification_endpoint_id: notification endpoint ID
        _notification_endpoint_name: notification endpoint name
        _notification_rule_id: notification rule ID
        _notification_rule_name: notification rule name
        _sent: sent status (true or false)
        _source_measurement: original measurement queried by the check
        _type: check type (threshold or deadman)
        other tags inherited from queried data or added in the check configuration
    fields:
        _message: message generated by the check
        _source_timestamp: original timestamp of the queried data
        _status_timestamp: timestamp when the status (_level) was evaluated
        other fields inherited from queried data

_tasks bucket也是是一个系统级bucket,用于存储inflxudb任务相关业务,其保存策略为3天。
任务运行是指单个任务的执行。
_tasks 模式:

runs (measurement)
    tags:
        status: task run status (success or failed)
        taskID: task ID
    fields:
        finishedAt: timestamp when the task run finished
        logs: log output from the task run
        requestedAt: timestamp when the task run was requested
        runID task run ID
       scheduledFor: timestamp the task run was scheduled for
       startedAt: timestamp when the task run started

解决方案

下面重点介绍下 "/" bucket,主要用于系统计量,通过目录文件统计,其占用的系统存储资源相当大,以此为切入点,通过配置该bucket解决问题。
首先在界面中将bucekt的retention policy的时间只保留3天数据,大约30分钟之后inflxudb会自动清理掉旧数据。

验证retention policy已经生效。

root@iZ2ze4vgyc2dc22jup6dv7Z:/usr/local/influxdb/engine/data# du -sh ./*
912K    ./177f23fdbc5960a2
440K    ./2e5dc5bdb6c14e48
3.2G    ./da6927bb4181d5da
508K    ./ddefe35149fa8703

Tags: influxdb, 时序数据库

Related Posts:

Leave a Comment