关于监控

Octopus基于sigs.k8s.io/controller-runtime上搭建，因此某些指标与控制器运行时和client-go相关。同时github.com/prometheus/client_golang为Go runtime提供了一些指标和过程状态。

指标类别

在 “种类”列中，使用第一个字母代表相应的单词：G - 仪表（Gauge），C - 计数器（Counter），H - 柱状图（Histogram），S - 摘要（Summary）。

Controller Runtime指标对照表

Controller 参数

种类	名称	描述
C	`controller_runtime_reconcile_total`	每个控制器的reconcile总数
C	`controller_runtime_reconcile_errors_total`	每个控制器的reconcile error总数
H	`controller_runtime_reconcile_time_seconds`	每个控制器的reconcile时间

Webhook 参数

种类	名称	描述
H	`controller_runtime_webhook_latency_seconds`	处理请求的延迟时间柱状图

Kubernetes 客户端指标对照表

Rest 客户端参数

种类	名称	描述
C	`rest_client_requests_total`	HTTP请求的数量，按状态码、方法和主机划分。
H	`rest_client_request_latency_seconds`	请求延迟时间，以秒为单位。按动词和URL分类。

Workqueue 参数

种类	名称	描述
G	`workqueue_depth`	工作队列的当前深度
G	`workqueue_unfinished_work_seconds`	正在进行中，还没有被work_duration观察到，且正在进行中的工作数量，数值表示卡住的线程数量。可以通过观察这个数值的增加速度来推断卡死线程的数量。
G	`workqueue_longest_running_processor_seconds`	工作队列运行时间最长的处理器已经运行了多少秒
C	`workqueue_adds_total`	工作队列处理的添加总数
C	`workqueue_retries_total`	工作队列处理的重试数量
H	`workqueue_queue_duration_seconds`	一个item在被请求之前在工作队列中停留的时间，以秒为单位
H	`workqueue_work_duration_seconds`	从工作队列处理一个项目需要多长时间，以秒为单位

Prometheus 客户端指标对照表

Go runtime 参数

种类	名称	描述
G	`go_goroutines`	目前存在的goroutines的数量
G	`go_threads`	创建的操作系统线程数
G	`go_info`	GO环境的信息
S	`go_gc_duration_seconds`	垃圾收集周期的暂停时间汇总
G	`go_memstats_alloc_bytes`	已分配且仍在使用的字节数
C	`go_memstats_alloc_bytes_total`	分配的字节总数，包括已经被释放的字节
G	`go_memstats_sys_bytes`	从系统获得的字节数
C	`go_memstats_lookups_total`	指针查找的总次数
C	`go_memstats_mallocs_total`	已分配内存的总数
C	`go_memstats_frees_total`	已释放内存的总数
G	`go_memstats_heap_alloc_bytes`	已分配且仍在使用的heap字节数。
G	`go_memstats_heap_sys_bytes`	从系统获得的heap数量
G	`go_memstats_heap_idle_bytes`	未使用的heap字节数
G	`go_memstats_heap_inuse_bytes`	正在使用的heap字节数
G	`go_memstats_heap_released_bytes`	释放给OS的heap字节数
G	`go_memstats_heap_objects`	已分配对象的数量
G	`go_memstats_stack_inuse_bytes`	stack allocator使用的字节数
G	`go_memstats_stack_sys_bytes`	stack allocator从系统获取的字节数
G	`go_memstats_mspan_inuse_bytes`	内存跨度结构所使用的字节数。
G	`go_memstats_mspan_sys_bytes`	内存跨度结构从系统获取的字节数
G	`go_memstats_mcache_inuse_bytes`	内存缓存结构使用的字节数。
G	`go_memstats_mcache_sys_bytes`	内存缓存结构从系统获取的字节数
G	`go_memstats_buck_hash_sys_bytes`	profile bucket哈希表使用的字节数
G	`go_memstats_gc_sys_bytes`	用于垃圾收集系统元数据的字节数
G	`go_memstats_other_sys_bytes`	用于其他系统分配的字节数
G	`go_memstats_next_gc_bytes`	下一次进行垃圾收集时的heap字节数
G	`go_memstats_last_gc_time_seconds`	自1970年以来最后一次收集垃圾时间，精确到秒数
G	`go_memstats_gc_cpu_fraction`	自程序启动以来，GC使用的该程序可用CPU时间，精确到分钟

Running process 参数

种类	名称	描述
C	`process_cpu_seconds_total`	用户和系统CPU总耗时，单位是秒
G	`process_open_fds`	打开的的file descriptors的数量。
G	`process_max_fds`	file descriptors数量的最大限额
G	`process_virtual_memory_bytes`	虚拟内存大小（单位：字节）
G	`process_virtual_memory_max_bytes`	虚拟内存大小的最大限额（单位：字节）
G	`process_resident_memory_bytes`	预留内存大小，单位：字节
G	`process_start_time_seconds`	进程自unix纪元以来的开始时间（秒）

Octopus指标对照表

Limb 参数

种类	名称	描述
G	`limb_connect_connections`	连接适配器当前的连接数量
C	`limb_connect_errors_total`	连接适配器时出现的错误总数
C	`limb_send_errors_total`	适配器所需发送设备的错误总数
H	`limb_send_latency_seconds`	适配器所需发送设备的延迟时间的柱状图

监控

默认情况下，指标将在端口 8080上公开 (请参阅brain options和limb options，则可以通过Prometheus进行收集，并通过Grafana进行可视化分析。 Octopus提供了一个ServiceMonitor定义YAML与Prometheus Operator集成用于配置和管理Prometheus实例的工具。

Grafana 仪表板

为方便起见，Octopus提供了Grafana仪表板来可视化展示监视指标。

与Prometheus Operator集成

使用prometheus-operator HELM图表，您可以轻松地设置Prometheus Operator来监视Octopus。以下步骤演示了如何在本地Kubernetes集群上运行Prometheus Operator：

使用cluster-k3d-spinup.sh通过k3d创建本地Kubernetes集群。
按照HELM的安装指南安装helm工具，然后使用helm fetch --untar --untardir /tmp stable/prometheus-operator 将prometheus-operator图表移至本地/ tmp目录。
从prometheus-operator图表生成部署YAML，如下所示。
helm template --namespace octopus-monitoring \
--name octopus \
--set defaultRules.create=false \
--set global.rbac.pspEnabled=false \
--set prometheusOperator.admissionWebhooks.patch.enabled=false \
--set prometheusOperator.admissionWebhooks.enabled=false \
--set prometheusOperator.kubeletService.enabled=false \
--set prometheusOperator.tlsProxy.enabled=false \
--set prometheusOperator.serviceMonitor.selfMonitor=false \
--set alertmanager.enabled=false \
--set grafana.defaultDashboardsEnabled=false \
--set coreDns.enabled=false \
--set kubeApiServer.enabled=false \
--set kubeControllerManager.enabled=false \
--set kubeEtcd.enabled=false \
--set kubeProxy.enabled=false \
--set kubeScheduler.enabled=false \
--set kubeStateMetrics.enabled=false \
--set kubelet.enabled=false \
--set nodeExporter.enabled=false \
--set prometheus.serviceMonitor.selfMonitor=false \
--set prometheus.ingress.enabled=true \
--set prometheus.ingress.hosts={localhost} \
--set prometheus.ingress.paths={/prometheus} \
--set prometheus.ingress.annotations.'traefik\.ingress\.kubernetes\.io\/rewrite-target'=/ \
--set prometheus.prometheusSpec.externalUrl=http://localhost/prometheus \
--set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false \
--set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false \
--set prometheus.prometheusSpec.ruleSelectorNilUsesHelmValues=false \
--set grafana.adminPassword=admin \
--set grafana.rbac.pspUseAppArmor=false \
--set grafana.rbac.pspEnabled=false \
--set grafana.serviceMonitor.selfMonitor=false \
--set grafana.testFramework.enabled=false \
--set grafana.ingress.enabled=true \
--set grafana.ingress.hosts={localhost} \
--set grafana.ingress.path=/grafana \
--set grafana.ingress.annotations.'traefik\.ingress\.kubernetes\.io\/rewrite-target'=/ \
--set grafana.'grafana\.ini'.server.root_url=http://localhost/grafana \
/tmp/prometheus-operator > /tmp/prometheus-operator_all_in_one.yaml
通过kubectl create ns octopus-monitoring创建octopus-monitoring命名空间。
通过kubectl apply -f /tmp/prometheus-operator_all_in_one.yaml将prometheus-operator all-in-ine部署于本地集群。
(可选)通过kubectl apply -f https://raw.githubusercontent.com/cnrancher/octopus/master/deploy/e2e/all_in_one.yaml来部署Octopus
说明
国内用户，可以使用以下方法加速安装：

kubectl apply -f http://rancher-mirror.cnrancher.com/octopus/master/deploy/e2e/all_in_one.yaml

:::

通过kubectl apply -f https://raw.githubusercontent.com/cnrancher/octopus/master/deploy/e2e/integrate_with_prometheus_operator.yaml 将监视集成部署于本地集群。
说明
国内用户，可以使用以下方法加速安装：

kubectl apply -f http://rancher-mirror.cnrancher.com/octopus/master/deploy/e2e/integrate_with_prometheus_operator.yaml

:::

访问http://localhost/prometheus以通过浏览器查看Prometheus Web控制台，或访问http://localhost/grafana以查看Grafana控制台(管理员帐户为admin/admin)。
(可选)从Grafana控制台导入Octopus概述仪表板。

#指标类别

#Controller Runtime指标对照表

#Controller 参数

#Webhook 参数

#Kubernetes 客户端指标对照表

#Rest 客户端参数

#Workqueue 参数

#Prometheus 客户端指标对照表

#Go runtime 参数

#Running process 参数

#Octopus指标对照表

#Limb 参数

#监控

#Grafana 仪表板

#与Prometheus Operator集成

说明

说明

指标类别

Controller Runtime指标对照表

Controller 参数

Webhook 参数

Kubernetes 客户端指标对照表

Rest 客户端参数

Workqueue 参数

Prometheus 客户端指标对照表

Go runtime 参数

Running process 参数

Octopus指标对照表

Limb 参数

监控

Grafana 仪表板

与Prometheus Operator集成