you are better than you think

kubernetes node 相关指标梳理

7 Oct, 2022

这一篇我们梳理node相关的指标，话不多说，先上指标。

1.kubelet自身指标梳理

# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
gc的时间统计（summary指标）

# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
goroutine 数量

# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
os的线程数量


# HELP kubelet_cgroup_manager_duration_seconds [ALPHA] Duration in seconds for cgroup manager operations. Broken down by method.
# TYPE kubelet_cgroup_manager_duration_seconds histogram
操作cgroup的时长分布，按照操作类型统计

# HELP kubelet_containers_per_pod_count [ALPHA] The number of containers per pod.
# TYPE kubelet_containers_per_pod_count histogram
pod中container数量的统计(spec.containers的数量)


# HELP kubelet_docker_operations_duration_seconds [ALPHA] Latency in seconds of Docker operations. Broken down by operation type.
# TYPE kubelet_docker_operations_duration_seconds histogram
操作docker的时长分布，按照操作类型统计

# HELP kubelet_docker_operations_errors_total [ALPHA] Cumulative number of Docker operation errors by operation type.
# TYPE kubelet_docker_operations_errors_total counter
操作docker的错误累计次数，按照操作类型统计

# HELP kubelet_docker_operations_timeout_total [ALPHA] Cumulative number of Docker operation timeout by operation type.
# TYPE kubelet_docker_operations_timeout_total counter
操作docker的超时统计，按照操作类型统计

# HELP kubelet_docker_operations_total [ALPHA] Cumulative number of Docker operations by operation type.
# TYPE kubelet_docker_operations_total counter
操作docker的累计次数，按照操作类型统计

# HELP kubelet_eviction_stats_age_seconds [ALPHA] Time between when stats are collected, and when pod is evicted based on those stats by eviction signal
# TYPE kubelet_eviction_stats_age_seconds histogram
驱逐操作的时间分布，按照驱逐信号(原因)分类统计

# HELP kubelet_evictions [ALPHA] Cumulative number of pod evictions by eviction signal
# TYPE kubelet_evictions counter
驱逐次数统计，按照驱逐信号（原因）统计

# HELP kubelet_http_inflight_requests [ALPHA] Number of the inflight http requests
# TYPE kubelet_http_inflight_requests gauge
请求kubelet的inflight请求数,按照method path server_type统计
注意与每秒的request数区别开

# HELP kubelet_http_requests_duration_seconds [ALPHA] Duration in seconds to serve http requests
# TYPE kubelet_http_requests_duration_seconds histogram
请求kubelet的请求时间统计,按照method path server_type统计

# HELP kubelet_http_requests_total [ALPHA] Number of the http requests received since the server started
# TYPE kubelet_http_requests_total counter
请求kubelet的请求数统计,按照method path server_type统计

# HELP kubelet_managed_ephemeral_containers [ALPHA] Current number of ephemeral containers in pods managed by this kubelet. Ephemeral containers will be ignored if disabled by the EphemeralContainers feature gate, and this number will be 0.
# TYPE kubelet_managed_ephemeral_containers gauge
当前kubelet管理的临时容器数量

# HELP kubelet_network_plugin_operations_duration_seconds [ALPHA] Latency in seconds of network plugin operations. Broken down by operation type.
# TYPE kubelet_network_plugin_operations_duration_seconds histogram
网络插件的操作耗时分布 ，按照操作类型(operation_type)统计
如果 --feature-gates=EphemeralContainers=false，否则一直为0

# HELP kubelet_network_plugin_operations_errors_total [ALPHA] Cumulative number of network plugin operation errors by operation type.
# TYPE kubelet_network_plugin_operations_errors_total counter
网络插件累计操作错误数统计，按照操作类型(operation_type)统计

# HELP kubelet_network_plugin_operations_total [ALPHA] Cumulative number of network plugin operations by operation type.
# TYPE kubelet_network_plugin_operations_total counter
网络插件累计操作数统计，按照操作类型(operation_type)统计

# HELP kubelet_node_name [ALPHA] The node's name. The count is always 1.
# TYPE kubelet_node_name gauge
node name

# HELP kubelet_pleg_discard_events [ALPHA] The number of discard events in PLEG.
# TYPE kubelet_pleg_discard_events counter
PLEG(pod lifecycle event generator) 丢弃的event数统计

# HELP kubelet_pleg_last_seen_seconds [ALPHA] Timestamp in seconds when PLEG was last seen active.
# TYPE kubelet_pleg_last_seen_seconds gauge
PLEG上次活跃的时间戳

# HELP kubelet_pleg_relist_duration_seconds [ALPHA] Duration in seconds for relisting pods in PLEG.
# TYPE kubelet_pleg_relist_duration_seconds histogram
PLEG relist pod时间分布

# HELP kubelet_pleg_relist_interval_seconds [ALPHA] Interval in seconds between relisting in PLEG.
# TYPE kubelet_pleg_relist_interval_seconds histogram
PLEG relist 间隔时间分布

# HELP kubelet_pod_start_duration_seconds [ALPHA] Duration in seconds for a single pod to go from pending to running.
# TYPE kubelet_pod_start_duration_seconds histogram
pod启动时间(从pending到running)分布
kubelet watch到pod时到pod中contianer都running后
（watch各种source channel的pod变更）

# HELP kubelet_pod_worker_duration_seconds [ALPHA] Duration in seconds to sync a single pod. Broken down by operation type: create, update, or sync
# TYPE kubelet_pod_worker_duration_seconds histogram
pod状态变化的时间分布， 按照操作类型（create update sync）统计
worker就是kubelet中处理一个pod的逻辑工作单位

# HELP kubelet_pod_worker_start_duration_seconds [ALPHA] Duration in seconds from seeing a pod to starting a worker.
# TYPE kubelet_pod_worker_start_duration_seconds histogram
kubelet watch到pod到worker启动的时间分布

# HELP kubelet_run_podsandbox_duration_seconds [ALPHA] Duration in seconds of the run_podsandbox operations. Broken down by RuntimeClass.Handler.
# TYPE kubelet_run_podsandbox_duration_seconds histogram
启动sandbox的时间分布

# HELP kubelet_run_podsandbox_errors_total [ALPHA] Cumulative number of the run_podsandbox operation errors by RuntimeClass.Handler.
# TYPE kubelet_run_podsandbox_errors_total counter
启动sanbox出现error的总数

# HELP kubelet_running_containers [ALPHA] Number of containers currently running
# TYPE kubelet_running_containers gauge
当前containers运行状态的统计
按照container状态统计，created running exited

# HELP kubelet_running_pods [ALPHA] Number of pods that have a running pod sandbox
# TYPE kubelet_running_pods gauge
当前处于running状态pod数量

# HELP kubelet_runtime_operations_duration_seconds [ALPHA] Duration in seconds of runtime operations. Broken down by operation type.
# TYPE kubelet_runtime_operations_duration_seconds histogram
容器运行时的操作耗时
(container在create list exec remove stop等的耗时)

# HELP kubelet_runtime_operations_errors_total [ALPHA] Cumulative number of runtime operation errors by operation type.
# TYPE kubelet_runtime_operations_errors_total counter
容器运行时的操作错误数统计(按操作类型统计)

# HELP kubelet_runtime_operations_total [ALPHA] Cumulative number of runtime operations by operation type.
# TYPE kubelet_runtime_operations_total counter
容器运行时的操作总数统计(按操作类型统计)

# HELP kubelet_started_containers_errors_total [ALPHA] Cumulative number of errors when starting containers
# TYPE kubelet_started_containers_errors_total counter
kubelet启动容器错误总数统计(按code和container_type统计)
code包括ErrImagePull ErrImageInspect ErrImagePull ErrRegistryUnavailable ErrInvalidImageName等
container_type一般为"container" "podsandbox"

# HELP kubelet_started_containers_total [ALPHA] Cumulative number of containers started
# TYPE kubelet_started_containers_total counter
kubelet启动容器总数

# HELP kubelet_started_pods_errors_total [ALPHA] Cumulative number of errors when starting pods
# TYPE kubelet_started_pods_errors_total counter
kubelet启动pod遇到的错误总数(只有创建sandbox遇到错误才会统计）

# HELP kubelet_started_pods_total [ALPHA] Cumulative number of pods started
# TYPE kubelet_started_pods_total counter
kubelet启动的pod总数

# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
统计cpu使用率

# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
允许进程打开的最大fd数

# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
当前打开的fd数量

# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
进程驻留内存大小

# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
进程启动时间

# HELP rest_client_request_duration_seconds [ALPHA] Request latency in seconds. Broken down by verb and URL.
# TYPE rest_client_request_duration_seconds histogram
请求apiserver的耗时统计(按照url和请求类型统计verb)

# HELP rest_client_requests_total [ALPHA] Number of HTTP requests, partitioned by status code, method, and host.
# TYPE rest_client_requests_total counter
请求apiserver的总次数(按照返回码code和请求类型method统计)

# HELP storage_operation_duration_seconds [ALPHA] Storage operation duration
# TYPE storage_operation_duration_seconds histogram
存储操作耗时(按照存储plugin(configmap emptydir hostpath 等 )和operation_name分类统计)
# HELP volume_manager_total_volumes [ALPHA] Number of volumes in Volume Manager
# TYPE volume_manager_total_volumes gauge
本机挂载的volume数量统计(按照plugin_name和state统计
plugin_name包括"host-path" "empty-dir" "configmap" "projected")
state(desired_state_of_world期状态/actual_state_of_world实际状态）

kubernetes控制面指标梳理

4 Sep, 2022

kubernetes组件指标梳理

本文梳理指标对应的的kubernetes版本为1.23.1, etcd版本为3.5.1

kube-apiserver 指标

kuber-apiserver暴露了148个指标，梳理后比较重要的指标如下。

# HELP apiserver_request_duration_seconds [STABLE] Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component.
# TYPE apiserver_request_duration_seconds histogram
apiserver响应的时间分布，按照url 和 verb 分类
一般按照instance和verb+时间 汇聚

# HELP apiserver_request_total [STABLE] Counter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, and HTTP response code.
# TYPE apiserver_request_total counter
apiserver的请求总数，按照verb、 version、 group、resource、scope、component、 http返回码分类统计

# HELP apiserver_current_inflight_requests [STABLE] Maximal number of currently used inflight request limit of this apiserver per request kind in last second.
# TYPE apiserver_current_inflight_requests gauge
当前最大请求数(利用channel大小限流), 按mutating(非get list watch的请求) 和 readOnly (get list watch)分别限制
超过max-requests-inflight(默认值400)  和 max-mutating-requests-inflight（默认200） 的请求会被限流
apiserver变更时要注意观察，也是反馈集群容量的一个重要指标

# HELP apiserver_response_sizes [STABLE] Response size distribution in bytes for each group, version, verb, resource, subresource, scope and component.
# TYPE apiserver_response_sizes histogram
apiserver 响应大小，单位byte, 按照verb、 version、 group、resource、scope、component分类统计

# HELP watch_cache_capacity [ALPHA] Total capacity of watch cache broken by resource type.
# TYPE watch_cache_capacity gauge
按照资源类型统计的watch缓存大小

# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
每秒钟用户态和系统态cpu消耗时间, 计算apiserver进程的cpu的使用率

# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
apiserver的内存使用量（单位:Byte)

# HELP workqueue_adds_total [ALPHA] Total number of adds handled by workqueue
# TYPE workqueue_adds_total counter
apiserver中包含的controller的工作队列，已处理的任务总数

# HELP workqueue_depth [ALPHA] Current depth of workqueue
# TYPE workqueue_depth gauge
apiserver中包含的controller的工作队列深度，表示当前队列中要处理的任务的数量，数值越小越好 
例如APIServiceRegistrationController admission_quota_controller

kubernetes环境鉴权与自动发现

20 Aug, 2022

kubernetes环境鉴权与自动发现

概览文章中提到了k8s的鉴权模式，简单回顾下：

RBAC： Role-based access control 是基于角色的访问控制
ABAC： Atrribute-based access control 是基于属性的访问控制
Node Authorization：节点鉴权，专门用户kubelet发出的api请求进行鉴权
Webhook Authorization： webhook是一种http回调，kube-apiserver配置webhook时，会设置回调webhook的规则，这些规则中包含了调用的api group、version、operation、scope等信息。

有细心的小伙伴指出，RBAC的角色可以作为ABAC的属性来配置。感谢小伙伴指正，ABAC可以更细粒度的控制权限，相应配置起来也更复杂。

kubernetes 鉴权

选定RBAC模式后，关于角色，有Role和ClusterRole，对应对象的绑定分别为: RoleBinding 和 ClusterRoleBinding。 Role创建后归属于特定的namespace，一般与特定namespace的权限绑定，而ClusterRole 不属于任何namespace，通常与一组权限绑定。

ClusterRole通常用于 + 定义指定namespace资源的访问权限，并在某个namespace范围内授予访问权限； + 定义指定namespace资源的访问权限，并在跨namespace范围内授予访问权限； + 定义集群范围内的资源访问权限。

官方文档推荐，如果在单个namespace内定义角色则使用Role，如果是定义集群范围的角色，则使用ClusterRole。要监控kubernetes组件和集群范围内业务以及为了通用性，所以我们选择ClusterRole 和 ClusterRoleBinding。

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19