No one sees the real me

仮想化PF基盤SE

【OpenShift】metrices-server 導入

metrics-server とは

OpenShift 3.11以前ではopenshfit-infraプロジェクトのheapsterコンポーネントを利用してHPA (Horizontal Pod Autoscaler)やoc adm topコマンドが動作していました。
3.11ではこのheapsterを含むHawkular Metricsスタックが非推奨になり、かわりにmetrics-serverが用意されています。

×openshift-infra

metrics-server インストール

# ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/metrics-server/config.yml -e openshift_metrics_server_install=true

PLAY RECAP **********************************************************************************************************************************************************************************************************************************
all-openshift              : ok=120  changed=15   unreachable=0    failed=0
localhost                  : ok=12   changed=0    unreachable=0    failed=0


INSTALLER STATUS ****************************************************************************************************************************************************************************************************************************
Initialization          : Complete (0:01:29)
metrics-server Install  : Complete (0:03:51)
[root@all-openshift ~]#

openshift-metrics-server という名前の Project名で、metrics-server-xxxx というPODが作成されました。

[root@all-openshift ~]# oc get pods --all-namespaces
NAMESPACE                           NAME                                           READY     STATUS      RESTARTS   AGE
default                             docker-registry-1-vrktc                        1/1       Running     1          21h
default                             registry-console-1-5t6zw                       1/1       Running     1          21h
default                             router-1-7z8rb                                 1/1       Running     2          21h
devproject                          httpd-example-1-4vlcf                          1/1       Running     0          1h
devproject                          httpd-example-1-build                          0/1       Completed   0          1h
kube-service-catalog                apiserver-66d66                                1/1       Running     7          21h
kube-service-catalog                controller-manager-r25kc                       1/1       Running     12         21h
kube-system                         master-api-all-openshift                       1/1       Running     5          21h
kube-system                         master-controllers-all-openshift               1/1       Running     7          21h
kube-system                         master-etcd-all-openshift                      1/1       Running     1          21h
openshift-ansible-service-broker    asb-1-deploy                                   0/1       Error       0          21h
openshift-console                   console-54cf74f88b-l44ph                       1/1       Running     1          21h
openshift-metrics-server            metrics-server-7bff797c9c-phggx                1/1       Running     0          9m ★これのみ
openshift-monitoring                alertmanager-main-0                            3/3       Running     4          21h
openshift-monitoring                alertmanager-main-1                            3/3       Running     4          21h
openshift-monitoring                alertmanager-main-2                            3/3       Running     3          21h
openshift-monitoring                cluster-monitoring-operator-8578656f6f-tpswg   1/1       Running     2          21h
openshift-monitoring                grafana-6b9f85786f-w99k7                       2/2       Running     2          21h
openshift-monitoring                kube-state-metrics-c4f86b5f8-qh9m2             3/3       Running     6          21h
openshift-monitoring                node-exporter-qm82r                            2/2       Running     2          21h
openshift-monitoring                prometheus-k8s-0                               4/4       Running     5          21h
openshift-monitoring                prometheus-k8s-1                               4/4       Running     5          21h
openshift-monitoring                prometheus-operator-6644b8cd54-n69h9           1/1       Running     3          21h
openshift-node                      sync-vx8m6                                     1/1       Running     1          21h
openshift-sdn                       ovs-6ght9                                      1/1       Running     1          21h
openshift-sdn                       sdn-mzmd4                                      1/1       Running     1          21h
openshift-template-service-broker   apiserver-97sch                                1/1       Running     9          21h
openshift-web-console               webconsole-b5c499c98-5k878                     1/1       Running     4          21h
[root@all-openshift ~]#

rheb.hatenablog.com

varu3.hatenablog.com

akubicharm.hatenablog.com

詰まったところ

CPU Utilization が取得できない。。。

図を外部から頂きます。
heapster の箇所を metrics-server だと置き換えればいいですかね。

f:id:naoki_1123:20200127215721p:plain


DeploymentConfig → ReplicationController → POD と 管理され、metrics-server が node の cAdvisorを見に行っているようですね。(kubelets including cAdvisor)

https://blog.openshift.com/wp-content/uploads/OpenShift-Commons-Whats-New-in-OpenShift-Container-Platform-3.11.pdf

で、HPA自体の作成はすんなり行きました。

わたしの場合は、DC に ResourceRequest, Limit, 閾値 をセットで入れていなかったため、
CPU使用率が取得できていなかったと思われます。
DC の rollout history がごっちゃになって 違うrevision に制限いれたりしてました。。。

[root@all-openshift ~]# oc -n devproject autoscale dc/httpd-example --min=1 --max=5 --cpu-percent=10
horizontalpodautoscaler.autoscaling/httpd-example autoscaled
[root@all-openshift ~]#
[root@all-openshift ~]#
[root@all-openshift ~]# oc -n devproject get hpa
NAME            REFERENCE                        TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
httpd-example   DeploymentConfig/httpd-example   <unknown>/10%   1         5         1          2m
[root@all-openshift ~]#
[root@all-openshift ~]#

※※※※※※※※※※※※※※※※※※※※
※(重要)CPU要求と制限を入れないと、unknown で Utilization を取得できません。
※※※※※※※※※※※※※※※※※※※※

# help を載せます。
[root@all-openshift ~]# oc -n devproject autoscale -h
Autoscale a deployment config or replication controller.

Looks up a deployment config or replication controller by name and creates an autoscaler that uses this deployment
config or replication controller as a reference. An autoscaler can automatically increase or decrease number of pods
deployed within the system as needed.

Usage:
  oc autoscale (-f FILENAME | TYPE NAME | TYPE/NAME) [--min=MINPODS] --max=MAXPODS [--cpu-percent=CPU] [flags]

Examples:
  # Auto scale a deployment config "foo", with the number of pods between 2 to
  # 10, target CPU utilization at a default value that server applies:
  oc autoscale dc/foo --min=2 --max=10

  # Auto scale a replication controller "foo", with the number of pods between
  # 1 to 5, target CPU utilization at 80%
  oc autoscale rc/foo --max=5 --cpu-percent=80

Options:
      --allow-missing-template-keys=true: If true, ignore any errors in templates when a field or map key is missing in
the template. Only applies to golang and jsonpath output formats.
      --cpu-percent=-1: The target average CPU utilization (represented as a percent of requested CPU) over all the
pods. If it's not specified or negative, a default autoscaling policy will be used.
      --dry-run=false: If true, only print the object that would be sent, without sending it.
  -f, --filename=[]: Filename, directory, or URL to files identifying the resource to autoscale.
      --generator='horizontalpodautoscaler/v1': The name of the API generator to use. Currently there is only 1
generator.
      --max=-1: The upper limit for the number of pods that can be set by the autoscaler. Required.
      --min=-1: The lower limit for the number of pods that can be set by the autoscaler. If it's not specified or
negative, the server will apply a default value.
      --name='': The name for the newly created object. If not specified, the name of the input resource will be used.
  -o, --output='': Output format. One of:
json|yaml|name|templatefile|template|go-template|go-template-file|jsonpath|jsonpath-file.
      --record=false: Record current kubectl command in the resource annotation. If set to false, do not record the
command. If set to true, record the command. If not set, default to updating the existing annotation value only if one
already exists.
  -R, --recursive=false: Process the directory used in -f, --filename recursively. Useful when you want to manage
related manifests organized within the same directory.
      --save-config=false: If true, the configuration of current object will be saved in its annotation. Otherwise, the
annotation will be unchanged. This flag is useful when you want to perform kubectl apply on this object in the future.
      --template='': Template string or path to template file to use when -o=go-template, -o=go-template-file. The
template format is golang templates [http://golang.org/pkg/text/template/#pkg-overview].

Use "oc options" for a list of global command-line options (applies to all commands).

続いてGUIを見ます。
Resource Requests and Limits を入れろって言ってますね。

f:id:naoki_1123:20200127221720p:plain

適当に入れてみます。
f:id:naoki_1123:20200127222046j:plain
はい、これがダメなパターンです。

後日談:10分程待てば紐づきできました。または即紐づくときもあります。(よく分からん。)

[root@all-openshift ~]# oc -n devproject get event
~
3m          9m           13        httpd-example.15edc07feba52da4            
HorizontalPodAutoscaler                                            
Warning   FailedGetResourceMetric        horizontal-pod-autoscaler            
missing request for cpu on container httpd-example in pod devproject/httpd-example-1-4vlcf
3m          9m           13        httpd-example.15edc07fed041902           
 HorizontalPodAutoscaler                                            
Warning   FailedComputeMetricsReplicas   horizontal-pod-autoscaler           
 failed to get cpu utilization: missing request for cpu on container httpd-example in pod devproject/httpd-example-1-4vlcf


わたしの場合は、DC に ResourceRequest, Limit, 閾値 をセットで入れていなかったため、
CPU使用率が取得できていなかったと思われます。
DC の rollout history がごっちゃになって 違うrevision に制限いれたりしてました。。。

後日談:10分程待てば紐づきできました。または即紐づくときもあります。(よく分からん。)

というわけで、先に Recource Limit / Request を入れたもしくは既にDCに入っている DC に HPA を追加すれば、 待ち時間ほとんどなく紐づいてくれるようです。


# 再度1から作成し直してみます。

[root@all-openshift ~]# oc -n devproject delete hpa httpd-example
horizontalpodautoscaler.autoscaling "httpd-example" deleted
[root@all-openshift ~]#
[root@all-openshift ~]# oc -n devproject edit dc httpd-example -o yaml
Edit cancelled, no changes made.
[root@all-openshift ~]#
[root@all-openshift ~]# oc -n devproject get dc httpd-example -o yaml
apiVersion: apps.openshift.io/v1
kind: DeploymentConfig
~
         resources:
          limits:
            cpu: 50m
            memory: 512Mi
          requests:
            cpu: 50m
            memory: 512Mi
~
[root@all-openshift ~]#
[root@all-openshift ~]# oc -n devproject autoscale dc/httpd-example --min=1 --max=5 --cpu-percent=10
horizontalpodautoscaler.autoscaling/httpd-example autoscaled
[root@all-openshift ~]#
[root@all-openshift ~]#


# 自動スケールアップする様を見てみましょう。

[root@all-openshift ~]# oc -n devproject get pods
NAME                    READY     STATUS      RESTARTS   AGE
httpd-example-1-4vlcf   1/1       Running     0          1h
httpd-example-1-build   0/1       Completed   0          1h
[root@all-openshift ~]#
[root@all-openshift ~]# oc -n devproject rsh httpd-example-1-4vlcf bash
bash-4.2$
bash-4.2$ yes > /dev/null &
[1] 99
bash-4.2$ top
bash-4.2$
[root@all-openshift ~]# oc -n devproject get pods
NAME                    READY     STATUS      RESTARTS   AGE
httpd-example-1-build   0/1       Completed   0          2h
httpd-example-2-bjrsp   1/1       Running     1          22m
httpd-example-2-k8m2d   1/1       Running     0          24m
httpd-example-2-s694z   1/1       Running     0          6m
httpd-example-2-tj2ch   1/1       Running     0          6m
httpd-example-2-wz6n4   1/1       Running     3          22m
[root@all-openshift ~]#


自動スケール機能によってPODが増えましたね。

f:id:naoki_1123:20200127224047j:plain