Panduan untuk menyiapkan target tingkat layanan (SLO) di Kubernetes dengan Prometheus dan Linkerd

Untuk mengantisipasi dimulainya kursus "Platform infrastruktur berdasarkan Kubernetes", kami telah menyiapkan terjemahan tradisional dari artikel yang bermanfaat.


Sasaran Tingkat Layanan (SLO) jauh lebih mudah untuk digunakan dengan kisi layanan

, (SLO, . Service Level Objectives) Kubernetes  Prometheus, ,  Linkerd,  . , , , SLO.

, , , SLO Kubernetes .

SLO Kubernetes

SLO, , . , Google SRE, SLO , , , .

, , , SLO : SLO , , .  Kubernetes, . , SLO , . (. SLO Kubernetes.)

, SLO Kubernetes , . SLO , , ! , Linkerd golden metrics ( ) โ€” , ,  โ€” . Linkerd SLO .

(, SLO, , , . , , , , SLO .)

.

SLO Linkerd Prometheus

, SLO gRPC-, Kubernetes. , SLO.

, Linkerd . Linkerd HTTP gRPC, (pods) . , Prometheus. Prometheus Linkerd, .

, , , Linkerd Prometheus, SLO.

: Linkerd Kubernetes

. , Kubernetes  kubectl, . Linkerd, , Linkerd .

Linkerd:

curl -sL https://run.linkerd.io/install | sh
export PATH=$PATH:$HOME/.linkerd2/bin

(Linkerd   Linkerd.)

, , Kubernetes Linkerd, Linkerd :

linkerd check --pre
linkerd install | kubectl apply -f -
linkerd check

, Emojivoto, :

curl -sL https://run.linkerd.io/emojivoto.yml \
  | linkerd inject - \
  | kubectl apply -f -

. SLO: .

 โ€” , , SLO. ?

. , ,     7 80 %.  SLO. :  (service level indicator โ€” SLI), ; , ;   . :

SLI:

: 80 %

: 7

SLO , 20 % 7- , . ,  โ€” , 20 % ยซยป .

, 7 100 % , 100 %  โ€” . , 7 80 % , 0 % .   80 %,    SLO .

:

= 1โ€“[(1โ€“)/(1โ€“)]

  โ€” SLI, . , , SLI ( ) .

Prometheus

. Prometheus Linkerd, , :

# Get the name of the prometheus pod
$ kubectl -n linkerd get pods
NAME                                      READY   STATUS    RESTARTS   AGE
..
linkerd-prometheus-54dd7dd977-zrgqw       2/2     Running   0          16h

PODNAME, :

kubectl -n linkerd port-forward linkerd-prometheus-PODNAME 9090:9090

 localhost:9090   PromQL, Prometheus.

Dasbor Prometheus
Prometheus

, !

Prometheus

100 80 %  โ€” . , Prometheus. Emojivoto, emojivoto .

, :

:

response_total{deployment="voting", direction="inbound", namespace="emojivoto"}

:

response_total{classification="success",deployment="voting",direction="inbound",namespace="emojivoto",..} 46499
response_total{classification="failure",deployment="voting",direction="inbound",namespace="emojivoto",..} 8652

, , : classification. 46 499 8652 .

, 7 ,  classification="success"   [7d]:

:

response_total{deployment="voting", classification="success", direction="inbound", namespace="emojivoto"}[7d]

, PromQL increase() sum(), , :

:

sum(increase(response_total{deployment="voting", classification="success", direction="inbound", namespace="emojivoto"}[7d])) by (namespace, deployment, classification, tls)

:

{classification="success",deployment="voting",namespace="emojivoto",tls="true"} 26445.68142198795

, 7 26 445 (  increase()).

, , , โ€”  classification="success":

:

sum(increase(response_total{deployment="voting", classification="success", direction="inbound", namespace="emojivoto"}[7d])) by (namespace, deployment, classification, tls) / ignoring(classification) sum(increase(response_total{deployment="voting", direction="inbound", namespace="emojivoto"}[7d])) by (namespace, deployment, tls)

:

{deployment="voting",namespace="emojivoto",tls="true"} 0.846113068695625

, 7 84,61 % .

, , . :

= 1โ€“[(1โ€“)/(1โ€“)]

, 80 % (0,8):

:

1 - ((1 - (sum(increase(response_total{deployment="voting", classification="success", direction="inbound", namespace="emojivoto"}[7d])) by (namespace, deployment, classification, tls)) / ignoring(classification) sum(increase(response_total{deployment="voting", direction="inbound", namespace="emojivoto"}[7d])) by (namespace, deployment, tls)) / (1 - .80))

:

{deployment="voting",namespace="emojivoto",tls="true"} 0.2312188519042635

23,12 % .

, !

Grafana

 โ€” , ? ! Linkerd Grafana, Linkerd.

Linkerd,  linkerd dashboard.

Grafana emojivoto, Grafana .

Dasbor Linkerd dengan integrasi Grafana
Linkerd Grafana

 deploy/voting, : , . .

Linkerd di dasbor Grafana
Linkerd Grafana

 โ€”  7-day error budget (success rate) (ยซ 7 ( )ยป) , , PromQL.

!

Anggaran bug di Grafana dengan metrik Linkerd
Grafana Linkerd

.

, , , PromQL, rate(), .

, -, . (Gauge) , , .

Anggaran error 7 hari (tingkat keberhasilan) dalam format Gauge.
7 ( ) (Gauge).

, emojivoto,  deployment="voting". , 80 %.

Anggaran kesalahan selama 7 hari (persentase upaya yang berhasil) untuk semua layanan.
7 ( ) .

SLO

SLO Linkerd, Grafana. !

?

, , SLO. . , .  , . SLO .

Buoyant SLO, Kubernetes. ,   Dive, SLO . Dive Linkerd , , . Dive , ,  , SLO, .

Dasbor menyelam menunjukkan SLO dan kepatuhan anggaran bug selama 7 hari.
Dive, SLO 7- .

,  โ€” Dive SLO Linkerd Prometheus Grafana, , โ€” SLO!

:

(SLO) Kubernetes

(SLO) . SLO โ€” . ยซ , , ?ยป , Kubernetes, SLO - : , , .


" Kubernetes". " Kubernetes" .


:




All Articles