Sesuaikan observabilitas dengan cepat dan fleksibel dengan garis log kanonik

Dalam posting di Habré, topik penebangan struktural sering disebutkan, tetapi sambil lalu. Jadi ketika saya menemukan artikel mendetail ini oleh Brandur Leach dari Stripe , saya memutuskan untuk menerjemahkannya dan membagikannya dengan komunitas. 





Badoo . — , id , — . , — , .





Brandur Leach , . — Stripe , , — ( ).





!






— . , «». .





— , - « » . , .





Stripe , , , (canonical log lines). : , , . .





, , -, (operational visibility) , . API, -, PCI- (PCI vault) Stripe Dashboard.





API -, . API :





[2019-03-18 22:48:32.990] Request started

[2019-03-18 22:48:32.991] User authenticated

[2019-03-18 22:48:32.992] Rate limiting ran

[2019-03-18 22:48:32.998] Charge created

[2019-03-18 22:48:32.999] Request finished
      
      



, . «» : JSON, , «-» ( logfmt). , .





:





[2019-03-18 22:48:32.990] Request started httpmethod=POST httppath=/v1/charges requestid=req123

[2019-03-18 22:48:32.991] User authenticated authtype=apikey keyid=mk123 userid=usr123

[2019-03-18 22:48:32.992] Rate limiting ran rateallowed=true ratequota=100 rateremaining=99

[2019-03-18 22:48:32.998] Charge created chargeid=ch123 permissionsused=accountwrite team=acquiring

[2019-03-18 22:48:32.999] Request finished alloccount=9123 databasequeries=34 duration=0.009 httpstatus=200
      
      



( - , , ). , . 





, , API . Splunk :





“Request started” | head
      
      



, - API:





“Rate limiting ran” allowed=false
      
      



API :





“Request finished” earliest=-1h | stats count p50(duration) p95(duration) p99(duration)
      
      



, Graphite StatsD, . , , , - . .





, — , . , , HTTP- :





“Request started” | stats count by http_path
      
      



API 500 ( ), , , - :





“Request finished” status=500 | stats count p50(duration) p95(duration) p99(duration)
      
      



, . , . , .





:

, , , . , (rate limiting) API, : « ?» - , .





, . - . — . , , .





. : ( ) , . :





[2019-03-18 22:48:32.999] canonical-log-line alloc_count=9123 auth_type=api_key database_queries=34 duration=0.009 http_method=POST http_path=/v1/charges http_status=200 key_id=mk_123 permissions_used=account_write rate_allowed=true rate_quota=100 rate_remaining=99 request_id=req_123 team=acquiring user_id=usr_123
      
      



, :





  • HTTP-, ;





  • , ( API, ), API-;





  • (rate limiters), ;





  • , ;





  • , .





, — , IETF URL.





. «» , , , . , , , . . , , , .





:





canonical-log-line rate_allowed=false | stats count by user_id
      
      







, , , . , , , .





. charges



, 4, . , , . :





canonical-log-line user=usr_123 http_method=POST http_path=/v1/charges http_status!=4* | timechart p50(duration) p95(duration) p99(duration)
      
      



Durasi permintaan API untuk persentil ke-50, ke-95, dan ke-99 (dihasilkan dengan cepat dari log)
API 50-, 95- 99- ( )

middleware

, , .





API Stripe middleware . , , , middleware .





:





class CanonicalLineLogger
  def call(env)
    # Call into the core application and inner middleware
    status, headers, body = @app.call(env)

    # Emit the canonical line using response status and other
    # information embedded in the request environment
    log_canonical_line(status, env)

    # Return results upstream
    [status, headers, body]
  end
end
      
      







, . ensure ( finally Ruby ) , - . begin/rescue ( try/catch), . , ( ).





. -, , — , . , , .





Stripe , . , , , . , Google Protocol Buffers.





API Kafka. , S3. Presto Redshift, , .





, . , Go, , API-:





Menggunakan versi Go (data yang diperoleh dari arsip baris log kanonik yang masuk ke repositori kami)
Go ( , )

, SQL, , . 





:





SELECT
    DATE_TRUNC('week', created) AS week,
    REGEXP_SUBSTR(language_version, '\\d*\\.\\d*') AS major_minor,
    COUNT(DISTINCT user)
FROM events.canonical_log_lines
WHERE created > CURRENT_DATE - interval '2 months'
    AND language = 'go'
GROUP BY 1, 2
ORDER BY 1, 3 DESC
      
      



Google Protocol Buffers , Stripe . Developer Dashboard, API- .





Dasbor Pengembang menampilkan jumlah permintaan API yang berhasil untuk akun Stripe (data yang dihasilkan dari garis log kanonik diarsipkan di S3)
Developer Dashboard API Stripe- ( , S3)

. MapReduce , S3, . , Google Protocol Buffer, .





, . , .





. , .





, . Kubernetes Elasticsearch, GCP — Google Stackdriver Logging. AWS CloudWatch. Fluentd . , : , , .





, - . , , . Kafka , . - Redis. Redshift BigQuery. , .





, .





  • . .





  • , , .





  • Kafka , .





  • . Stripe Developer Dashboard.





— , , , . , , .












All Articles