[{"data":1,"prerenderedAt":711},["ShallowReactive",2],{"navigation":3,"\u002Fen\u002Fblog\u002Fkubernetes-monitoring-logs-metrics":294,"\u002Fen\u002Fblog\u002Fkubernetes-monitoring-logs-metrics-surround":706},[4,8,12,16,20,24,28,32,36,40,44,48,52,56,60,64,68,72,76,80,84,88,92,96,100,104,108,112,116,120,124,128,132,136,140,144,148,152,156,160,164,168,172,176,180,184,188,207,219,250,287],{"title":5,"path":6,"stem":7},"Build and Deploy a Modern Website in 5 Minutes","\u002Fen\u002Fblog\u002Fbuild-with-loveable","en\u002F3.blog\u002F1.build-with-loveable",{"title":9,"path":10,"stem":11},"The Vercel Alternative for the German Mittelstand: Sovereign Hosting on Hetzner with lowcloud","\u002Fen\u002Fblog\u002Fdigital-sovereignty-lowcloud-vs-vercel-b2b","en\u002F3.blog\u002F10.digital-sovereignty-lowcloud-vs-vercel-b2b",{"title":13,"path":14,"stem":15},"Cloud Sovereignty Framework: How the EU Is Finally Making Cloud Sovereignty Measurable","\u002Fen\u002Fblog\u002Fcloud-sovereignty-framework","en\u002F3.blog\u002F12.cloud-sovereignty-framework",{"title":17,"path":18,"stem":19},"Avoiding Cloud Vendor Lock-in: What Real Sovereignty Means Technically","\u002Fen\u002Fblog\u002Fcloud-vendor-lock-in","en\u002F3.blog\u002F13.cloud-vendor-lock-in",{"title":21,"path":22,"stem":23},"Digital Sovereignty with Kubernetes: When Is Open Source Truly Sovereign?","\u002Fen\u002Fblog\u002Fkubernetes-digital-sovereignty","en\u002F3.blog\u002F14.kubernetes-digital-sovereignty",{"title":25,"path":26,"stem":27},"What Is DevOps as a Service and When Does It Actually Make Sense?","\u002Fen\u002Fblog\u002Fdevops-as-a-service","en\u002F3.blog\u002F15.devops-as-a-service",{"title":29,"path":30,"stem":31},"Cloud Sovereignty Governance: Why This Topic Belongs in the Boardroom, Not the Server Room","\u002Fen\u002Fblog\u002Fcloud-sovereignty-governance","en\u002F3.blog\u002F16.cloud-sovereignty-governance",{"title":33,"path":34,"stem":35},"PaaS vs. DaaS: What","\u002Fen\u002Fblog\u002Fpaas-vs-daas","en\u002F3.blog\u002F17.paas-vs-daas",{"title":37,"path":38,"stem":39},"Sovereign Cloud: Can SaaS Really Maintain Control Over Your Data?","\u002Fen\u002Fblog\u002Fsovereign-cloud-saas-data-control","en\u002F3.blog\u002F18.sovereign-cloud-saas-data-control",{"title":41,"path":42,"stem":43},"DevOps vs. DevOps as a Service – Which One Fits Your Team?","\u002Fen\u002Fblog\u002Fdevops-vs-devops-as-a-service","en\u002F3.blog\u002F19.devops-vs-devops-as-a-service",{"title":45,"path":46,"stem":47},"Docker Fundamentals -  Understanding Container Virtualization","\u002Fen\u002Fblog\u002Fhow-docker-works","en\u002F3.blog\u002F2.how-docker-works",{"title":49,"path":50,"stem":51},"The 7 Biggest DevOps Problems in SMBs – And How to Fix Them","\u002Fen\u002Fblog\u002Fdevops-problems-smb","en\u002F3.blog\u002F20.devops-problems-smb",{"title":53,"path":54,"stem":55},"PostgreSQL Helm Chart: How to Deploy Postgres on Kubernetes","\u002Fen\u002Fblog\u002Fpostgresql-helm-chart-kubernetes","en\u002F3.blog\u002F21.postgresql-helm-chart-kubernetes",{"title":57,"path":58,"stem":59},"Platform Engineering vs. DevOps – What","\u002Fen\u002Fblog\u002Fplatform-engineering-vs-devops","en\u002F3.blog\u002F22.platform-engineering-vs-devops",{"title":61,"path":62,"stem":63},"Cloud Act vs. GDPR: The Risk for EU Businesses","\u002Fen\u002Fblog\u002Fcloud-act-vs-gdpr","en\u002F3.blog\u002F23.cloud-act-vs-gdpr",{"title":65,"path":66,"stem":67},"Cut IT Costs with Automation: The Biggest Lever","\u002Fen\u002Fblog\u002Freduce-it-costs-automation","en\u002F3.blog\u002F24.reduce-it-costs-automation",{"title":69,"path":70,"stem":71},"NIS2 Compliance for DevOps Teams: What You Need to Do","\u002Fen\u002Fblog\u002Fnis2-compliance-devops","en\u002F3.blog\u002F25.nis2-compliance-devops",{"title":73,"path":74,"stem":75},"Self-Hosted EU Alternatives: Host LibreOffice & More","\u002Fen\u002Fblog\u002Fself-hosted-eu-alternatives","en\u002F3.blog\u002F26.self-hosted-eu-alternatives",{"title":77,"path":78,"stem":79},"DORA Compliance for DevOps: What the EU Resilience Act Means","\u002Fen\u002Fblog\u002Fdora-compliance-devops","en\u002F3.blog\u002F27.dora-compliance-devops",{"title":81,"path":82,"stem":83},"Cloud TCO: Hidden Costs AWS, Azure & GCP Don't Show You","\u002Fen\u002Fblog\u002Fcloud-tco-hidden-costs","en\u002F3.blog\u002F28.cloud-tco-hidden-costs",{"title":85,"path":86,"stem":87},"Data Residency vs. Data Sovereignty: What Really Matters","\u002Fen\u002Fblog\u002Fdata-residency-vs-data-sovereignty","en\u002F3.blog\u002F29.data-residency-vs-data-sovereignty",{"title":89,"path":90,"stem":91},"Self-Host n8n on Hetzner: Complete Docker Setup Guide","\u002Fen\u002Fblog\u002Fself-hosted-n8n-on-hetzner","en\u002F3.blog\u002F3.self-hosted-n8n-on-hetzner",{"title":93,"path":94,"stem":95},"Manual Deployments: An Underestimated Risk for SMBs","\u002Fen\u002Fblog\u002Fmanual-deployment-risks","en\u002F3.blog\u002F30.manual-deployment-risks",{"title":97,"path":98,"stem":99},"DevOps Tool Sprawl: How It Happens and How to Stop It","\u002Fen\u002Fblog\u002Fdevops-tool-sprawl","en\u002F3.blog\u002F31.devops-tool-sprawl",{"title":101,"path":102,"stem":103},"Kubernetes Monitoring: Using Logs and Metrics Effectively","\u002Fen\u002Fblog\u002Fkubernetes-monitoring-logs-metrics","en\u002F3.blog\u002F32.kubernetes-monitoring-logs-metrics",{"title":105,"path":106,"stem":107},"OB7 Case Study: Website Deployment Without Infrastructure Overhead","\u002Fen\u002Fblog\u002Fob7-case-study-lowcloud-deployment","en\u002F3.blog\u002F33.ob7-case-study-lowcloud-deployment",{"title":109,"path":110,"stem":111},"DevOps in SMBs: Why Missing Roles Become a Real Risk","\u002Fen\u002Fblog\u002Fmissing-devops-roles-smb","en\u002F3.blog\u002F34.missing-devops-roles-smb",{"title":113,"path":114,"stem":115},"Simplify Kubernetes Configuration: The Path to Human-Readable Cloud","\u002Fen\u002Fblog\u002Fsimplify-kubernetes-configuration","en\u002F3.blog\u002F35.simplify-kubernetes-configuration",{"title":117,"path":118,"stem":119},"Collaborative DevOps: How Modern Teams Build Cloud Apps Together","\u002Fen\u002Fblog\u002Fcollaborative-devops-teams","en\u002F3.blog\u002F36.collaborative-devops-teams",{"title":121,"path":122,"stem":123},"Knowledge Documentation in DevOps Teams: How to Actually Reduce Your Bus Factor","\u002Fen\u002Fblog\u002Fdevops-knowledge-documentation-bus-factor","en\u002F3.blog\u002F37.devops-knowledge-documentation-bus-factor",{"title":125,"path":126,"stem":127},"What Is PaaS? Platform as a Service Explained","\u002Fen\u002Fblog\u002Fwhat-is-paas","en\u002F3.blog\u002F38.what-is-paas",{"title":129,"path":130,"stem":131},"EU AI Act Hosting: What Changes for AI Workload Operators","\u002Fen\u002Fblog\u002Feu-ai-act-hosting","en\u002F3.blog\u002F39.eu-ai-act-hosting",{"title":133,"path":134,"stem":135},"Docker Compose Tutorial: Managing Multi-Container Apps Made Easy","\u002Fen\u002Fblog\u002Fdocker-compose-for-beginners","en\u002F3.blog\u002F4.docker-compose-for-beginners",{"title":137,"path":138,"stem":139},"Full-Stack Developer Reality: What the Title Actually Means","\u002Fen\u002Fblog\u002Ffull-stack-developer-reality","en\u002F3.blog\u002F40.full-stack-developer-reality",{"title":141,"path":142,"stem":143},"Cloud Egress Fees Compared: AWS vs. Azure vs. GCP Pricing","\u002Fen\u002Fblog\u002Fcloud-egress-fees","en\u002F3.blog\u002F41.cloud-egress-fees",{"title":145,"path":146,"stem":147},"Bring Your Own Cloud: What the Model Means and Why It","\u002Fen\u002Fblog\u002Fbring-your-own-cloud","en\u002F3.blog\u002F42.bring-your-own-cloud",{"title":149,"path":150,"stem":151},"Zero-Config Kubernetes: Why Simplicity Wins","\u002Fen\u002Fblog\u002Fzero-config-kubernetes","en\u002F3.blog\u002F43.zero-config-kubernetes",{"title":153,"path":154,"stem":155},"Minimalist Cloud Architecture: Why Less Complexity Means More Stability","\u002Fen\u002Fblog\u002Fminimalist-cloud-architecture","en\u002F3.blog\u002F44.minimalist-cloud-architecture",{"title":157,"path":158,"stem":159},"Software Deployment for SMBs: How Small Teams Ship Faster","\u002Fen\u002Fblog\u002Fsmb-software-deployment","en\u002F3.blog\u002F45.smb-software-deployment",{"title":161,"path":162,"stem":163},"EU Data Act: What Businesses and DevOps Teams Need to Know","\u002Fen\u002Fblog\u002Feu-data-act-business-devops","en\u002F3.blog\u002F46.eu-data-act-business-devops",{"title":165,"path":166,"stem":167},"Data Governance Act: What SMBs and DevOps Teams Need to Know","\u002Fen\u002Fblog\u002Fdata-governance-act-devops-guide","en\u002F3.blog\u002F47.data-governance-act-devops-guide",{"title":169,"path":170,"stem":171},"Self-Host Docmost with Docker Compose and Traefik: Complete Guide","\u002Fen\u002Fblog\u002Fself-host-docmost-with-docker-and-traefik","en\u002F3.blog\u002F5.self-host-docmost-with-docker-and-traefik",{"title":173,"path":174,"stem":175},"What Is Kubernetes? A Practical Guide to Container Orchestration","\u002Fen\u002Fblog\u002Fwhat-is-kubernetes","en\u002F3.blog\u002F6.what-is-kubernetes",{"title":177,"path":178,"stem":179},"The Cloud Illusion: Why a Server Location in Germany Doesn’t Guarantee Digital Sovereignty","\u002Fen\u002Fblog\u002Fcloud-illusion-digital-sovereignty","en\u002F3.blog\u002F7.cloud-illusion-digital-sovereignty",{"title":181,"path":182,"stem":183},"S3-Compatible Object Storage: The Best Solutions at a Glance","\u002Fen\u002Fblog\u002Fs3-compatible-object-storage","en\u002F3.blog\u002F8.s3-compatible-object-storage",{"title":185,"path":186,"stem":187},"Deployment as a Bottleneck: When AI Codes Faster Than You Can Deploy","\u002Fen\u002Fblog\u002Fdeployment-bottleneck","en\u002F3.blog\u002F9.deployment-bottleneck",{"title":189,"path":190,"stem":191,"children":192,"icon":206},"Getting Started","\u002Fen\u002Fdocs\u002Fgetting-started","en\u002F1.docs\u002F1.getting-started\u002F1.index",[193,196,201],{"title":194,"path":190,"stem":191,"icon":195},"Introduction","i-lucide-house",{"title":197,"path":198,"stem":199,"icon":200},"Get Started","\u002Fen\u002Fdocs\u002Fgetting-started\u002Fget-started","en\u002F1.docs\u002F1.getting-started\u002F2.get-started","i-lucide-rocket",{"title":202,"path":203,"stem":204,"icon":205},"How It Works","\u002Fen\u002Fdocs\u002Fgetting-started\u002Fhow-it-works","en\u002F1.docs\u002F1.getting-started\u002F3.how-it-works","i-lucide-lightbulb",false,{"title":208,"path":209,"stem":210,"children":211,"icon":206},"Guides","\u002Fen\u002Fdocs\u002Fguides","en\u002F1.docs\u002F2.guides\u002F1.index",[212,214],{"title":208,"path":209,"stem":210,"icon":213},"i-lucide-book-open",{"title":215,"path":216,"stem":217,"icon":218},"Connect a Container Registry","\u002Fen\u002Fdocs\u002Fguides\u002Fcontainer-registries","en\u002F1.docs\u002F2.guides\u002F2.container-registries","i-lucide-container",{"title":220,"path":221,"stem":222,"children":223,"icon":206},"App Services","\u002Fen\u002Fdocs\u002Fapp-services","en\u002F1.docs\u002F3.app-services\u002F1.index",[224,225,230,235,240,245],{"title":220,"path":221,"stem":222,"icon":200},{"title":226,"path":227,"stem":228,"icon":229},"Build Settings","\u002Fen\u002Fdocs\u002Fapp-services\u002Fbuild-settings","en\u002F1.docs\u002F3.app-services\u002F2.build-settings","i-lucide-settings",{"title":231,"path":232,"stem":233,"icon":234},"Env Variables","\u002Fen\u002Fdocs\u002Fapp-services\u002Fenvironment-variables","en\u002F1.docs\u002F3.app-services\u002F3.environment-variables","i-lucide-key",{"title":236,"path":237,"stem":238,"icon":239},"Custom Domains","\u002Fen\u002Fdocs\u002Fapp-services\u002Fcustom-domains","en\u002F1.docs\u002F3.app-services\u002F4.custom-domains","i-lucide-globe",{"title":241,"path":242,"stem":243,"icon":244},"Health Checks","\u002Fen\u002Fdocs\u002Fapp-services\u002Fhealth-checks","en\u002F1.docs\u002F3.app-services\u002F5.health-checks","i-lucide-heart-pulse",{"title":246,"path":247,"stem":248,"icon":249},"Autoscaling","\u002Fen\u002Fdocs\u002Fapp-services\u002Fautoscaling","en\u002F1.docs\u002F3.app-services\u002F6.autoscaling","i-lucide-scaling",{"title":251,"path":252,"stem":253,"children":254,"icon":206},"Helm Releases","\u002Fen\u002Fdocs\u002Fhelm-releases","en\u002F1.docs\u002F4.helm-releases\u002F1.index",[255,257,262,267,272,277,282],{"title":251,"path":252,"stem":253,"icon":256},"i-lucide-package",{"title":258,"path":259,"stem":260,"icon":261},"Deploy PostgreSQL","\u002Fen\u002Fdocs\u002Fhelm-releases\u002Fdeploy-postgresql","en\u002F1.docs\u002F4.helm-releases\u002F2.deploy-postgresql","i-lucide-database",{"title":263,"path":264,"stem":265,"icon":266},"Deploy Redis","\u002Fen\u002Fdocs\u002Fhelm-releases\u002Fdeploy-redis","en\u002F1.docs\u002F4.helm-releases\u002F3.deploy-redis","i-lucide-zap",{"title":268,"path":269,"stem":270,"icon":271},"Deploy n8n","\u002Fen\u002Fdocs\u002Fhelm-releases\u002Fdeploy-n8n","en\u002F1.docs\u002F4.helm-releases\u002F4.deploy-n8n","i-lucide-workflow",{"title":273,"path":274,"stem":275,"icon":276},"Deploy RustFS","\u002Fen\u002Fdocs\u002Fhelm-releases\u002Fdeploy-rustfs","en\u002F1.docs\u002F4.helm-releases\u002F5.deploy-rustfs","i-lucide-hard-drive",{"title":278,"path":279,"stem":280,"icon":281},"Deploy OpenSearch","\u002Fen\u002Fdocs\u002Fhelm-releases\u002Fdeploy-opensearch","en\u002F1.docs\u002F4.helm-releases\u002F6.deploy-opensearch","i-lucide-search",{"title":283,"path":284,"stem":285,"icon":286},"Deploy Keycloak","\u002Fen\u002Fdocs\u002Fhelm-releases\u002Fdeploy-keycloak","en\u002F1.docs\u002F4.helm-releases\u002F7.deploy-keycloak","i-lucide-shield-check",{"title":288,"path":289,"stem":290,"children":291,"icon":206},"Glossary","\u002Fen\u002Fdocs\u002Fglossary","en\u002F1.docs\u002F5.glossary\u002F1.index",[292],{"title":288,"path":289,"stem":290,"icon":293},"i-lucide-book-a",{"id":295,"title":101,"authors":296,"badge":302,"body":303,"date":697,"description":698,"extension":699,"image":700,"lastUpdated":302,"meta":702,"navigation":703,"path":102,"published":703,"seo":704,"stem":103,"tags":302,"__hash__":705},"posts\u002Fen\u002F3.blog\u002F32.kubernetes-monitoring-logs-metrics.md",[297],{"name":298,"to":299,"avatar":300},"Thomas Ens","\u002Fabout\u002Fthomasens",{"src":301},"\u002Fimages\u002Fblog\u002Fauthors\u002Fthomas.jpeg",null,{"type":304,"value":305,"toc":681},"minimark",[306,310,315,318,325,331,343,346,350,353,373,380,383,387,401,404,423,426,431,434,455,459,465,468,483,486,510,513,517,524,527,552,555,559,562,565,568,574,580,586,592,596,599,602,634,637,640,644,647,658,665,673,676],[307,308,309],"p",{},"A Kubernetes cluster is running, deployments are green, and yet problems arise that nobody catches early enough. The reason is usually not faulty code but missing or poorly configured Kubernetes monitoring. If you don't know what's happening inside your pods, you're flying blind — reacting to outages instead of predicting them. This article shows how logs and metrics work together, where they differ, and what a solid monitoring stack needs to deliver in practice.",[311,312,314],"h2",{"id":313},"logs-vs-metrics-two-tools-two-jobs","Logs vs. Metrics: Two Tools, Two Jobs",[307,316,317],{},"Logs and metrics are often mentioned in the same breath, but they solve different problems.",[307,319,320,324],{},[321,322,323],"strong",{},"Metrics"," are numerical time-series data: CPU utilization, memory consumption, request rate, error rate. They're well suited for spotting trends, monitoring thresholds, and detecting anomalies quickly. Metrics are compact and can be aggregated efficiently.",[307,326,327,330],{},[321,328,329],{},"Logs",", on the other hand, are event-based text messages. They contain the context that metrics lack: Which user triggered which request? What error occurred on which line? What exactly happened before the application crashed?",[307,332,333,334,338,339,342],{},"The simple rule of thumb: Metrics tell you ",[335,336,337],"em",{},"that"," something is wrong. Logs tell you ",[335,340,341],{},"why",".",[307,344,345],{},"If you only have metrics, you see the alert but not the cause. If you only have logs, you drown in text and can't find patterns. Both together paint the full picture.",[311,347,349],{"id":348},"the-three-pillars-of-observability","The Three Pillars of Observability",[307,351,352],{},"Observability is more than just monitoring. The term describes the ability to understand a system's internal state from its outputs. In practice, this rests on three pillars:",[354,355,356,362,367],"ol",{},[357,358,359,361],"li",{},[321,360,323],{}," — aggregated numbers over time",[357,363,364,366],{},[321,365,329],{}," — structured or unstructured event records",[357,368,369,372],{},[321,370,371],{},"Traces"," — distributed tracing across multiple services",[307,374,375,379],{},[376,377,378],"a",{"href":174},"Kubernetes"," monitoring primarily covers the first two pillars. Tracing comes into play once multiple microservices communicate with each other and you need to understand which service in a request chain took how long.",[307,381,382],{},"For most teams, the pragmatic starting point is: get metrics and logs under control first, add tracing later as microservice complexity grows.",[311,384,386],{"id":385},"kubernetes-monitoring-with-prometheus","Kubernetes Monitoring with Prometheus",[307,388,389,395,396,400],{},[376,390,394],{"href":391,"rel":392},"https:\u002F\u002Fprometheus.io\u002F",[393],"nofollow","Prometheus"," is the de facto standard for metrics in Kubernetes environments. The principle is simple: Prometheus scrapes HTTP endpoints (",[397,398,399],"code",{},"\u002Fmetrics",") at defined intervals and stores the data as time series in its own database.",[307,402,403],{},"Two components provide the bulk of Kubernetes metrics:",[405,406,407,415],"ul",{},[357,408,409,414],{},[321,410,411],{},[397,412,413],{},"node_exporter"," — hardware and OS metrics from the node: CPU, RAM, disk I\u002FO, network",[357,416,417,422],{},[321,418,419],{},[397,420,421],{},"kube-state-metrics"," — Kubernetes-specific metrics: pod status, deployment replicas, job successes, resource requests vs. limits",[307,424,425],{},"On top of that come application-specific metrics. If you're running a Go or Java application, you can expose custom metrics using a Prometheus library: request latencies, queue sizes, business metrics.",[427,428,430],"h3",{"id":429},"label-cardinality-the-underestimated-performance-problem","Label Cardinality: The Underestimated Performance Problem",[307,432,433],{},"Prometheus metrics are qualified by labels. This is powerful but can get expensive. If you use a user ID or session ID as a label, the number of time series explodes. Thousands or millions of distinct label values mean thousands or millions of separate time series — memory consumption and query performance suffer significantly.",[307,435,436,437,440,441,440,444,447,448,440,451,454],{},"The rule: labels should have a manageable, bounded set of possible values. Status code (",[397,438,439],{},"200",", ",[397,442,443],{},"404",[397,445,446],{},"500","), HTTP method (",[397,449,450],{},"GET",[397,452,453],{},"POST","), service name — these are sensible labels. User IDs or request IDs belong in logs, not in metrics.",[311,456,458],{"id":457},"log-aggregation-in-kubernetes","Log Aggregation in Kubernetes",[307,460,461,464],{},[397,462,463],{},"kubectl logs \u003Cpod>"," is fine for development. In production, it's a crutch.",[307,466,467],{},"Pods can be restarted at any time, and their old logs disappear with them. When a pod crashes and restarts, you lose exactly the logs you'd need for root cause analysis. On top of that, manual log queries don't scale across many pods.",[307,469,470,471,476,477,482],{},"The solution is log aggregation: all logs are collected by a log collector (e.g., ",[376,472,475],{"href":473,"rel":474},"https:\u002F\u002Ffluentbit.io\u002F",[393],"Fluent Bit"," or ",[376,478,481],{"href":479,"rel":480},"https:\u002F\u002Fwww.fluentd.org\u002F",[393],"Fluentd","), forwarded, and stored centrally.",[307,484,485],{},"For central storage, there are two common options:",[405,487,488,504],{},[357,489,490,497,498,503],{},[376,491,494],{"href":492,"rel":493},"https:\u002F\u002Fgrafana.com\u002Foss\u002Floki\u002F",[393],[321,495,496],{},"Loki"," (by ",[376,499,502],{"href":500,"rel":501},"https:\u002F\u002Fgrafana.com\u002F",[393],"Grafana"," Labs): lightweight, indexes only metadata (labels), stores log content compressed. Well integrated with Grafana, significantly cheaper to operate than Elasticsearch.",[357,505,506,509],{},[321,507,508],{},"Elasticsearch",": powerful full-text index, more complex queries possible, but more resource-intensive and operationally demanding.",[307,511,512],{},"For most Kubernetes teams already using Grafana, Loki is the natural choice. Elasticsearch makes sense when complex full-text search or advanced analytics are needed.",[427,514,516],{"id":515},"log-level-discipline-in-production","Log Level Discipline in Production",[307,518,519,520,523],{},"A frequent source of problems: applications running in production with log level ",[397,521,522],{},"DEBUG",". The result is gigabytes of logs per day that nobody reads, but that cost storage and make finding real errors harder.",[307,525,526],{},"Clear conventions:",[405,528,529,534,540,546],{},[357,530,531,533],{},[321,532,522],{}," — development only, or for targeted troubleshooting",[357,535,536,539],{},[321,537,538],{},"INFO"," — important events that document normal operations",[357,541,542,545],{},[321,543,544],{},"WARN"," — something unexpected happened, operations continue",[357,547,548,551],{},[321,549,550],{},"ERROR"," — a failure occurred that needs attention",[307,553,554],{},"And: structured logging is always preferable to unstructured logging. Writing logs as JSON makes them efficient to filter and query in Loki or Elasticsearch. Free-text logs are hard for machines to process.",[311,556,558],{"id":557},"alerting-what-actually-deserves-an-alert","Alerting: What Actually Deserves an Alert",[307,560,561],{},"An alert that nobody pays attention to anymore is worse than no alert at all. Alert fatigue is a real problem. Teams that receive dozens of notifications daily get used to them and eventually miss the critical one.",[307,563,564],{},"Prometheus Alertmanager is the standard tool for receiving, grouping, deduplicating, and routing alerts (Slack, PagerDuty, email, etc.).",[307,566,567],{},"Principles for effective alerting:",[307,569,570,573],{},[321,571,572],{},"Alert on symptoms, not causes."," An alert on \"high CPU\" is often useless. High CPU isn't a problem as long as the application is responding. Better: alert on response time > 2 seconds or error rate > 1%.",[307,575,576,579],{},[321,577,578],{},"Use the four golden signals"," (from the Google SRE book): latency, traffic, errors, saturation. These are the signals that actually indicate user-facing problems.",[307,581,582,585],{},[321,583,584],{},"Define alerting tiers."," Not every alert needs to wake someone up at 3 AM. Critical alerts go to PagerDuty, warnings go to a Slack channel.",[307,587,588,591],{},[321,589,590],{},"Test your alerts."," An alert that has never fired may have never had the chance — or it might be broken.",[311,593,595],{"id":594},"dashboards-with-grafana","Dashboards with Grafana",[307,597,598],{},"Grafana is the standard tool for visualizing Prometheus metrics and Loki logs. A good dashboard answers at a glance: Is everything okay?",[307,600,601],{},"What belongs on every team dashboard:",[405,603,604,610,616,622,628],{},[357,605,606,609],{},[321,607,608],{},"Request rate"," — how many requests is the service handling right now?",[357,611,612,615],{},[321,613,614],{},"Error rate"," — how many of those are failing?",[357,617,618,621],{},[321,619,620],{},"Latency"," — P50, P95, P99 response times (not just averages)",[357,623,624,627],{},[321,625,626],{},"Pod status"," — are all replicas running, are there restarts?",[357,629,630,633],{},[321,631,632],{},"Resource utilization"," — CPU and memory vs. defined limits",[307,635,636],{},"A common mistake: dashboards that show too much. When 40 graphs are on one screen, nobody sees anything anymore. Less is more — a focused overview dashboard, with links to detail dashboards for specific analysis.",[307,638,639],{},"Grafana also offers annotations: when was a new deployment rolled out? These markers on graphs are enormously helpful for correlating performance changes with deployments.",[311,641,643],{"id":642},"kubernetes-monitoring-on-a-devops-as-a-service-platform","Kubernetes Monitoring on a DevOps-as-a-Service Platform",[307,645,646],{},"Building and running your own monitoring stack is possible but labor-intensive. Prometheus, Alertmanager, Loki, Fluent Bit, Grafana: each component needs to be configured, secured, updated, and scaled. That's operational overhead that doesn't directly contribute to your product.",[307,648,649,650,653,654,657],{},"On a Kubernetes ",[376,651,652],{"href":26},"DaaS platform"," like ",[321,655,656],{},"lowcloud",", monitoring infrastructure is part of the platform. This means: metrics for all workloads are collected automatically, logs are aggregated and made searchable, basic alerts are preconfigured. Teams can focus on configuring their own metrics and alerts instead of running the stack themselves.",[307,659,660,661,664],{},"This is especially relevant for ",[376,662,663],{"href":50},"smaller teams that handle DevOps on the side",". Getting Kubernetes monitoring right takes time — time that can be invested in product development when the platform handles the fundamentals.",[307,666,667,668,342],{},"Teams running Kubernetes workloads on lowcloud get Prometheus and Grafana as integrated services — configurable, but without the initial infrastructure overhead. Learn more at ",[376,669,672],{"href":670,"rel":671},"http:\u002F\u002Flowcloud.de",[393],"lowcloud.de",[674,675],"hr",{},[307,677,678],{},[335,679,680],{},"Monitoring isn't a one-time project you check off a list. It's an ongoing process: sharpening alerts, adjusting dashboards, instrumenting new services. Those who start early and take a structured approach have a clear advantage when incidents happen — and sleep better at night.",{"title":682,"searchDepth":683,"depth":683,"links":684},"",2,[685,686,687,691,694,695,696],{"id":313,"depth":683,"text":314},{"id":348,"depth":683,"text":349},{"id":385,"depth":683,"text":386,"children":688},[689],{"id":429,"depth":690,"text":430},3,{"id":457,"depth":683,"text":458,"children":692},[693],{"id":515,"depth":690,"text":516},{"id":557,"depth":683,"text":558},{"id":594,"depth":683,"text":595},{"id":642,"depth":683,"text":643},"2026-03-20","How logs and metrics work together in Kubernetes, where they differ, and what a solid monitoring stack needs to deliver in practice.","md",{"src":701},"\u002Fimages\u002Fblog\u002Fkubernetes-monitoring-logs-metrics.jpg",{},true,{"title":101,"description":698},"BehRry_IPCfzDNELu5QVqMgi7Z90RjYqm5iA5a_fMWA",[707,709],{"title":97,"path":98,"stem":99,"description":708,"children":-1},"Tool sprawl costs more than licenses: cognitive load, slow onboarding, lost knowledge. Here is how to bring order to your DevOps setup.",{"title":105,"path":106,"stem":107,"description":710,"children":-1},"How OB7 deploys their new website with lowcloud – no server configuration, SSL setup, or provider management. A case study on managed container deployments.",1775388341411]