Recommended Azure Monitors

General

This document describes the recommended Azure monitors which can be implemented in Azure cloud application subscriptions.

SMT incident priority mapping

The priority “Blocker” is mostly used by Developers to prioritize their tasks and its not applicable for operations team.

0-CRITICALCritical<= 4 hrs
1-ERRORHigh<= 12hrs
2-WARNINGMedium<= 48hrs (2days)
3 - InformationalLow<= 96hrs (4days)
4 - VerboseNo TicketAction based on the notification and analysis

Recommended Azure Monitors

All ResourcesResource HealthResource HealthPrevious resource status=All, Current resource status=AllAlwaysCurrent status4 - VerboseMS teamsIncluded all future resource groups and future resourcesExcluding “Virtual machine instance from VMSS”
All ResourcesService HealthService HealthEvent types: Service issue, Planned maintenance , Health advisories, Security AdvisoriesAlwaysCurrent status4 - VerboseMS teamsRegions : North Europe, West EuropeServices: Alerts & Metrics, Activity Logs & Alerts and 21 more
Azure SQL DatabaseCPUMetricapp_cpu_percent > 805 mins1 hour2-WARNINGEmail
Azure SQL DatabaseCPUMetricapp_cpu_percent > 955 mins1 hour1-ERRORMS teams & Email
Azure SQL DatabaseMemoryMetricapp_memory_percent > 805 mins1 hour2-WARNINGEmail
Azure SQL DatabaseMemoryMetricapp_memory_percent > 955 mins1 hour1-ERRORMS teams & Email
Azure SQL DatabaseSpaceMetricallocated_data_storage greater or less than dynamic threshold15 mins1 hour2-WARNINGEmail
AKS - NodeNode CPUMetricnode_cpu_usage_percentage > 8015 mins1 hour2-WARNINGEmailName of the node Include True
AKS - NodeNode MemoryMetricnode_memory_working_set_percentage > 8015 mins1 hour2-WARNINGEmailName of the node Include True
AKS - NodeNode DiskMetricnode_disk_usage_percentage > 8015 mins1 hour2-WARNINGEmailName of the node Include True
AKS - NodeNode Status (NotReady,Unknown)Metrickube_node_status_condition > 05 mins15 mins2-WARNINGEmail
AKS - PodsPods phases (Failed,Unknown,Pending)Metrickube_pod_status_phase >= 15 mins30 mins2-WARNINGEmailPhase of the pod Include Failed,Unknown,Pending
AKS - PodsUnschedulable PodsMetricunschedulable > 115 mins1 hour2-WARNINGEmail
AKS - PodsPods ready state percentageMetricpodReadyPercentage(preview)2-WARNINGEmail
AKS - ContainersRestarting ContainersMetricrestarting container count(preview)2-WARNINGEmail
AKS - ContainersOOM killed containersMetricoomKilledContainerCount)preview)2-WARNINGEmail
AKS - ContainersCPU Exceeded PercentageMetriccpuExceededPercentage (preview)2-WARNINGEmail
AKS - ContainersMemory working set exceeded percentageMetricmemoryWorkingSetExceededPercentage(preview)2-WARNINGEmail
Application GatewayUnhealthy backend HostMetricUnhealthyHostCount > 01 min5 mins0-CRITICALMS teams & Email
Application GatewayFailed RequestsMetricFailedRequests > 1005 mins15 mins2-WARNINGEmail
Load balancerSNAT Connection Status CountMetricSnatConnectionCount >= 15 mins15 mins2-WARNINGEmailConnection State = Failed, Pending
Public IP AddressesUnder DDoS attack or notMetricIfUnderDDoSAttack > 01 min5 mins0-CRITICALMS teams & Email
Virtual machine scalesetCPU UsageMetricPercentage CPU > 9015 mins1 hour2-WARNINGEmail
Container RegistryStorage UsedMetricStorageUsed > 90% of Storage size included in the SKU15 mins1 hour3 - InformationalEmailReview this which SKU of ACR has this metric
LogicAppRunsFailedMetricRunsFailed>01 hour12 hours3 - InformationalEmail
Log Analytics WorkspaceContainer SIGKILL ErrorLogsTable rows Count > 015 mins15 mins2-WARNINGEmailSignal KILL error Expand source
Log Analytics WorkspaceWAF_Possible_DDoS_DetectedLogs Querycount_ > 100015 mins15 mins1 - ErrorMS teams & EmailWAF_Possible_DDoS_Detected Expand source
Log Analytics workspaceNode-restart-delayed triggered by KuredLogs Query2-WARNINGEmailNode-restart-delayed Expand source
Log Analytics workspaceNode-restart-successful-Kured ActionLogs QueryOBSOLETENode-restart-successful Expand source
Azure SQL Database / serverVulnerability Scan ReportVulnerability Scan Report
FailureFailure Anomalies - ETAS-BCP-PT-Forensic-Logic-App Failure Anomalies detected 3 - Informational etas-bcp-pt-forensic-logic-app Application Insights Smart detector

Requirements

ACRACR - To trigger alert when Create or Update Images from the ACR?
SQL DBSQL DB - Slow / Long running Queries?
Service Principal secret / certificate expiry?
AKSCheck if we can sent an alert if k8s is not able to scale in new workernode
VISUALIZATION KURED/AKS ALERTSCurrently we dont have a Dashboard / Vis for kured alertsA overview over time would be helpful to



Refer : https://learn.microsoft.com/en-us/azure/azure-monitor/containers/container-insights-overview
Overview diagram of Container insights



https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-overview
Diagram that explains Azure Monitor alerts.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/313709.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Hive-Sql复杂面试题

参考链接&#xff1a;hive sql面试题及答案 - 知乎 有哪些好的题目都可以给我哦 我来汇总到一起 1、编写sql实现每个用户截止到每月为止的最大单月访问次数和累计到该月的总访问次数 数据&#xff1a; userid,month,visits A,2015-01,5 A,2015-01,15 B,2015-01,5 A,2015-01,…

MySQL面试——聚簇/非聚簇索引

存储引擎是针对表结构&#xff0c;不是数据库 引擎层&#xff1a;对数据层以何种方式进行组织 update&#xff1a;加索引&#xff1a;行级锁&#xff1b;不加索引&#xff1a;表级锁

Java 网络编程之TCP(三):基于NIO实现服务端,BIO实现客户端

前面的文章&#xff0c;我们讲述了BIO的概念&#xff0c;以及编程模型&#xff0c;由于BIO中服务器端的一些阻塞的点&#xff0c;导致服务端对于每一个客户端连接&#xff0c;都要开辟一个线程来处理&#xff0c;导致资源浪费&#xff0c;效率低。 为此&#xff0c;Linux 内核…

Stable Diffusion WebUI 使用 VAE 增加滤镜效果

本文收录于《AI绘画从入门到精通》专栏&#xff0c;专栏总目录&#xff1a;点这里&#xff0c;订阅后可阅读专栏内所有文章。 大家好&#xff0c;我是水滴~~ 本文主要介绍 VAE 模型&#xff0c;主要内容有&#xff1a;VAE 模型的概念、如果下载 VAE 模型、如何安装 VAE 模型、如…

520提升幸福感的好物有哪些?5款必备产品推荐!

520作为年度表白节日&#xff0c;提醒人们别忘了在日常中向所爱之人表达浪漫。从鲜花、美酒到护肤品&#xff0c;礼物成为表达爱意的重要方式。然而&#xff0c;如何选购适合对方的礼物成为人们的难题。过去&#xff0c;关于“硬核520礼物”等话题热度不减&#xff0c;各种送礼…

MQ面试题

为什么要使用消息队列&#xff1f; 优点&#xff1a;解耦、异步、流量削峰 缺点&#xff1a;可用性降低、复杂性提高、一致性问题 为什么选择了RabbitMQ而不是其它的MQ&#xff1f; kafka是以吞吐量高而闻名&#xff0c;不过其数据稳定性一般&#xff0c;而且无法保证消息有…

同旺科技 USB TO SPI / I2C适配器读写24LC256--页写

所需设备&#xff1a; 1、USB 转 SPI I2C 适配器&#xff1b;内附链接 2、24LC256芯片 适应于同旺科技 USB TO SPI / I2C适配器升级版、专业版&#xff1b; 从00地址开始写入64个字节&#xff0c;然后再将64个字节读回&#xff1b; 页写时序&#xff1a; 读时序&#xff1a…

C语言中整型与浮点型在内存中的存储

今天让我们来看看整型的数据和浮点型的数据在内存中是怎么存储的呢 整型数据在内存中的存储 整型数据在内存中存储的是二进制的补码 正数的话也没什么可说的&#xff0c;原码反码补码都相同 我们来看看负数&#xff1a; 以-5为例 原码&#xff1a;10000000 00000000 00000000 0…

Jenkins CI/CD 持续集成专题二 Jenkins 相关问题汇总

一 问题一 pod [!] Unknown command: package 1.1 如果没有安装过cocoapods-packager&#xff0c;安装cocoapods-packager&#xff0c;sudo gem install cocoapods-packager 1.2 如果已经安装cocoapods-packager&#xff0c;还是出现上面的错误&#xff0c;有可能是pod的安…

Spring Boost + Elasticsearch 实现检索查询

需求&#xff1a;对“昵称”进行“全文检索查询”&#xff0c;对“账号”进行“精确查询”。 认识 Elasticsearch 1. ES 的倒排索引 正向索引 对 id 进行检索速度很快。对其他字段即使加了索引&#xff0c;只能满足精确查询。模糊查询时&#xff0c;逐条数据扫描&#xff0c…

vscode ssh远程连接服务器,一直正在下载vscode服务器的解决办法

前言 为方便描述&#xff0c;在本教程中&#xff0c;发起远程连接的叫“主机”&#xff0c;被远程连接的叫“服务器”。 正文 如果主机是首次用vscode远程连接服务器&#xff0c;会在服务器上自动下载vscode服务器&#xff0c;但有时候因为网络问题&#xff0c;会卡在&#xff…

UE4 相机围绕某点旋转

关卡&#xff08;一个相机CameraActor&#xff0c;一个Cube(名叫Target)&#xff09;&#xff1a; 关卡蓝图里的逻辑(为了大家看得清楚&#xff0c;特意连得很紧凑&#xff0c;也比较乱&#xff0c;不然一张截图放不下)&#xff1a; 只对Yaw 只Pitch: 同样对Roll: 围绕任…

switch语句深讲

一。功能 1.选择&#xff0c;由case N:完成 2.switch语句本身没有分支功能&#xff0c;分支功能由break完成 二。注意 1.switch语句如果不加break&#xff0c;在一次判断成功后会执行下面全部语句并跳过判断 2.switch的参数必须是整形或者是计算结果为整形的表达式,浮点数会…

visionTransformer window平台下报错

错误&#xff1a; KeyError: Transformer/encoderblock_0/MlpBlock_3/Dense_0kernel is not a file in the archive解决方法&#xff1a; 修改这个函数即可&#xff0c;主要原因是Linux系统与window系统路径分隔符不一样导致 def load_from(self, weights, n_block):ROOT f&…

c++使用googletest进行单元测试

googletest进行单元测试 使用Google test进行测试一、单元测试二、使用gmock测试 使用Google test进行测试 使用场景&#xff1a; 在平时写代码中&#xff0c;我们需要测试某个函数是否正确时可以使用Google test使用&#xff0c;当然&#xff0c;我们也可以自己写函数进行验证…

云计算时代:SFP、SFP+、SFP28、QSFP+和QSFP28光纤模块详解

随着数据中心的快速发展和云计算的广泛应用&#xff0c;高速、高效率的光纤网络传输成为关键需求。在众多光纤模块中&#xff0c;SFP、SFP、SFP28、QSFP和QSFP28是最常见的几种类型。本文将为您详细解析这几种光纤模块之间的区别&#xff0c;帮助您更好地了解和选择适合自己需求…

网贷大数据黑名单要多久才能变正常?

网贷大数据黑名单是指个人在网贷平台申请贷款时&#xff0c;因为信用记录较差而被列入黑名单&#xff0c;无法获得贷款或者贷款额度受到限制的情况。网贷大数据黑名单的具体时间因个人信用状况、所属平台政策以及银行审核标准不同而异&#xff0c;一般来说&#xff0c;需要一定…

就业班 第三阶段(nginx) 2401--4.22 day1 nginx1 http+nginx初识+配置+虚拟主机

一、HTTP 介绍 HTTP协议是Hyper Text Transfer Protocol&#xff08;超文本传输协议&#xff09;的缩写,是用于从万维网&#xff08;WWW:World Wide Web &#xff09;服务器传输超文本到本地浏览器的传送协议。 HTTP是一个基于TCP/IP通信协议来传递数据&#xff08;HTML 文件…

Centos 5 的yum源

背景 有使用较老的Centos 5 系统内部安装软件无法正常报错&#xff0c;是由于系统叫老yum源存在问题 处理方法 更换下述yum源&#xff0c;可以将其他repo源文件备份移动到其他目录&#xff0c;添加下述源后重新测试 [C5.11-base] nameCentOS-5.11 baseurlhttp://vault.c…

微信小程序实现预约生成二维码

业务需求&#xff1a;点击预约按钮即可生成二维码凭码入校参观~ 一.创建页面 如下是博主自己写的wxml&#xff1a; <swiper indicator-dots indicator-color"white" indicator-active-color"blue" autoplay interval"2000" circular > &…