记录一下
一次应用事故分析、排查、处理
背景介绍
9号上午收到CPU告警,同时业务反馈依赖该服务的上游服务接口响应耗时太长
应用告警-CPU使用率 告警变更
【WARNING】项目XXX,集群qd-aliyun,分区bbbb-prod,应用customer,实例customer-6fb6448688-m47jz, POD实例CPU请求使用率 >= 90.000000% 当前值138.4971051199925%
发生时间:2024/10/09 11:17:33项目XXX,集群qd-aliyun,分区bbbb-prod,应用customer,实例customer-6fb6448688-28pvs, POD实例CPU请求使用率 >= 90.000000% 当前值157.7076205766934%告警已恢复
发生时间: 2024/10/09 11:06:33
恢复时间: 2024/10/09 12:24:33
服务访问量
单实例峰值QPS100左右
为啥要关注QPS,因为QPS100不应该消耗这么多CPU啊,而且请求、响应体都不大。
POD监控
POD配额
- CPU请求 2 Core CPU上限 3 Core
- 内存请求 7GiB 内存上限 9GiB
从图中可以看出
- CPU负载一直很高
- TCP链接及线程数从11点40开始陡峭上升
Arms
看下Trace监控发现,耗时主要是customer通过fegin调用外围接口导致的。
临时方案
临时处理方案:扩实例并增加CPU配置。
根因分析
此处略过排查三方接口跟开放平台网关的过程,此处的结论是:依赖的三方接口跟开放平台网关没有问题。
为啥会先排查三方接口跟开放平台网关是因为中Trace上来看是调用三方接口响应时间过长。
从Arms图看可以看出
- CPU耗时集中在fegin调用的Decoder、Encoder
- Decoder、Encoder耗时都集中在
- HttpMessageConverters#getDefaultConverters()=>
- WebMvcConfigurationSupport#addDefaultHttpMessageConverters=>
- …(具体调用链看下方摘要)
feign.ReflectiveFeign$BuildTemplateByResolvingArgs.create(Object[]) (14.37%, 1.43 minutes)
feign.ReflectiveFeign$BuildEncodedTemplateFromArgs.reesolve(Object[], RequestTemplate, Map) (14.37%, 1.43minutes)
org.springframework.cloud.openfeign.support.SpringEndcoder.encode(Object, Type, RequestTemplate) (14.28%,1.42 minutes)
com.jiankunking.common.core.feign.FeignClientsConfig$$ambda$938.56729293.get0bject() (13.98%, 1.39 minutes
com.jiankunking.common.core.feign.FeignClientsConfig.lambda$feignEncoder$2() (13.98%, 1.39 minutees)
org.springframework.boot.autoconfigure.http.HttpmessaageConverters.<init>(HttpMessageConverter[]) (12.03%,1.19 minutes)
prg.springframework.boot.autoconfigure.http.Http.HttpMessageConverters.<init>(Collection) (12.03%, 119 minutes)
org.springframework.boot.autoconfigure.http.HttpmessaageConverters.<init>(boolean, Collection) (12.03%, 1.19 minutes)
prg.springframework.boot.autoconfigure.http.Http.HttpMessageConverters.getDefaultConverters()(12.02%, 1.19 minutes
org.springframework.boot.autoconfigure.http.HttpmessageConverters$1.defaultMessageConverters() (12.02%, 119 minutes)
org.springframework.web.servlet.config.annotation.WebMvcConfigurationSupport.getMessageConverters() (12.02%, 1.19 minutes)
org.springframework.web.servlet.config.annotation. WebMvcConfigurationSupport.addDefaultHttpMessageConverters(List) (12.02%, 1
org.springframework.http.converter.json.Jackson2ObjectMapperBuilder.build() (5.93%, 0.59 minutes)
org.springframework.http.converter.json.Jackson2ObjectMapperBuilder.configure(ObjectMapper)(5.91%, 0.59 minutes)
org.springframework.http.converter.json.Jackson2Objec:tMapperBuilder.registerWellKnownModulesIfAvailable(Map)(5.89%,0.58 min
org.springframework.util.ClassUtils.forName(String, CClassLoader)(5.84%, 0.58 minutes)
java.lang.Class.forName(String, boolean, Classloader) (5.83%, 0.58 minutes)
java.lang.Class.forName0(String, boolean, ClassLoader, Class) (5.83%, 0.58 minutes)
......
自定义Encoder、Decoder
Encoder
看下jiankunking.common.core.feign.FeignClientsConfig中的Encoder
public Encoder feignEncoder() {ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(new RMappingJackson2HttpMessageConverter());return new SpringEncoder(objectFactory);}public class RMappingJackson2HttpMessageConverter extends MappingJackson2HttpMessageConverter {public RMappingJackson2HttpMessageConverter(ObjectMapper objectMapper) {super(objectMapper);List<MediaType> mediaTypes = new ArrayList<>();mediaTypes.add(MediaType.valueOf(MediaType.APPLICATION_JSON_UTF8_VALUE));mediaTypes.add(MediaType.valueOf(MediaType.TEXT_HTML_VALUE + ";charset=UTF-8"));setSupportedMediaTypes(mediaTypes);}RMappingJackson2HttpMessageConverter() {List<MediaType> mediaTypes = new ArrayList<>();mediaTypes.add(MediaType.valueOf(MediaType.APPLICATION_JSON_UTF8_VALUE));mediaTypes.add(MediaType.valueOf(MediaType.TEXT_HTML_VALUE + ";charset=UTF-8"));setSupportedMediaTypes(mediaTypes);}}
Decoder
看下jiankunking.common.core.feign.FeignClientsConfig中的Decoder
public Decoder feignDecoder() {HttpMessageConverter jacksonConverter = new MappingJackson2HttpMessageConverter(customObjectMapper());ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(jacksonConverter);return new ResponseEntityDecoder(new RSpringDecoder(objectFactory));}public ObjectMapper customObjectMapper() {ObjectMapper objectMapper = new ObjectMapper();objectMapper.registerModule(new StringToDateModule());objectMapper.configure(JsonParser.Feature.ALLOW_COMMENTS, true);objectMapper.configure(JsonParser.Feature.ALLOW_UNQUOTED_FIELD_NAMES, true);objectMapper.configure(JsonParser.Feature.ALLOW_SINGLE_QUOTES, true);objectMapper.configure(JsonParser.Feature.ALLOW_UNQUOTED_CONTROL_CHARS, true);objectMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);return objectMapper;}
Google了一下:‘spring feign encode jackson cpu usage high’
=> https://segmentfault.com/a/1190000043037032
=> https://mp.weixin.qq.com/s/RuqltkN9VdVQ1K3GKuJ-Gw
=> https://meantobe.github.io/2019/12/21/ClassLoader/
源码分析
查看registerWellKnownModulesIfAvailable处的代码
@SuppressWarnings("unchecked")private void registerWellKnownModulesIfAvailable(Map<Object, Module> modulesToRegister) {try {Class<? extends Module> jdk8ModuleClass = (Class<? extends Module>)ClassUtils.forName("com.fasterxml.jackson.datatype.jdk8.Jdk8Module", this.moduleClassLoader);Module jdk8Module = BeanUtils.instantiateClass(jdk8ModuleClass);modulesToRegister.put(jdk8Module.getTypeId(), jdk8Module);}catch (ClassNotFoundException ex) {// jackson-datatype-jdk8 not available}try {Class<? extends Module> javaTimeModuleClass = (Class<? extends Module>)ClassUtils.forName("com.fasterxml.jackson.datatype.jsr310.JavaTimeModule", this.moduleClassLoader);Module javaTimeModule = BeanUtils.instantiateClass(javaTimeModuleClass);modulesToRegister.put(javaTimeModule.getTypeId(), javaTimeModule);}catch (ClassNotFoundException ex) {// jackson-datatype-jsr310 not available}// Joda-Time present?if (ClassUtils.isPresent("org.joda.time.LocalDate", this.moduleClassLoader)) {try {Class<? extends Module> jodaModuleClass = (Class<? extends Module>)ClassUtils.forName("com.fasterxml.jackson.datatype.joda.JodaModule", this.moduleClassLoader);Module jodaModule = BeanUtils.instantiateClass(jodaModuleClass);modulesToRegister.put(jodaModule.getTypeId(), jodaModule);}catch (ClassNotFoundException ex) {// jackson-datatype-joda not available}}// Kotlin present?if (KotlinDetector.isKotlinPresent()) {try {Class<? extends Module> kotlinModuleClass = (Class<? extends Module>)ClassUtils.forName("com.fasterxml.jackson.module.kotlin.KotlinModule", this.moduleClassLoader);Module kotlinModule = BeanUtils.instantiateClass(kotlinModuleClass);modulesToRegister.put(kotlinModule.getTypeId(), kotlinModule);}catch (ClassNotFoundException ex) {if (!kotlinWarningLogged) {kotlinWarningLogged = true;logger.warn("For Jackson Kotlin classes support please add " +"\"com.fasterxml.jackson.module:jackson-module-kotlin\" to the classpath");}}}}
可以看到其逻辑为若classpath中有JodaTime的LocalDate,则加载Jackson对应的JodaModule.LaunchedURLClassLoader.
为啥没有怀疑jdk8ModuleClass、javaTimeModuleClass这两个地方呢?因为common包中已经依赖了下面两个包
compile "com.fasterxml.jackson.datatype:jackson-datatype-jdk8:${v.jacksonDatatype}"compile "com.fasterxml.jackson.datatype:jackson-datatype-jsr310:${v.jacksonDatatype}"
那么解决方案就很清晰了
解决方案
避免ClassLoader反复加载
将这个依赖添加到工程中。加载一次后,再次调用可以通过findLoadedClass获得,减少加载类导致的资源消耗。
<dependency><groupId>com.fasterxml.jackson.datatype</groupId><artifactId>jackson-datatype-joda</artifactId><version>x.x.x</version>
</dependency>
避免HttpMessageConverters重复初始化
public Decoder feignDecoder() {HttpMessageConverter jacksonConverter = new MappingJackson2HttpMessageConverter(customObjectMapper());ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(false, Collections.singletonList(jacksonConverter));return new ResponseEntityDecoder(new RSpringDecoder(objectFactory));}public Encoder feignEncoder() {HttpMessageConverter jacksonConverter = new RMappingJackson2HttpMessageConverter(customObjectMapper());ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(false, Collections.singletonList(jacksonConverter));return new SpringEncoder(objectFactory);}
总结
大家在自定义 Feign 的编解码器时,如果用到了 SpringEncoder / SpringDecoder,应避免 HttpMessageConverters 的重复初始化。如果不需要使用那些默认的 HttpMessageConverter,可以在初始化 HttpMessageConverters 时将第一个入参设置为 false,从而不初始化那些默认的 HttpMessageConverter。