背景:Unity接入的是 Google Protobuf 3.21.12 版本,排查下来反序列化过程中的一些GC点,处理了几个严重的,网上也有一些分析,这里就不一一展开,默认读者已经略知一二了。
如果下面有任何问题请评论区留言提出,我会留意修改的!
GC点1
每次反序列化解析Message的时候,会将Stream传给MessageParser.cs,然后传给MessageExtensions.cs,这里每次都会new CodeInputStream();造成GC(如下图1,2)
这里的做法是改成了单例Instance,将每处new改成获取单例,然后调用Reset,参考以下部分代码,替换单例的调用代码这里略过(搜引用即可)。
这里有个易错点,Reset的bytes.length值,必须传(0,0),我改成(0,bytes.length)报错了,参考CodedInputStream构造函数本身也是(0,0)。
private static CodedInputStream _bytesInstance;
public static CodedInputStream GetBytesInstance(byte[] buffer, int bufferPos, int bufferSize)
{if (_bytesInstance == null){_bytesInstance = new CodedInputStream(buffer, bufferPos, bufferSize);}else{_bytesInstance.Reset(buffer, bufferPos, bufferSize, true);}return _bytesInstance;
}
private static byte[] bytes = new byte[BufferSize];
private static CodedInputStream _streamInstance;
public static CodedInputStream GetSteamInstance(Stream input)
{if (_streamInstance == null){_streamInstance = new CodedInputStream(input);}else{_streamInstance.Reset(bytes, 0, 0, false, ProtoPreconditions.CheckNotNull(input, "input"));}return _streamInstance;
}
private static CodedInputStream _streamBytesInstance;
public static CodedInputStream GetSteamBytesInstance(Stream input, byte[] buffer)
{if (_streamBytesInstance == null){_streamBytesInstance = new CodedInputStream(input, buffer);}else{_streamBytesInstance.Reset(buffer, 0, 0, false, ProtoPreconditions.CheckNotNull(input, "input"));}return _streamBytesInstance;
}
...
...
...
public void Reset(byte[] buffer, int bufferPos, int bufferSize, bool leaveOpen, Stream input = null)
{this.input = input;this.buffer = buffer;this.state = default;this.state.bufferPos = bufferPos;this.state.bufferSize = bufferSize;this.state.sizeLimit = DefaultSizeLimit;this.state.recursionLimit = DefaultRecursionLimit;SegmentedBufferHelper.Initialize(this, out this.state.segmentedBufferHelper);this.leaveOpen = leaveOpen;this.state.currentLimit = int.MaxValue;
}
GC点2
protoc.exe 生成的proto message 的 cs 模板代码,都会带一个Parser给业务方使用,使用Parser来反序列化数据流(下图)
然后仔细看生成的代码(下图),_parser是static readonly,初始化的时候就构造好了,常驻内存,但这里有个延迟初始化,将lambda () => new ToyTrackingSurvivorData() 透传给MessageParser。
我们看看MessageParser做了啥(下图)
这里的ParseFrom是我们业务调过来的,也就是每一次的反序列化,都会factory()一次,GC点无疑了,那么问题已经找到了,需要怎么解决呢。
一开始想的是这里也做成单例,每次factory()改成每次先reset然后再返回,但报错了,错误原因是当.proto里面的字段是repeated或者map的时候,需要同时factory()多个对象出来,这里单例就走不通了,那么就做对象池把。
关于对象池设计的思考:
- Protobuf源码里需要有一个池子,每次factory()实例化给出去的对象,业务用完了要回池子,下次业务取的时候优先从池子里面取
- Parser每次MergeFrom的时候(这里可以理解为每次业务从池子里取出来的时候),需要把从池子里取出来的对象数据成员都Reset为default,或者Clear数据,这里值类型是default,repeated & map是引用类型,需要Clear,注意:存在proto里面是repeated<message>套repeated<message>再套repeated<int>的情况,所以需要考虑递归去清理。
- 因为Parser所在的cs文件是protoc.exe生成的代码,需要改生成模板的代码工具,也就是protoc.exe的源码
- 设计业务回收池策略,也就是业务什么时候用完,返给池子
关于第一点我这里踩了个小坑,因为考虑到每个message的类型都不一样,所以需要做Dictionary<className, MObjectPool>的池子,也实现了,但发现每次池子里的个数都是1,才反应过来下面这段代码的设计理念
private static readonly pb::MessageParser<ToyTrackingSurvivorData> _parser = new pb::MessageParser<ToyTrackingSurvivorData>(() => new ToyTrackingSurvivorData());
它通过范型MessageParser<T>生成了无数个_parser<T>,每个message类都一一对应,这样也不需要做Dictionary了,也就是每个Parser都自带一个MObjectPool,代码就简洁多了。
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Text;namespace Google.Protobuf
{internal interface IObjectPool{int countAll { get; } // 总对象个数int countActive { get; } // 当前活跃对象个数int countInActive { get; } // 当前队列可用对象个数}internal class MObjectPool<T> : IObjectPool{private const int LimitNum = 1024;private readonly Queue<T> _queue = new Queue<T>(LimitNum);private readonly Func<T> _create;private readonly Action<T> _get;private readonly Action<T> _release;private readonly Action<T> _destroy;public int countAll { get; private set; }public int countActive { get { return countAll - countInActive; } }public int countInActive { get { return _queue.Count; } }public MObjectPool(Func<T> create, Action<T> get = null, Action<T> release = null, Action<T> destroy = null){_create = create;_get = get;_release = release;_destroy = destroy;}public T Get(){T t;if (_queue.Count == 0){t = _create();countAll++;}else{t = _queue.Dequeue();}_get?.Invoke(t);return t;}public void Recycle(T t){if (t == null) return;if (countInActive < LimitNum){_queue.Enqueue(t);}else{countAll--;}_release?.Invoke(t);}public void Destroy(){if (_destroy != null){while(_queue.Count > 0){_destroy(_queue.Dequeue());}}_queue.Clear();countAll = 0;}}public class MObjcetPoolMgr<T> where T : IMessage<T>{private MObjectPool<T> _pool;private static MObjcetPoolMgr<T> _instance;public static MObjcetPoolMgr<T> Instance{get{if (_instance == null){_instance = new MObjcetPoolMgr<T>();}return _instance;}}public T Get(Func<T> create, Action<T> get = null, Action<T> release = null, Action<T> clear = null){if (_pool == null){_pool = new MObjectPool<T>(create, get, release, clear);}var t = _pool.Get();//log("Get");return t;}public void Recycle(T t){_pool.Recycle(t);//log("Recycle");}public void Destroy(){_pool.Destroy();//log("Destroy");}private static StringBuilder str = new StringBuilder();private void log(string op){str.Clear();str.Append($"[{nameof(MObjcetPoolMgr<T>)}][{op}] {typeof(T).Name} countAll:{_pool.countAll.ToString()} countActive:{_pool.countActive.ToString()} countInActive:{_pool.countInActive.ToString()}");UnityEngine.Debug.Log(str.ToString());}}
}
关于MessageParser的调用如下(简略版),这样factory()的替代品池子就做好了!
public new T ParseFrom(CodedInputStream input)
{//T message = factory();T message = _poolGet();MergeFrom(message, input);return message;
}
private T _poolGet()
{return MObjcetPoolMgr<T>.Instance.Get(factory);
}
public void PoolRecycle(T t)
{if (t == null) return;MObjcetPoolMgr<T>.Instance.Recycle(t);
}
public void PoolDestroy()
{MObjcetPoolMgr<T>.Instance.Destroy();
}
下面关于2 3两点其实是一个问题,就是如何修改protoc.exe生成的模板代码,这里网上的参考资料有一些零碎,我也是拼起来写完的,思路就是在每个message class里加一个MessageClear方法,来清理池子里的数据,然后在每次用的时候,调用下MessageClear()就行了,直接看我的修改
csharp_message.cc第一处修改:WriteGeneratedCodeAttributes(printer);
printer->Print("public void MessageClear()\n{\n");
for (int i = 0; i < descriptor_->field_count(); i++){const FieldDescriptor* fieldDescriptor = descriptor_->field(i);std::string fieldName = UnderscoresToCamelCase(fieldDescriptor->name(), false);if (fieldDescriptor->type() == FieldDescriptor::Type::TYPE_MESSAGE || fieldDescriptor->type() == FieldDescriptor::Type::TYPE_GROUP) {if (fieldDescriptor->is_repeated()) {if (fieldDescriptor->is_map()) {if (fieldDescriptor->message_type()->map_value()->type() == FieldDescriptor::Type::TYPE_MESSAGE || fieldDescriptor->message_type()->map_value()->type() == FieldDescriptor::Type::TYPE_GROUP){printer->Print(" if($field_name$_ != null) { for (int i = 0, size = $field_name$_.Count; i < size; i++) { $field_name$_[i].MessageClear(); } $field_name$_.Clear(); }\n", "field_name", fieldName);} else {printer->Print(" if($field_name$_ != null) $field_name$_.Clear();\n", "field_name", fieldName);}} else {printer->Print(" if($field_name$_ != null) { for (int i = 0, size = $field_name$_.Count; i < size; i++) { $field_name$_[i].MessageClear(); } $field_name$_.Clear(); }\n", "field_name", fieldName);}} else {printer->Print(" if($field_name$_ != null) $field_name$_.MessageClear();\n", "field_name", fieldName);}}else {if (fieldDescriptor->is_repeated()) {printer->Print(" if($field_name$_ != null) $field_name$_.Clear();\n", "field_name", fieldName);} else {if (fieldDescriptor->type() == FieldDescriptor::Type::TYPE_BYTES) {printer->Print(" if($field_name$_.Length != 0) $field_name$_ = pb::ByteString.Empty;\n", "field_name", fieldName);} else if (fieldDescriptor->type() == FieldDescriptor::Type::TYPE_ENUM){printer->Print(" $field_name$_ = $field_type$.$default_value$;\n", "field_type", GetClassName(fieldDescriptor->enum_type()), "field_name", fieldName, "default_value", GetEnumValueName(fieldDescriptor->default_value_enum()->type()->name(), fieldDescriptor->default_value_enum()->name()));} else if (fieldDescriptor->type() == FieldDescriptor::Type::TYPE_STRING){printer->Print(" $field_name$_ = $default_value$;\n", "field_name", fieldName, "default_value", "\"\"");} else{printer->Print(" $field_name$_ = $default_value$;\n", "field_name", fieldName, "default_value", "default");}}}
}
printer->Print("}\n");
csharp_message.cc第二处修改:printer->Print("MessageClear();\n");
csharp_message.cc第三处修改:printer->Indent();
printer->Print("MessageClear();\n");
printer->Outdent();
到此,protoc.exe的生成代码就改好了,解决了2 3点的问题!
接下来是第四点,业务代码的回收策略了,这里比较吃项目,有很多需要手改的地方,但好在也有模板,可以参考下,我们使用了ProtoGen.exe工具生成协议代码,每次协议使用完之后回收进池子就OK了。