LuaJIT 学习(5)—— string.buffer 库

文章目录

    • Using the String Buffer Library
      • Buffer Objects
      • Buffer Method Overview
    • Buffer Creation and Management
      • `local buf = buffer.new([size [,options]]) local buf = buffer.new([options])`
      • `buf = buf:reset()`
      • `buf = buf:free()`
    • Buffer Writers
      • `buf = buf:put([str|num|obj] [,…])`
      • `buf = buf:putf(format, …)`
      • `buf = buf:putcdata(cdata, len)`FFI
      • `buf = buf:set(str) `
      • `buf = buf:set(cdata, len)`FFI
      • `ptr, len = buf:reserve(size)`FFI
      • `buf = buf:commit(used)`FFI
    • Buffer Readers
      • `len = #buf`
      • `res = str|num|buf .. str|num|buf […]`
      • `buf = buf:skip(len)`
      • `str, … = buf:get([len|nil] [,…])`
      • `str = buf:tostring() `
      • `str = tostring(buf)`
      • `ptr, len = buf:ref()`FFI
    • Serialization of Lua Objects
        • 例子:序列化 Lua 对象
    • Error handling
    • FFI caveats
        • 例子说明:

The string buffer library allows high-performance manipulation of string-like data.

Unlike Lua strings, which are constants, string buffers are mutable sequences of 8-bit (binary-transparent) characters. Data can be stored, formatted and encoded into a string buffer and later converted, extracted or decoded.

The convenient string buffer API simplifies common string manipulation tasks, that would otherwise require creating many intermediate strings. String buffers improve performance by eliminating redundant memory copies, object creation, string interning and garbage collection overhead. In conjunction with the FFI library, they allow zero-copy operations.

The string buffer library also includes a high-performance serializer for Lua objects.

Using the String Buffer Library

The string buffer library is built into LuaJIT by default, but it’s not loaded by default. Add this to the start of every Lua file that needs one of its functions:

local buffer = require("string.buffer")

The convention for the syntax shown on this page is that buffer refers to the buffer library and buf refers to an individual buffer object.

Please note the difference between a Lua function call, e.g. buffer.new() (with a dot) and a Lua method call, e.g. buf:reset() (with a colon).

Buffer Objects

A buffer object is a garbage-collected Lua object. After creation with buffer.new(), it can (and should) be reused for many operations. When the last reference to a buffer object is gone, it will eventually be freed by the garbage collector, along with the allocated buffer space.

Buffers operate like a FIFO (first-in first-out) data structure. Data can be appended (written) to the end of the buffer and consumed (read) from the front of the buffer. These operations may be freely mixed.

The buffer space that holds the characters is managed automatically — it grows as needed and already consumed space is recycled. Use buffer.new(size) and buf:free(), if you need more control.

The maximum size of a single buffer is the same as the maximum size of a Lua string, which is slightly below two gigabytes. For huge data sizes, neither strings nor buffers are the right data structure — use the FFI library to directly map memory or files up to the virtual memory limit of your OS.

Buffer Method Overview

  • The buf:put*()-like methods append (write) characters to the end of the buffer.
  • The buf:get*()-like methods consume (read) characters from the front of the buffer.
  • Other methods, like buf:tostring() only read the buffer contents, but don’t change the buffer.
  • The buf:set() method allows zero-copy consumption of a string or an FFI cdata object as a buffer.
  • The FFI-specific methods allow zero-copy read/write-style operations or modifying the buffer contents in-place. Please check the FFI caveats below, too.
  • Methods that don’t need to return anything specific, return the buffer object itself as a convenience. This allows method chaining, e.g.: buf:reset():encode(obj) or buf:skip(len):get()

Buffer Creation and Management

local buf = buffer.new([size [,options]]) local buf = buffer.new([options])

Creates a new buffer object.

The optional size argument ensures a minimum initial buffer size. This is strictly an optimization when the required buffer size is known beforehand. The buffer space will grow as needed, in any case.

The optional table options sets various serialization options.

buf = buf:reset()

Reset (empty) the buffer. The allocated buffer space is not freed and may be reused.

buf = buf:free()

The buffer space of the buffer object is freed. The object itself remains intact, empty and may be reused.

Note: you normally don’t need to use this method. The garbage collector automatically frees the buffer space, when the buffer object is collected. Use this method, if you need to free the associated memory immediately.

Buffer Writers

buf = buf:put([str|num|obj] [,…])

Appends a string str, a number num or any object obj with a __tostring metamethod to the buffer. Multiple arguments are appended in the given order.

Appending a buffer to a buffer is possible and short-circuited internally. But it still involves a copy. Better combine the buffer writes to use a single buffer.

buf = buf:putf(format, …)

Appends the formatted arguments to the buffer. The format string supports the same options as string.format().

buf = buf:putcdata(cdata, len)FFI

Appends the given len number of bytes from the memory pointed to by the FFI cdata object to the buffer. The object needs to be convertible to a (constant) pointer.

buf = buf:set(str)

buf = buf:set(cdata, len)FFI

This method allows zero-copy consumption of a string or an FFI cdata object as a buffer. It stores a reference to the passed string str or the FFI cdata object in the buffer. Any buffer space originally allocated is freed. This is not an append operation, unlike the buf:put*() methods.

After calling this method, the buffer behaves as if buf:free():put(str) or buf:free():put(cdata, len) had been called. However, the data is only referenced and not copied, as long as the buffer is only consumed.

In case the buffer is written to later on, the referenced data is copied and the object reference is removed (copy-on-write semantics).

The stored reference is an anchor for the garbage collector and keeps the originally passed string or FFI cdata object alive.

ptr, len = buf:reserve(size)FFI

buf = buf:commit(used)FFI

The reserve method reserves at least size bytes of write space in the buffer. It returns an uint8_t * FFI cdata pointer ptr that points to this space.

The available length in bytes is returned in len. This is at least size bytes, but may be more to facilitate efficient buffer growth. You can either make use of the additional space or ignore len and only use size bytes.

The commit method appends the used bytes of the previously returned write space to the buffer data.

This pair of methods allows zero-copy use of C read-style APIs:

local MIN_SIZE = 65536
repeatlocal ptr, len = buf:reserve(MIN_SIZE)local n = C.read(fd, ptr, len)if n == 0 then break end -- EOF.if n < 0 then error("read error") endbuf:commit(n)
until false

The reserved write space is not initialized. At least the used bytes must be written to before calling the commit method. There’s no need to call the commit method, if nothing is added to the buffer (e.g. on error).

Buffer Readers

len = #buf

Returns the current length of the buffer data in bytes.

res = str|num|buf .. str|num|buf […]

The Lua concatenation operator .. also accepts buffers, just like strings or numbers. It always returns a string and not a buffer.

Note that although this is supported for convenience, this thwarts one of the main reasons to use buffers, which is to avoid string allocations. Rewrite it with buf:put() and buf:get().

Mixing this with unrelated objects that have a __concat metamethod may not work, since these probably only expect strings.

buf = buf:skip(len)

Skips (consumes) len bytes from the buffer up to the current length of the buffer data.

str, … = buf:get([len|nil] [,…])

Consumes the buffer data and returns one or more strings. If called without arguments, the whole buffer data is consumed. If called with a number, up to len bytes are consumed. A nil argument consumes the remaining buffer space (this only makes sense as the last argument). Multiple arguments consume the buffer data in the given order.

Note: a zero length or no remaining buffer data returns an empty string and not nil.

str = buf:tostring()

str = tostring(buf)

Creates a string from the buffer data, but doesn’t consume it. The buffer remains unchanged.

Buffer objects also define a __tostring metamethod. This means buffers can be passed to the global tostring() function and many other functions that accept this in place of strings. The important internal uses in functions like io.write() are short-circuited to avoid the creation of an intermediate string object.

ptr, len = buf:ref()FFI

Returns an uint8_t * FFI cdata pointer ptr that points to the buffer data. The length of the buffer data in bytes is returned in len.

The returned pointer can be directly passed to C functions that expect a buffer and a length. You can also do bytewise reads (local x = ptr[i]) or writes (ptr[i] = 0x40) of the buffer data.

In conjunction with the skip method, this allows zero-copy use of C write-style APIs:

repeatlocal ptr, len = buf:ref()if len == 0 then break endlocal n = C.write(fd, ptr, len)if n < 0 then error("write error") endbuf:skip(n)
until n >= len

Unlike Lua strings, buffer data is not implicitly zero-terminated. It’s not safe to pass ptr to C functions that expect zero-terminated strings. If you’re not using len, then you’re doing something wrong.

Serialization of Lua Objects

略过

例子:序列化 Lua 对象
local buffer = require("string.buffer")-- 创建一个元表
local mt1 = { __index = function(t, k) return "default" end }
local mt2 = { __index = function(t, k) return "another default" end }-- 创建需要序列化的表
local t1 = setmetatable({ key1 = "value1", key2 = "value2" }, mt1)
local t2 = setmetatable({ key1 = "value3", key2 = "value4" }, mt2)-- 定义字典和元表的数组
local dict = {"key1", "key2"}
local metatable = {mt1, mt2}-- 使用 buffer.new() 进行序列化
local buffer_obj = buffer.new({dict = dict,metatable = metatable
})-- 假设序列化后的数据为序列化函数 `encode()`
local serialized_data = buffer_obj:encode({t1, t2})-- 反序列化
local decoded_data = buffer_obj:decode(serialized_data)-- 访问解码后的数据
for _, tbl in ipairs(decoded_data) doprint(tbl.key1, tbl.key2)
end

Error handling

Many of the buffer methods can throw an error. Out-of-memory or usage errors are best caught with an outer wrapper for larger parts of code. There’s not much one can do after that, anyway.

OTOH, you may want to catch some errors individually. Buffer methods need to receive the buffer object as the first argument. The Lua colon-syntax obj:method() does that implicitly. But to wrap a method with pcall(), the arguments need to be passed like this:

local ok, err = pcall(buf.encode, buf, obj)
if not ok then-- Handle error in err.
end

FFI caveats

The string buffer library has been designed to work well together with the FFI library. But due to the low-level nature of the FFI library, some care needs to be taken:

First, please remember that FFI pointers are zero-indexed. The space returned by buf:reserve() and buf:ref() starts at the returned pointer and ends before len bytes after that.

I.e. the first valid index is ptr[0] and the last valid index is ptr[len-1]. If the returned length is zero, there’s no valid index at all. The returned pointer may even be NULL.

The space pointed to by the returned pointer is only valid as long as the buffer is not modified in any way (neither append, nor consume, nor reset, etc.). The pointer is also not a GC anchor for the buffer object itself.

Buffer data is only guaranteed to be byte-aligned. Casting the returned pointer to a data type with higher alignment may cause unaligned accesses. It depends on the CPU architecture whether this is allowed or not (it’s always OK on x86/x64 and mostly OK on other modern architectures).

FFI pointers or references do not count as GC anchors for an underlying object. E.g. an array allocated with ffi.new() is anchored by buf:set(array, len), but not by buf:set(array+offset, len). The addition of the offset creates a new pointer, even when the offset is zero. In this case, you need to make sure there’s still a reference to the original array as long as its contents are in use by the buffer.

例子说明:
  1. 正常的引用:当你使用 buf:set(array, len) 时,这个 array 是一个通过 FFI 创建的数组,它会被作为 buf 的参数传递进去。在这种情况下,array 被引用,并且只要 buf 依然存在并持有这个引用,array 不会被垃圾回收器回收。这里 array 是一个“垃圾回收锚点”(GC anchor),即它会被垃圾回收器追踪。
  2. 添加偏移量后的情况:当你通过 array + offset 创建一个新的指针时(即通过加偏移量来引用 array 中的某个元素),这时创建的是一个新的指针对象。即使 offset 为零,array + offset 仍然会被视为一个新的指针。这个新的指针不会自动被垃圾回收器追踪,因为它并没有直接引用 array
    • 问题:这意味着,如果你只使用 array + offset(即偏移后的指针),垃圾回收器可能会认为原始的 array 对象不再被使用,最终回收掉 array,即使 buf 仍然依赖于它的内容。这会导致访问已回收的内存,造成未定义行为或崩溃。

Even though each LuaJIT VM instance is single-threaded (but you can create multiple VMs), FFI data structures can be accessed concurrently. Be careful when reading/writing FFI cdata from/to buffers to avoid concurrent accesses or modifications. In particular, the memory referenced by buf:set(cdata, len) must not be modified while buffer readers are working on it. Shared, but read-only memory mappings of files are OK, but only if the file does not change.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/35463.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

本地部署deepseek-r1建立向量知识库和知识库检索实践【代码】

目录 一、本地部署DS 二、建立本地知识库 1.安装python和必要的库 2.设置主目录工作区 3.编写文档解析脚本 4.构建向量数据库 三、基于DS,使用本地知识库检索 本地部署DS,其实非常简单,我写了一篇操作记录,我终于本地部署了DeepSeek-R1(图文全过程)-CSDN博客 安装…

Matlab 汽车传动系统的振动特性分析

1、内容简介 Matlab 186-汽车传动系统的振动特性分析 可以交流、咨询、答疑 2、内容说明 略 摘要&#xff1a;汽车动力传动系统是一个具有多自由度的、连续的、有阻尼系统。传动系统的振动主要有横向振动、扭转振动、纵向振动。并且汽车传动系统的扭转振动是一个非常重要的振…

【C++】树和二叉树的实现(上)

本篇博客给大家带来的是用C语言来实现数据结构树和二叉树的实现&#xff01; &#x1f41f;&#x1f41f;文章专栏&#xff1a;数据结构 &#x1f680;&#x1f680;若有问题评论区下讨论&#xff0c;我会及时回答 ❤❤欢迎大家点赞、收藏、分享&#xff01; 今日思想&#xff…

k8s环境部署

四台机器 分别是 k8s-master&#xff1a;172.25.254.100 k8s-node1&#xff1a;172.25.254.10 k8s-node2&#xff1a;172.25.254.20 docker-harbor&#xff1a;172.25.254.200 reg.timinglee.org 四台机器分别配置好网络和软件仓库 做好地址解析 scp -r /etc/hosts/ root17…

transformer bert 多头自注意力

输入的&#xff08;a1,a2,a3,a4&#xff09;是最终嵌入&#xff0c;是一个(512,768)的矩阵&#xff1b;而a1是一个token&#xff0c;尺寸是768 a1通过Wq权重矩阵&#xff0c;经过全连接变换得到查询向量q1&#xff1b;a2通过Wk权重矩阵得到键向量k2&#xff1b;q和k点乘就是值…

它,让机器人与HMI屏无缝对接

随着工业自动化向智能化发展&#xff0c;机器人与HMI屏的通信变得至关重要。本文将为您介绍一款创新的解决方案&#xff0c;它打破了通信协议的壁垒&#xff0c;实现机器人与HMI屏的无缝连接。 随着工业自动化向智能化的迈进&#xff0c;生产制造业正加速引入大量工业机器人以替…

MySQL 锁

MySQL中最常见的锁有全局锁、表锁、行锁。 全局锁 全局锁用于锁住当前库中的所有实例&#xff0c;也就是说会将所有的表都锁住。一般用于做数据库备份的时候就需要添加全局锁&#xff0c;数据库备份的时候是一个表一个表备份&#xff0c;如果没有加锁的话在备份的时候会有其他的…

win10 c++ VsCode 配置PCL open3d并显示

win10 c VsCode配置PCL open3d并显示 一、效果图二、配置步骤2.1 安装vscode2.2 pcl-open3d配置2.3 vscode中设置 三、测试代码四、注意事项及后续 一、效果图 二、配置步骤 2.1 安装vscode vscode下载链接 下载中文插件、c相关插件 2.2 pcl-open3d配置 1&#xff09;下载…

Python----计算机视觉处理(Opencv:图像颜色替换)

一、开运算 开运算就是对图像先进行腐蚀操作&#xff0c; 然后进行膨胀操作。开运算可以去除二值化图中的小的噪点&#xff0c;并分离相连的物体。 其主要目的就是消除那些小白点 在开运算组件中&#xff0c;有一个叫做kernel的参数&#xff0c;指的是核的大小&#xff0c;通常…

泰勒·斯威夫特(Taylor Swift)的音乐影响力与商业版图深度研究

泰勒斯威夫特的音乐影响力与商业版图深度研究 简介 泰勒斯威夫特&#xff08;Taylor Swift&#xff09;是当今流行音乐领域最具影响力的全球巨星之一。自少年时期出道以来&#xff0c;她在音乐风格、形象和商业战略上不断演变&#xff0c;从乡村音乐新人成长为引领流行文化的…

完全托管的DeepSeek-R1模型正式登陆Amazon Bedrock:安全部署与使用指南

文章目录 摘要一、核心优势&#xff1a;完全托管与企业级安全二、部署注意事项三、实践指南&#xff1a;从接入到调用四、支持区域与定价五、结语 摘要 DeepSeek-R1模型已在Amazon Bedrock平台正式上线&#xff0c;支持通过Bedrock Marketplace和自定义模型导入功能调用。 该模…

Matlab 汽车ABS实现模糊pid和pid控制

1、内容简介 Matlab 181-汽车ABS实现模糊pid和pid控制 可以交流、咨询、答疑 2、内容说明 略 实现汽车防抱死制动系统&#xff08;ABS&#xff09;的控制算法&#xff0c;通常涉及到传统的PID控制和模糊PID控制两种方法。下面将分别介绍这两种控制策略的基本概念以及如何在M…

Spring IOC(五个类注解)

controller、service、Repository、Component 、Configurationpackage com.java.ioc;import com.java.ioc.Controller.HelloController; import com.java.ioc.rep.UserRepository; import com.java.ioc.service.UserService; import org.springframework.boot.SpringApplicatio…

[Java实战]Spring Boot服务CPU 100%问题排查:从定位到解决

Spring Boot服务CPU 100%问题排查&#xff1a;从定位到解决 1. 引言 当Spring Boot服务出现CPU占用率100%时&#xff0c;系统性能会急剧下降&#xff0c;甚至导致服务不可用。本文将通过真实代码案例&#xff0c;详细讲解如何快速定位问题根源&#xff0c;并提供解决方案。无…

机器学习扫盲系列(2)- 深入浅出“反向传播”-1

系列文章目录 机器学习扫盲系列&#xff08;1&#xff09;- 序 机器学习扫盲系列&#xff08;2&#xff09;- 深入浅出“反向传播”-1 文章目录 前言一、神经网络的本质二、线性问题解析解的不可行性梯度下降与随机梯度下降链式法则 三、非线性问题激活函数 前言 反向传播(Ba…

LabVIEW 线性拟合

该 LabVIEW 程序实现了 线性拟合&#xff08;Linear Fit&#xff09;&#xff0c;用于计算给定一组数据点的斜率&#xff08;Slope&#xff09;和截距&#xff08;Intercept&#xff09;&#xff0c;并将结果可视化于 XY Graph 中。本案例适用于数据拟合、实验数据分析、传感器…

XSS漏洞靶场---(复现)

XSS漏洞靶场—&#xff08;复现&#xff09; 反射型 XSS 的特点是攻击者诱导用户点击包含恶意脚本的 URL&#xff0c;服务器接收到请求后将恶意脚本反射回响应页面&#xff0c;浏览器执行该脚本从而造成攻击&#xff0c;恶意脚本不会在服务器端存储。 Level 1(反射型XSS) 此漏…

优选算法系列(2.滑动窗口 _ 上)

目录 解法⼀&#xff08;暴力求解&#xff09;&#xff08;不会超时&#xff0c;可以通过&#xff09;&#xff1a;一.长度最小的子数组&#xff08;medium&#xff09; 题目链接209. 长度最小的子数组 - 力扣&#xff08;LeetCode&#xff09; 解法&#xff1a; 代码&#…

ELK(Elasticsearch、Logstash、Kbana)安装及Spring应用

Elasticsearch安装及Spring应用 一、引言二、基本概念1.索引&#xff08;Index&#xff09;2.类型&#xff08;Type&#xff09;3.文档&#xff08;Document&#xff09;4.分片&#xff08;Shard&#xff09;5.副本&#xff08;Replica&#xff09; 二、ELK搭建1.创建挂载的文件…

Redis,从数据结构到集群的知识总结

Redis基础部分 2. 数据结构 redis底层使用C语言实现&#xff0c;这里主要分析底层数据结构 2.1 动态字符串(SDS) 由于C底层的字符串数组一旦遇到’\0’就会认为这个字符串数组已经结束&#xff0c;意味着无法存储二进制数据&#xff08;如图片、音频等&#xff09;&#xff…