目录
- 问题描述
- 版本
- 定位:打印size最大的Record
- 定位:RefSubRecord
- 解决
- 代码
问题描述
使用apache.poi
读取.xls文件时有The content of an excel record cannot exceed 8224 bytes
的报错。待读取的文件的内容也是通过apache.poi
写入的,我的文件修改步骤是先删除页签然后写入页签(页签名是保持不变的),这样一次修改的结果也是符合我的预期的,但是某次程序读取文件时就出现了下面的报错,而且手动也打不开文件了。
Exception in thread "main" org.apache.poi.util.RecordFormatException: The content of an excel record cannot exceed 8224 bytesat org.apache.poi.hssf.record.RecordInputStream.nextRecord(RecordInputStream.java:222)at org.apache.poi.hssf.record.RecordFactoryInputStream.nextRecord(RecordFactoryInputStream.java:253)at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:494)at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:356)at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:413)at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:394)at com.mark.learning.bug.excel.ExcelXls.addSheet(ExcelXls.java:28)at com.mark.learning.bug.excel.ExcelXls.main(ExcelXls.java:84)
版本
<dependency><groupId>org.apache.poi</groupId><artifactId>poi</artifactId><version>3.17</version></dependency><dependency><groupId>org.apache.poi</groupId><artifactId>poi-ooxml</artifactId><version>3.17</version></dependency>
定位:打印size最大的Record
既然提示某一个Record超过了上限了,那我就把这个内容打印出来看看。最新定位到ExternSheetRecord 类的_list属性。
第267个record的size:6066
[EXTERNSHEET]numOfRefs = 1010
refrec #0: extBook=0 firstSheet=-1 lastSheet=-1
refrec #1: extBook=0 firstSheet=-1 lastSheet=-1
refrec #2: extBook=0 firstSheet=-1 lastSheet=-1
refrec #3: extBook=0 firstSheet=-1 lastSheet=-1
refrec #4: extBook=0 firstSheet=-1 lastSheet=-1
refrec #5: extBook=0 firstSheet=-1 lastSheet=-1
public class ExternSheetRecord extends StandardRecord {public final static short sid = 0x0017;private final List<RefSubRecord> _list;//这里有很多的记录信息
定位:RefSubRecord
RefSubRecord记录是什么信息?什么时候进行初始化?我在构造函数打了一个断点,发现每当删除一个页签或者新增一个页签就会创建RefSubRecord页签
这里有意思的是删除页签的时候会把对应索引的记录的firstSheetIndex
和lastSheetIndex
修改为-1。但是后面新增的从时候又尝试根据这两个变量找到对应的索引
这样就导致到了ExternSheetRecord
类的_list
属性会随着程序的运行不断的增长!
解决
1.直接替换文件类型将.xls换位新版的.xlsx
2.升级版本apache.poi
版本,我尝试升级为3.8版本的时候发现就没有这个问题了,原因是再3.8中删除的删除的页签的时候不会修改RefSubRecord的信息
3.17的删除逻辑
3.8的删除逻辑
(少了上面红框的内容)
代码
复现...exceed 8224 bytes
报错的代码
public class ExcelTest {private static int createSheetCnt = 0;private static final String path = "C:\\Users\\Desktop\\test2.xls";public void addSheet() {try {File file = new File(path);FileInputStream in = new FileInputStream(file);HSSFWorkbook workbook = new HSSFWorkbook(in);in.close();String sheetName = "test";int sheetIndex = workbook.getSheetIndex(sheetName);if (sheetIndex >= 0) {//页签存在删除页签workbook.removeSheetAt(sheetIndex);}//新建一个页签写入文件workbook.createSheet(sheetName);FileOutputStream fileOut = new FileOutputStream(path);workbook.write(fileOut);fileOut.close();System.out.println("创建页签次数:" + ++createSheetCnt);} catch (IOException e) {throw new RuntimeException(e);}}@Testpublic void test() {for (int i = 0; i < 10000; i++) {addSheet();}}
}
打印record信息的方法
public void printlnRecords() {try {File file = new File(path);FileInputStream in = new FileInputStream(file);HSSFWorkbook workbook = new HSSFWorkbook(in);in.close();InternalWorkbook internalWorkbook = workbook.getInternalWorkbook();List<Record> records = internalWorkbook.getRecords();System.out.println("records size:" + records.size());int maxIndex = 0;int maxRecordSize = 0;for (int i = 0; i < records.size(); i++) {Record record = records.get(i);int recordSize = record.getRecordSize();System.out.println("第" + i + "个record的size:" + recordSize);System.out.println(record);System.out.println();if (recordSize > maxRecordSize) {maxRecordSize = recordSize;maxIndex = i;}}System.out.println("第" + maxIndex + "个record的有最大size:" + maxRecordSize);} catch (IOException e) {throw new RuntimeException(e);}}