前言
Apache PDFBox
是一个开源的Java库
,可以用来对PDF文档做一些基本
操作,比如实际应用中的pdf读取、写入、合并、拆分、写文字、写图片、加水印等,甚至还应用到了电子签章。本文逐个介绍对pdf的操作,以备作为后续参考使用。
一、pdfbox简介
Apache PDFBox库是一个开源的Java工具,专门用于处理PDF文档。它允许用户创建全新的PDF文件,编辑现有的PDF文档,以及从PDF文件中提取内容。
Apache PDFBox具备以下主要功能:
- 创建和修改PDF文档
- 提取文本和图像
- 处理表单和注释
- 合并和拆分PDF文档
- 填充PDF表单。
- 将PDF文件另存为图像格式,如PNG或JPEG。
- 对PDF文件进行数字签名,如电子签章。
官网地址:Apache PDFBox | Download
二、pdfbox核心功能
2.1. 创建PDF文档
try {// 创建一个空白的PDF文档PDDocument document = new PDDocument();// 创建一个页面PDPage page = new PDPage(PDRectangle.A4);document.addPage(page);// 创建一个内容流PDPageContentStream contentStream = new PDPageContentStream(document, page);// 设置字体和字号contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);// 在页面上绘制文本contentStream.beginText();contentStream.newLineAtOffset(100, 700);contentStream.showText("Hello, World!");contentStream.endText();// 关闭内容流contentStream.close();// 保存PDF文档document.save("output.pdf");// 关闭PDF文档document.close();} catch (IOException e) {e.printStackTrace();}
2.2. 读取PDF文档
try (PDDocument document = PDDocument.load(new File("mysql.pdf"))) {PDFTextStripper pdfStripper = new PDFTextStripper();String text = pdfStripper.getText(document);System.out.println(text);
} catch (IOException e) {e.printStackTrace();
}
2.3. 修改PDF文档
- 写入单行文本
try (PDDocument document = PDDocument.load(new File("eclipse安装与设置.pdf"))) {PDPage page = document.getPage(0);try (PDPageContentStream contentStream = new PDPageContentStream(document, page, PDPageContentStream.AppendMode.APPEND, true)) {contentStream.beginText();contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);contentStream.newLineAtOffset(100, 600);//设置文本的坐标位置contentStream.showText("Added signature here");contentStream.endText();}document.save("eclipse安装与设置V2.pdf");
} catch (IOException e) {e.printStackTrace();
}
- 连续写入多行文本
try (PDPageContentStream contentStream = new PDPageContentStream(document, page)) {contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);// 设置文本起始坐标float startX = 50;float startY = page.getMediaBox().getHeight() - 50;// 设置行间距float leading = 15;// 写入多行文本String[] lines = {"第一行文本","第二行文本","第三行文本"};contentStream.beginText();contentStream.newLineAtOffset(startX, startY);for (String line : lines) {contentStream.showText(line);contentStream.newLineAtOffset(0, -leading);}contentStream.endText();
} catch (IOException e) {e.printStackTrace();
}
2.4 插入图片
使用PDImageXObject.createFromFileByExtension()方法加载图片文件,创建一个DImageXObject对象。
PDImageXObject image = PDImageXObject.createFromFileByExtension(new File("D:\\path\\to\\image.jpg"), document);
// 图片的宽度和高度
float imageWidth = image.getWidth();
float imageHeight = image.getHeight();PDPageContentStream contentStream = new PDPageContentStream(document, page);
// x,y 代表其 xy 坐标,后面的imageWidth, imageHeight 分别代表图片的宽度和高度
contentStream.drawImage(image, x, y, imageWidth, imageHeight);
2.5. 提取图像
try (PDDocument document = PDDocument.load(new File("jdk8.pdf"))) {PDPage page = document.getPage(0);Iterable<PDImageXObject> images = page.getResources().getXObjectNames(PDImageXObject.class);int imageNo = 1;for (PDImageXObject img: images) {BufferedImage bImage = img.getImage();ImageIO.write(bImage, "png", new File("pdf_image" + imageNo + ".png"));imageNo++;}
} catch (IOException e) {e.printStackTrace();
}
2.5. 合并PDF文档
try {PDFMergerUtility merger = new PDFMergerUtility();merger.addSource("1.pdf");merger.addSource("2.pdf");merger.addSource("3.pdf");merger.addSource("5.pdf");merger.setDestinationFileName("merged.pdf");merger.mergeDocuments(null);} catch (IOException e) {e.printStackTrace();}
2.6. 拆分PDF文档
try (PDDocument document = PDDocument.load(new File("example.pdf"))) {Splitter splitter = new Splitter();List<PDDocument> pages = splitter.split(document);int pageNo = 1;for (PDDocument page : pages) {page.save("split_page_" + pageNo + ".pdf");pageNo++;page.close();}
} catch (IOException e) {e.printStackTrace();
}
2.7.添加矩形框
//设置边框颜色
contentStream.setStrokingColor(new Color(213, 213, 213));
//设置边框宽度为1
contentStream.setLineWidth(1);
// 添加矩形框到页面内容流
contentStream.addRect(50, pageHeight-50, 100, 100);
// 绘制矩形框的边框
contentStream.stroke();
//恢复原来的颜色,否则会影响文字颜色
contentStream.setStrokingColor(Color.BLACK);
2.8.路径绘制
// 加载文件
PDDocument pdDocument = Loader.loadPDF(new File("项目合同.pdf"));
// 读取首页内容流
pdDocument.getPages().get(0);
PDPageContentStream contentStream = new PDPageContentStream(pdDocument, page, PDPageContentStream.AppendMode.APPEND, true);
// 绘制
contentStream.setLineWidth(0.5f);
contentStream.setStrokingColor(0 / 255f, 0 / 255f, 0 / 255f);
contentStream.moveTo(0, 0);
contentStream.lineTo(10, 0);
contentStream.curveTo(20, 30, 40, 50, 60, 70);
contentStream.closePath();
contentStream.fill();//填充
contentStream.stroke();//描边
//contentStream.fillAndStroke();//填充并描边
2.9.设置透明度
PDPageContentStream contentStream = new PDPageContentStream(pdDocument, page, PDPageContentStream.AppendMode.APPEND, true);
// 设置透明度
PDExtendedGraphicsState graphicsState = new PDExtendedGraphicsState();
graphicsState.setNonStrokingAlphaConstant(0.6f);//当这个值变为1.0f时,表示重置了透明度
contentStream.setGraphicsStateParameters(graphicsState);
三、pdfbox2.0的使用
以springboot项目为例,在pom.xml增加依赖
<dependency><groupId>org.apache.pdfbox</groupId><artifactId>pdfbox</artifactId><version>2.0.27</version>
</dependency>
注意:如果有引入图像识别(tess4j)类似的jar包,需要做排除
这里汇集了pdfbox2.0对pdf的读取、写入、合并、拆分、写文字、写图片、加水印操作,工具类示例代码:
import java.awt.Color;
import java.awt.Font;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;import javax.imageio.stream.FileImageOutputStream;import org.apache.pdfbox.multipdf.PDFMergerUtility;
import org.apache.pdfbox.multipdf.Splitter;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.PDPageTree;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.font.PDFont;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;
import org.apache.pdfbox.text.PDFTextStripper;import cn.hutool.core.img.ImgUtil;public class Pdfbox2 {public static void main(String[] args) throws Exception {// 读取PDF内容 String content = ReadPdf("D:\\pdf\\target.pdf");System.out.println("内容如下:\n" + content);// PDF插入文字String inputFilePath = "D:\\pdf\\A.pdf";String outputFilePath = "D:\\pdf\\A2.pdf";//只能写英文,写中文会报错insertPageContent(inputFilePath, outputFilePath, 1, "testMessage");// PDF插入图片String imagePath = "D:\\pdf\\pic.jpg";insertImage(inputFilePath, imagePath, outputFilePath, 1);// 合并PDFList<String> pathList = new ArrayList<String>();String targetPDFPath = "D:\\pdf\\target.pdf";pathList.add("D:\\pdf\\A.pdf");pathList.add("D:\\pdf\\B.pdf");pathList.add("D:\\pdf\\C.pdf");pathList.add("D:\\pdf\\D.pdf");pathList.add("D:\\pdf\\E.pdf"); mergePdf(pathList, targetPDFPath);//拆分PDFString sourcePdfPath = ("D:\\pdf\\target.pdf");String splitPath = ("D:\\pdf");String splitFileName = ("splitPDF");spiltPdf(sourcePdfPath, splitPath, splitFileName);}/*** 读取pdf中文字信息(全部)* @param inputFile* @return*/public static String ReadPdf(String inputFile){//创建文档对象PDDocument doc =null;String content="";try {//加载一个pdf对象doc = PDDocument.load(new File(inputFile));//获取一个PDFTextStripper文本剥离对象 PDFTextStripper textStripper = new PDFTextStripper();content = textStripper.getText(doc); //关闭文档doc.close();} catch (Exception e) {e.printStackTrace();}return content;}/*** pdf指定页插入一段文字* @param inputFilePath* @param outputFilePath* @param pageNum* @param message* @throws Exception*/public static void insertPageContent (String inputFilePath, String outputFilePath, Integer pageNum, String message) throws Exception {File inputPDFFile = new File(inputFilePath);File outputPDFFile = new File(outputFilePath); PDDocument doc = null;try{doc = PDDocument.load(inputPDFFile);PDPageTree allPages = doc.getDocumentCatalog().getPages();PDFont font = PDType1Font.HELVETICA_BOLD;//字体大小float fontSize = 36.0f;PDPage page = (PDPage)allPages.get(pageNum - 1);PDRectangle pageSize = page.getMediaBox();float stringWidth = font.getStringWidth(message)*fontSize/1000f;// 计算页面中心位置int rotation = page.getRotation(); boolean rotate = rotation == 90 || rotation == 270;float pageWidth = rotate ? pageSize.getHeight() : pageSize.getWidth();float pageHeight = rotate ? pageSize.getWidth() : pageSize.getHeight();double centeredXPosition = rotate ? pageHeight/2f : (pageWidth - stringWidth)/2f;double centeredYPosition = rotate ? (pageWidth - stringWidth)/2f : pageHeight/2f;// append the content to the existing streamPDPageContentStream contentStream = new PDPageContentStream(doc, page, true, true,true);contentStream.beginText();// 设置字体和字体大小contentStream.setFont( font, fontSize );// 设置颜色contentStream.setNonStrokingColor(255, 0, 0);if (rotate) {// 旋转contentStream.setTextRotation(Math.PI/2, centeredXPosition, centeredYPosition);} else {contentStream.setTextTranslation(centeredXPosition, centeredYPosition);}contentStream.drawString(message);contentStream.endText();contentStream.close();doc.save(outputPDFFile);} finally {if( doc != null ) {doc.close();}}}/*** 在pdf中插入图片* @param inputFilePath* @param imagePath* @param outputFilePath* @param pageNum* @throws Exception*/public static void insertImage(String inputFilePath, String imagePath, String outputFilePath, Integer pageNum) throws Exception {File inputPDFFile = new File(inputFilePath);File outputPDFFile = new File(outputFilePath);try {PDDocument doc = PDDocument.load(inputPDFFile);PDImageXObject pdImage = PDImageXObject.createFromFile(imagePath, doc);PDPage page = doc.getPage(0); //PDPageContentStream contentStream = new PDPageContentStream(doc, page);//这行代码会覆盖原内容PDPageContentStream contentStream = new PDPageContentStream(doc, page, true, true, true);//此行不会覆盖原内容contentStream.drawImage(pdImage, 70, 250);contentStream.close();doc.save(outputPDFFile);doc.close();} catch (IOException e) {e.printStackTrace();}}/*** 合并PDF* @param pathList* @param targetPDFPath* @throws Exception*/public static void mergePdf(List<String> pathList, String targetPDFPath) throws Exception {List<File> files = new ArrayList<File>();for(String path : pathList) {files.add(new File(path));}PDFMergerUtility mergePdf = new PDFMergerUtility();for (File f : files) {if(f.exists() && f.isFile()){// 循环添加要合并的pdfmergePdf.addSource(f);}} // 设置合并生成pdf文件名称mergePdf.setDestinationFileName(targetPDFPath);// 合并pdf mergePdf.mergeDocuments();}/*** 拆分pdf,将pdf逐页分割* @param sourcePdfPath* @param splitPath* @param splitFileName* @throws Exception*/public static void spiltPdf(String sourcePdfPath, String splitPath, String splitFileName) throws Exception {int i = 1;String splitPdf = splitPath + File.separator + splitFileName + "_";// 加载PDF文档File file = new File(sourcePdfPath);PDDocument document = PDDocument.load(file);// 实例化Splitter splitter = new Splitter();splitter.setStartPage(1);splitter.setSplitAtPage(1);splitter.setEndPage(5);// 分割PDF文档List<PDDocument> Pages = splitter.split(document);// 迭代Iterator<PDDocument> iterator = Pages.listIterator();// 保存while(iterator.hasNext()) {PDDocument pd = iterator.next();String pdfName = splitPdf + (i++) + ".pdf";pd.save(pdfName);}document.close();}/*** 添加水印* @throws Exception*/public static void addWaterMark() throws Exception {// 读取原始 PDF 文件PDDocument document = PDDocument.load(new File("D:\\pdf\\test.pdf"));File file = new File("D:\\pdf\\waterMark.png");//BufferedOutputStream outputStream = FileUtil.getOutputStream(file);FileImageOutputStream fileImageOutputStream = new FileImageOutputStream(file);// 创建文字图片Font font = new Font("宋体", Font.PLAIN, 12);Color fontColor = new Color(100, 100, 100, 60);ImgUtil.createImage("水印", font, null, fontColor, fileImageOutputStream);// 旋转图片,使水印倾斜ImgUtil.rotate(file, -20, file);// 遍历 PDF 中的所有页面for (int i = 0; i < document.getNumberOfPages(); i++) {PDPage page = document.getPage(i);PDPageContentStream contentStream = new PDPageContentStream(document, page, PDPageContentStream.AppendMode.APPEND, true, true);// PDF页面宽度float pageWidth = page.getMediaBox().getWidth();// PDF页面高度float pageHeight = page.getMediaBox().getHeight();// 每页有4列水印int xBegin = (int) (pageWidth / 4);// 每页有8行水印int yBegin = (int) (pageHeight / 8);// 加载图片PDImageXObject pdImageXObject = PDImageXObject.createFromFile(file.getAbsolutePath(), document);float width = pdImageXObject.getWidth();float height = pdImageXObject.getHeight();for (int yIndex = 0; yIndex < 8; yIndex++) {for (int xIndex = 0; xIndex < 4; xIndex++) {contentStream.drawImage(pdImageXObject, (xBegin * xIndex) + 5, (yBegin * yIndex), width, height);}yIndex++;}contentStream.close();}// 保存修改后的 PDF 文件document.save(new File("D:\\pdf\\testout.pdf"));document.close();}}
四、pdfbox3.0的使用
同样以springboot项目为例,在pom.xml增加依赖
<dependency><groupId>org.apache.pdfbox</groupId><artifactId>pdfbox</artifactId><version>3.0.3</version>
</dependency>
常用操作示例代码:
import java.awt.Color;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;import org.apache.pdfbox.Loader;
import org.apache.pdfbox.multipdf.PDFMergerUtility;
import org.apache.pdfbox.multipdf.Splitter;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.PDPageTree;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.font.PDFont;
import org.apache.pdfbox.pdmodel.font.PDType0Font;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import org.apache.pdfbox.pdmodel.font.Standard14Fonts;
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;
import org.apache.pdfbox.pdmodel.graphics.state.PDExtendedGraphicsState;
import org.apache.pdfbox.pdmodel.graphics.state.RenderingMode;
import org.apache.pdfbox.text.PDFTextStripper;
import org.apache.pdfbox.util.Matrix;/*** Pdfbox 3.x版本工具类* @author Administrator**/
public class PdfboxUtil { public static void main(String[] args) throws Exception {// 读取PDF内容 String content = ReadPdf("D:\\pdf\\target.pdf");System.out.println("内容如下:\n" + content);// PDF插入文字String inputFilePath = "D:\\pdf\\A.pdf";String outputFilePath = "D:\\pdf\\A2.pdf";//只能写英文,写中文会报错insertPageContent(inputFilePath, outputFilePath, 1, "HelloKitty");// PDF插入图片String imagePath = "D:\\pdf\\pic.jpg";insertImage(inputFilePath, imagePath, outputFilePath, 1);// 合并PDFList<String> pathList = new ArrayList<String>();String targetPDFPath = "D:\\pdf\\target.pdf";pathList.add("D:\\pdf\\A.pdf");pathList.add("D:\\pdf\\B.pdf");pathList.add("D:\\pdf\\C.pdf");pathList.add("D:\\pdf\\D.pdf");pathList.add("D:\\pdf\\E.pdf"); mergePdf(pathList, targetPDFPath);//拆分PDFString sourcePdfPath = ("D:\\pdf\\target.pdf");String splitPath = ("D:\\pdf");String splitFileName = ("splitPDF");spiltPdf(sourcePdfPath, splitPath, splitFileName);// 原文件路径String sourcePath = "D:\\pdf\\test11.pdf";// 复制字体到c://windows/font下String fontFilePath = "D:\\pdf\\simhei.ttf";String waterMarkText = "专属水印";// 图片的位置String logoFilePath = "D:\\pdf\\logo.jpeg";// PDF文件的输出路径String pdfFile = "D:\\pdf\\" + System.currentTimeMillis() + ".pdf"; addWaterRemark(sourcePath, pdfFile, fontFilePath, waterMarkText, logoFilePath);}/*** 读取pdf中文字信息(全部)* @param inputFile* @return*/public static String ReadPdf(String inputFile){//创建文档对象PDDocument doc = null;String content = "";try {//加载一个pdf对象doc = Loader.loadPDF(new File(inputFile));//获取一个PDFTextStripper文本剥离对象 PDFTextStripper textStripper = new PDFTextStripper();content = textStripper.getText(doc);//关闭文档doc.close();} catch (Exception e) {e.printStackTrace();}return content;}/*** pdf指定页插入一段文字* @param inputFilePath* @param outputFilePath* @param pageNum* @param message* @throws Exception*/public static void insertPageContent (String inputFilePath, String outputFilePath, Integer pageNum, String message) throws Exception {File inputPDFFile = new File(inputFilePath);File outputPDFFile = new File(outputFilePath); PDDocument doc = null;try{doc = Loader.loadPDF(inputPDFFile);PDPageTree allPages = doc.getDocumentCatalog().getPages(); PDType1Font pdType1Font = new PDType1Font(Standard14Fonts.FontName.HELVETICA_BOLD);//字体大小float fontSize = 36.0f;PDPage page = (PDPage)allPages.get(pageNum - 1);PDRectangle pageSize = page.getMediaBox();float stringWidth = pdType1Font.getStringWidth(message)*fontSize/1000f;// 计算页面中心位置int rotation = page.getRotation(); boolean rotate = rotation == 90 || rotation == 270;float pageWidth = rotate ? pageSize.getHeight() : pageSize.getWidth();float pageHeight = rotate ? pageSize.getWidth() : pageSize.getHeight();float centeredXPosition = rotate ? pageHeight/2f : (pageWidth - stringWidth)/2f;float centeredYPosition = rotate ? (pageWidth - stringWidth)/2f : pageHeight/2f;// 读取内容流PDPageContentStream contentStream = new PDPageContentStream(doc, page, PDPageContentStream.AppendMode.APPEND, true);// 设置透明度//PDExtendedGraphicsState graphicsState = new PDExtendedGraphicsState();//graphicsState.setNonStrokingAlphaConstant(0.5f);//contentStream.setGraphicsStateParameters(graphicsState);// 执行文本操作contentStream.beginText();// 设置字体和字体大小contentStream.setFont( pdType1Font, fontSize );// 设置颜色contentStream.setNonStrokingColor(0f / 255f, 0f / 255f, 0f / 255f);if (rotate) {// 旋转contentStream.setTextMatrix(Matrix.getRotateInstance(45, centeredXPosition, centeredYPosition));} else {contentStream.setTextMatrix(Matrix.getTranslateInstance(centeredXPosition, centeredYPosition));}// 设置加粗渲染模式contentStream.setRenderingMode(RenderingMode.FILL_STROKE);contentStream.setLineWidth(0.1f);// 设置加粗宽度contentStream.setStrokingColor(0f / 255f, 0f / 255f, 0f / 255f);// 设置加粗颜色(RGB)// 绘制 contentStream.moveTo(0, 0);//移动contentStream.lineTo(10, 0);//绘制直线路径contentStream.curveTo(20, 30, 40, 50, 60, 70);//绘制贝塞尔曲线contentStream.closePath();//闭合路径contentStream.stroke();//描边//contentStream.fillAndStroke();//填充并描边contentStream.showText(message);//输出文本contentStream.endText();contentStream.close();doc.save(outputPDFFile); // } finally {if( doc != null ) {doc.close();}}}/*** 在pdf中插入图片* @param inputFilePath* @param imagePath* @param outputFilePath* @param pageNum* @throws Exception*/public static void insertImage(String inputFilePath, String imagePath, String outputFilePath, Integer pageNum) throws Exception {File inputPDFFile = new File(inputFilePath);File outputPDFFile = new File(outputFilePath);try {PDDocument doc = Loader.loadPDF(inputPDFFile);PDImageXObject pdImage = PDImageXObject.createFromFile(imagePath, doc); PDPage page = doc.getPage(0);PDPageContentStream contentStream = new PDPageContentStream(doc, page, PDPageContentStream.AppendMode.APPEND, true);contentStream.drawImage(pdImage, 70, 250);contentStream.close();doc.save(outputPDFFile);doc.close();} catch (IOException e) {e.printStackTrace();}}/*** 合并PDF* @param pathList* @param targetPDFPath* @throws Exception*/public static void mergePdf(List<String> pathList, String targetPDFPath) throws Exception {List<File> files = new ArrayList<File>();for(String path : pathList) {files.add(new File(path));}PDFMergerUtility mergePdf = new PDFMergerUtility();for (File f : files) {if(f.exists() && f.isFile()){// 循环添加要合并的pdfmergePdf.addSource(f);}} // 设置合并生成pdf文件名称mergePdf.setDestinationFileName(targetPDFPath);// 合并pdf mergePdf.mergeDocuments(null);}/*** 拆分pdf,将pdf逐页分割* @param sourcePdfPath* @param splitPath* @param splitFileName* @throws Exception*/public static void spiltPdf(String sourcePdfPath, String splitPath, String splitFileName) throws Exception {int i = 1;String splitPdf = splitPath + File.separator + splitFileName + "_";// 加载PDF文档File file = new File(sourcePdfPath);PDDocument document = Loader.loadPDF(file);// 实例化Splitter splitter = new Splitter();splitter.setStartPage(1);splitter.setSplitAtPage(1);splitter.setEndPage(5);// 分割PDF文档List<PDDocument> Pages = splitter.split(document);// 迭代Iterator<PDDocument> iterator = Pages.listIterator();// 保存while(iterator.hasNext()) {PDDocument pd = iterator.next();String pdfName = splitPdf + (i++) + ".pdf";pd.save(pdfName);}document.close();}public static void addWaterRemark(String sourcePath, String pdfFile, String fontFilePath, String waterMarkText, String logoFilePath) {try {// 读取原始 PDF 文件PDDocument doc = Loader.loadPDF(new File(sourcePath));doc.setAllSecurityToBeRemoved(true);//必须是ttf字体PDFont font = PDType0Font.load(doc, new File(fontFilePath));PDImageXObject logoImage = PDImageXObject.createFromFile(logoFilePath, doc);float logoImageWidth = logoImage.getWidth();float logoImageHeight = logoImage.getHeight();float maxX;float maxY;float tempX;float tempY;float xStepLength = 300f;float yStepLength = 200f;float startX = 100f;float startY = 100f;PDPageContentStream cs;// 透明度PDExtendedGraphicsState r0 = new PDExtendedGraphicsState();r0.setNonStrokingAlphaConstant(0.2f);r0.setAlphaSourceFlag(true);for (PDPage page : doc.getPages()) {maxX = page.getBBox().getUpperRightX();maxY = page.getBBox().getUpperRightX();cs = new PDPageContentStream(doc, page, PDPageContentStream.AppendMode.APPEND, true, true);cs.setGraphicsStateParameters(r0);cs.setNonStrokingColor(new Color(200, 200, 200));cs.setFont(font, 36f);// 添加图片,不能在beginText和endText之间加cs.drawImage(logoImage, (maxX - logoImageWidth) / 2, maxY / 2, logoImageWidth * 1.5f, logoImageHeight * 1.5f);cs.beginText();tempX = startX;while (tempX < maxX) {tempY = startY;while (tempY < maxY) {cs.setTextMatrix(Matrix.getRotateInstance(45, tempX, tempY));cs.showText(waterMarkText);tempY += yStepLength;}tempX += xStepLength;}cs.endText();// 关闭流cs.close();}// 保存修改后的 PDF 文件doc.save(new File(pdfFile));doc.close();} catch (Exception e) {e.printStackTrace();}}}
五、pdfbox2.0与3.0的变化
随着版本的迭代,特别是到了3.0,为了安全与规范,不少方法已经过时而不再使用了,具体有哪些类、属性、方法的做了变更,参见官网:Deprecated List (PDFBox reactor 2.0.7 API),如图:
官网已经将替代的方法一一列举了,在这里简单介绍几个常用方法
3.1 加载PDF
原有加载方法都已从 PDDocument 中删除,采用新类 org.apache.pdfbox.Loader 用于加载 PDF文件,它提供了几种使用不同类型的源加载 PDF 的方法。
//2.0方法
PDDocument pdDocument = PDDocument.load(new File("D:\\test.pdf"));//3.0方法
PDDocument pdDocument = Loader.loadPDF(new File("D:\\test.pdf"));
3.2 使用字体
原有 org.apache.pdfbox.pdmodel.font.PDType1Font 标准 14 种字体的静态实例被删除,因为底层的 COSDictionary 不应该是不可变的。同时,引入了一个新的构造函数PDType1Font 来创建标准 14 字体,使用新的枚举 org.apache.pdfbox.pdmodel.font.Standard14Fonts.FontName 作为参数,其中定义了标准 14 种字体的名称。
// 2.0方法
PDFont font = PDType1Font.HELVETICA_BOLD;// 3.0方法
PDType1Font pdType1Font = new PDType1Font(Standard14Fonts.FontName.HELVETICA_BOLD);
3.3 读取内容流
//2.0 写法
PDDocument doc = PDDocument.load(inputPDFFile);
// 读取内容流
PDPageContentStream contentStream = new PDPageContentStream(doc, page, true, true,true);//3.0 写法
PDDocument doc = Loader.loadPDF(inputPDFFile);
// 读取内容流
PDPageContentStream contentStream = new PDPageContentStream(doc, page, PDPageContentStream.AppendMode.APPEND, true);
提示:PDFBox 3.0 至少需要 JDK 8
六、总结
在Java开发中,处理PDF文档是一项常见的需求。Apache PDFBox 是一个强大的开源库,专门用于创建、操作和提取PDF文档内容。本文简单介绍 Apache PDFBox 的核心功能及其使用方法,帮助你在项目中更高效地处理PDF文档。