PDF转markdown工具:magic-pdf

1. magic-pdf 环境安装

conda create -n MinerU python=3.10
conda activate MinerU
pip install boto3>=1.28.43 -i https://pypi.tuna.tsinghua.edu.cn/simple/
pip install magic-pdf[full]==0.7.0b1 --extra-index-url https://wheels.myhloli.com  -i https://pypi.tuna.tsinghua.edu.cn/simple/

2. 权重下载

sudo apt-get install git-lfs
git clone https://github.com/opendatalab/MinerU.git
cd MinerU/
git lfs install
git lfs clone https://huggingface.co/wanderkid/PDF-Extract-Kit

或者

pip install modelscope
# Use the following Python code to download the model using the ModelScope SDK:
from modelscope import snapshot_download
model_dir = snapshot_download('wanderkid/PDF-Extract-Kit')

3. 修改配置

修改

magic-pdf.template.json 中models-dir修改为模型的下载路径
{"bucket_info":{"bucket-name-1":["ak", "sk", "endpoint"],"bucket-name-2":["ak", "sk", "endpoint"]},"models-dir":"/home/adam/work/MinerU/PDF-Extract-Kit/models","device-mode":"cpu","table-config": {"is_table_recog_enable": false,"max_time": 400}
}

将magic-pdf.template.json文件修改为magic-pdf.json放在系统目录,不同的系统默认目录不同,

Windows : C:\Users\YourUsername,

Linux : /home/YourUsername

macOS : /Users/YourUsername

4. 使用参数

magic-pdf --help
Usage: magic-pdf [OPTIONS]Options:-v, --version                display the version and exit-p, --path PATH              local pdf filepath or directory  [required]-o, --output-dir TEXT        output local directory-m, --method [ocr|txt|auto]  the method for parsing pdf.  ocr: using ocr technique to extract information from pdf,txt: suitable for the text-based pdf only and outperform ocr,auto: automatically choose the best method for parsing pdffrom ocr and txt.without method specified, auto will be used by default. --help                       Show this message and exit.## show version
magic-pdf -v## command line example
magic-pdf -p {some_pdf} -o {some_output_dir} -m auto

{some_pdf}可以是单个 PDF 文件,也可以是包含多个 PDF 的目录。 结果将保存在目录中。输出文件列表如下:{some_output_dir}

├── some_pdf.md                 # markdown file
├── images                      # directory for storing images
├── layout.pdf                  # layout diagram
├── middle.json                 # MinerU intermediate processing result
├── model.json                  # model inference result
├── origin.pdf                  # original PDF file
└── spans.pdf                   # smallest granularity bbox position information diagram

5.测试

magic-pdf -p GenZ-LLM.pdf -o ./res/ -m auto

结果:

测试使用cpu执行,内存16g,3页pdf解析大概2分钟, 页数过多会崩掉。有些公式好像解析的不太对,整体可用。

具体log:

2024-08-13 15:53:44.149 | INFO     | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 14962, cid_chars_radio: 0.0
INFO:datasets:PyTorch version 2.3.1 available.
2024-08-13 15:53:53.048 | INFO     | magic_pdf.model.pdf_extract_kit:__init__:111 - DocAnalysis init, this may take some times. apply_layout: True, apply_formula: True, apply_ocr: False, apply_table: False
2024-08-13 15:53:53.048 | INFO     | magic_pdf.model.pdf_extract_kit:__init__:119 - using device: cpu
2024-08-13 15:53:53.048 | INFO     | magic_pdf.model.pdf_extract_kit:__init__:121 - using models_dir: /home/long/work/MinerU/PDF-Extract-Kit/models
CustomVisionEncoderDecoderModel init
CustomMBartForCausalLM init
CustomMBartDecoder init
[08/13 15:54:06 detectron2]: Rank of current process: 0. World size: 1
[08/13 15:54:07 detectron2]: Environment info:
-------------------------------  --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sys.platform                     linux
Python                           3.10.14 (main, May  6 2024, 19:42:50) [GCC 11.2.0]
numpy                            1.26.4
detectron2                       0.6 @/home/long/anaconda3/envs/MinerU/lib/python3.10/site-packages/detectron2
detectron2._C                    not built correctly: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /home/long/anaconda3/envs/MinerU/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so)
Compiler ($CXX)                  c++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
DETECTRON2_ENV_MODULE            <not set>
PyTorch                          2.3.1+cu121 @/home/long/anaconda3/envs/MinerU/lib/python3.10/site-packages/torch
PyTorch debug build              False
torch._C._GLIBCXX_USE_CXX11_ABI  False
GPU available                    No: torch.cuda.is_available() == False
Pillow                           10.4.0
torchvision                      0.18.1+cu121 @/home/long/anaconda3/envs/MinerU/lib/python3.10/site-packages/torchvision
fvcore                           0.1.5.post20221221
iopath                           0.1.9
cv2                              4.6.0
-------------------------------  --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PyTorch built with:- GCC 9.3- C++ Version: 201703- Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications- Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361)- OpenMP 201511 (a.k.a. OpenMP 4.5)- LAPACK is enabled (usually provided by MKL)- NNPACK is enabled- CPU capability usage: AVX2- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.1, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, [08/13 15:54:07 detectron2]: Command line arguments: {'config_file': '/home/long/anaconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml', 'resume': False, 'eval_only': False, 'num_gpus': 1, 'num_machines': 1, 'machine_rank': 0, 'dist_url': 'tcp://127.0.0.1:57823', 'opts': ['MODEL.WEIGHTS', '/home/long/work/MinerU/PDF-Extract-Kit/models/Layout/model_final.pth']}
[08/13 15:54:07 detectron2]: Contents of args.config_file=/home/long/anaconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml:
AUG:DETR: true
CACHE_DIR: ~/cache/huggingface
CUDNN_BENCHMARK: false
DATALOADER:ASPECT_RATIO_GROUPING: trueFILTER_EMPTY_ANNOTATIONS: falseNUM_WORKERS: 4REPEAT_THRESHOLD: 0.0SAMPLER_TRAIN: TrainingSampler
DATASETS:PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000PROPOSAL_FILES_TEST: []PROPOSAL_FILES_TRAIN: []TEST:- scihub_trainTRAIN:- scihub_train
GLOBAL:HACK: 1.0
ICDAR_DATA_DIR_TEST: ''
ICDAR_DATA_DIR_TRAIN: ''
INPUT:CROP:ENABLED: trueSIZE:- 384- 600TYPE: absolute_rangeFORMAT: RGBMASK_FORMAT: polygonMAX_SIZE_TEST: 1333MAX_SIZE_TRAIN: 1333MIN_SIZE_TEST: 800MIN_SIZE_TRAIN:- 480- 512- 544- 576- 608- 640- 672- 704- 736- 768- 800MIN_SIZE_TRAIN_SAMPLING: choiceRANDOM_FLIP: horizontal
MODEL:ANCHOR_GENERATOR:ANGLES:- - -90- 0- 90ASPECT_RATIOS:- - 0.5- 1.0- 2.0NAME: DefaultAnchorGeneratorOFFSET: 0.0SIZES:- - 32- - 64- - 128- - 256- - 512BACKBONE:FREEZE_AT: 2NAME: build_vit_fpn_backboneCONFIG_PATH: ''DEVICE: cudaFPN:FUSE_TYPE: sumIN_FEATURES:- layer3- layer5- layer7- layer11NORM: ''OUT_CHANNELS: 256IMAGE_ONLY: trueKEYPOINT_ON: falseLOAD_PROPOSALS: falseMASK_ON: trueMETA_ARCHITECTURE: VLGeneralizedRCNNPANOPTIC_FPN:COMBINE:ENABLED: trueINSTANCES_CONFIDENCE_THRESH: 0.5OVERLAP_THRESH: 0.5STUFF_AREA_LIMIT: 4096INSTANCE_LOSS_WEIGHT: 1.0PIXEL_MEAN:- 127.5- 127.5- 127.5PIXEL_STD:- 127.5- 127.5- 127.5PROPOSAL_GENERATOR:MIN_SIZE: 0NAME: RPNRESNETS:DEFORM_MODULATED: falseDEFORM_NUM_GROUPS: 1DEFORM_ON_PER_STAGE:- false- false- false- falseDEPTH: 50NORM: FrozenBNNUM_GROUPS: 1OUT_FEATURES:- res4RES2_OUT_CHANNELS: 256RES5_DILATION: 1STEM_OUT_CHANNELS: 64STRIDE_IN_1X1: trueWIDTH_PER_GROUP: 64RETINANET:BBOX_REG_LOSS_TYPE: smooth_l1BBOX_REG_WEIGHTS:- 1.0- 1.0- 1.0- 1.0FOCAL_LOSS_ALPHA: 0.25FOCAL_LOSS_GAMMA: 2.0IN_FEATURES:- p3- p4- p5- p6- p7IOU_LABELS:- 0- -1- 1IOU_THRESHOLDS:- 0.4- 0.5NMS_THRESH_TEST: 0.5NORM: ''NUM_CLASSES: 10NUM_CONVS: 4PRIOR_PROB: 0.01SCORE_THRESH_TEST: 0.05SMOOTH_L1_LOSS_BETA: 0.1TOPK_CANDIDATES_TEST: 1000ROI_BOX_CASCADE_HEAD:BBOX_REG_WEIGHTS:- - 10.0- 10.0- 5.0- 5.0- - 20.0- 20.0- 10.0- 10.0- - 30.0- 30.0- 15.0- 15.0IOUS:- 0.5- 0.6- 0.7ROI_BOX_HEAD:BBOX_REG_LOSS_TYPE: smooth_l1BBOX_REG_LOSS_WEIGHT: 1.0BBOX_REG_WEIGHTS:- 10.0- 10.0- 5.0- 5.0CLS_AGNOSTIC_BBOX_REG: trueCONV_DIM: 256FC_DIM: 1024NAME: FastRCNNConvFCHeadNORM: ''NUM_CONV: 0NUM_FC: 2POOLER_RESOLUTION: 7POOLER_SAMPLING_RATIO: 0POOLER_TYPE: ROIAlignV2SMOOTH_L1_BETA: 0.0TRAIN_ON_PRED_BOXES: falseROI_HEADS:BATCH_SIZE_PER_IMAGE: 512IN_FEATURES:- p2- p3- p4- p5IOU_LABELS:- 0- 1IOU_THRESHOLDS:- 0.5NAME: CascadeROIHeadsNMS_THRESH_TEST: 0.5NUM_CLASSES: 10POSITIVE_FRACTION: 0.25PROPOSAL_APPEND_GT: trueSCORE_THRESH_TEST: 0.05ROI_KEYPOINT_HEAD:CONV_DIMS:- 512- 512- 512- 512- 512- 512- 512- 512LOSS_WEIGHT: 1.0MIN_KEYPOINTS_PER_IMAGE: 1NAME: KRCNNConvDeconvUpsampleHeadNORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: trueNUM_KEYPOINTS: 17POOLER_RESOLUTION: 14POOLER_SAMPLING_RATIO: 0POOLER_TYPE: ROIAlignV2ROI_MASK_HEAD:CLS_AGNOSTIC_MASK: falseCONV_DIM: 256NAME: MaskRCNNConvUpsampleHeadNORM: ''NUM_CONV: 4POOLER_RESOLUTION: 14POOLER_SAMPLING_RATIO: 0POOLER_TYPE: ROIAlignV2RPN:BATCH_SIZE_PER_IMAGE: 256BBOX_REG_LOSS_TYPE: smooth_l1BBOX_REG_LOSS_WEIGHT: 1.0BBOX_REG_WEIGHTS:- 1.0- 1.0- 1.0- 1.0BOUNDARY_THRESH: -1CONV_DIMS:- -1HEAD_NAME: StandardRPNHeadIN_FEATURES:- p2- p3- p4- p5- p6IOU_LABELS:- 0- -1- 1IOU_THRESHOLDS:- 0.3- 0.7LOSS_WEIGHT: 1.0NMS_THRESH: 0.7POSITIVE_FRACTION: 0.5POST_NMS_TOPK_TEST: 1000POST_NMS_TOPK_TRAIN: 2000PRE_NMS_TOPK_TEST: 1000PRE_NMS_TOPK_TRAIN: 2000SMOOTH_L1_BETA: 0.0SEM_SEG_HEAD:COMMON_STRIDE: 4CONVS_DIM: 128IGNORE_VALUE: 255IN_FEATURES:- p2- p3- p4- p5LOSS_WEIGHT: 1.0NAME: SemSegFPNHeadNORM: GNNUM_CLASSES: 10VIT:DROP_PATH: 0.1IMG_SIZE:- 224- 224NAME: layoutlmv3_baseOUT_FEATURES:- layer3- layer5- layer7- layer11POS_TYPE: absWEIGHTS: 
OUTPUT_DIR: 
SCIHUB_DATA_DIR_TRAIN: ~/publaynet/layout_scihub/train
SEED: 42
SOLVER:AMP:ENABLED: trueBACKBONE_MULTIPLIER: 1.0BASE_LR: 0.0002BIAS_LR_FACTOR: 1.0CHECKPOINT_PERIOD: 2000CLIP_GRADIENTS:CLIP_TYPE: full_modelCLIP_VALUE: 1.0ENABLED: trueNORM_TYPE: 2.0GAMMA: 0.1GRADIENT_ACCUMULATION_STEPS: 1IMS_PER_BATCH: 32LR_SCHEDULER_NAME: WarmupCosineLRMAX_ITER: 20000MOMENTUM: 0.9NESTEROV: falseOPTIMIZER: longWREFERENCE_WORLD_SIZE: 0STEPS:- 10000WARMUP_FACTOR: 0.01WARMUP_ITERS: 333WARMUP_METHOD: linearWEIGHT_DECAY: 0.05WEIGHT_DECAY_BIAS: nullWEIGHT_DECAY_NORM: 0.0
TEST:AUG:ENABLED: falseFLIP: trueMAX_SIZE: 4000MIN_SIZES:- 400- 500- 600- 700- 800- 900- 1000- 1100- 1200DETECTIONS_PER_IMAGE: 100EVAL_PERIOD: 1000EXPECTED_RESULTS: []KEYPOINT_OKS_SIGMAS: []PRECISE_BN:ENABLED: falseNUM_ITER: 200
VERSION: 2
VIS_PERIOD: 0[08/13 15:54:08 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /home/long/work/MinerU/PDF-Extract-Kit/models/Layout/model_final.pth ...
[08/13 15:54:08 fvcore.common.checkpoint]: [Checkpointer] Loading from /home/long/work/MinerU/PDF-Extract-Kit/models/Layout/model_final.pth ...
2024-08-13 15:54:09.334 | INFO     | magic_pdf.model.pdf_extract_kit:__init__:148 - DocAnalysis init done!
2024-08-13 15:54:09.336 | INFO     | magic_pdf.model.doc_analyze_by_custom_model:custom_model_init:98 - model init cost: 25.18623661994934
2024-08-13 15:54:18.411 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:159 - layout detection cost: 8.960: 1888x1472 2 embeddings, 3839.2ms
Speed: 28.6ms preprocess, 3839.2ms inference, 0.9ms postprocess per image at shape (1, 3, 1888, 1472)
2024-08-13 15:54:25.349 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:189 - formula nums: 2, mfr time: 1.24
2024-08-13 15:54:34.577 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:159 - layout detection cost: 9.220: 1888x1472 25 embeddings, 4120.5ms
Speed: 15.3ms preprocess, 4120.5ms inference, 1.0ms postprocess per image at shape (1, 3, 1888, 1472)
2024-08-13 15:54:49.462 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:189 - formula nums: 25, mfr time: 10.67
2024-08-13 15:54:59.903 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:159 - layout detection cost: 10.440: 1888x1472 18 embeddings, 4241.8ms
Speed: 20.1ms preprocess, 4241.8ms inference, 0.9ms postprocess per image at shape (1, 3, 1888, 1472)
2024-08-13 15:55:12.180 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:189 - formula nums: 18, mfr time: 7.93
2024-08-13 15:55:12.184 | INFO     | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:124 - doc analyze cost: 62.73242211341858
2024-08-13 15:55:12.233 | INFO     | magic_pdf.pdf_parse_union_core:pdf_parse_union:221 - page_id: 0, last_page_cost_time: 0.0
2024-08-13 15:55:12.305 | INFO     | magic_pdf.pdf_parse_union_core:pdf_parse_union:221 - page_id: 1, last_page_cost_time: 0.07
2024-08-13 15:55:12.364 | INFO     | magic_pdf.pdf_parse_union_core:pdf_parse_union:221 - page_id: 2, last_page_cost_time: 0.06
2024-08-13 15:55:12.743 | INFO     | magic_pdf.para.para_split_v2:__detect_list_lines:140 - 发现了列表,列表行数:[(8, 9)], [[8, 9]]
2024-08-13 15:55:12.744 | INFO     | magic_pdf.para.para_split_v2:__detect_list_lines:153 - 列表行的第8到第9行是列表
2024-08-13 15:55:12.750 | INFO     | magic_pdf.para.para_split_v2:__detect_list_lines:140 - 发现了列表,列表行数:[(19, 20)], [[19]]
2024-08-13 15:55:12.750 | INFO     | magic_pdf.para.para_split_v2:__detect_list_lines:153 - 列表行的第19到第20行是列表
2024-08-13 15:55:12.755 | INFO     | magic_pdf.para.para_split_v2:para_split:764 - 连接了第0页和第1页的段落
2024-08-13 15:55:13.239 | INFO     | magic_pdf.pipe.UNIPipe:pipe_mk_markdown:48 - uni_pipe mk mm_markdown finished
2024-08-13 15:55:13.278 | INFO     | magic_pdf.pipe.UNIPipe:pipe_mk_uni_format:43 - uni_pipe mk content list finished
2024-08-13 15:55:13.278 | INFO     | magic_pdf.tools.common:do_parse:119 - local output dir is ./res/GenZ-LLM-Analyzer/auto

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/403228.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

《深入浅出多模态》(八)多模态经典模型:MiniGPTv4

🎉AI学习星球推荐: GoAI的学习社区 知识星球是一个致力于提供《机器学习 | 深度学习 | CV | NLP | 大模型 | 多模态 | AIGC 》各个最新AI方向综述、论文等成体系的学习资料,配有全面而有深度的专栏内容,包括不限于 前沿论文解读、资料共享、行业最新动态以、实践教程、求职…

【el-table】横向滚动条加粗后,滚动到固定列下被遮挡,已解决

横向滚动条按要求加粗后&#xff0c;遇到的问题&#xff1a;列表的操作列是固定在最右侧的&#xff0c;当滚动条滑动到最右侧的时候&#xff0c;滚动条被遮挡了 我尝试了几种方法都不行&#xff0c;比如找到.el-table__fixed-right .el-table__fixed-footer-wrapper &#xff…

智能监控,无忧仓储:EasyCVR视频汇聚+AI智能分享技术为药品仓库安全保驾护航

随着科技的飞速发展&#xff0c;药品仓库的安全管理正迎来前所未有的变革。药品作为直接关系到公众健康的重要物资&#xff0c;其安全存储和监管显得尤为重要。在这个背景下&#xff0c;视频汇聚平台EasyCVR视频智能管理系统的应用&#xff0c;为药品仓库的安全监管提供了强有力…

el-tree树状控件,定位到选中的节点的位置

效果图 在el-tree 控件加 :render-content"renderContent" 在掉接口的方法中 实际有用的是setTimeout 方法和this.$refs.xxxxxx.setCheckedKeys([industrycodeList]) if(res.data.swindustrylist.length>0){res.data.swindustrylist.forEach(item > {industry…

Ubuntu | 解决 VMware 中 Ubuntu 虚拟机磁盘空间不足问题

目录 一、存在的问题二、解决的步骤第一步&#xff1a;扩展磁盘空间第二步&#xff1a;查看磁盘空间使用情况第三步&#xff1a;安装分区工具第四步&#xff1a;启动分区工具第五步&#xff1a;修改挂载文件夹的读写权限第六步&#xff1a;扩展文件系统大小第七步&#xff1a;验…

【文献阅读】2024 DAVE 基于密度检测

摘要、图、模型架构 提出什么模块 解决什么问题 摘要 Low-shot counters estimate the number of objects corresponding to a selected category, based on only few or no exemplars annotated in the image. The current state-ofthe-art estimates the total counts as th…

【Harmony OS 4.0】待办列表案例

src/main/ets/example1/Models.ets // 定义class类数据模型 export class TaskDataModel {// private 私有属性&#xff0c;在类对象外不允许随意更改数据&#xff0c;必须本地初始化。private tasks: Array<string> [早起晨练, 准备早餐, 阅读名著, 学习ArkTs, 玩游戏…

电子电气架构 --- 车载以太网

我是穿拖鞋的汉子,魔都中坚持长期主义的汽车电子工程师。 老规矩,分享一段喜欢的文字,避免自己成为高知识低文化的工程师: 屏蔽力是信息过载时代一个人的特殊竞争力,任何消耗你的人和事,多看一眼都是你的不对。非必要不费力证明自己,无利益不试图说服别人,是精神上的节…

Python 全栈系列262 使用sqlalchemy(clickhouse)

说明 再补充一篇。之前连不上的原因也挺搞笑&#xff0c;大概是deepseek把我带偏了&#xff0c; 应该是 pip3 install clickhouse-sqlalchemy -i https://mirrors.aliyun.com/pypi/simple/ 但是它教我 pip3 install sqlalchemy-clickhouse -i https://mirrors.aliyun.com/py…

【实用工具】使用Chrome插件搭建第二大脑!SuperMemory大语言模型登场,开源、免费、保存你需要的所有网站!——含入门安装教程

文章目录 项目简介项目搭建主要功能How do I use this?本地部署 项目简介 最近&#xff0c;有一款Github项目十分火爆&#xff0c;它专注于用超级内存打造自己的第二大脑。它是书签的 ChatGPT&#xff0c;基于Chrome 浏览器扩展导入推文或保存网站和内容&#xff0c;你可以访…

【计算机人接私活】手把手教你上手挖到第一个漏洞,从底薪3k到月入过万,只有一步之遥!

计算机人想接靠谱的私活&#xff1f;看这篇&#xff01; 暑假想做兼职赚生活费&#xff1f;看这篇&#xff01; 挖漏洞找不到门路&#xff1f;看这篇&#xff01; 挖漏洞必备工具 Up入行网安多年&#xff0c;一直在探索副业项目。 从最初的月薪5k&#xff0c;到现在一个漏…

基于javaEE的校园二手书交易平台的设计与实现

TOC springboot287基于javaEE的校园二手书交易平台的设计与实现 第1章 绪论 1.1 研究背景 互联网概念的产生到如今的蓬勃发展&#xff0c;用了短短的几十年时间就风靡全球&#xff0c;使得全球各个行业都进行了互联网的改造升级&#xff0c;标志着互联网浪潮的来临。在这个…

EWM 批次管理 / Batch Management

目录 1 简介 2 业务数据 2.1 基于 PO&#xff0c;创建 ERP LE - Delivery 内向交货单&#xff0c;同时同步到 EWM 交货单 2.2 在 EWM 内向交货单&#xff0c;创建批次。EWM 批次创建的前提条件来自于物料主数据批次分类&#xff08;023&#xff09;决定的。SAP 提供的标准条…

【数据结构】二叉树(三)精选Oj题

本篇已经是二叉树第三篇啦&#xff0c;下面讲解相关面试题&#xff0c;写作不易&#xff0c;求路过的朋友给个点赞与收藏呀~ 目录 1、相同的树 2、另一颗树的子树 3、翻转二叉树 4、对称二叉树 5、平衡二叉树 6、构建二叉树 7、二叉树的最近公共祖先 孩子双亲解法 二叉…

大端存储与小端存储

大端存储与小端存储 什么大端存储什么是小端存储 大端存储&#xff08;Big-endian&#xff09;和小端存储&#xff08;Little-endian&#xff09;是计算机科学中数据在内存中存储的两种不同方式&#xff0c;主要涉及多字节数据类型&#xff08;如整数、浮点数&#xff09;的字…

vue3 组合式 API:setup()

查看vue3官网介绍&#xff1a;组合式 API&#xff1a;setup() 在 Vue 3 中&#xff0c;组合式 API 的 setup() 函数是一个非常重要的特性&#xff0c;它提供了一种更灵活和可维护的方式来组织组件的逻辑。 基本概念 setup() 函数是在组件实例创建之前执行的&#xff0c;它用于…

零基础STM32单片机编程入门(三十八) 多传感器模块之跌倒检测实战源码

文章目录 一.概要二.实验原理三.实验控制流程四.STM32单片机跌倒监测实验(MPU6050直流有刷电机蜂鸣器)五.CubeMX工程源代码下载六.实验效果视频七.小结 一.概要 据统计每年约有 300 万老年人因跌倒受伤而在急诊室接受治疗&#xff0c;每五次跌倒就有一次会造成伤害&#xff0c…

网络如何发送一个数据包

网络如何发送一个数据包 网络消息发送就是点一点屏幕。 骚瑞&#xff0c;这一点都不好笑。&#xff08;小品就是我的本质惹&#xff09; 之前我就是会被这个问题搞的不安宁。是怎么知道对方的IP地址的呢&#xff1f;怎么知道对方的MAC呢&#xff1f;世界上计算机有那么多&…

阿里Qwen2开源大模型本地部署及调试全攻略

阿里Qwen2开源大模型本地部署及调试全攻略 #Qwen2系列大模型性能卓越&#xff0c;超越业界知名模型。开源后受到AI开发者关注&#xff0c;支持多种语言&#xff0c;提升多语言理解。在预训练和微调上优化&#xff0c;实现智能水平提升。Qwen2系列模型在各项能力上均领先&#…

python 获取pdf文件中的超链接

pip install pymupdf pip install fitzimport fitz # PyMuPDFdef get_pdf_links(pdf_path):# 打开PDF文件document fitz.open(pdf_path)links []for page_num in range(len(document)):page document[page_num]# 获取当前页面的链接for link in page.get_links():links.app…