英伟达基于Mistral 7B开发新一代Embedding模型——NV-Embed-v2

在这里插入图片描述

我们介绍的 NV-Embed-v2 是一种通用嵌入模型,它在大规模文本嵌入基准(MTEB 基准)(截至 2024 年 8 月 30 日)的 56 项文本嵌入任务中以 72.31 的高分排名第一。此外,它还在检索子类别中排名第一(在 15 项任务中获得 62.65 分),这对 RAG 技术的发展至关重要。

NV-Embed-v2 采用了多项新设计,包括让 LLM 关注潜在向量,以获得更好的池化嵌入输出,并展示了一种两阶段指令调整方法,以提高检索和非检索任务的准确性。此外,NV-Embed-v2 还采用了一种新颖的硬阴性挖掘方法,该方法考虑了正相关性得分,能更好地去除假阴性。

有关更多技术细节,请参阅我们的论文: NV-Embed:将 LLM 训练为通用嵌入模型的改进技术。

型号详情

  • 仅用于解码器的基本 LLM:Mistral-7B-v0.1
  • 池类型: Latent-Attention
  • 嵌入尺寸: 4096

如何使用

所需软件包

如果遇到问题,请尝试安装以下 python 软件包

pip uninstall -y transformer-engine
pip install torch==2.2.0
pip install transformers==4.42.4
pip install flash-attn==2.2.0
pip install sentence-transformers==2.7.0

以下是如何使用 Huggingface-transformer 和 Sentence-transformer 对查询和段落进行编码的示例。

HuggingFace Transformers

import torch
import torch.nn.functional as F
from transformers import AutoTokenizer, AutoModel# Each query needs to be accompanied by an corresponding instruction describing the task.
task_name_to_instruct = {"example": "Given a question, retrieve passages that answer the question",}query_prefix = "Instruct: "+task_name_to_instruct["example"]+"\nQuery: "
queries = ['are judo throws allowed in wrestling?', 'how to become a radiology technician in michigan?']# No instruction needed for retrieval passages
passage_prefix = ""
passages = ["Since you're reading this, you are probably someone from a judo background or someone who is just wondering how judo techniques can be applied under wrestling rules. So without further ado, let's get to the question. Are Judo throws allowed in wrestling? Yes, judo throws are allowed in freestyle and folkstyle wrestling. You only need to be careful to follow the slam rules when executing judo throws. In wrestling, a slam is lifting and returning an opponent to the mat with unnecessary force.","Below are the basic steps to becoming a radiologic technologist in Michigan:Earn a high school diploma. As with most careers in health care, a high school education is the first step to finding entry-level employment. Taking classes in math and science, such as anatomy, biology, chemistry, physiology, and physics, can help prepare students for their college studies and future careers.Earn an associate degree. Entry-level radiologic positions typically require at least an Associate of Applied Science. Before enrolling in one of these degree programs, students should make sure it has been properly accredited by the Joint Review Committee on Education in Radiologic Technology (JRCERT).Get licensed or certified in the state of Michigan."
]# load model with tokenizer
model = AutoModel.from_pretrained('nvidia/NV-Embed-v2', trust_remote_code=True)# get the embeddings
max_length = 32768
query_embeddings = model.encode(queries, instruction=query_prefix, max_length=max_length)
passage_embeddings = model.encode(passages, instruction=passage_prefix, max_length=max_length)# normalize embeddings
query_embeddings = F.normalize(query_embeddings, p=2, dim=1)
passage_embeddings = F.normalize(passage_embeddings, p=2, dim=1)# get the embeddings with DataLoader (spliting the datasets into multiple mini-batches)
# batch_size=2
# query_embeddings = model._do_encode(queries, batch_size=batch_size, instruction=query_prefix, max_length=max_length, num_workers=32, return_numpy=True)
# passage_embeddings = model._do_encode(passages, batch_size=batch_size, instruction=passage_prefix, max_length=max_length, num_workers=32, return_numpy=True)scores = (query_embeddings @ passage_embeddings.T) * 100
print(scores.tolist())
# [[87.42693328857422, 0.46283677220344543], [0.965264618396759, 86.03721618652344]]

Sentence-Transformers

import torch
from sentence_transformers import SentenceTransformer# Each query needs to be accompanied by an corresponding instruction describing the task.
task_name_to_instruct = {"example": "Given a question, retrieve passages that answer the question",}query_prefix = "Instruct: "+task_name_to_instruct["example"]+"\nQuery: "
queries = ['are judo throws allowed in wrestling?', 'how to become a radiology technician in michigan?']# No instruction needed for retrieval passages
passages = ["Since you're reading this, you are probably someone from a judo background or someone who is just wondering how judo techniques can be applied under wrestling rules. So without further ado, let's get to the question. Are Judo throws allowed in wrestling? Yes, judo throws are allowed in freestyle and folkstyle wrestling. You only need to be careful to follow the slam rules when executing judo throws. In wrestling, a slam is lifting and returning an opponent to the mat with unnecessary force.","Below are the basic steps to becoming a radiologic technologist in Michigan:Earn a high school diploma. As with most careers in health care, a high school education is the first step to finding entry-level employment. Taking classes in math and science, such as anatomy, biology, chemistry, physiology, and physics, can help prepare students for their college studies and future careers.Earn an associate degree. Entry-level radiologic positions typically require at least an Associate of Applied Science. Before enrolling in one of these degree programs, students should make sure it has been properly accredited by the Joint Review Committee on Education in Radiologic Technology (JRCERT).Get licensed or certified in the state of Michigan."
]# load model with tokenizer
model = SentenceTransformer('nvidia/NV-Embed-v2', trust_remote_code=True)
model.max_seq_length = 32768
model.tokenizer.padding_side="right"def add_eos(input_examples):input_examples = [input_example + model.tokenizer.eos_token for input_example in input_examples]return input_examples# get the embeddings
batch_size = 2
query_embeddings = model.encode(add_eos(queries), batch_size=batch_size, prompt=query_prefix, normalize_embeddings=True)
passage_embeddings = model.encode(add_eos(passages), batch_size=batch_size, normalize_embeddings=True)scores = (query_embeddings @ passage_embeddings.T) * 100
print(scores.tolist())

MTEB 基准的指令模板

对于检索、STS 和摘要的 MTEB 子任务,请使用 instructions.json 中的指令前缀模板。 对于分类、聚类和重排,请使用 NV-Embed 论文表 7 中提供的说明。 7 中提供的说明。

instructions.json

{"ClimateFEVER":{"query": "Given a claim about climate change, retrieve documents that support or refute the claim","corpus": ""},"HotpotQA":{"query": "Given a multi-hop question, retrieve documents that can help answer the question","corpus": ""},"FEVER":{"query": "Given a claim, retrieve documents that support or refute the claim","corpus": ""},"MSMARCO":{"query": "Given a web search query, retrieve relevant passages that answer the query","corpus": ""},"DBPedia":{"query": "Given a query, retrieve relevant entity descriptions from DBPedia","corpus": ""},"NQ":{"query": "Given a question, retrieve passages that answer the question","corpus": ""},"QuoraRetrieval":{"query": "Given a question, retrieve questions that are semantically equivalent to the given question","corpus": "Given a question, retrieve questions that are semantically equivalent to the given question"},"SCIDOCS":{"query": "Given a scientific paper title, retrieve paper abstracts that are cited by the given paper","corpus": ""},"TRECCOVID":{"query": "Given a query on COVID-19, retrieve documents that answer the query","corpus": ""},"Touche2020":{"query": "Given a question, retrieve passages that answer the question","corpus": ""},"SciFact":{"query": "Given a scientific claim, retrieve documents that support or refute the claim","corpus": ""},"NFCorpus":{"query": "Given a question, retrieve relevant documents that answer the question","corpus": ""},"ArguAna":{"query": "Given a claim, retrieve documents that support or refute the claim","corpus": ""},"FiQA2018":{"query": "Given a financial question, retrieve relevant passages that answer the query","corpus": ""},"STS":{"text": "Retrieve semantically similar text"},"SUMM":{"text": "Given a news summary, retrieve other semantically similar summaries"}
}

如何启用多 GPU(注意,这是 HuggingFace Transformers的情况)

from transformers import AutoModel
from torch.nn import DataParallelembedding_model = AutoModel.from_pretrained("nvidia/NV-Embed-v2")
for module_key, module in embedding_model._modules.items():embedding_model._modules[module_key] = DataParallel(module)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/472566.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【计算机网络】TCP网络特点2

断开连接 四次挥手 原因 TCP 四次挥手是为了满足 TCP 连接的全双工特性:两个方向都可以自由传输 保证数据传输的完整性:两方都完成了数据发送和接收并且都同意断开连接 可靠地终止连接以及避免数据混淆和错误等需求:每个方向都需要单独确认导致四次挥手过程 这些…

Opengl光照测试

代码 #include "Model.h" #include "shader_m.h" #include "imgui.h" #include "imgui_impl_glfw.h" #include "imgui_impl_opengl3.h" //以上是放在同目录的头文件#include <glad/glad.h> #include <GLFW/glfw3.…

【MySQL】SQL语言

【MySQL】SQL语言 文章目录 【MySQL】SQL语言前言一、SQL的通用语法二、SQL的分类三、SQLDDLDMLDQLDCL 总结 前言 本篇文章将讲到SQL语言&#xff0c;包括SQL的通用语法,SQL的分类,以及SQL语言的DDL,DML,DQL,DCL。 一、SQL的通用语法 在学习具体的SQL语句之前&#xff0c;先来…

.netcore + postgis 保存地图围栏数据

一、数据库字段 字段类型选择(Type) 设置对象类型为&#xff1a;geometry 二、前端传递的Json格式转换 前端传递围栏的各个坐标点数据如下&#xff1a; {"AreaRange": [{"lat": 30.123456,"lng": 120.123456},{"lat": 30.123456…

T265相机双目鱼眼+imu联合标定(全记录)

最近工作用到t265&#xff0c;记录一遍标定过程 1.安装驱动 首先安装realsense驱动&#xff0c;因为笔者之前使用过d435i&#xff0c;装的librealsense版本为2.55.1&#xff0c;直接使用t265会出现找不到设备的问题&#xff0c;经查阅发现是因为realsense在2.53.1后就不再支持…

【C语言指南】C语言内存管理 深度解析

&#x1f493; 博客主页&#xff1a;倔强的石头的CSDN主页 &#x1f4dd;Gitee主页&#xff1a;倔强的石头的gitee主页 ⏩ 文章专栏&#xff1a;《C语言指南》 期待您的关注 引言 C语言是一种强大而灵活的编程语言&#xff0c;为程序员提供了对内存的直接控制能力。这种对内存…

Python学习从0到1 day26 第三阶段 Spark ④ 数据输出

半山腰太挤了&#xff0c;你该去山顶看看 —— 24.11.10 一、输出为python对象 1.collect算子 功能: 将RDD各个分区内的数据&#xff0c;统一收集到Driver中&#xff0c;形成一个List对象 语法&#xff1a; rdd.collect() 返回值是一个list列表 示例&#xff1a; from …

【机器学习】机器学习中用到的高等数学知识-1.线性代数 (Linear Algebra)

向量(Vector)和矩阵(Matrix)&#xff1a;用于表示数据集&#xff08;Dataset&#xff09;和特征&#xff08;Feature&#xff09;。矩阵运算&#xff1a;加法、乘法和逆矩阵(Inverse Matrix)等&#xff0c;用于计算模型参数。特征值(Eigenvalues)和特征向量(Eigenvectors)&…

java项目-jenkins任务的创建和执行

参考内容: jenkins的安装部署以及全局配置 1.编译任务的general 2.源码管理 3.构建里编译打包然后copy复制jar包到运行服务器的路径 clean install -DskipTests -Pdev 中的-Pdev这个参数用于激活 Maven 项目中的特定构建配置&#xff08;Profile&#xff09; 在 pom.xml 文件…

【数据库取证】快速从服务器镜像文件中获取后台隐藏数据

文章关键词&#xff1a;电子数据取证、数据库取证、电子物证、云取证、手机取证、计算机取证、服务器取证 小编最近做了很多鉴定案件和参加相关电子数据取证比武赛&#xff0c;经常涉及到服务器数据库分析。现在分享一下技术方案&#xff0c;供各位在工作中和取证赛事中取得好成…

__VUE_PROD_HYDRATION_MISMATCH_DETAILS__ is not explicitly defined

VUE_PROD_HYDRATION_MISMATCH_DETAILS 未明确定义。您正在运行 Vue 的 esm-bundler 构建&#xff0c;它期望这些编译时功能标志通过捆绑器配置全局注入&#xff0c;以便在生产捆绑包中获得更好的tree-shaking优化。 Vue.js应用程序正在使用ESM&#xff08;ECMAScript模块&#…

git撤销、回退某个commit的修改

文章目录 撤销某个特定的commit方法 1&#xff1a;使用 git revert方法 2&#xff1a;使用 git rebase -i方法 3&#xff1a;使用 git reset 撤销某个特定的commit 如果你要撤销某个很早之前的 commit&#xff0c;比如 7461f745cfd58496554bd672d52efa8b1ccf0b42&#xff0c;可…

Flume和kafka的整合

1、Kafka作为Source 【数据进入到kafka中&#xff0c;抽取出来】 在flume的conf文件夹下&#xff0c;有一个flumeconf 文件夹&#xff1a;这个文件夹是自己创建的 创建一个flume脚本文件&#xff1a; kafka-memory-logger.conf Flume 1.9用户手册中文版 — 可能是目前翻译最完…

JavaSE常用API-日期(计算两个日期时间差-高考倒计时)

计算两个日期时间差&#xff08;高考倒计时&#xff09; JDK8之前日期、时间 Date SimpleDateFormat Calender JDK8开始日期、时间 LocalDate/LocalTime/LocalDateTime ZoneId/ZoneDateTIme Instant-时间毫秒值 DateTimeFormatter Duration/Period

支持向量机SVM——基于分类问题的监督学习算法

支持向量机&#xff08;SVM&#xff0c;Support Vector Machine&#xff09;是一种常用于分类问题的监督学习算法&#xff0c;其核心思想是通过寻找一个最佳的超平面来将不同类别的数据点分开&#xff0c;从而实现分类。支持向量机广泛应用于模式识别、文本分类、图像识别等任务…

node对接ChatGpt的流式输出的配置

node对接ChatGpt的流式输出的配置 首先看一下效果 将数据用流的方式返回给客户端,这种技术需求在传统的管理项目中不多见,但是在媒体或者有实时消息等功能上就会用到,这个知识点对于前端还是很重要的。 即时你不写服务端,但是服务端如果给你这样的接口,你也得知道怎么去使用联…

SobarQube实现PDF报告导出

文章目录 前言一、插件配置二、使用步骤1.新生成一个Token2.将拷贝的Token加到上文中执行的命令中3.查看报告 三、友情提示总结 前言 这篇博文是承接此文 .Net项目在Windows中使用sonarqube进行代码质量扫描的详细操作配置 描述如何导出PDF报告 众所周知&#xff0c;导出PDF功…

[Codesys]常用功能块应用分享-BMOV功能块功能介绍及其使用实例说明

官方说明 功能说明 参数 类型 功能 pbyDataSrcPOINTER TO BYTE指向源数组指针uiSizeUINT要移动数据的BYTE数pbyDataDesPOINTER TO BYTE指向目标数组指针 实例应用-ST IF SYSTEM_CLOCK.AlwaysTrue THENCASE iAutoState OF0: //读写完成信号在下次读写信号的上升沿或复位信号…

sql注入之二次注入(sqlilabs-less24)

二阶注入&#xff08;Second-Order Injection&#xff09;是一种特殊的 SQL 注入攻击&#xff0c;通常发生在用户输入的数据首先被存储在数据库中&#xff0c;然后在后续的操作中被使用时&#xff0c;触发了注入漏洞。与传统的 SQL 注入&#xff08;直接注入&#xff09;不同&a…

springboot实现简单的数据查询接口(无实体类)

目录 前言&#xff1a;springboot整体架构 1、ZjGxbMapper.xml 2、ZjGxbMapper.java 3、ZjGxbService.java 4、ZjGxbController.java 5、调用接口测试数据是否正确 6、打包放到服务器即可 前言&#xff1a;springboot整体架构 文件架构&#xff0c;主要编写框选的这几类…