TruEra

文章目录

- 关于 TruEra
- 关于 TruLens

关于 TruEra

TruEra Gen AI Observability and LLM Evaluation
Monitor, evaluate, and debug your LLM and Gen AI apps.
All part of Full Lifecycle AI Observability from TruEra.

官网：https://truera.com
github : https://github.com/truera
https://github.com/truera/truera-examples
trulens : https://www.trulens.org
https://github.com/truera/trulens/
论文：https://arxiv.org/abs/1802.03788

当您构建和部署ML模型时，TruEra会插入到您的ML堆栈中，让您测试、调试和监控您的项目，以确保每个模型都在做它应该做的事情——如果没有，为什么不呢？从帮助您完善数据的功能开发到高效培训和评估模型，再到验证最终的生产模型，TruEra为您提供了帮助。

要了解如何在TruEra中创建和接收您的第一个项目，请使用SDK Quickstart。
如果你想探索一个特定的人工智能质量概念，如性能、漂移或公平性，请从“入门示例”开始。
这些笔记本分为两部分，将引导您针对特定问题测试ML模型，以及如何沿该轴改进模型。最后，如果有一个特定的框架或环境需要集成，请查看“集成和扩展”部分！

TruEra是一家提供全面的机器学习模型测试、调试和监控软件的公司，推出了TruLens for LLM Applications，这是首个针对基于大型语言模型（如GPT）构建的应用程序的开源测试软件。
LLM正逐渐成为未来许多应用程序的关键技术，但人们也对其使用产生了越来越多的担忧，涉及LLM的幻觉、不准确性、有害性、偏见、安全性以及潜在的滥用等问题引起了广泛关注。

TruEra的联合创始人、总裁兼首席科学家Anupam Datta解释道：“TruLens反馈功能通过分析LLM应用程序生成的文本和元数据来评分。通过建立这种关系模型，我们可以自动将其应用于扩展模型评估。”

关于 TruLens

Evaluate and Track LLM Applications

trulens : https://www.trulens.org
github : https://github.com/truera/trulens/

TruLens provides a set of tools for developing and monitoring neural nets, including large language models.
This includes both tools for evaluation of LLMs and LLM-based applications with TruLens-Eval and deep learning explainability with TruLens-Explain.
TruLens-Eval and TruLens-Explain are housed in separate packages and can be used independently.

在这里插入图片描述

TruLens 在开发工作流的位置

Build your first prototype then connect instrumentation and logging with TruLens.
Decide what feedbacks you need, and specify them with TruLens to run alongside your app.
Then iterate and compare versions of your app in an easy-to-use user interface 👇
在这里插入图片描述