基于segment anything model(SAM)相关性研究的各个方向论文/项目汇总

目录

  • 简介
  • anything项目整理
    • AnyObject
    • AnyGeneration
    • Any3D
    • AnyModel
    • AnyTask
    • AnyX
  • 论文汇总
    • AnyObejct
    • AnyGeneration
    • AnyModel
    • AnyTask

简介

有关anything相关的主流任务: 2d检测相关(AnyObject), 3d检测相关(Any3D),AI生成相关(AnyGeneration), AI模型优化相关(), AI任务相关, etc.

  • AnyObject - 分割、检测、分类、医学图像、OCR、姿态等。
  • AnyGeneration - 文本到图像的生成、编辑、修复、样式转换等。
  • Any3D - 3D 生成、分割等。
  • AnyModel - 任何修剪、任何量化、模型重使用。
  • AnyTask -LLM 控制器 + ModelZoo,通用解码,多任务学习。
  • AnyX - 其他主题:字幕等

anything项目整理

AnyObject

Title & AuthorsIntroUseful Links

Segment Anything
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick
> Meta Research
> Preprint’23

[Segment Anything (Project)]
在这里插入图片描述
[Github]
[Page]
[Demo]

OVSeg: Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Feng Liang, Bichen Wu, Xiaoliang Dai, Kunpeng Li, Yinan Zhao, Hang Zhang, Peizhao Zhang, Peter Vajda, Diana Marculescu
> Meta Research
> Preprint’23

[OVSeg (Project)]
image[Github]
[Page]

Learning to Segment Every Thing
Ronghang Hu, Piotr Dollar, Kaiming He, Trevor Darrell, Ross Girshick
> UC Berkeley, FAIR
> CVPR’18

[seg_every_thing (Project)]
image[Github]
[Page]

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
Shilong Liu and Zhaoyang Zeng and Tianhe Ren and Feng Li and Hao Zhang and Jie Yang and Chunyuan Li and Jianwei Yang and Hang Su and Jun Zhu and Lei Zhang
> IDEA-Research
> Preprint’23

[Grounded-SAM, GroundingDINO (Project)]
在这里插入图片描述
[Github]
[Demo]

SegGPT: Segmenting Everything In Context
Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang
> BAAI-Vision
> Preprint’23

[SegGPT (Project)]
image[Github]
V3Det: Vast Vocabulary Visual Detection Dataset
Jiaqi Wang, Pan Zhang, Tao Chu, Yuhang Cao, Yujie Zhou, Tong Wu, Bin Wang, Conghui He, Dahua Lin
> Shanghai AI Laboratory, CUHK
> Preprint’23
image

segment-anything-video (Project)
Kadir Nar
在这里插入图片描述

[Github]

Towards Segmenting Anything That Moves
Achal Dave, Pavel Tokmakov, Deva Ramanan
> ICCV’19 Workshop

[segment-any-moving (Project)]
[Github]

Semantic Segment Anything
Jiaqi Chen, Zeyu Yang, Li Zhang

[Semantic-Segment-Anything (Project)]
image[Github]

Grounded Segment Anything: From Objects to Parts (Project)
Peize Sun and Shoufa Chen
[Github]

GroundedSAM-zero-shot-anomaly-detection (Project)
Yunkang Cao
image[Github]

Segment Anything Labelling Tool (SALT) (Project)
Anurag Ghosh
[Github]

Prompt-Segment-Anything (Project)
Rockey
[Github]

SAM-RBox (Project)
Qingyun Li
intro[Github]

VISAM (Project)
Feng Yan, Weixin Luo, Yujie Zhong, Yiyang Gan, Lin Ma
[Github]

Segment Anything EO tools: Earth observation tools for Meta AI Segment Anything (Project)
Aliaksandr Hancharenka, Alexander Chichigin
[Github]

napari-segment-anything: Segment Anything Model (SAM) native Qt UI (Project)
Jordão Bragantini, Kyle I S Harrington, Ajinkya Kulkarni
image[Github]

SAM-Medical-Imaging: Segment Anything Model (SAM) native Qt UI (Project)
Jordão Bragantini, Kyle I S Harrington, Ajinkya Kulkarni
image[Github]

OCR-SAM: Combining MMOCR with Segment Anything & Stable Diffusion. (Project)
Zhenhua Yang, Qing Jiang
[Github]

segment-anything-u-specify: using sam+clip to segment any objs u specify with text prompts. (Project)
MaybeShewill-CV
[Github]

Segment Everything Everywhere All at Once
Xueyan Zou, Jianwei Yang, Hao Zhang, Feng Li, Linjie Li, Jianfeng Gao, Yong Jae Lee

[SEEM (Project)]
[Github]

SegDrawer: Simple static web-based mask drawer (Project)
Harry
[Github]

Magic Copy: a Chrome extension (Project)
Harry
image[Github]

Track Anything: Segment Anything Meets Videos
Jinyu Yang, Mingqi Gao, Zhe Li, Shang Gao, Fangjing Wang, Feng Zheng

[Track-Anything (Project)]
[Github]
[Demo]

Count Anything (Project)
Liqi Yan
image[Github]

Segment-and-Track-Anything (Project)
Zongxin Yang
image[Github]

Pose for Everything: Towards Category-Agnostic Pose Estimation
Lumin Xu*, Sheng Jin*, Wang Zeng, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang
> CUHK, SenseTime
> ECCV’22 Oral

[Pose-for-Everything (Project)]
[Github]

Relate Anything Model (Project)
Zujin Guo*, Bo Li*, Jingkang Yang*, Zijian Zhou*, Ziwei Liu
> MMLab@NTU
> VisCom Lab, KCL/TongJi
Github

SegmentAnyRGBD (Project)
Jun Cen, Yizheng Wu, Xingyi Li, Jingkang Yang, Yixuan Pei, Lingdong Kong
> Visual Intelligence Lab@HKUST,
> HUST,
> MMLab@NTU,
> Smiles Lab@XJTU,
> NUS
Github



AnyGeneration

Title & AuthorsIntroUseful Links

High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer
> LMU München, Runway ML
> CVPR’22

[Stable-Diffusion (Project)]
intro[Github]
[Page]
[Demo]

Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang, Maneesh Agrawala
> Stanford University
> Preprint’23

[ControlNet (Project)]
intro[Github]
[Demo]
GigaGAN: Large-scale GAN for Text-to-Image Synthesis
Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, Taesung Park
> POSTECH, Carnegie Mellon University, Adobe Research
> CVPR’23
image[Page]

Inpaint-Anything: Segment Anything Meets Image Inpainting (Project)
Tao Yu
[Github]

IEA: Image Editing Anything (Project)
Zhengcong Fei
intro[Github]

EditAnything (Project)
Shanghua Gao, Pan Zhou
[Github]

Segment Anything for Stable Diffusion Webui (Project)
Chengsong Zhang
image[Github]

Segment Anything with Clip (Project)
Jinwoo Park
intro[Github]

ShowAnything: Edit and Generate Anything In Image and Video (Project)
Showlab, NUS
Github

Transfer-Any-Style: About An interactive demo based on Segment-Anything for style transfer (Project)
LV-Lab, NUS
Github



Any3D

Title & AuthorsIntroUseful Links

Anything-3D: Segment-Anything + 3D, Let’s lift the anything to 3D (Project)
LV-Lab, NUS
Github

SAM 3D Selector: Utilizing segment-anything to help the region selection of 3D point cloud or mesh. (Project)
Nexuslrf
Github

3D-Box via Segment Anything. (Project)
dvlab-research
[Github]

Segment Anything 3D (Project)
Yunhan Yang, Xiaoyang Wu
[Github]



AnyModel

Title & AuthorsIntroUseful Links
[
DepGraph: Towards Any Structural Pruning
Gongfan Fang, Xinyin Ma, Mingli Song, Michael Bi Mi, Xinchao Wang
> Learning and Vision Lab @ NUS
> CVPR’23

[Torch-Pruning (Project)]
[Github]
[Demo]

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark
Yuhang Li and Mingzhu Shen and Jian Ma and Yan Ren and Mingxin Zhao and Qi Zhang and Ruihao Gong and Fengwei Yu and Junjie Yan
> SenseTime Research
> NeurIPS’21

[MQBench (Project)]
intro[Github]
[Page]

OTOv2: Automatic, Generic, User-Friendly
Tianyi Chen, Luming Liang, Tianyu Ding, Ilya Zharkov
> Microsoft
> ICLR’23

[Only Train Once (Project)]
intro[Github]

Deep Model Reassembly
Xingyi Yang, Daquan Zhou, Songhua Liu, Jingwen Ye, Xinchao Wang
LV Lab, NUS
> NeurIPS’22

[Deep Model Reassembly (Project)]
[Github]
[Page]



AnyTask

Title & AuthorsIntroUseful Links

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang
> Zhejiang University, MSRA
Preprint’23

[Jarvis (Project)]
[Github]
[Demo]
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan
> Microsoft
> > Preprint’23
[Github]

Generalized Decoding for Pixel, Image and Language
Xueyan Zou, Zi-Yi Dou, Jianwei Yang, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Harkirat Behl, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Yong Jae Lee, Jianfeng Gao
> Microsoft
> CVPR’23

[X-Decoder (Project)]
intro[Github]
[Page]
[Demo]

Pre-Trained Image Processing Transformer
Chen, Hanting and Wang, Yunhe and Guo, Tianyu and Xu, Chang and Deng, Yiping and Liu, Zhenhua and Ma, Siwei and Xu, Chunjing and Xu, Chao and Gao, Wen
> Huawei-Noah
> CVPR’21

[Pretrained-IPT (Project)]
[Github]

OpenAGI: When LLM Meets Domain Experts
Yingqiang Ge, Wenyue Hua, Jianchao Ji, Juntao Tan, Shuyuan Xu, Yongfeng Zhang
> Rutgers University
> Preprint’23

[OpenAGI (Project)]
Github



AnyX

Title & AuthorsIntroUseful Links

Caption Anything: Interactive Image Description with Diverse Multimodal Controls
Teng Wang, Jinrui Zhang, Junjie Fei, Hao Zheng, Yunlong Tang, Zhe Li, Mingqi Gao, Shanshan Zhao
> SUSTech VIP Lab
> Preprint’23

Caption Anything (Project)
[Github]
[Demo]

Image2Paragraph:Transform Image into Unique Paragraph (Project)
Jinpeng Wang
Github



论文汇总

AnyObejct

PaperFirst AuthorVenueTopic
Segment AnythingAlexander KirillovPreprint’23Segmentation
Learning to Segment Every ThingRonghang HuCVPR’18
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object DetectionShilong LiuPreprint’23Grouding+Detection
SegGPT: Segmenting Everything In ContextXinlong WangPreprint’23Segmentation
V3Det: Vast Vocabulary Visual Detection DatasetJiaqi WangPreprint’23Dataset
Pose for Everything: Towards Category-Agnostic Pose EstimationLumin XuECCV’22 OralPose

AnyGeneration

PaperFirst AuthorVenueTopic
High-Resolution Image Synthesis with Latent Diffusion ModelsRobin RombachCVPR’22Text-to-Image Generation
Adding Conditional Control to Text-to-Image Diffusion ModelsLvmin ZhangPreprint’23Controlllable Generation
GigaGAN: Large-scale GAN for Text-to-Image SynthesisMinguk KangCVPR’23Large-scale GAN
Inpaint Anything: Segment Anything Meets Image InpaintingTao YuPreprint’23Inpainting

AnyModel

PaperFirst AuthorVenueTopic
DepGraph: Towards Any Structural PruningGongfan FangCVPR’23Network Pruning
MQBench: Towards Reproducible and Deployable Model Quantization BenchmarkYuhang LiNeurIPS’21Network Quantization
OTOv2: Automatic, Generic, User-FriendlyTianyi ChenICLR’23Network Pruning
Deep Model ReassemblyXingyi YangNeurIPS’22Model Reuse

AnyTask

PaperFirst AuthorVenueTopic
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFaceYongliang ShenPreprint’23Modelzoo + LLM
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIsYaobo LiangPreprint’23Modelzoo + LLM
Generalized Decoding for Pixel, Image and LanguageXueyan ZouCVPR’23Multi Tasking
Pre-Trained Image Processing TransformerChen, HantingCVPR’21Low-level Vision

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/23576.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

单卡就能运行AI画画模型,小白也能看懂的教程来了,还有100万卡时免费NPU算力可用丨昇思MindSpore...

允中 发自 凹非寺量子位 | 公众号 QbitAI 昇思MindSpore首个可训练的diffusion模型DDPM马上要和大家见面了,操作简单,可训练推理,单卡即可运行,欢迎广大产学研开发者使用启智社区免费Ascend NPU算力体验。 最近爆火的AI绘图&#…

『2023北京智源大会』视觉与多模态大模型

『2023北京智源大会』视觉与多模态大模型 文章目录 一. Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold | 潘新钢 | 南洋理工大学1. Image Manipulation(图像编辑)背景2. Drag Your GAN 二. Machine Learning for 3D Content Creatio…

ChatGPT出现后是否还建议读计算机专业?

前言 首先,在多模态大模型落地应用之后,产业领域会迎来一次全面的技术升级,很多传统的人力资源岗位会被替代,但是同样也会增加一些新的就业岗位,而对于计算机专业的同学来说,这也是一个新的发展机会。 在…

chatgpt赋能python:Python配置Anaconda

Python配置Anaconda Python作为一个传统和流行的编程语言,在科学领域得到了广泛的应用。Anaconda是Python的一个流行的开源发行版,它提供了Python和其他相关工具的全套解决方案,使得科学计算和数据分析变得更为容易。在本文中,我…

关于图灵测试和中文屋Chinese room的理解

图灵测试与中文屋 这篇文章想分享关于人工智能的“中文屋论证”(也叫汉字屋,Chinese room)。什么是中文屋论证呢,我们知道图灵测试是判断是机器否是人工智能的公认标准。我先说图灵测试,知道了图灵测试就很好理解汉子屋…

彩票的两种分析方法

概率均值二分K线图分析法: 算法是取当前元素的数学平均值为基本,当本值大于均值则在上期数上加超过值,小于则在上期数上减不足值,即大于则阳线,小于则阴线。这样连线后就是K线了。 如对数进行K线方法:33个…

ChatGPT 飙升到搜索引擎第二梯队后,增长放缓

整理 | 陈静琳 责编 | 屠敏 出品 | CSDN(ID:CSDNnews) ChatGPT 的爆火,是昙花一现,还是未来可期? 近日,网站流量分析工具 Similarweb 针对 ChatGPT 目前的数据流量现状进行了一次深度的调研…

去年精准预言AIGC爆发!今年百度又看好这十大科技趋势

萧箫 发自 凹非寺量子位 | 公众号 QbitAI 2023年,我们还会见证新的AI突破吗? 过去一年里,我们围观了ChatGPT的崛起,看见国内外多模态大模型同台竞技,察觉到自动驾驶公司的商业化加速落地,也发现以AI制药为核…

使用chatgpt画一个流程图

是的&#xff0c;ChatGPT可以直接写代码&#xff01; ChatGPT支持许多编程语言&#xff0c;包括Python&#xff0c;JavaScript和C 等。您可以在消息框中键入您的代码&#xff0c;并使用/code命令将其格式化为代码块&#xff0c;以便ChatGPT更好地理解您的请求。 <!DOCTYPE h…

快速串联 RNN / LSTM / Attention / transformer / BERT / GPT

参考&#xff1a; 李宏毅2021/2022春机器学习课程王树森 RNN & Transformer 教程Transformer 详解 文章目录 0. 背景&#xff1a;序列数据及相关任务1. 早期序列模型1.1 循环神经网络 RNN1.2 长短期记忆网络 LSTM1.3 改善 RNN/LSTM 的三个技巧1.3.1 通过堆叠扩展为深度模型…

国产开源50亿参数新模型,合成可控性、质量实现飞跃

关注并星标 从此不迷路 计算机视觉研究院 公众号ID&#xff5c;ComputerVisionGzq 学习群&#xff5c;扫码在主页获取加入方式 计算机视觉研究院专栏 作者&#xff1a;Edison_G 在 AI 绘画领域&#xff0c;很多研究者都在致力于提升 AI 绘画模型的可控性&#xff0c;即让模型生…

多模态大模型技术演进及研究框架

一、多模态预训练概述 多模态表示包含两个或两个以上事物表现形式 模态是事物的一种表现形式,多模态通常包含两个或者两个以上的模态形式,是从多个视角出发对事物进行描述。生活中常见多 模态表示,例如传感器的数据不仅仅包含文字、图像,还可以包括与之匹配的温度、深度信息…

MySQL索引为什么要用B+树实现?

首先&#xff0c;得先了解什么是B树什么是B树 什么是B树 自平衡二叉树虽然能保持查询操作的时间复杂度在O(logn)&#xff0c;但是因为它本质上是一个二叉树&#xff0c;每个节点只能有 2 个子节点&#xff0c;那么当节点个数越多的时候&#xff0c;树的高度也会相应变高&…

Altman:巨型AI模型时代结束;马斯克TruthGPT曝光|每日创新观察

今日看点&#xff1a; OpenAI CEO&#xff1a;巨型AI模型时代已结束Stable Diffusion-XL开启公测马斯克TruthGPT曝光Adobe Premiere Pro 将引入新 AI 工具OpenAI CEO&#xff1a;巨型AI模型时代已结束 参考链接 OpenAI的首席执行官山姆奥特曼&#xff08;Sam Altman&#xff…

RWKV:在Transformer时代重新定义循环神经网络

论文地址&#xff1a;https://arxiv.org/abs/2305.13048 参考&#xff1a;https://www.zhihu.com/question/602564718/answer/3041307432 RWKV: Reinventing RNNs for the Transformer Era RWKV&#xff1a;在Transformer时代重新定义循环神经网络 Abstract 摘要 Transformer已…

2023 4月份 华为硬件开发岗位实习生机考回忆

2023 4月份 华为硬件开发岗位实习生机考回忆 Proscribe &#xff01;本帖只用作学习之意&#xff0c;若违反任何要求或侵权将立马删除&#xff0c;其中答案也可能错误&#xff0c;实际的工程应用和理论也有所区别&#xff0c;仅收录部分题目和答案等&#xff0c;仅供参考。&a…

那些Edge浏览器的神仙插件

浏览器插件选的好&#xff0c;网上冲浪没烦恼 文章目录 浏览器下载插件解除网页下载限制清理浏览器缓存标签自动刷新视频速度控制广告拦截器图片助手护眼模式超级复制翻译插件音乐插件喵喵折智能AI浮图秀油猴 早在五月份的时候就发过一张关于插件的动态&#xff0c;今天再来仔细…

复试常见问题

复试常见问题 语言相关操作系统组成原理计算机网络数据结构算法设计与分析深度学习梯度消失与梯度爆炸过拟合与欠拟合---退化神经网络中有哪些正则化技术&#xff1f;激活函数的作用&#xff1f;学习率太大(太小)时会发生什么&#xff1f;如何设置学习率&#xff1f;‍什么是数…

GPT之战,谷歌真的要输了?越来越多顶尖研究员跳槽OpenAI

来源&#xff1a;新智元 近期一场大讨论&#xff1a;为什么越来越多Google顶尖研究员跳槽OpenAI&#xff1f;这场LLM战役它还能打赢吗&#xff1f; 知友回复 莱斯大学博士、知友「一堆废纸」表示&#xff0c;其实谷歌和OpenAI的差距&#xff0c;是数据的差距。 「OpenAI对LLM有…

html+css实现星系图

往期内容&#xff1a; 01-htmlcssjs实现时钟 02-htmlcssjs实现骰子 03-htmlcssjs实现点名系统 文章目录 01-htmlcssjs实现时钟02-htmlcssjs实现骰子03-htmlcssjs实现点名系统前言一、整体效果二、代码实现1.背景图2.主体星系3.添加文字效果4.整体代码 总结 前言 本文通过ht…