当前位置:网站首页>Omnivore, a non picky AI model, focuses on images, videos and 3D data!
Omnivore, a non picky AI model, focuses on images, videos and 3D data!
2022-07-18 22:11:00 【Zilliz】
publisher :Towhee Technical team
Tired of using different models for different data ? Have you ever thought that one model can handle data of different modes ? finally , stay 2022 Beginning of the year Meta AI Launched “ Omnivores ” Omnivore, One model handles different visual modal data , The image can be 、 video 、3D Data are classified .Omnivore Not only compatible with many types of data , It is also among the best in the data sets of different tasks .Omnivore In the image classification dataset ImageNet Up to 86.0% precision ; In... For action recognition Kinetics Data sets can achieve 84.1% precision ; For single view 3D Scene classification SUN RGB-D On dataset , The accuracy is also as high as 67.1% .

Omnivore: Multiple visual modalities
Omnivore Convert the data of different visual modes into general vector format , And then use it Transformer Unique flexibility , Carry out joint training for classification tasks of different modes . Whether it's training from scratch , Or fine tune the pre training model , Just use Omnivore And ready-made standard data sets , It can make its performance reach or even exceed the corresponding single model .
Reference material :
Model use case :action-classification/omnivore
Address of thesis :OMNIVORE: A Single Model for Many Visual Modalities
More information : Facebook AI Introduction “ Supermodel ”: Get the image 、 Video and 3D Three classification tasks of data , Performance is not inferior to independent models
For more project updates and details, please pay attention to our project ( https://github.com/towhee-io/towhee/blob/main/towhee/models/README_CN.md) , Your attention is a powerful driving force for us to generate electricity with love , welcome star, fork, slack Three even :)

边栏推荐
- A signal design and performance analysis of synaesthesia integration
- RPA生态系统大揭秘,支撑RPA企业数十亿估值的生命本源
- [Seaborn] 5. Matrix plots
- 通感一体化融合架构及关键技术
- openGauss数据库
- Case study on how low code helps the new growth of non-standard retail business
- CONDA create delete environment
- Leetcode 1309. 解码字母到整数映射(可以,一次过)
- 动态规划之4种背包问题
- 鎳氫電池的特性和使用方法(FDK鎳氫電池充電機制)
猜你喜欢

哪个品牌的蓝牙耳机降噪好?主动降噪耳机排行榜10强

How do top enterprises such as Starbucks, Coca Cola and apple carry out brand marketing

Leetcode 1342. 将数字变成 0 的操作次数

Linux solves the problem of oracle:ora-12537: tns:connection closed

VS2017\VS2019\VS2022项目多余文件(中间文件\临时文件)一键清理BAT

面试官:建造者模式是什么?

国产之光!高分时空表征学习模型 UniFormer

What if win11 always pops up the input experience

The upgraded ranking activity is hot again. Looking around, it's full of bonuses

App 抓包提示网络异常怎么破?
随机推荐
【C】 Dynamic memory management
WTL first window
星巴克、可口可乐、苹果这些顶级企业是如何进行品牌营销
App packet capturing tips how to break network exceptions?
A signal design and performance analysis of synaesthesia integration
机器学习实战运用:速刷牛客5道机器学习题目
ES6-新增的数组方法之最常用的几种 map(),filter(),reduce(),forEach(),
是时候聊聊RPA了
Other new features of MySQL MySQL 8
[pictures and texts] U-disk startup disk production U-disk startup disk reinstallation system tutorial
栓Q了,大厂被强制毕业,空窗一个月死背八股文,还好拿到了Offer
8大主流OA办公软件比拼,传统VS新秀你PICK谁?
C语言-数组
什么样的无线蓝牙耳机好?综合性能最好的蓝牙耳机
升级版打榜活动再次火热袭来,放眼望去全是奖金
Low code development builds business process management solutions
Flutter 卡在 Running Gradle task ‘assembleDebug‘... 的解决方法
Anaconda's understanding and a brief introduction to some editors related to it
一种通感一体化的信号设计与性能分析
Backup MySQL database on Linux server (detailed tutorial)