V

a

r

y

-

T

o

y

The Young's First "Large" Vision Language Model

Haoran Wei    Lingyu Kong    Jinyue Chen    Liang Zhao    Zheng Ge    En Yu    Jianjian Sun    Chunrui Han    Xiangyu Zhang   
MEGVII Technology   Equal Contribution
Multimodal Examples
*Vary toy excels in many tasks such as document OCR and object detection.
Image

Input

Detect the ball in this image.
Image
Image

Output

Image

Input

Detect a zebra on the right.
Image
Image

Output

Image

Input

Detect all objects in this image.
Image
Image

Output

Image

Input

Written all the texts. Image
This is Ana, and her dragon, Kepri," Griffith said.
Drake nodded. 'Hi." he said.
"We could use somebody new around here," Ana said with a grin.
"You have met the other Dragon Masters and their dragons. It is time for you to meet your dragon," Griffith said.
Drake's heart pounded. My family will never believe this! he thought. Just this morning I was digging onions. Now I have ridden a horse. I've met a wizard. I'Ve seen the Dragon Stone. And I'm going to have my very own dragon?"

Output

Image

Input

Convert the document to markdown format.
Image
活动1
我们在过去的学习中已经发现了如下的运算规律:
\(15 \times 15=1 \times 2 \times 100+25=225\),
\(25 \times 25=2 \times 3 \times 100+25=625\) ,
\(35 \times 35=3 \times 4 \times 100+25=1225\) ,

你能写出一般的规律吗? 你能用本章所学知识证明你的结论吗?
## 活动2
(1) 计算下列两个数的积 (这两个数的十位上的数相同,个位上的数 的和等于 10 ), 你发现结果有什么规律?
\(53 \times 57,38 \times 32,84 \times 86,71 \times 79\).
(2) 你能用本章所学知识解释这个规律吗?
(3) 利用你发现的规律计算: \(58 \times 52,63 \times 67,75^{2}, 95^{2}\).

Output

Image

Input

Describe this image in within 100 words. Image
A cat sleeping on top of a pair of shoes.

Output