Segmentation 101 : The Largest list of Open and Searchable Segmentation Datasets

Segmentation 101 a.k.a. SEG 101 is an exhaustive list we have created to make it easier for you to search publicly available Image Segmentation datasets.

Segmentation is one of the most time-consuming annotation tasks. Sometimes, before collecting your own dataset, you do want to experiment on a publicly available dataset.

So, we created this list which is searchable by class name, so you can quickly find a class that you need. It contains Instance Segmentation, Semantic Part Segmentation, Motion Segmentation, Vessel Segmentation, and many such variants.

So, if you want a head start for your AI app or for a hobby project which requires pixel-wise annotated data then, do make a quick stop at the below-mentioned datasets, and use them to build something great !!.

1. The COCO-Stuff Dataset

Dataset Characteristics:

Task Type: Instance Segmentation, Scene Understanding, Object Localization, Semantic Segmentation.

Image Count: 164K complex images from COCO 2017

Labeled Instances Count: 2.5 million

Categories: 172 classes: 80 things, 91 stuff, and 1 class unlabeled

*91 thing classes (1-91) Descriptions for stuff labels (92-182) are provided. Furthermore, 11 classes have been removed from COCO and therefore lack a preview image.

Id	Label name	Preview	Description
0	unlabeled		Pixels that do not belong to any of the other classes
1	person	(view)
2	bicycle	(view)
3	car	(view)
4	motorcycle	(view)
5	airplane	(view)
6	bus	(view)
7	train	(view)
8	truck	(view)
9	boat	(view)
10	traffic light	(view)
11	fire hydrant	(view)
12	street sign		Removed from COCO.
13	stop sign	(view)
14	parking meter	(view)
15	bench	(view)
16	bird	(view)
17	cat	(view)
18	dog	(view)
19	horse	(view)
20	sheep	(view)
21	cow	(view)
22	elephant	(view)
23	bear	(view)
24	zebra	(view)
25	giraffe	(view)
26	hat		Removed from COCO.
27	backpack	(view)
28	umbrella	(view)
29	shoe		Removed from COCO.
30	eye glasses		Removed from COCO.
31	handbag	(view)
32	tie	(view)
33	suitcase	(view)
34	frisbee	(view)
35	skis	(view)
36	snowboard	(view)
37	sports ball	(view)
38	kite	(view)
39	baseball bat	(view)
40	baseball glove	(view)
41	skateboard	(view)
42	surfboard	(view)
43	tennis racket	(view)
44	bottle	(view)
45	plate		Removed from COCO.
46	wine glass	(view)
47	cup	(view)
48	fork	(view)
49	knife	(view)
50	spoon	(view)
51	bowl	(view)
52	banana	(view)
53	apple	(view)
54	sandwich	(view)
55	orange	(view)
56	broccoli	(view)
57	carrot	(view)
58	hot dog	(view)
59	pizza	(view)
60	donut	(view)
61	cake	(view)
62	chair	(view)
63	couch	(view)
64	potted plant	(view)
65	bed	(view)
66	mirror		Removed from COCO.
67	dining table	(view)
68	window		Removed from COCO.
69	desk		Removed from COCO.
70	toilet	(view)
71	door		Removed from COCO.
72	tv	(view)
73	laptop	(view)
74	mouse	(view)
75	remote	(view)
76	keyboard	(view)
77	cell phone	(view)
78	microwave	(view)
79	oven	(view)
80	toaster	(view)
81	sink	(view)
82	refrigerator	(view)
83	blender		Removed from COCO.
84	book	(view)
85	clock	(view)
86	vase	(view)
87	scissors	(view)
88	teddy bear	(view)
89	hair drier	(view)
90	toothbrush	(view)
91	hair brush		Removed from COCO.
92	banner	(view)	Any large sign, especially if constructed of soft material or fabric, often seen in stadiums and advertising.
93	blanket	(view)	A loosely woven fabric, used for warmth while sleeping.
94	branch	(view)	The woody part of a tree or bush, arising from the trunk and usually dividing.
95	bridge	(view)	A manmade construction that spans a divide (incl. train bridge, river bridge).
96	building-other	(view)	Any other type of building or structures.
97	bush	(view)	A woody plant distinguished from a tree by its multiple stems and lower height (incl. hedge, scrub).
98	cabinet	(view)	A storage closet, often hanging on the wall.
99	cage	(view)	An enclosure made of bars, often seen in zoos.
100	cardboard	(view)	A wood-based material resembling heavy paper, used in the manufacture of boxes, cartons and signs.
101	carpet	(view)	A fabric used as a floor covering.
102	ceiling-other	(view)	Other types of ceilings (incl. industrial ceilings, painted ceilings).
103	ceiling-tile	(view)	A ceiling made of regularly-shaped slabs.
104	cloth	(view)	A piece of cloth used for a particular purpose. (incl. cleaning cloth).
105	clothes	(view)	Items of clothing or apparel, not currently worn by a person.
106	clouds	(view)	A visible mass of water droplets suspended in the air.
107	counter	(view)	A surface in the kitchen or bathroom, often built into a wall or above a cabinet, which holds the washbasin or surface to prepare food.
108	cupboard	(view)	A piece of furniture used for storing dishware or a wardrobe for clothes, sometimes hanging on the wall.
109	curtain	(view)	A piece of cloth covering a window, bed or shower to offer privacy and keep out light.
110	desk-stuff	(view)	A piece of furniture with a flat surface and typically with drawers, at which one can read, write, or do other work.
111	dirt	(view)	Soil or earth (incl. dirt path).
112	door-stuff	(view)	A portal of entry into a building, room or vehicle, consisting of a rigid plane movable on a hinge (incl. the frame, replaces door).
113	fence	(view)	A thin, human-constructed barrier which separates two pieces of land.
114	floor-marble	(view)	The supporting surface of a room or outside, made of marble.
115	floor-other	(view)	Any other type of floor (incl. rubber-based floor).
116	floor-stone	(view)	The supporting surface of a room or outside, made of stone (incl. brick floor).
117	floor-tile	(view)	The supporting surface of a room or outside, made of regularly-shaped slabs (incl. tiled stone floor, tiled marble floor).
118	floor-wood	(view)	The supporting surface of a room or outside, made of wood (incl. wooden tiles, parquet, laminate, wooden boards).
119	flower	(view)	The seed-bearing part of a plant (incl. the entire flower).
120	fog	(view)	A thick cloud of tiny water droplets suspended in the atmosphere near the earth’s surface.
121	food-other	(view)	Any other type of food.
122	fruit	(view)	The sweet and fleshy product of a tree or other plant.
123	furniture-other	(view)	Any other type of furniture (incl. oven).
124	grass	(view)	Vegetation consisting of typically short plants with long, narrow leaves (incl. lawn, pasture).
125	gravel	(view)	A loose aggregation of small water-worn or pounded stones.
126	ground-other	(view)	Any other type of ground found outside a building.
127	hill	(view)	A naturally raised area of land, not as high as a mountain, viewed at a distance and may be covered in trees, snow or grass.
128	house	(view)	A smaller size building for human habitation.
129	leaves	(view)	A structure of a higher plant, typically green and blade-like, that is attached to a stem or stalk.
130	light	(view)	A source of illumination, especially a lamp (incl. ceiling lights).
131	mat	(view)	A piece of coarse material placed on a floor for people to wipe their feet on.
132	metal	(view)	A raw metal material (incl. a pile of metal).
133	mirror-stuff	(view)	A glass coated surface which reflects a clear image (incl. the frame, replaces mirror).
134	moss	(view)	A small flowerless green plant which lacks true roots, growing in in damp habitats.
135	mountain	(view)	A large natural elevation rising abruptly from the surrounding level, viewed at a distance and may be covered in trees, snow or grass.
136	mud	(view)	A soft, sticky matter resulting from the mixing of earth and water.
137	napkin	(view)	A piece of cloth or paper used at a meal to wipe the fingers or lips.
138	net	(view)	An open-meshed fabric twisted, knotted, or woven together at regular intervals.
139	paper	(view)	A material manufactured in thin sheets from the pulp of wood.
140	pavement	(view)	A typically raised paved path for pedestrians at the side of a road.
141	pillow	(view)	A rectangular cloth bag stuffed with soft materials to support the head.
142	plant-other	(view)	Any other type of plant.
143	plastic	(view)	Raw plastic material.
144	platform	(view)	A raised level surface on which people or things can stand (incl. railroad platform).
145	playingfield	(view)	A ground marked off for various games (incl. indoor and outdoor).
146	railing	(view)	A fence or barrier made of typically metal rails.
147	railroad	(view)	A track made of steel rails along which trains run (incl. the wooden beams).
148	river	(view)	A stream of flowing water.
149	road	(view)	A paved way leading from one place to another.
150	rock	(view)	The solid mineral material forming part of the surface of the earth.
151	roof	(view)	The structure forming the upper covering of a building.
152	rug	(view)	A floor covering of thick woven material, typically not extending over the entire floor.
153	salad	(view)	A cold dish of various mixtures of raw or cooked vegetables.
154	sand	(view)	A loose granular substance, typically pale yellowish brown, resulting from erosion (incl. beach).
155	sea	(view)	Expanse of water that covers most of the earth’s surface.
156	shelf	(view)	An open piece of furniture that provides a surface for the storage or display of objects.
157	sky-other	(view)	Any other type of sky (incl. blue sky).
158	skyscraper	(view)	A very tall building of many storeys.
159	snow	(view)	Atmospheric water vapour frozen into ice crystals, falling or lying on the ground.
160	solid-other	(view)	Any other type of solid material.
161	stairs	(view)	A set of steps leading from one floor to another (incl. stairs inside or outside a building).
162	stone	(view)	A piece of stone shaped for a purpose.
163	straw	(view)	Dried stalks of grain.
164	structural-other	(view)	Any other type of structural connection (incl. arcs, pillars).
165	table	(view)	A piece of furniture with a flat top and one or more legs.
166	tent	(view)	A portable shelter made of cloth.
167	textile-other	(view)	Any other type of textile.
168	towel	(view)	A piece of thick absorbent cloth used for drying oneself.
169	tree	(view)	A woody plant, typically having a single trunk growing to a considerable height and bearing lateral branches at some distance from the ground.
170	vegetable	(view)	A part of a plant used as food.
171	wall-brick	(view)	A building wall made of bricks of clay.
172	wall-concrete	(view)	A building wall made of concrete.
173	wall-other	(view)	Any other type of wall.
174	wall-panel	(view)	A panel that is attached to a wall.
175	wall-stone	(view)	A building wall made of stone.
176	wall-tile	(view)	A building wall made of tiles, such as used in bathrooms and kitchens.
177	wall-wood	(view)	A building wall made of wooden material.
178	water-other	(view)	Any other type of water (incl. lake).
179	waterdrops	(view)	Sprinkles or drops of water not connected to a larger body of water.
180	window-blind	(view)	Blinds and shutters that cover a window.
181	window-other	(view)	Any type of window that must be visible in the image (replaces window).
182	wood	(view)	Raw wood materials (incl. logs).

Code/Model/Publication URL : COCO-Stuff: Thing and Stuff Classes in Context

Source URL : The COCO-Stuff dataset

Dataset Examples :

2. The DUTS Image Dataset

Dataset Characteristics:

Task Type : Saliency detection

Image Attribute : 10,553 training images and 5,019 test images

Code/Model/Publication URL : Learning to Detect Salient Objects with Image-level Supervision

Source URL : The DUTS Image Dataset

Dataset Examples :

3. The CMU-Cornell iCoseg Dataset

Dataset Characteristics:

Task Type : cosegmentation

Image Attribute : 38 challenging groups with 643 total images (∼17 images per group), with associated pixel-level ground truth.

Code/Model/Publication URL : iCoseg: Interactive Co-segmentation with Intelligent Scribble Guidance

Source URL : The CMU-Cornell iCoseg Dataset

Dataset Examples :

4. The PASCAL-S Dataset

Dataset Characteristics:

Task Type : fixation prediction, salient object segmentation

Image Attribute : 850 images from PASCAL VOC 2010

Object Instance : 1296

No of subjects : 12

Code/Model/Publication URL : The Secrets of Salient Object Segmentation

Source URL : The PASCAL-S Dataset

Dataset Examples :

Credits : The PASCAL-S Dataset

5. DAVIS: Densely Annotated Video Segmentation

Dataset Characteristics:

Task Type: video object segmentation

Video Attribute: 400 objects on 150 videos with∼10k frames

Code/Model/Publication URL: A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation

Source URL: DAVIS

Dataset Examples :

6. NYU Depth Dataset V1

Dataset Characteristics:

Task Type : Indoor Scene segmentation, Multi-class segmentation

Image Attribute : 2347 unique frames cover over 13 object classes, spread over 64 different indoor environments.

Scene class	Scenes
Bathroom	6
Bedroom	17
Bookstore	3
Cafe	1
Kitchen	10
Living Room	13
Office	14
Total	64

Object Class

Bed

Blind

Bookshelf

Cabinet

Ceiling

Floor

Picture

Sofa

Table

Television

Wall

Window

Background

Code/Model/Publication URL : Indoor Scene Segmentation using a Structured Light Sensor

Source URL : NYU Depth V1

Dataset Examples :

Credits : Indoor Scene Segmentation using a Structured Light Sensor

7. PASCAL-Part Dataset

Dataset Characteristics:

Task Type: Body parts segmentation

Image Attribute: 10, 103

Label Attribute: 4,203

Categories: 20 with their individual body parts

Objects	Parts
`aeroplane`
	`body`
	`engine*`
	`left wing`
	`right wing`
	`stern`
	`tail`
	`wheel*`
`bicycle`
	`back wheel`
	`chain wheel`
	`front wheel`
	`handlebar`
	`headlight*`
	`saddle`
`bird`
	`beak`
	`head`
	`left eye`
	`left foot`
	`left leg`
	`left wing`
	`neck`
	`right eye`
	`right foot`
	`right leg`
	`right wing`
	`tail`
	`torso`
`boat`
`bottle`
	`body`
	`cap`
`bus`
	`back license plate`
	`back side`
	`door*`
	`front license plate`
	`front side`
	`headlight*`
	`left mirror`
	`left side`
	`right mirror`
	`right side`
	`roof side`
	`wheel*`
	`window*`
`car`
	`back license plate`
	`back side`
	`door*`
	`front license plate`
	`front side`
	`headlight*`
	`left mirror`
	`left side`
	`right mirror`
	`right side`
	`roof side`
	`wheel*`
	`window*`
`cat`
	`head`
	`left back leg`
	`left back paw`
	`left ear`
	`left eye`
	`left front leg`
	`left front paw`
	`neck`
	`nose`
	`right back leg`
	`right back paw`
	`right ear`
	`right eye`
	`right front leg`
	`right front paw`
	`tail`
	`torso`
`chair`
`cow`
	`head`
	`left back lower leg`
	`left back upper leg`
	`left ear`
	`left eye`
	`left front lower leg`
	`left front upper leg`
	`left horn`
	`muzzle`
	`neck`
	`right back lower leg`
	`right back upper leg`
	`right ear`
	`right eye`
	`right front lower leg`
	`right front upper leg`
	`right horn`
	`tail`
	`torso`
`diningtable`
`dog`
	`head`
	`left back leg`
	`left back paw`
	`left ear`
	`left eye`
	`left front leg`
	`left front paw`
	`muzzle`
	`neck`
	`nose`
	`right back leg`
	`right back paw`
	`right ear`
	`right eye`
	`right front leg`
	`right front paw`
	`tail`
	`torso`
`horse`
	`head`
	`left back hoof`
	`left back lower leg`
	`left back upper leg`
	`left ear`
	`left eye`
	`left front hoof`
	`left front lower leg`
	`left front upper leg`
	`muzzle`
	`neck`
	`right back hoof`
	`right back lower leg`
	`right back upper leg`
	`right ear`
	`right eye`
	`right front hoof`
	`right front lower leg`
	`right front upper leg`
	`tail`
	`torso`
`motorbike`
	`back wheel`
	`front wheel`
	`handlebar`
	`headlight*`
	`saddle`
`person`
	`hair`
	`head`
	`left ear`
	`left eye`
	`left eyebrow`
	`left foot`
	`left hand`
	`left lower arm`
	`left lower leg`
	`left upper arm`
	`left upper leg`
	`mouth`
	`neck`
	`nose`
	`right ear`
	`right eye`
	`right eyebrow`
	`right foot`
	`right hand`
	`right lower arm`
	`right lower leg`
	`right upper arm`
	`right upper leg`
	`torso`
`pottedplant`
	`plant`
	`pot`
`sheep`
	`head`
	`left back lower leg`
	`left back upper leg`
	`left ear`
	`left eye`
	`left front lower leg`
	`left front upper leg`
	`left horn`
	`muzzle`
	`neck`
	`right back lower leg`
	`right back upper leg`
	`right ear`
	`right eye`
	`right front lower leg`
	`right front upper leg`
	`right horn`
	`tail`
	`torso`
`sofa`
`train`
	`coach back side*`
	`coach front side*`
	`coach left side*`
	`coach right side*`
	`coach roof side*`
	`coach*`
	`head`
	`head back side`
	`head front side`
	`head left tside`
	`head right side`
	`head roof side`
	`headlight*`
`tvmonitor`
	`screen`

(* denotes a part can appear multiple times in one object instance)

Code/Model/Publication URL : Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts

Source URL : PASCAL-Part Dataset

Dataset Examples :

Credits: PASCAL-Part Dataset

8. Matting Human Datasets

Dataset Characteristics:

Task Type: Human segmentation

Image Attribute: 34,427 images and corresponding matting results

Code/Model/Publication URL: matting human dataset

Source URL: matting_human_half

Dataset Examples :

9. Automatic Portrait Segmentation Dataset

Dataset Characteristics:

Task Type: Automatic Portrait Segmentation for styling images

Image Attribute: 1800

Code/Model/Publication URL : Automatic Portrait Segmentation for Image Stylization

Source URL: Automatic Portrait Segmentation

Dataset Examples :

Credits : Automatic Portrait Segmentation

10. ADE20K Dataset

Dataset Characteristics:

Task Type: image segmentation

Image Attribute: 22,210 images, 900 scene categories defined in the SUN database, more than 200 object classes.

Label attribute: 22,210 Fully annotated with objects and parts

Code/Model/Publication URL: Semantic Understanding of Scenes through ADE20K Dataset

Source URL: ADE20K Dataset

Dataset Examples :

11. Pedestrian Parsing Dataset

Dataset Characteristics:

Task Type: pedestrian detection, pose estimation, and body segmentation.

Image Attribute: contains 3,673 images from 171 videos of different Surveillance Scenes (PPSS), where 2,064 images are occluded and 1,609 are not.

Code/Model/Publication URL: Pedestrian Parsing via Deep Decompositional Network

Source URL: Pedestrian Parsing Dataset

Dataset Examples :

12. iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images

Dataset Characteristics:

Task Type: instance-level object detection and pixel-level segmentation task on aerial images

Image Attribute: 2806 high-resolution aerial images

Label attribute: 655,451 object instances

Categories :

iSAID Dataset	Large-vehicle, swimming pool, Helicopter, Bridge, Plane, Ship, Soccer-ball field, Storage tank
	Basketball court, Ground track field, Small-vehicle, Harbor, Baseball Diamond, Tennis court, Roundabout

Code/Model/Publication URL : iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images

Source URL: iSAID

Dataset Examples :

Credits: iSAID

13. Pascal VOC (2007-2012)

Dataset Characteristics:

Task Type: object segmentation, class segmentation

Image Count: 54k Images

Labeled Instances Count: 19,377

Categories :

- - Person: person
  - Animal: bird, cat, cow, dog, horse, sheep
  - Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
  - Indoor: bottle, chair, dining table, potted plant, sofa, TV/monitor
  - Background, void

Code/Model/Publication URL: The 2005 PASCAL Visual Object Classes Challenge

Source URL: Pascal VOC Dataset

Dataset Examples :

14. Open Images Dataset V6

Dataset Characteristics:

Task Type: object detection, segmentation, visual relationship, local narratives

Image Count: ~9M images

Labeled Instances Count: segmentation masks for 2.8M object instances in 350 classes, 16M bounding boxes in 600 object classes

Categories : 19,957 classes

Code/Model/Publication URL : Open Images

Source URL: Open Images Dataset

Dataset Examples :

15. OASIS: A Large-Scale Dataset for Single Image 3D in the Wild

Dataset Characteristics:

Task Type: Depth Estimation, Surface Normal Estimation, Fold & Occlusion detection, Planar Instance segmentation

Image Count : 140,000 images

Code/Model/Publication URL : OASIS: A Large-Scale Dataset for Single Image 3D in the Wild

Source URL : OASIS: A Large-Scale Dataset for Single Image 3D in the Wild

Dataset Examples :

Credits : OASIS: A Large-Scale Dataset for Single Image 3D in the Wild

16. KAIST Salient Pedestrian Dataset

Dataset Characteristics :

Task Type: pedestrian detector in thermal images

Image Attribute: 1702 images (913 day images and 789 night images)

Label instance: 4170 instances of pedestrians

Code/Model/Publication URL: Pedestrian Detection from Thermal Images using Saliency Maps

Source URL: Salient-Pedestrian-Detection

Dataset Examples :

Credits: Pedestrian Detection from Thermal Images using Saliency Maps

17. Clothing Attributes Dataset

Dataset Characteristics :

Task Type: Dressing Style Analysis using Semantic Attributes

Image Attribute : 1856 images

Label instance : 283,107 label

Categories : 26 attributes in total, including 23 binary-class attributes (6 for pattern, 11 for color and 6 miscellaneous attributes), 3 multi-class attributes (sleeve length, neckline shape and clothing category)

Describing Clothing by Semantic Attributes

Clothing pattern (Positive / Negative)	Solid (1052 / 441), Floral (69 / 1649), Spotted (101 / 1619)Plaid (105 / 1635), Striped (140 / 1534), Graphics (110 / 1668)
Major color(Positive / Negative)	Red (93 / 1651), Yellow (67 / 1677), Green (83 / 1661), Cyan (90 / 1654)Blue (150 / 1594), Purple (77 / 1667), Brown (168 / 1576), White (466 / 1278)Gray (345 / 1399), Black (620 / 1124),>2Colors (203 / 1541)
Wearing necktie	Yes 211, No 1528
Collar presence	Yes 895, No 567
Gender	Male 762, Female 1032
Wearing scarf	Yes 234, No 1432
Skin exposure	High 193, Low 1497
Placket presence	Yes 1159, No 624
Sleeve length	No sleeve (188), Short sleeve (323), Long sleeve (1270)
Neckline shape	V-shape (626), Round (465), Others (223)
Clothing category	Shirt (134), Sweater (88), T-shirt (108), Outerwear (220)Suit (232), Tank Top (62), Dress (260)

Code/Model/Publication URL : Describing Clothing by Semantic Attributes

Source URL : Clothing Attributes Dataset

Dataset Examples :

18. TACO: Trash Annotations in Context for Litter Detection

Dataset Characteristics :

Task Type : litter detection and segmentation

Image Attribute : 2285 images

Label instance : 7426 annotations

Categories: 60 categories which belong to 28 super (top) categories

Super category	Category	Notes
Aluminum foil	Aluminum foil	–
Battery	Battery	–
Blister pack	Aluminum blister pack	Containers used to store capsules (e.g. pills)
Blister pack	Carded blister pack	Paper-back package
Bottle	Clear plastic bottle	Water and soft drink bottles made of PET
Bottle	Glass bottle	Includes beer and wine bottles
Bottle	Other plastic bottle	Opaque or translucent. Generally made of HDPE. Includes detergent bottles
Bottle cap	Plastic bottle cap	–
Bottle cap	Metal bottle cap	–
Broken glass	Broken glass	–
Can	Aerosol	–
Can	Drink can	Aluminum soda can
Can	Food can	Steel can
Carton	Corrugated carton	Includes cardboard boxes
Carton	Drink carton	Tetrapak composites
Carton	Egg carton	–
Carton	Meal carton	Includes sandwich boxes, paper plates, take-out boxes
Carton	Pizza box	–
Carton	Toilet tube	–
Carton	Other carton	Paperboard boxes
Cigarette	Cigarette	Cigarette butts
Cup	Paper cup	–
Cup	Disposable plastic cup	Generally made of PET
Cup	Foam cup	Polystyrene Cup
Cup	Glass cup	–
Cup	Other plastic cup	Reusable plastic cups, thicker than disposable ones
Food waste	Food waste	–
Glass jar	Glass jar	–
Lid	Plastic lid	Includes cup lids
Lid	Metal lid	Generally glass jar lids
Paper	Normal paper	–
Paper	Tissues	–
Paper	Wrapping paper	–
Paper	Magazine paper	Plastified paper used in catalogues
Paper bag	Paper bag	Brown bag
Paper bag	Plastified paper bag	Bakery bags that come with transparent film
Plastic bag & wrapper	Garbage bag	–
Plastic bag & wrapper	Single-use carrier bag	–
Plastic bag & wrapper	Polypropylene bag	Reusable bags
Plastic bag & wrapper	Plastic Film	May be transparent or opaque. Inludes bread bags, cereal bags and produce bags
Plastic bag & wrapper	Six pack rings	–
Plastic bag & wrapper	Crisp packet	So common that it needs its own category
Plastic bag & wrapper	Other plastic wrapper	Can be made of aluminium. Includes candy wrappers, retort pouches and yoghurt lids
Plastic container	Spread tub	Includes margarine tubs and yoghurt pots
Plastic container	Tupperware	HDPE microwavable tub
Plastic container	Disposable food container	Includes black trays and PET containers
Plastic container	Foam food container	Styrofoam takeaway boxes
Plastic container	Other plastic container	–
Plastic glooves	Plastic glooves	–
Plastic utensils	Plastic utensils	–
Pop tab	Pop tab	–
Rope	Rope	Includes fishing nets
Scrap metal	Scrap metal	Includes all metal except cans
Shoe	Shoe	–
Squeezable tube	Squeezable tube	Includes toothpaste and glue tubes
Straw	Plastic straw	–
Straw	Paper straw	–
Styrofoam piece	Styrofoam piece	–
Other plastic	Other plastic	Includes other objects or fragments made of plastic
Unlabeled litter	Unlabeled litter	Unknown object of unknown material. Any ambiguous object.

Code/Model/Publication URL : TACO: Trash Annotations in Context for Litter Detection

Source URL: TACO

Dataset Examples :

19. Crowd Instance-level Human Parsing (CIHP) Dataset

Dataset Characteristics :

Task Type: Instance-level Human Parsing, Semantic Part Segmentation

Image Count: 38280 multiple-person images with pixel-wise annotations of 19 semantic parts and of 20 categories in instance-level.

Categories: pixel-wise annotations on 20 categories.

Semantic Part Label
Hat	Pants
Hair	Gloves
Sunglasses	Scarf
Upper-clothes	Skirt
Dress	Torso-skin
Coat	Face
Socks	Right arm
Right leg	Left shoe
Right shoe	Left leg
Left arm

Code/Model/Publication URL: Instance-level Human Parsing via Part Grouping Network

Source URL: Crowd Instance-level Human Parsing (CIHP) Dataset

Dataset Examples :

Credits: Crowd Instance-level Human Parsing (CIHP) Dataset

20. ModaNet

Dataset Characteristics :

Task Type: semantic segmentation on street fashion images in detail.

Image Count: 55,176 street image fully annotated with polygons.

Categories: Each polygon (segmentation mask) annotation is assigned to one of the following labels:

Label	Description	Fine-Grained-categories
1	bag	bag
2	belt	belt
3	boots	boots
4	footwear	footwear
5	outer	coat/jacket/suit/blazers/cardigan/sweater/Jumpsuits/Rompers/vest
6	dress	dress/t-shirt dress
7	sunglasses	sunglasses
8	pants	pants/jeans/leggings
9	top	top/blouse/t-shirt/shirt
10	shorts	shorts
11	skirt	skirt
12	headwear	headwear
13	scarf & tie	scarf & tie

Code/Model/Publication URL : ModaNet: A Large-scale Street Fashion Dataset with Polygon Annotations

Source URL: modanet

Dataset Examples :

Credits : ModaNet: A Large-scale Street Fashion Dataset with Polygon Annotations

21. Embrapa Wine Grape Instance Segmentation Dataset

Dataset Characteristics :

Task Type: instance segmentation for image-based monitoring and field robotics in viticulture.

Image Count: 300 images

Label Count: 2,020 binary masks for instance segmentation

Categories: Each polygon (segmentation mask) annotation is assigned to one of the following labels:

Prefix	Variety
CDY	Chardonnay
CFR	Cabernet Franc
CSV	Cabernet Sauvignon
SVB	Sauvignon Blanc
SYH	Syrah

Code/Model/Publication URL: Grape detection, segmentation and tracking using deep neural networks and three-dimensional association

Source URL : Embrapa WGISD

Dataset Examples :

Credits: Grape detection, segmentation and tracking using deep neural networks and three-dimensional association

22. MinneApple: A Benchmark Dataset for Apple Detection and Segmentation

Dataset Characteristics :

Task Type: fruit segmentation

Image Count: 1671 images of Apples

Label Count: 41,000 annotated object instances

Code/Model/Publication URL : MinneApple: A Benchmark Dataset for Apple Detection and Segmentation

Source URL: MinneApple

Dataset Examples :

Credits : MinneApple: A Benchmark Dataset for Apple Detection and Segmentation

23. FSS-1000: A 1000 Class Dataset for Few-shot Segmentation

Dataset Characteristics :

Task Type: image segmentation

Image Count: 10000 images with pixelwise segmentation labels

Categories: 1,000, with instance segmentation labels in 758 out of the 1,000 classes

Hierarchy of FSS-1000

Code/Model/Publication URL : FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation

Source URL: FSS-1000

Dataset Examples :

Credits : FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation

24. KINS Dataset

Dataset Characteristics :

Task Type: Amodal instance segmentation

Image Count: 14,991images from KITTI dataset

Label Count: On average, each image has 12.53 labeled instances, and each object polygon consists of 33.70 points.

Categories :

Category

People

Vehicle

Subcategory

Pedestrian

Cyclist

person-siting

Car

Van

Tram

truck

misc

Code/Model/Publication URL: Amodal Instance Segmentation with KINS Dataset

Source URL: KINS Dataset

Dataset Examples :

Credits: Amodal Instance Segmentation with KINS Dataset

25. MSeg

Dataset Characteristics :

Task Type: semantic segmentation

Image Count : 80,000 images

Label Count: 220,000 object masks

Category: 194 categories.

Universal Class Name	Description
airplane	an aircraft that has a fixed wing and is powered by propellers or jets
animal-other	Any animal that is not a bird, cat, dog, horse, sheep, cow, elephant, bear, zebra, or giraffe
apparel	clothing in general, not currently being worn by a human being
apple	fruit with red or yellow or green skin and sweet to tart crisp whitish flesh
arcade machine
armchair	chair with a support on each side for arms (that is not a swivel chair, seat, or sofa)
autorickshaw	in South Asia, a small motor vehicle with three wheels that is used as a taxi
awning	a sheet of canvas or other material stretched on a frame and used to keep the sun or rain off a storefront, window, doorway, or deck. Should be attached to a building.
backpack	a bag carried by a strap on your back or shoulder
bag
banana	Elongated crescent-shaped yellow fruit with soft sweet flesh
banner	long strip of cloth or paper used for decoration or advertising
barrel	a cylindrical container that holds liquids
base
baseball bat	an implement used in baseball by the batter
baseball glove	the handwear used by fielders in playing baseball
basket	a container used to hold or carry things, typically made from interwoven strips of cane or wire.
bathtub
bathroom counter	a long, flat, narrow surface in a bathroom at waist-height
bear	massive plantigrade carnivorous or omnivorous mammals with long shaggy coats and strong claws
bed
bench	a long seat for more than one person
bicycle	a wheeled vehicle that has two wheels and is moved by foot pedals
bicyclist	a person who rides a bicycle
bike rack	a rack for parking bicycles
billboard
bird	Warm-blooded egg-laying vertebrates characterized by feathers and forelimbs modified as wings
blanket
boat-ship
book
bookshelf
bottle	a glass or plastic vessel used for storing drinks or other liquids; typically cylindrical without handles and with a narrow neck that can be plugged or capped
bowl
box
bridge	a structure that allows people or vehicles to cross an obstacle such as a river or canal or railway etc.
broccoli	plant with dense clusters of tight green flower buds
building	a structure that has a roof and walls and stands more or less permanently in one place. Includes houses, grandstands, booths, towers, and skyscrapers
bulletin board
bus	a large vehicle carrying many passengers; used for public transport. Typically has swing doors, more than 4 wheels, tires larger than a regular car, and an illuminated front sign
cabinet	A case or cupboard usually having doors and shelves. Can be found in a bathroom, kitchen, or other room. Cabinets are not for storing clothes.
cake	baked goods made from or based on a mixture of flour, sugar, eggs, and fat
car
carrot	deep orange edible root of the cultivated carrot plant
case	a glass container used to store and display items in a shop or museum or home
cat	feline mammal usually having thick soft fur and no ability to roar: domestic cats
cctv camera	camera that can produce images or recordings for surveillance or other private purposes
ceiling
cell phone	a phone with access to a cellular radio system so it can be used over a wide area, without a physical connection to a network; a mobile phone.
chair_other	Any chair that does NOT fall into the following categories: armchair, stool, seat, sofa. A chair is defined as a seat for one person, with a support for the back.
chandelier	branched lighting fixture; often ornate; hangs from the ceiling
chest of drawers	furniture with drawers for keeping clothes
clock	a timepiece that shows the time of day
column	(architecture) a tall vertical cylindrical structure standing upright and used to support a structure
conveyor belt	a moving belt that transports people or objects (as in a factory). Includes moving walk-ways and baggage carousels
couch	an upholstered seat for more than one person (also known as sofa)
counter-other	a long, flat, narrow surface used for making transactions in a store or in a home kitchen (fixed against a wall for preparing food).
cow	cattle that are reared for their meat
crib	baby bed with high sides made of slats
cup	a container for holding liquids while drinking; includes mugs, and could be made of plastic, glass, ceramic, paper, or styrofoam. Cups include drinking glasses without a stem; if the cup/glass has a stem, it is a wine glass
curtain-other	hanging cloth used as a blind (especially for a window). does not include shower curtains.
desk	A piece of furniture with a flat or sloped surface with drawers, at which one can read,write, or do other work. A table, frame, or case with a sloping or horizontal surface especially for writing and reading and often with drawers, compartments, and pigeon holes.Includes tables used as desks.
dishwasher	a machine for washing dishes
dog	a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds
donut	a small ring-shaped friedcake
door	a swinging or sliding barrier that will close the entrance to a room or building or vehicle
elephant	five-toed pachyderm
escalator	a stairway whose steps move continuously on a circulating belt
fan	a device for creating a current of air by movement of a surface or surfaces
fence
fire hydrant	an upright hydrant for drawing water to use in fighting a fire
fireplace	an open recess in a wall at the base of a chimney where a fire can be built
flag	emblem usually consisting of a rectangular piece of cloth of distinctive design
floor	The inside lower horizontal surface (as of a room, hallway, tent, or other structure). Must be indoors. This includes homogenous floor coverings that span an entire room, such as carpet, linoleum, or marble.
food other	any food that is not a banana, apple, sandwich, orange, broccoli, carrot, hot dog, pizza, donut, cake, or fruit
fork	cutlery used for serving and eating food
fountain	an ornamental structure in a pool or lake from which one or more jets of water are pumped into the air. Includes the jet of water.
frisbee	a light, plastic disk propelled with a flip of the wrist for recreation or competition
fruit other	any fruit, not an apple, orange, or banana (e.g. pineapple, melon, pear, kiwi, avocado)
giraffe	tallest living quadruped; having a spotted coat and small horns and very long neck and legs; of savannahs of tropical Africa
gravel	very small rock fragments and pebbles
guard_rail	a strong fence/railing at the side of a road or in the middle of an expressway, intended to reduce the risk of serious accidents. A guardrail sits low to the ground and is made of thick horizontal metal rails.
hair_dryer
horse	a hand-held electric blower that can blow warm air onto the hair
hot dog	Solid-hoofed Herbivorous quadruped domesticated since prehistoric times
junction box	A metal box outdoors, containing a junction of electric wires or cables
keyboard	a freestanding panel of keys that operate a computer or Typewriter. Should not be part of a laptop (to prevent clashes). Connected with a cord to a computer.
kitchen-island	an unattached counter in a kitchen that permits access from all sides
kite	plaything consisting of a light frame covered with tissue paper; flown in wind at end of a string. Includes kites for wind-surfing.
knife	edge tool used as a cutting instrument; has a pointed blade with a sharp edge and a handle
lamp	a device for giving light that has a covering, but allows light to shine through or around. Could be freestanding on a floor and be covered with a shade, or could hang from the ceiling, or be placed on a table/desk/nightstand.
laptop	a portable computer small enough to use in your lap. Includes the laptop keyboard.
light-other	Any other type of light, e.g. fluorescent ceiling lights, bathroom lights embedded into ceiling, etc
mailbox	a private box for delivery of mail
microwave	kitchen appliance that cooks food by passing an electromagnetic wave through it
mirror	polished surface that forms images by reflecting light
motorcycle	a motor vehicle with two wheels and a strong frame
motorcyclist	a human relying on a motorcycle/moped for movement. Must be actively riding the motorcycle/moped (not standing nearby it).
mountain hill
mouse	a computer input device that controls an on-screen pointer
net	an open fabric of string or rope or wire woven together at regular intervals
night stand	a small low bedside table, typically having drawers. a small bedside table or stand
orange	round yellow to orange fruit of any of several citrus trees
ottoman	a low upholstered seat or thick cushion used as a seat
oven
painting	graphic art consisting of an artistic composition made by applying paints to a surface
paper	a material made of cellulose pulp derived mainly from wood or rags or certain grasses. Includes loose paper, reams of paper, paper towels, and napkins
parking meter	a money-operated timer located next to a parking space; depositing money into it entitles you to park your car therefore a specified length of time
person_nonrider	a human being who is not a motorcyclist, bicyclist, or rider other (could be a motor vehicle passenger)
pier_wharf	Either a pier or a wharf. A pier is a platform/low structure built out from the shore into the water and supported by piles; provides access to ships and boats. A wharf is a level quayside area to which a ship may be moored to load and unload at the edge of water.
pillow
pizza	Italian open pie made of thin bread dough spread with a spiced mixture of e.g. tomato sauce and cheese
plate
platform	area alongside a railway track providing convenient access to trains, or a ramp/raised surface at a skatepark
playingfield	a field used for outdoor team games. Includes tennis courts, baseball fields/diamonds, and fields for soccer, ultimate frisbee, and equestrian events.
plaything_other	a child’s toy, or something used like a toy, that is not a teddy bear.
pole	a long (usually round) rod of wood or metal or plastic. includes lamp posts (excluding the light) and streetlight poles (excluding the light)
pool table	game equipment consisting of a heavy table on which pool is played
poster
potted plant	a plant that is planted and grown in a container rather than in the ground
radiator	heater consisting of a series of pipes for circulating steam or hot water to heat rooms or buildings
railing-banister	a barrier consisting of a horizontal bar and supports; a railing at the side of a staircase or balcony to prevent people from falling
railroad	a line of track providing a runway for wheels
range_hood	exhaust hood over a kitchen range
refrigerator	a refrigerator in which the coolant is pumped around by an electric motor
remote	a device that can be used to control a machine or apparatus from a distance
rider_other	a human relying upon a segway, skateboards, electric scooter, lawn mower, rickshaw, wheelchair etc (any other device that is not a motorcycle, bicycle, or vehicle)for movement on land.
river_lake	River or lake. River: a large natural stream of water, with possible rapids. Lake: a body of (usually fresh) water surrounded by land. Lakes generally have very still water. Should appear naturally formed.
road	a long, narrow stretch with a smoothed or paved surface, made for traveling by motor vehicle, between two or more points; street or highway. Often bounded by curbs.
road_barrier	concrete safety barriers on roads and highways; also known as Jersey barrier
rock	a lump or mass of hard consolidated mineral matter
rug_floormat	a floor covering of thick woven material or animal skin– just with limited size, covers only a small section of the floor. Includes rugs, and floormats (door mats, bath mat). Does not include carpet.
runway	a strip of level paved surface where planes can take off,land, or taxi from a terminal;or a narrow platform extending from the stage into the audience in a theater or nightclub etc.
sandwich	two (or more) slices of bread with a filling between them
scissors	an edge tool having two crossed pivoting blades
sconce	A decorative wall bracket/object for holding light bulbs, candles or other sources of light. Must be attached to the side of a wall.
sculpture	statue or a three-dimensional work of art
sea	A sea or ocean. Often has large amounts of waves/surf and a sandy beach. May have surfers on surfboards. Unlike a river or a lake, one generally cannot see the far bank of a sea/ocean.
seat	a space reserved for sitting (as in a theater, auditorium, stadium, on a train or airplane, or in a car). (that is not a bench,chair, stool, sofa, or armchair)
sheep	woolly usually horned ruminant mammal related to the goat
shelf
shower
shower_curtain	fabric found next to a bathtub or shower to keep water from spilling out into the bathroom
sidewalk_pavement	walk consisting of a paved area for pedestrians; usually beside a street or roadway (not a road)
sink	plumbing fixture consisting of a water basin fixed to a wall or floor and having a drain-pipe (usually in a bathroom or kitchen)
skateboard	a board with wheels that is ridden in a standing or crouching position and propelled by foot
skis	sports equipment for skiing on snow
sky	the atmosphere and outer space as viewed from the earth
slow wheeled object	any slow-moving wheeled object, such as strollers, wheelchairs, and carts
snow	a layer of snowflakes (white crystals of frozen water) covering the ground
snowboard	a board that resembles a broad ski or a small surfboard; used in a standing position to slide down snow-covered slopes
spoon	a piece of cutlery with a shallow bowl-shaped container and a handle; used to stir or serve or take up food
sports ball	round object that is hit or thrown or kicked in games
stage	a large platform on which people can stand and can be seen by an audience
stairs
stool	a simple seat without a back or arms; or a high seat at a bar without arms
storage_tank	a large (usually metallic) vessel for holding gases or liquids
stove
streetlight	a lamp or light supported on a lamppost/pole; for illuminating a street
suitcase	a portable rectangular container for carrying clothes
surfboard	a narrow buoyant board for riding surf
swimming_pool	pool that provides a facility for swimming
swivel_chair	a chair that swivels on its base. Can be turned around a central point to face in a different direction without moving the legs
table	A piece of furniture consisting of a smooth flat slab fixed on legs. A piece of furniture with a flat top and one or more legs, providing a level surface on which objects may be placed, and that can be used for such purposes as eating, writing, working, or playing games.
teddy_bear	Stuffed toy in the form of a bear. It is a type of plaything/child’s toy (usually plush and stuffed with soft materials).
television
tennis racket	a racket used to play tennis
tent	a portable shelter (usually of canvas stretched over supporting poles and fastened to the ground with ropes and pegs). For example, a camping tent.
terrain	dirt, sand, grass, or any kind of horizontally spreading vegetation
tie	neckwear consisting of a long narrow piece of material worn (mostly by men) under a collar and tied in knot at the front
toaster
toilet	a plumbing fixture for defecation and urination
toothbrush	small brush; has long handle;used to clean teeth
towel	a thick, rectangular piece of absorbent cloth for drying or wiping
traffic light	a visual signal to control the flow of traffic at intersections (includes stoplights, pedestrian crossing lights, etc)
traffic sign	a sign usually on the side of a street or highway bearing symbols or words of warning or direction to motorists or pedestrians
trailer
train	All vehicles that move on rails, e.g. trams, trains.
trash can	a bin that holds rubbish until it is collected
tray	a flat, shallow container with a raised rim, typically used for carrying food and drink, or for holding small items.
truck
tunnel	a passageway through or under something, usually underground (especially one for trains or cars)
umbrella	a lightweight collapsible canopy. Could be handheld or mounted above a table outdoors.
vase	an open jar of glass, porcelain,or metal used as an ornament or to hold flowers
vegetation	Trees, hedges, and all kinds of vertically growing vegetation (for example, must be higher than lawn grass)
wall	Either: one of the sides of a room or building connecting the floor and ceiling. Or: a masonry fence around a garden,park, or estate.
wardrobe	a tall piece of furniture that provides storage space for clothes; has a door and rails or hooks for hanging clothes
washer-dryer	a home appliance for washing or drying clothes and linens automatically
waterfall	a steep descent of the water of a river
water_other	all other kinds of water, e.g. puddles, ditchwater, aquariums, shallow artificially made ponds, zoo ponds, flooded areas, bathtub water, water coming out of a fire hydrant, water from a kitchen faucet, etc.is not a lake, river, fountain,swimming pool, or sea.
whiteboard	a hard smooth white surface used for writing or drawing on with markers.
window	a framework of wood or metal that contains a glass window-pane and is built into a wall or roof to admit light or air
window-blind	A blind for privacy or to keep out light. Placed indoors on a window.
wine_glass	a glass that has a stem and in which wine is served
zebra	any of several fleet black-and-white striped African equines

Code/Model/Publication URL : MSeg: A Composite Dataset for Multi-domain Semantic Segmentation

Source URL: MSeg

Dataset Examples :

Credits : MSeg: A Composite Dataset for Multi-domain Semantic Segmentation

26. PASCAL-Context Dataset

Dataset Characteristics :

Task Type: semantic segmentation, object segmentation

Image Count : 10,103

Label Count : 4,203

Categories : 540

Code/Model/Publication URL: The Role of Context for Object Detection and Semantic Segmentation in the Wild

Source URL: PASCAL-Context Dataset

Dataset Examples :

Credits : PASCAL-Context Dataset

27. LVIS: A DATASET FOR LARGE VOCABULARY INSTANCE SEGMENTATION

Dataset Characteristics :

Task Type: Instance segmentation

Image Count: ~160k images

Label Count: ~2M instance annotations

Categories: 1203

Code/Model/Publication URL : LVIS: A Dataset for Large Vocabulary Instance Segmentation

Source URL: LVIS Dataset

Dataset Examples :

Credits: LVIS Dataset

28. Flower Datasets

Dataset Characteristics :

Task Type: flower segmentation

Image Count : Dataset 1 : 1360 images consisting of 17 flower species

Dataset 2 : 8,189 images consisting of 102 flower species

Categories : 119 categories

Dataset 1 :

Flowers Class

Buttercup

Colts’ Foot

Daffodil

Dandelion

Daisy

fritillary

Iris

Pansy

Sunflower

Windflower

Snowdrop

Lily Valley

Bluebell

Crocus

Tigerlily

Tulip

Cowslip

Dataset 2 :

	#images		#images		#images
alpine sea holly	43	buttercup	71	fire lily	40
anthurium	105	californian poppy	102	foxglove	162
artichoke	78	camellia	91	frangipani	166
azalea	96	canna lily	82	fritillary	91
ball moss	46	canterbury bells	40	garden phlox	45
balloon flower	49	cape flower	108	gaura	67
barbeton daisy	127	carnation	52	gazania	78
bearded iris	54	cautleya spicata	50	geranium	114
bee balm	66	clematis	112	giant white arum lily	56
bird of paradise	85	colt’s foot	87	globe thistle	45
bishop of llandaff	109	columbine	86	globe-flower	41
black-eyed susan	54	common dandelion	92	grape hyacinth	41
blackberry lily	48	corn poppy	41	great masterwort	56
blanket flower	49	cyclamen	154	hard-leaved pocket orchid	60
bolero deep blue	40	daffodil	59	hibiscus	131
bougainvillea	128	desert-rose	63	hippeastrum	76
bromelia	63	english marigold	65	japanese anemone	55
king protea	49	peruvian lily	82	stemless gentian	66
lenten rose	67	petunia	258	sunflower	61
lotus	137	pincushion flower	59	sweet pea	56
love in the mist	46	pink primrose	40	sweet william	85
magnolia	63	pink-yellow dahlia?	109	sword lily	130
mallow	66	poinsettia	93	thorn apple	120
marigold	67	primula	93	tiger lily	45
mexican aster	40	prince of wales feathers	40	toad lily	41
mexican petunia	82	purple coneflower	85	tree mallow	58
monkshood	46	red ginger	42	tree poppy	62
moon orchid	40	rose	171	trumpet creeper	58
morning glory	107	ruby-lipped cattleya	75	wallflower	196
orange dahlia	67	siam tulip	41	water lily	194
osteospermum	61	silverbush	52	watercress	184
oxeye daisy	49	snapdragon	87	wild pansy	85
passion flower	251	spear thistle	48	windflower	54
pelargonium	71	spring crocus	42	yellow iris	49

Credits : 102 Category Flower Dataset

Code/Model/Publication URL: Automated flower classification over a large number of classes

Source URL: Flower Dataset

Dataset Examples :

Credits: Automated flower classification over a large number of classes

29. NYU Depth Dataset V2

Dataset Characteristics :

Task Type: Dense Semantic Image Segmentation, Indoor segmentation

Image Attribute: 1449 densely labeled pairs of aligned RGB and depth images

464 different indoor scenes across 26 scene classes

5,064 distinct objects,spanning 894 different classes

Code/Model/Publication URL: Indoor Segmentation and Support Inference from RGBD Images

Source URL: NYU Depth Dataset V2

Dataset Examples :

Credits: NYU Depth Dataset V2

30. Fashionista

Dataset Characteristics :

Task Type: Clothing Estimation

Image Attribute : 158,235 fashion photos

Categories :

Garments
background(null)	top	cardigan
skin	skirt	blazer
hair	jacket	t-shirt
dress	coat	socks
bag	shirt	necklace
blouse	shoes	bracelet
purse	accessories	boots
sweater	jumper	cape
leggings	romper	vest
belt	tights	jeans
heels	wedges	stockings
hat	shorts

Code/Model/Publication URL: Parsing Clothing in Fashion Photographs

Source URL: Clothing Parsing

Dataset Examples :

Credits : Parsing Clothing in Fashion Photographs

31. UNIMIB Food Database

Dataset Characteristics :

Task Type: food segmentation

Image Attribute: 1,027 canteen trays for a total of 3,616 food instances belonging to 73 food classes.

Segmented images of the 73 food categories

Code/Model/Publication URL: Food recognition: a new dataset, experiments, and results

Source URL: UNIMIB Food Database

Dataset Examples :

Credits: Food recognition: a new dataset, experiments, and results

32. SYNSCAPES

Dataset Characteristics :

Task Type: Semantic segmentation

Image Attribute: 25,000 RGB images in PNG format at 1440×720 resolution

Categories: 19

Classes
Road	SideWalk
Building	Wall
Fence	Pole
Tr.Sign	Tr. Light
Vegetation	Terrain
Sky	Person
Rider	Car
Truck	Bus
Train	Motorcycle
Bicycle

Code/Model/Publication URL : Synscapes: A Photorealistic Synthetic Dataset for Street Scene Parsing

Source URL: SYNSCAPES

Dataset Examples :

Credits : Synscapes: A Photorealistic Synthetic Dataset for Street Scene Parsing

33. YouTube-VOS

Dataset Characteristics :

Task Type :

- Semi-supervised Video Object Segmentation
- Video Instance Segmentation

Video Attribute :

- 4000+ high-resolution YouTube videos
- 90+ semantic categories
- 7800+ unique objects
- 190k+ high-quality manual annotations
- 340+ minutes duration

Object categories in YouTube-VOS
person	cat	train	hedgehog	squirrel
table	ape	snake	owl	eagle
rope	camera	parrot	zebra	plant
snail	chameleon	watch	giant_panda	giraffe
airplane	toilet	box	stuffed_toy	tissue
sedan	bear	bus	camel	guitar
lizard	fox	shark	frisbee	kangaroo
microphone	duck	leopard	tiger	whale
cloth	cup	dog	elephant	surfboard
knife	bottle	shovel	skateboard	horse
earless_seal	tennis_racket	small_panda	flag	monkey
others	frog	crocodile	spider	mirror
sheep	deer	mouse	umbrella	ball
ring	fish	motorbike	boat	paddle
jellyfish	necklace	rabbit	turtle	snowboard
raccoon	eyeglasses	ant	hat	bird
penguin	parachute	backpack	cow	truck
lion	bucket	butterfly	hand	dolphin
sign	bike	handbag

Code/Model/Publication URL :

- YouTube-VOS: A Large-Scale Video Object Segmentation Benchmark
- Video Instance Segmentation

Source URL : YouTube-VOS

Dataset Examples :

Credits : Video Instance Segmentation

34. The Lane Marker Dataset

Dataset Characteristics :

Task Type : binary marker segmentation, lane-dependent pixel-level segmentation

Image Attribute : 100,042 labeled lane marker images from about 350 km recorded drives

Code/Model/Publication URL : Unsupervised Labeled Lane Markers Using Maps

Source URL : The Lane Marker Dataset

Dataset Examples :

Credits : Unsupervised Labeled Lane Markers Using Maps

35. Total-Text

Dataset Characteristics :

Task Type : Scene Text Detection, Segmentation-based text detection

Image Attribute : It has 1,555 scene images, 4,265 curved text, 9,330 annotated words with 3 different text orientations including horizontal, multi-oriented, and curved text.

Code/Model/Publication URL : Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition

Source URL : The Lane Marker Dataset

Dataset Examples :

Credits : Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition

36. MVTec D2S: Densely Segmented Supermarket Dataset

Dataset Characteristics :

Task Type : instance-aware semantic segmentation in an industrial domain

Image Attribute : 21,000 images in 700 different scenes with various backgrounds, clutter objects, and occlusion levels.

Categories : 60

adelholzener_alpenquelle_classic_075	carrot
adelholzener_classic_naturell_02	grapes_green_sugraone_seedless
apple_roter_boskoop	adelholzener_classic_bio_apfelschorle_02
banana_single	salad_iceberg
caona_kakaohaltiges_getraenkepulver	banana_bundle
clementine	lettuce
clementine_single	avocado
coca_cola_light_05	apple_golden_delicious
corny_nussvoll_single	augustiner_lagerbraeu_hell_05
douwe_egberts_professional_kaffee_gemahalen	corny_schoko_banana_single
ethiquable_gruener_tee_ceylon	cocoba_fruehstueckskakao_mit_honing
franken_tafelreinigar	adelholzener_gourmet_mineralwasser_02
gepa_bio_und_fair_fencheltee	grapes_sweet_celebration_seedless
gepa_bio_und_fair_pfefferminztee	coca_cola_05
koelln_muesli_schoko	tegernseer_hell_03
pasta_reggia_fusilli	koelln_muesli_fruechte
pasta_reggia_spaghetti	feldsalat
rispentomaten	gepa_bio_und_fair_kamillentee
roma_rispentomaten	dr_oetker_vitalis_knuspermuesli_klassisch
suntory_gokuri_limonade	cucumber
pear	orange_single
kilimanjaro_tea_earl_grey	kiwi
zucchini	apple_granny_smith
pelikan_tintenpatrone_canon	gepa_bio_caffe_crema
cafe_wunderbar_espresso	gepa_italienischer_bio_espresso
gepa_bio_und_fair_rooibostee	corny_nussvoll
augustiner_weissbier_05	pasta_reggia_elicoidali
corny_schoko_banana	rucola
adelholzener_alpenquelle_naturell_075	apple_braeburn_bundle
gepa_bio_und_fair_kraeuterteemischung	oranges

Code/Model/Publication URL : MVTec D2S: Densely Segmented Supermarket Dataset

Source URL : MVTec D2S Dataset

Dataset Examples :

Credits : MVTec D2S: Densely Segmented Supermarket Dataset

37. CORe50

Dataset Characteristics :

Task Type : image segmentation

Video Attribute : 50 domestic objects videos collected in 11 distinct sessions (8 indoor and 3 outdoor) characterized by different backgrounds and lighting.

Categories : 10

plug adapters

mobile phones

scissors

light bulbs

cans

glasses

balls

markers

cups

remote controls

Code/Model/Publication URL : CORe50: a New Dataset and Benchmark for Continuous Object Recognition

Source URL : CORe50

Dataset Examples :

Credits : CORe50: a New Dataset and Benchmark for Continuous Object Recognition

38. Fashionpedia

Dataset Characteristics :

Task Type : Apparel instance segmentation

Image Attribute : 48,825 clothing images in daily-life, street-style,celebrity events, runway, and online shopping with 46 apparel categories and 294 attributes

Code/Model/Publication URL : Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset

Source URL : Fashionpedia

Dataset Examples :

Credits : Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset

39. The Oxford-IIIT Pet Dataset

Dataset Characteristics :

Task Type : image segmentation

Image Attribute : 7349 images of cats and dogs

Categories :

Code/Model/Publication URL : Cats and Dogs

Source URL : The Oxford-IIIT Pet Dataset

Dataset Examples :

Credits : The Oxford-IIIT Pet Dataset

40. BDD100K: A Large-scale Diverse Driving Video Database

Dataset Characteristics :

Task Type : drivable area segmentation, semantic segmentation, instance segmentation, multi-object segmentation tracking

Video Attribute : 100K videos and 10 tasks

Code/Model/Publication URL : BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

Source URL : BDD100K

Dataset Examples :

Credits : BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

41. MOTS Dataset

Dataset Characteristics :

Task Type : Multi-Object Tracking and Segmentation

Video Attribute : 8 challenging video sequences (4 training, 4 test) in unconstrained environments filmed with both static and moving cameras

Code/Model/Publication URL : MOTS: Multi-Object Tracking and Segmentation

Source URL : MOTS

Dataset Examples :

Credits : MOTS: Multi-Object Tracking and Segmentation

42. KITTI MOTS Dataset

Dataset Characteristics :

Task Type : Multi-Object Tracking and Segmentation

Video Attribute : 21 training sequences and 29 test sequences

Code/Model/Publication URL : MOTS: Multi-Object Tracking and Segmentation

Source URL : KITTI-MOTS

Dataset Examples :

Credits : MOTS: Multi-Object Tracking and Segmentation

43. ApolloScape Dataset

Dataset Characteristics :

Task Type : Lanemark segmentation

Image Attribute : 100K image frames, 80k lidar point cloud and 1000km trajectories for urban traffic

Code/Model/Publication URL : DVI: Depth Guided Video Inpainting for Autonomous Driving

Source URL : ApolloScape

Dataset Examples :

Credits : Toolkit for ApolloScape Dataset

44. Camouflaged Object (CAMO) dataset

Dataset Characteristics :

Task Type : camouflaged object segmentation

Image Attribute : 1250 images under two categories, i.e., naturally camouflaged objects and artificially camouflaged objects.

Categories : Camouflaged animals consist of amphibians, birds, insects, mammals, reptiles, and underwater animals in various environments, i.e., ground, underwater, desert, forest, mountain, and snow.

Camouflaged human falls into soldiers on the battlefields and human body painting arts.

Code/Model/Publication URL : Anabranch Network for Camouflaged Object Segmentation

Source URL : Camouflaged Object Segmentation

Dataset Examples :

Credits : Anabranch Network for Camouflaged Object Segmentation

45. Berkeley Segmentation Data Set and Benchmarks 500 (BSDS500)

Dataset Characteristics :

Task Type : image segmentation

Image Attribute : 300 images are used for training / validation and 200 fresh images, together with human annotations, are added for testing

Code/Model/Publication URL : Contour Detection and Hierarchical Image Segmentation

Source URL : Berkeley Segmentation Data Set and Benchmarks 500 (BSDS500)

Dataset Examples :

Credits : Contour Detection and Hierarchical Image Segmentation

46. Microsoft Research Cambridge Object Recognition Image Database (MSRC v1 & v2)

Dataset Characteristics :

Task Type : image segmentation, semantic segmentation

Image Attribute : v1 : 240 images and 9 object classes with coarse pixel-wise labeled images.

v2 : 591 images, 23 object classes with accurate pixel-wise labeled images.

Code/Model/Publication URL : Microsoft Research Cambridge Object Recognition Image Database

Source URL : Microsoft Research Cambridge Object Recognition Image Database

Dataset Examples :

Credits : Microsoft Research Cambridge Object Recognition Image Database

47. Transmission Electron Microscopic (TEM) Cell Recordings Segmentation Dataset

Dataset Characteristics :

Task Type : automatic segmentations for mitochondria

Image Attribute : A minimum of 10000 cells were recorded under five classes (background, cytoplasm,nucleus, mitochondria and vesicles)

Code/Model/Publication URL : Semi-automatic procedure for the determination of the cell surface area used in systems immunology

Source URL : Transmission Electron Microscopic (TEM) Cell Recordings Segmentation Dataset

Dataset Examples :

Credits : Semi-automatic procedure for the determination of the cell surface area used in systems immunology

48. Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59)

Dataset Characteristics :

Task Type : Motion Segmentation

Video Attribute : 59 video sequence with 720 frames annotated

Code/Model/Publication URL : Segmentation of moving objects by long term video analysis

Source URL : Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59)

Dataset Examples :

Credits : Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59)

49. Georgia Tech Segmentation and Tracking Dataset

Dataset Characteristics :

Task Type : target segmentation and tracking in video

Video Attribute : 6 videos, 243 frames

parachute	girl
monkeydog	penguin
birdfall	cheetah

Code/Model/Publication URL : Motion Coherent Tracking Using Multi-label MRF Optimization

Source URL : GaTech SegTrack

Dataset Examples :

Credits : GaTech SegTrack

50. Video Segmentation Benchmark (VSB100) Dataset

Dataset Characteristics :

Task Type : Video Segmentation

Video Attribute : 100 HD quality videos with ground truth annotations

Code/Model/Publication URL : A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis

Source URL : Video Segmentation Benchmark (VSB100) Dataset

Dataset Examples :

Credits : A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis

51. Scene Flow Datasets

Dataset Characteristics :

Task Type : Object-level and material-level segmentation

Image Attribute : more than 39000 stereo frames in 960×540 pixel resolution

Code/Model/Publication URL : A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation

Source URL : Scene Flow Datasets

Dataset Examples :

Credits : A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation

52. FreiHAND Dataset

Dataset Characteristics :

Task Type : hand pose and shape estimation from single color image

Image Attribute : 130240 training and 3960 evaluation samples with Hand segmentation mask (224×224 pixels)

Code/Model/Publication URL : FreiHAND: A Dataset for Markerless Capture of Hand Pose and Shape from Single RGB Images

Source URL : FreiHAND Dataset

Dataset Examples :

Credits : FreiHAND: A Dataset for Markerless Capture of Hand Pose and Shape from Single RGB Images

53. YouTube-Objects Dataset

Dataset Characteristics :

Task Type : video segmentation, foreground object segmentation

Video Attribute : 26 web videos with 10 object classes and more than 20,000 frames

aeroplane	bird
boat	car
cat	cow
dog	horse
motorbike	train

Code/Model/Publication URL : Supervoxel-Consistent Foreground Propagation in Video

Source URL : YouTube-Objects Segmentation Labels

Dataset Examples :

Credits : Supervoxel-Consistent Foreground Propagation in Video

54. SegTrack v2 Dataset

Dataset Characteristics :

Task Type : video segmentation

Video Attribute : 1,000 frames with pixel-level annotations for 14 categories

Girl	Birdfall
Parachute	Cheetah
Monkeydog	Penguin
Drifting Car	Hummingbird
Frog	Worm
Soldier	Monkey
Bird of Paradise	BMX

Code/Model/Publication URL : Video Segmentation by Tracking Many Figure-Ground Segments

Source URL : SegTrack v2 Dataset

Dataset Examples :

Credits : SegTrack v2 Dataset

55. Human semantic part segmentation Dataset

Dataset Characteristics :

Task Type : ground and aerial robot segmentation

Image Attribute : two hundred and one images of six different people in multiple viewpoints

Categories : 14 body parts

Head	Torso
Left Upper arm	Left Lower arm
Left hand	Right Upper hand
Right Lower arm	Right hand
Right Upper leg	Right Lower leg
Right foot	Left Upper leg
Left Lower leg	Left foot

Code/Model/Publication URL : Deep Learning for Human Part Discovery in Images

Source URL : Human semantic part segmentation Dataset

Dataset Examples :

Credits : Deep Learning for Human Part Discovery in Images

56. Cambridge-driving Labeled Video Database (CamVid)

Dataset Characteristics :

Task Type : semantic segmentation

Image Attribute : per-pixel semantic segmentation of over 700 images

Categories : 32 semantic classes

List of the 32 object class names and their corresponding colors used for labeling.

Code/Model/Publication URL : Semantic Object Classes in Video: A High-Definition Ground Truth Database

Source URL : Human semantic part segmentation Dataset

Dataset Examples :

Credits : Semantic Object Classes in Video: A High-Definition Ground Truth Database

57. Cityscapes Dataset

Dataset Characteristics :

Task Type : instance segmentation, semantic segmentation

Image Attribute : 25k Images

Group	Classes
flat	road, sidewalk, parking, rail track
human	person, rider
vehicle	car, truck, bus, on rails, motorcycle, bicycle, caravan, trailer
construction	building, wall, fence, guard rail, bridge, tunnel
object	pole, pole group, traffic sign, traffic light
nature	vegetation, terrain
sky	sky
void	ground, dynamic, static

*require institutional login

Code/Model/Publication URL : Semantic Understanding of Urban Street Scenes

Source URL : Cityscapes Dataset

Dataset Examples :

Credits : Cityscapes Dataset

58. Multi-Human Parsing (MHP) v1.0 Dataset

Dataset Characteristics :

Task Type : multi-human parsing

Image Attribute : 4,980 images consist of 14,969 person instances with fine-grained annotations at pixel-level

Categories : 18 different semantic labels

Face	Hair
Upper Clothes	Right Arm
Left Arm	Pants
Left Shoe	Right Shoe
Right Leg	Left Leg
Torso Skin	Dress
Hat	Skirt
Bag	Sun Glasses
Belt	Scarf

Code/Model/Publication URL : Multi-Human Parsing in the Wild

Source URL : Multi-Human Parsing (MHP) v1.0 Dataset

Dataset Examples :

Credits : Multi-Human Parsing in the Wild

59. Mut1ny head/face segmentation dataset

Dataset Characteristics :

Task Type : head/face segmentation

Image Attribute : 16.5k (16557) fully pixel-level labeled segmentation images

Categories : 11 facial labels

Lips	Eyes
Nose	Hair
Ears	Eyebrows
Teeth	General face
Facial hair	Specs/sunglasses
Background/undefined

Source URL : Mut1ny head/face segmentation dataset

Dataset Examples :

Credits : Mut1ny head/face segmentation dataset

60. Semantic Drone Dataset

Dataset Characteristics :

Task Type : semantic understanding of urban scenes for increasing the safety of autonomous drone flight and landing procedures

Image Attribute : The training set contains 400 publicly available images and the test set is made up of 200 private images.

Semantic classes of the Drone Dataset

tree
grass
other vegetation
dirt
gravel

rocks
water
paved area
pool
person

dog
car
bicycle
roof
wall

fence
fence-pole
window
door
obstacle

Source URL : Semantic Drone Dataset

Dataset Examples :

Credits : Semantic Drone Dataset

61. The SYNTHIA dataset

Dataset Characteristics :

Task Type : semantic segmentation

Image Attribute : +200,000 HD images from video streams and +20,000 HD images from independent snapshots

Categories : pixel-level semantic annotations for 13 classes

sky	building
road	sidewalk
fence	vegetation
lane-marking	pole
car	traffic signs
pedestrians	cyclists
miscellaneous

Code/Model/Publication URL : The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes

Source URL : The SYNTHIA dataset

Dataset Examples :

Credits : The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes

62. RailSem19

Dataset Characteristics :

Task Type : A Dataset for Semantic Rail Scene Understanding

Image Attribute : 8500 unique images having 110000 annotation taken from a the ego-perspective of a rail vehicle. Over 1000 examples with railway crossings and 1200 tram scenes.

Categories : pixel-level semantic annotations for 13 classes

buffer-stop	crossing
guardrail	train-car
platform	rail
switch-ind.	switch-left
switch-right	switch-unknown
switch-static	track-sign-front
track-signal-back	track-signal-front
person-group	truck
car	fence
person	pole
rail-occluder

Code/Model/Publication URL : RailSem19: A Dataset for Semantic Rail Scene Understanding

Source URL : RailSem19

Dataset Examples :

Credits : RailSem19: A Dataset for Semantic Rail Scene Understanding

63. DeepFashion: In-shop Clothes Retrieval Dataset

Dataset Characteristics :

Task Type : instance semantic segmentation

Image Attribute : 7,982 number of clothing items, 52,712 number of in-shop clothes images, and ~200,000 cross-pose/scale pairs;

Code/Model/Publication URL : DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations

Source URL : DeepFashion: In-shop Clothes Retrieval

Dataset Examples :

Credits : DeepFashion: In-shop Clothes Retrieval

64. Indian Driving Dataset

Dataset Characteristics :

Task Type : road scene understanding in unstructured environments

Image Attribute : 10,004 images, finely annotated with 34 classes collected from 182 drive sequences on Indian roads

road	parking
drivable fallback	ground
sidewalk	rail track
non-drivable fallback	train
person	animal
rider	motorcycle
bicycle	autorickshaw
car	truck
bus	caravan
trailer	train
vehicle fallback	curb
wall	fence
guard rail	billboard
traffic sign	traffic light
pole	polegroup
obs-str-bar-fallback	building
bridge	tunnel
vegetation	sky
fallback background

Code/Model/Publication URL : IDD: A Dataset for Exploring Problems of Autonomous Navigation in Unconstrained Environments

Source URL : Indian Driving Dataset

Dataset Examples :

Credits : IDD: A Dataset for Exploring Problems of Autonomous Navigation in Unconstrained Environments

65. A2D2 Dataset

Dataset Characteristics :

Task Type : Instance Segmentation, Semantic Segmentation

Image Attribute : 41,277 camera images are semantically labelled.

Natural Object	Sky
RD normal street	Building
Car	SideWalk
Ego car	Truck
Grid Structure	Road Blocks
Drivable cobblestone	Curbstone
Solid line	Non-drivable street
Poles	Dashed lines
Irrelevant Signs	Traffic sign
Parking area	Obstacles/trash
RD restricted area	Traffic guide obj
Slow Drive area	Signal corpus
Pedestrian	Bicycle
Painted driv. instr	utility vehicle
traffic signal	Sidebars
Small vehicles	zebra crossing
Electronic traffic	Tractor
Blurred Area	Animals
Speed bumper	Rain dirt

Code/Model/Publication URL : A2D2: Audi Autonomous Driving Dataset

Source URL : A2D2 Dataset

Dataset Examples :

Credits : A2D2: Audi Autonomous Driving Dataset

66. Mapillary Vistas Dataset

Dataset Characteristics :

Task Type : semantic image segmentation and instance-specific image segmentation

Image Attribute : 25 000 high-resolution images annotated into 66 object categories with additional, instance-specific labels for 37 classes.

Bird	Ground Animal
Crosswalk-Plain	Person
Bicyclist	Motorcyclist
other rider	Lane Marking - Crosswalk
Banner	Bench
Bike Rack	Billboard
Catch Basin	CCTV Camera
Fire Hydrant	Junction Box
Mailbox	Manhole
Phone Booth	Street Light
Pole	Traffic Sign Frame
Utility Pole	Traffic Light
Traffic Sign (Back)	Traffic Sign (Front)
Trash Can	Bicycle
Boat	Bus
Car	Caravan
Motorcycle	Other Vehicle
Trailer	Truck
Wheeled Slow

*require institutional login

Code/Model/Publication URL : The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes

Source URL : Mapillary Vistas Dataset

Dataset Examples :

Credits : The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes

67. eTRIMS Image Database

Dataset Characteristics :

Task Type : object segmentation and class segmentation of real world street scene.

Image Attribute : The database is comprised of two datasets,

The 4-Class eTRIMS Dataset with 4 annotated object classes with 60 images and 534 annotated objects.

sky
building
vegetation
pavement/road

The 8-Class eTRIMS Dataset with 8 annotated object classes with 60 images and 1702 annotated objects.

sky
building
vegetation
pavement
window
door
car
road

Code/Model/Publication URL : eTRIMS Image Database for Interpreting Images of Man-Made Scenes

Source URL : eTRIMS Image Database

Dataset Examples :

Credits : eTRIMS Image Database

68. Daimler Urban Segmentation Dataset

Dataset Characteristics :

Task Type : image segmentation of highly cluttered urban traffic scenes

Image Attribute : 5000 rectified stereo image pairs with a resolution of 1024×440 with pixel-level semantic class annotations into 5 classes: ground, building, vehicle, pedestrian, sky

Code/Model/Publication URL : Efficient Multi-Cue Scene Segmentation

Source URL : Daimler Urban Segmentation Dataset

Dataset Examples :

Credits : Stixmantics: A Medium-Level Model for Real-Time Semantic Scene Understanding

69. KITTI Motion Segmentation Dataset

Dataset Characteristics :

Task Type : Motion Segmentation

Image Attribute : six sequences from KITTI raw data to generate a total of 1750 frames. In addition to these frames, 200 frames from KITTI scene flow are used to provide us with 1950 frames in total.

Code/Model/Publication URL : MODNet: Moving Object Detection Network with Motion and Appearance for Autonomous Driving

Source URL : KITTI Motion Segmentation Dataset

Dataset Examples :

Credits : MODNet: Moving Object Detection Network with Motion and Appearance for Autonomous Driving

70. CUHK DeepFashion2 Dataset

Dataset Characteristics :

Task Type : Cloth Segmentation

Image Attribute : 491K diverse images of 13 popular clothing categories, 801K clothing clothing items, 873K Commercial-Consumer clothes pairs

short sleeve top	long sleeve top
short sleeve outwear	long sleeve outwear
vest	sling
shorts	trousers
skirt	short sleeve dress
long sleeve dress	vest dress
sling dress

Code/Model/Publication URL : DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation,Segmentation and Re-Identification of Clothing Images

Source URL : DeepFashion2

Dataset Examples :

Credits : DeepFashion2

71. WildDash 2 Dataset

Dataset Characteristics :

Task Type : semantic and instance segmentation for the automotive domain

Image Attribute : 4256 public frames

Categories : pixel-level semantic annotations for 20 classes

Code/Model/Publication URL : WildDash – Creating Hazard-Aware Benchmarks

Source URL : WildDash 2 Dataset

Dataset Examples :

Credits : WildDash – Creating Hazard-Aware Benchmarks

72. CAD 120 affordance dataset

Dataset Characteristics :

Task Type: semantic image segmentation

Image Attribute: contain 3090 images containing 9916 object instances

The images contain the following object classes:
1.table
2.kettle
3.plate
4.bottle
5.thermal cup
6.knife
7.medicine box
8.can
9.microwave
10.paper box
11.bowl
12.mug

Affordances in the dataset:
1.openable
2.cuttable
3.pourable
4.containable
5.supportable
6.holdable

Code/Model/Publication URL: Weakly Supervised Affordance Detection

Source URL : CAD 120 Affordance Segmentation Dataset

Dataset Examples :

Credits : Weakly Supervised Affordance Detection

73. The Aberystwyth Leaf Evaluation Dataset

Dataset Characteristics :

Task Type : Leaf Segmentation

Image Attribute : 4 sets of 20 Arabidopsis Thaliana plants have been grown in trays Images of each tray are taken in a 15 minute timelapse sequence using a robotic greenhouse system.

There are 56 annotated ground truth images containing 916 hand-marked up individual arabidopsis plants.

Code/Model/Publication URL : The Aberystwyth Leaf Evaluation Dataset: A plant growth visible light image dataset of Arabidopsis thalian

Source URL : The Aberystwyth Leaf Evaluation Dataset

Dataset Examples :

Credits : The Aberystwyth Leaf Evaluation Dataset: A plant growth visible light image dataset of Arabidopsis thalian

74. Shadow Detection/Texture Segmentation Computer Vision Dataset

Dataset Characteristics :

Task Type : Texture Segmentation

Image Attribute : 53 outdoor images with objects influenced by shadows

Code/Model/Publication URL : Shadow free segmentation in still images using local density measure

Source URL : Shadow Detection/Texture Segmentation Computer Vision Dataset

Dataset Examples :

Credits : Shadow free segmentation in still images using local density measure

75. CITY-OSM – ETH Zurich Dataset

Dataset Characteristics :

Task Type : Aerial Image Segmentation

Image Attribute : Four large datasets were downloaded from Google Maps and OSM, for the cities of Chicago, Paris, Zurich, and Berlin.Additionally and also downloaded a somewhat smaller dataset for the city of Potsdam.

Code/Model/Publication URL : Learning Aerial Image Segmentation from Online Maps

Source URL : CITY-OSM – ETH Zurich Dataset

Dataset Examples :

Credits : Learning Aerial Image Segmentation from Online Maps

76. The TUD Crossing dataset

Dataset Characteristics :

Task Type : pedestrian instance segmentation

Image Attribute : 201 images with 1008 highly overlapping pedestrians and consisting of 1216 pedestrian instances with densely segmented overlapping pedestrians.

Code/Model/Publication URL : Hough Regions for Joining Instance Localization and Segmentation

Source URL : The TUD Crossing dataset

Dataset Examples :

Credits : Hough Regions for Joining Instance Localization and Segmentation

77. Inria Aerial Image Labeling Dataset

Dataset Characteristics :

Task Type : semantic segmentation on aerial images

Image Attribute : Coverage of 810 km² (405 km² for training and 405 km² for testing) having Ground truth data for two semantic classes: building and not building

Code/Model/Publication URL : CAN SEMANTIC LABELING METHODS GENERALIZE TO ANY CITY?THE INRIA AERIAL IMAGE LABELING BENCHMARK

Source URL : Inria Aerial Image Labeling Dataset

Dataset Examples :

Credits : CAN SEMANTIC LABELING METHODS GENERALIZE TO ANY CITY?THE INRIA AERIAL IMAGE LABELING BENCHMARK

78. Zurich Summer Dataset

Dataset Characteristics :

Task Type : Semantic segmentation of urban scene

Image Attribute : 20 multi-spectral VHR images acquired over the city of Zurich(Switzerland) by the QuickBird satellite in 2002. The average image size is 1000×1150 pixels (approximately 23Mpixels in total)

Categories : 8 different urban and periurban classes : Roads, Buildings, Trees, Grass, Bare Soil, Water, Railways and Swimming pools

Code/Model/Publication URL : Semantic segmentation of urban scenes by learning local class interactions

Source URL : Zurich Summer Dataset

Dataset Examples :

Credits : Semantic segmentation of urban scenes by learning local class interactions

79. Pedestrian Color Naming Dataset

Dataset Characteristics :

Task Type : image segmentation

Image Attribute : 14,213 images, each of which hand-labeled with color label for each pixel

Code/Model/Publication URL : Pedestrian Color Naming via Convolutional Neural Network

Source URL : Pedestrian Color Naming Dataset

Dataset Examples :

Credits : Pedestrian Color Naming via Convolutional Neural Network

80. Human Skin Segmentation Dataset

Dataset Characteristics :

Task Type : human skin segmentation

Image Attribute : 32 face photo, 46 family photo with ground truth labels

Code/Model/Publication URL : A Fusion Approach for Efficient Human Skin Detection

Source URL : Human Skin Segmentation Dataset

Dataset Examples :

Credits : A Fusion Approach for Efficient Human Skin Detection

81. 38-Cloud: A Cloud Segmentation Dataset

Dataset Characteristics :

Task Type : Cloud Segmentation

Image Attribute : 8400 patches for training and 9201 patches for testing extracted from 38 Landsat 8 remote sensing images

Code/Model/Publication URL : CLOUD-NET: AN END-TO-END CLOUD DETECTION ALGORITHM FOR LANDSAT 8 IMAGERY

Source URL : 38-Cloud-A-Cloud-Segmentation-Dataset

Dataset Examples :

Credits : 38-Cloud-A-Cloud-Segmentation-Dataset

82. Multi-Human Parsing (MHP) v2.0 Dataset

Dataset Characteristics :

Task Type : multi-human parsing

Image Attribute : 25,403 human images with pixel-wise annotations

Categories : 58 semantic categories

cap/hat	helmet
face	hair
left-arm	right-arm
left-hand	right-hand
protector	bikini/bra
jacket/windbreaker/hoodie	t-shirt
polo-shirt	sweater
singlet	torso-skin
pants	shorts/swim-shorts
skirt	stockings
socks	left-boot
right-boot	left-shoe
right-shoe	left-high heel
right-high heel	left-sandal
right-sandal	left-leg
right-leg	left-foot
right-foot	coat
dress	robe
jumpsuit	other-full-body-clothes
headwear	backpack
ball	bats
belt	bottle
carrybag	cases
sunglasses	eyewear
glove	scarf
umbrella	wallet/purse
watch	wristband
tie	other-accessory
other-upper-body-clothes	other-lower-body-clothes

Code/Model/Publication URL : Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing

Source URL : Multi-Human Parsing (MHP) v2.0 Dataset

Dataset Examples :

Credits : Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing

83. MSRA10K Salient Object Database

Dataset Characteristics :

Task Type : salient object detection and segmentation

Image Attribute : 10,000 images with pixel-level saliency labeling + Pixel accurate salient object labeling for 5000 images from MSRA-B dataset

Code/Model/Publication URL : Global Contrast based Salient Region Detection

Source URL : MSRA10K Salient Object Database

Dataset Examples :

Credits : MSRA10K Salient Object Database

84. Stanford background dataset

Dataset Characteristics :

Task Type : geometric and semantic scene understanding.

Image Attribute : 715 images have approximately 320-by-240 pixels.

Categories :

sky
tree
road
grass
water
bldg
mntn
fg obj.
horz.
vert.

Code/Model/Publication URL : Decomposing a Scene into Geometric and Semantically Consistent Regions

Source URL : Stanford background dataset

Dataset Examples :

Credits : Decomposing a Scene into Geometric and Semantically Consistent Regions

85. Oakland 3-D Point Cloud Dataset

Dataset Characteristics :

Task Type : semantic segmentation

Image Attribute : 17 files, 1.6 millions 3-D pts, 44 labels

Code/Model/Publication URL : Onboard Contextual Classification of 3-D Point Clouds with Learned High-order Markov Random Fields

Source URL : Oakland 3-D Point Cloud Dataset

Dataset Examples :

Credits : Onboard Contextual Classification of 3-D Point Clouds with Learned High-order Markov Random Fields

86. Penn-Fudan Database for Pedestrian Detection and Segmentation

Dataset Characteristics :

Task Type : pedestrian segmentation

Image Attribute : 170 images with 345 labeled pedestrians, among which 96 images are taken from around University of Pennsylvania, and other 74 are taken from around Fudan University

Code/Model/Publication URL : Object Detection Combining Recognition and Segmentation

Source URL : Penn-Fudan Database for Pedestrian Detection and Segmentation

Dataset Examples :

Credits : Penn-Fudan Database for Pedestrian Detection and Segmentation

87. HOPKINS 155 DATASET

Dataset Characteristics :

Task Type : motion segmentation

Image Attribute : 155 motion sequences of checkerboard, traffic, and articulated scenes

Code/Model/Publication URL : A Benchmark for the Comparison of 3-D Motion Segmentation Algorithms

Source URL : HOPKINS 155 DATASET

Dataset Examples :

Credits : HOPKINS 155 DATASET

88. Segmentation evaluation database

Dataset Characteristics :

Task Type : image segmentation

Image Attribute : 200 gray level images along with ground truth segmentations

Code/Model/Publication URL : Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration

Source URL : Segmentation evaluation database

Dataset Examples :

Credits : Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration

89. CO-SKEL dataset

Dataset Characteristics :

Task Type : co-skeletonization, Co-segmentation

Image Attribute : It consists of 26 categories with total 353 images of animals,birds, flowers and humans

bear	iris
camel	cat
cheetah	cormorant
cow	cranesbill
deer	desertrose
dog	egret
firepink	frog
geranium	horse
man	ostrich
panda	pigeon
seagull	seastar
sheep	snowowl
statue	woman

Code/Model/Publication URL : Object Co-skeletonization with Co-segmentation

Source URL : CO-SKEL dataset

Dataset Examples :

Credits : Object Co-skeletonization with Co-segmentation

90. EV-IMO dataset

Dataset Characteristics :

Task Type : indoor motion segmentation

Video Attribute :

Total recording time ~30 minutes
Up to 3 independently moving objects objects
Multiple types of scenes (varying backgrounds and motion speeds)

Code/Model/Publication URL : EV-IMO: Motion Segmentation Dataset and Learning Pipeline for Event Cameras

Source URL : EVIMO dataset

Dataset Examples :

Credits : EVIMO dataset

91. Materials in Context Database (MINC) dataset

Dataset Characteristics :

Task Type : material segmentation of images in the wild.

Image Attribute : It consists of 3M labeled point samples and 7061 labeled material segmentations in 23 material categories.

Brick	Carpet
Ceramic	Fabric
Foliage	Food
Glass	Hair
Leather	Mirror
Metal	Other
Painted	Paper
Plastic	Skin
Pol. stone	Sky
Stone	Tile
Wallpaper	Wood
Water

Code/Model/Publication URL : Material Recognition in the Wild with the Materials in Context Database

Source URL : MINC dataset

Dataset Examples :

Credits : Material Recognition in the Wild with the Materials in Context Database

92. Liver Tumor Segmentation (LiTS) dataset

Dataset Characteristics :

Task Type : Liver Tumor Segmentation

Image Attribute : the training data set contains 130 CT scans and the test data set 70 CT scans

Code/Model/Publication URL : The Liver Tumor Segmentation Benchmark (LiTS)

Source URL : (LiTS) dataset

Dataset Examples :

Credits : The Liver Tumor Segmentation Benchmark (LiTS)

93. Egocentric Dataset of the University of Barcelona – Segmentation (EDUB-Seg) dataset

Dataset Characteristics :

Task Type : egocentric event segmentation

Image Attribute : a total of 18,735 images captured by 7 different users during overall 20 days.

Code/Model/Publication URL : R-Clustering for Egocentric Video Segmentation

Source URL : EDUB-Seg dataset

Dataset Examples :

Credits : R-Clustering for Egocentric Video Segmentation

94. Audio Visual Cues Dataset

Dataset Characteristics :

Task Type : semantic segmentation from audio-visual clues

Image Attribute : 9 long reconstruction sequences, containing on average 1600 individual 640×480 RGB-D frames. 600 sounds from 50 different objects and 9 material categories were collected.

Code/Model/Publication URL : Joint Object-Material Category Segmentation from Audio-Visual Cues

Source URL : Audio Visual Cues Dataset

Dataset Examples :

Credits : Joint Object-Material Category Segmentation from Audio-Visual Cues

95. OpenSurfaces Dataset

Dataset Characteristics :

Task Type : surfaces segmentation from consumer photographs of indoor scene

Image Attribute : 25K images filtered then 58,928 surfaces annotated with a material name, and 33,378 annotated with object names.

Code/Model/Publication URL : OPENSURFACES: A Richly Annotated Catalog of Surface Appearance

Source URL : OpenSurfaces Dataset

Dataset Examples :

Credits : OPENSURFACES: A Richly Annotated Catalog of Surface Appearance

96. Multi-species fruit flower detection

Dataset Characteristics :

Task Type : semantic segmentation

Image Attribute : 190 images shared among three different species: apple, peach, and pear

Code/Model/Publication URL : Multispecies Fruit Flower Detection Using a Refined Semantic Segmentation Network

Source URL : Data from: Multi-species fruit flower detection using a refined semantic segmentation network

Dataset Examples :

Credits : Multispecies Fruit Flower Detection Using a Refined Semantic Segmentation Network

97. SAIL-VOS

Dataset Characteristics :

Task Type : video object segmentation

Image Attribute : 201 video sequences and 111,654 frames consisting of 1,388,389 objects

Code/Model/Publication URL : SAIL-VOS: Semantic Amodal Instance Level Video Object Segmentation –A Synthetic Dataset and Baselines

Source URL : SAIL-VOS

Dataset Examples :

Credits : SAIL-VOS: Semantic Amodal Instance Level Video Object Segmentation –A Synthetic Dataset and Baselines

98. TB-roses-v1 & v2 dataset

Dataset Characteristics :

Task Type : rose stem segmentation

Image Attribute : 354 images of rose bushes

Code/Model/Publication URL : Brain-inspired robust delineation operator

Source URL : TB-roses-v1 & v2 dataset

Dataset Examples :

Credits : TB-roses-v1 & v2 dataset

99. IOSTAR Retinal Vessel Segmentation Dataset

Dataset Characteristics :

Task Type : Vessel Segmentation

Image Attribute : 30 images with a resolution of 1024×1024 pixels

Code/Model/Publication URL : Robust Retinal Vessel Segmentation via Locally Adaptive Derivative Frames in Orientation Scores

Source URL : IOSTAR Retinal Vessel Segmentation Dataset

*require institutional login

Dataset Examples :

Credits : Robust Retinal Vessel Segmentation via Locally Adaptive Derivative Frames in Orientation Scores

100. PartNet Dataset

Dataset Characteristics :

Task Type : fine-grained semantic segmentation, hierarchical semantic segmentation, and instance segmentation

Image Attribute : 573,585 part instances over 26,671 3D models covering 24 object categories.

24 object categories in PartNet

Code/Model/Publication URL : PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding

Source URL : PartNet Dataset

Dataset Examples :

Credits : PartNet Dataset

101. Synthinel-1 dataset

Dataset Characteristics :

Task Type : Building footprint segmentation

Image Attribute : 2,108 synthetic images with Each synthetic image is 572×572 pixels in size, with a resolution 0.3m/pixel

The nine different virtual city styles used are.

(a) Red roof style

(b) Paris’ buildings style

(d) sci-fi city style

(e) Chinese palace style

(f) Damaged city style

(g) Austin city style

(h) Venice style

(i) modern city style

Code/Model/Publication URL : The Synthinel-1 dataset: a collection of high resolution synthetic overhead imagery for building segmentation

Source URL : Synthinel-1 dataset

Dataset Examples :

Credits: Synthinel-1 dataset

Do you have a custom dataset with several object classes for which you desire pixel-perfect annotations?

Well, you have reached the right place, Request a demo in order to know how NeuralMarker can help you in creating your next pixel-perfect quality training data.

Consequently, hop on to here to know which AI Tools NeuralMarker An-End-to-End AI Annotation Platform uses to segment pixels over the images.

Credits : The PASCAL-S Dataset

Credits : Indoor Scene Segmentation using a Structured Light Sensor

Credits: PASCAL-Part Dataset

Credits : Automatic Portrait Segmentation

Credits: iSAID

Semantic Part Label

Credits: NYU Depth Dataset V2

Credits : Parsing Clothing in Fashion Photographs

Segmented images of the 73 food categories

Credits: Food recognition: a new dataset, experiments, and results

Credits : Synscapes: A Photorealistic Synthetic Dataset for Street Scene Parsing

Credits : Video Instance Segmentation

Credits : Unsupervised Labeled Lane Markers Using Maps

Credits : Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition

Credits : MVTec D2S: Densely Segmented Supermarket Dataset

Credits : CORe50: a New Dataset and Benchmark for Continuous Object Recognition

Credits : Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset

Credits : The Oxford-IIIT Pet Dataset

Credits : BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

Credits : MOTS: Multi-Object Tracking and Segmentation

Credits : MOTS: Multi-Object Tracking and Segmentation

Credits : Toolkit for ApolloScape Dataset

Credits : Anabranch Network for Camouflaged Object Segmentation

Credits : Contour Detection and Hierarchical Image Segmentation

Credits : Microsoft Research Cambridge Object Recognition Image Database

Credits : Semi-automatic procedure for the determination of the cell surface area used in systems immunology

Credits : Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59)

Credits : GaTech SegTrack

Credits : A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis

Credits : A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation

Credits : FreiHAND: A Dataset for Markerless Capture of Hand Pose and Shape from Single RGB Images

Credits : Supervoxel-Consistent Foreground Propagation in Video

Credits : SegTrack v2 Dataset

Credits : Deep Learning for Human Part Discovery in Images

Credits : Semantic Object Classes in Video: A High-Definition Ground Truth Database

Credits : Cityscapes Dataset

Credits : Multi-Human Parsing in the Wild

Credits : Mut1ny head/face segmentation dataset

Credits : Semantic Drone Dataset

Credits : The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes

Credits : RailSem19: A Dataset for Semantic Rail Scene Understanding

Credits : DeepFashion: In-shop Clothes Retrieval

Credits : IDD: A Dataset for Exploring Problems of Autonomous Navigation in Unconstrained Environments

Credits : A2D2: Audi Autonomous Driving Dataset

Credits : The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes

Credits : eTRIMS Image Database

Credits : Stixmantics: A Medium-Level Model for Real-Time Semantic Scene Understanding

Credits : MODNet: Moving Object Detection Network with Motion and Appearance for Autonomous Driving

Credits : DeepFashion2

Credits : WildDash – Creating Hazard-Aware Benchmarks

Credits : Weakly Supervised Affordance Detection

Credits : The Aberystwyth Leaf Evaluation Dataset: A plant growth visible light image dataset of Arabidopsis thalian

Credits : Shadow free segmentation in still images using local density measure

Credits : Learning Aerial Image Segmentation from Online Maps

Credits : Hough Regions for Joining Instance Localization and Segmentation

Credits : CAN SEMANTIC LABELING METHODS GENERALIZE TO ANY CITY?THE INRIA AERIAL IMAGE LABELING BENCHMARK

Credits : Semantic segmentation of urban scenes by learning local class interactions

Credits : Pedestrian Color Naming via Convolutional Neural Network

Credits : A Fusion Approach for Efficient Human Skin Detection

Credits : 38-Cloud-A-Cloud-Segmentation-Dataset

Credits : Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing

Credits : MSRA10K Salient Object Database

Credits : Decomposing a Scene into Geometric and Semantically Consistent Regions

Credits : Onboard Contextual Classification of 3-D Point Clouds with Learned High-order Markov Random Fields

Credits : Penn-Fudan Database for Pedestrian Detection and Segmentation

Credits : HOPKINS 155 DATASET

Credits : Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration

Credits : Object Co-skeletonization with Co-segmentation

Credits : EVIMO dataset

Credits : Material Recognition in the Wild with the Materials in Context Database

Credits : The Liver Tumor Segmentation Benchmark (LiTS)

Credits : R-Clustering for Egocentric Video Segmentation

Credits : Joint Object-Material Category Segmentation from Audio-Visual Cues

Credits : OPENSURFACES: A Richly Annotated Catalog of Surface Appearance

Credits : Multispecies Fruit Flower Detection Using a Refined Semantic Segmentation Network

Credits : SAIL-VOS: Semantic Amodal Instance Level Video Object Segmentation –A Synthetic Dataset and Baselines

Credits : TB-roses-v1 & v2 dataset

Credits : Robust Retinal Vessel Segmentation via Locally Adaptive Derivative Frames in Orientation Scores

Credits : PartNet Dataset

Credits: Synthinel-1 dataset