Smuggle It : An Annotation Datasheet for labeling your next Object Detection Dataset


Object Detection in the most fundamental way allows Computer Vision algorithms to boldly answer one question i.e. “what objects are where“.

Consequently, the Computer Vision systems or the Deep Learning Models need accurately labeled datasets to consistently answer the same question to a variety of object classes.

Even it’s quite seen that often the companies constructing the Computer Vision Systems are able to move ahead than

their competitions only because of highly accurate models which are nothing but the by-product of precise annotations done on the training dataset.

Hidden Holy Grail

One among many reasons that the AI Organisations are able to build highly accurate datasets, is because their team of data scientists and machine learning engineers specifically fabricate a hidden holy grail (Data Annotations Instruction Sheet) for their dataset for the AI workforce (the team of data labelers) to follow.

However, if you’re thinking about how you as a Data Scientist can create a Data Annotations Instruction Sheet at the organizational or at the individual level for your object detection dataset.

Well, then this article is for you !!

This blog is a “ready to use” guide for data scientists for providing intuitive annotation instructions to Annotators for labeling & producing inch-perfect Object Detection Datasets.

So, buckle up Data Specialists, it’s time to start the journey !!

Best Practice:

As a starter try to provide information about the complexity of the datasets upfront, as the added level of clarity can largely support the team of labelers to produce quality & consistent label data throughout the data labeling process.

An example of such practice is included below:

Vehicle Detection Dataset


The vehicle Detection dataset incorporates diverse vehicles moving across urban, desert, and wildlife areas in various parts of the world. In the following sections, Diversity and Class Definitions in terms of the Dataset are included.


  1. Various cities
  2. Seasons(Summer, Monsoon, Winter)
  3. Daytime, Nighttime, Eveningtime
  4. Good/medium weather conditions.

Class Definitions

Class Definitions stands for the classes present in the Vehicle Detection Dataset.

  1. Bicycle: two-wheel Bicycle with and without the rider.
  2. Bus: double-decker bus, public transport bus, long-distance travel bus.        
  3. Car: sports car, jeep, SUV, sedan, van, car-truck with continuous body shape.
  4. Motorcycle: sportbike, standard motorcycle, Dual-sport motorcycle, cruiser motorcycle with and without a rider.
  5. Scooter: classical and modern scooter with and without a rider.         
  6. Truck: box truck, pickup truck, freight truck. Including their trailers.


Image Annotation Instructions for Object Detection Datasets

In order to maintain consistency, all the instructions in this section are given by taking the above Dataset [Vehicle Detection] complexity and diversity into consideration. As long as you need a bounding box around the object of interest, these instructions can conveniently be applied to any other Object Detection Dataset.


  1. Draw Tight Bounding box and must include all visible parts of the object to be annotated.

        2. Draw annotation only on the visible part of the category/categories, Avoid covering the part which is occluded by other surrounding objects.

     3. Draw one bounding box for one Category.

4. Same category objects should not be put inside one Bounding box. They have to be covered separately.

5. One bounding box can be overlapped over another bounding box in order to cover the same or different categories.

6. Avoid Drawing of Annotation under the wrong category. Draw annotation under the correct Category.

7. Avoid missing annotations of categories that appear very small in the image.

8. Avoid annotations of category/categories which do not appear fully to be distinguishable in the image.

9. Draw annotation over all objects present in an image.

10. Avoid annotating images completely in which :

  • you are not sure what the object is (due to poor illumination).
  • the object is very small (according to your judgment).
  • less than 20% of the object is visible.

As there are two objects (Bicycles) on the front leftmost side of the image and only one is visible clearly.

Therefore the annotator should avoid annotating this image together. If the annotator annotates this image leaving out the front leftmost Bicycle because of low visibility.

He/She is conveying to the Machine Learning Model that only those annotated are bicycles and the rest of them are not.

*: It suggests the mentioned point varies from company-to-company project requirement.

11*. Avoid annotating objects of interest if they appear blurry in the image.

Ideally, any annotator working on such images should DELETE these kinds of images from the Dataset.


12.* Annotate objects in images only if they are photorealistic.


13.* Annotate objects in images only if they have Non-Iconic rather than Iconic View.

Iconic: Natural images in which only a single object of interest exists in the image.

Non-Iconic view: Natural images in which multiple objects of interest exists in the image.


Publication: Microsoft COCO: Common Objects in Context

Label Object Detection Datasets with NeuralMarker

Visit NeuralMarker to further explore this training data annotation platform or Request a demo to learn how this platform fuelled with AI and its smart labeling features can assist you in Pre-labeling Dataset for Object Detection.

Leave a Comment

Your email address will not be published. Required fields are marked *