Three Types and Contents of Unmanned Autonomous Driving Data Labeling [Illustration]

What is autonomous driving data annotation?

Autonomous driving data annotation is the process of marking cars, people, and objects in images or videos by using bounding boxes and defining other attributes, and teaching the model to recognize traffic elements such as pedestrians, cars, traffic signs, etc., to help ML models understand and Identify objects detected by sensors in the vehicle.

The basis for realizing autonomous driving is artificial intelligence. What is the basis for realizing artificial intelligence? The answer is: automatic driving data labeling.

At present, autonomous driving urgently needs to solve four major problems: see (positioning, obstacle avoidance), hear (decision-making, control, execution), speak (path planning, driving mode), and have a brain (edge ​​computing) .

Label content:

1. Motorcycle; 2. Bicycle; 3. Motorcyclist/cyclist; 4. Front and rear wheel lines; 5. Tricycle; 6. Pedestrian; 7. Traffic lights; 8. Traffic signs; 9. Indifferent objects; 10 . car cover; 11. negligible pole; 12. horizon; 13. other animals.

Label type

What types are there in autonomous driving data labeling ? The static and dynamic things on the road are intricate. How should these objects be marked? What are the labeling types for different objects and regions ?

1. Draw frame label

Draw-frame annotation mainly refers to marking the specified target object in the image with 2D frame, 3D frame, multi-frame , etc. It is applied to the basic recognition of vehicles and pedestrians, that is, marking cyclists, pedestrians, and cars. Including 3D labeling of vehicles in the plane picture, it is mainly used to train automatic driving to judge the volume of passing or overtaking vehicles, and establish a complete cognition of pedestrians and vehicles for cars.


2. Semantic Segmentation

Semantic segmentation is a very important labeling task in computer vision, segmenting and labeling different regions in a picture: these may be “pedestrians, vehicles, buildings, sky, vegetation, etc. For example, semantic segmentation can help self-driving vehicles recognize a Drivable areas in pictures.

Among the common annotation types in the field of autonomous driving, image semantic segmentation is a widely used annotation type. From a conceptual point of view, image semantic segmentation belongs to an important branch of the field of artificial intelligence computer vision. It combines image classification, target detection and image segmentation technologies, mainly for pixel-level classification of images.


3. 3D point cloud annotation

3D point cloud is a kind of data that is very suitable for 3D scene understanding, because the point cloud is very close to the original sensor data set, the lidar scan is the point cloud, and the depth image is a local part of the point cloud, the original data is End-to-end deep learning can be achieved.

Compared with 2D images, 3D point clouds can provide more dimensional information, such as: geometry, shape, and size information. 3D point cloud annotation finds the target object from the point cloud image collected by the lidar, and marks it in the form of a cube frame, which is not easily affected by changes in light intensity and occlusion by other objects. Target objects include vehicles, pedestrians, advertising signs and trees, etc., which are used for training artificial intelligence models such as computer vision and automatic driving.


Talking about the principle of automatic driving

For a self-driving car to get from point A to point B, it needs a perfect grasp of its surroundings. A typical use case for the driving functionality you want to implement in a car might require two identical sensor sets. One is your sensor set under test and the other sensor set will be your reference.

case analysis:

Now let us assume that a car travels 3,00,000 kilometers at an average speed of 45 kilometers per hour under different driving conditions. Using these numbers, we will know that the vehicle needs 6700 hours to cover this distance. The car may also have multiple cameras and a LIDAR (Light Detection and Ranging) system, which would generate 240,000,000 frames of data if we assume they recorded at 10 frames per second during those 6700 hours. Assuming an average of maybe 15 objects per frame, including other cars, traffic lights, pedestrians, and other objects, we end up with over 3.5 billion objects. All such objects must be labeled.


Labeling isn’t enough; it also has to be accurate. Until now, we have not been able to make meaningful comparisons between sensor sets on a vehicle. So what if we had to manually label each object?

Now it may not be enough to just place bounding boxes and generalized labels as cars, pedestrians, stop signs, etc. You will need the appropriate attribute that best describes the object. In addition to this, you also need to know about brake lights, stop signs, moving objects, stationary objects, emergency vehicles, classification of lights, what warning lights emergency vehicles have, etc. This requires an exhaustive list of objects and their corresponding properties, where each property must be processed one at a time. That means we’re talking a lot of data.


Once you’ve done that, you also need to make sure you have the correct labels; another person needs to check that the label data is correct. This ensures minimal scope for errors. If this activity were done manually, taking an average of 60 seconds per object, then we would need to spend 60 million hours or 6849 calendar years for the 3.6 billion objects we discussed earlier. Therefore, manual labeling seems implausible.


AI intelligent labeling tool improves efficiency

From the examples mentioned above, we learned that manual labeling of data is very unlikely. Various open source tools can help us in this activity. Objects are automatically detected despite different angles, low resolution or low light conditions. This is possible thanks to deep learning models. When it comes to automation, the first step is to create annotation tasks. Start by naming the task, specifying a label and the properties associated with it. With this done, you can now add the available data repositories that need to be labeled.


Apart from that, many additional properties can be added to tasks. Labeling can be done using polygons, boxes, and polylines. Different modes, i.e. interpolation, attribute labeling mode, segmentation, etc.

Automation reduces the average time required to label data. Incorporating automation can save you at least 65% of your energy and mental fatigue.