One of the ways that autonomous car makers and companies working on tech for those self-driving rides make their algorithms and simulations better is with more data. Berkeley has announced that it has made the largest self-driving dataset ever gathered open source. The dataset contains 100,000 video sequences that are each about 40 seconds long.
Each of those 100,000 videos is in 720p quality. Berkeley says that the dataset is 800 times larger than the Baidu ApolloScape dataset. The new dataset is called BDD100K and the videos also have GPS information that was recorded via mobile phones. GPS data allows the video to illustrate rough driving trajectories.
One of the big things about the data that will make it so useful is that it included different weather conditions ranging from sunny to overcast, rainy, and hazy. Data also offers a “good balance” of day and night driving scenarios. Annotated images divide between two types of lane markings to make them easy to distinguish.
Until now the Baidu dataset that was released was the largest, that release happened back in March. The Berkeley dataset is 800 times larger than what Baidu offered. Mapillary’s dataset is 4,800 times smaller than Berkeley’s and the KITTI dataset is 8,000 times smaller.
The Berkeley BDD100K dataset has 1.2 million images, 100,000 sequences, and covers multiple cities. The dataset is available for download now and data scientists should be salivating.
SOURCE: Analytics Vidhya