满足不同角色需求: 领域专家 数据科学家 科研人员、高校教师及学生
Oxford Town Centre Dataset
775次浏览 dataju 于 2021-08-16 发布
该内容是由用户自发提供,聚数力平台仅提供平台,让大数据应用过程中的信息实现共享、交易与托管。如该内容涉及到您的隐私或可能侵犯版权,请告知我们及时删除。
数据集概述

https://academictorrents.com/details/35e83806d9362a57be736f370c821960eb2f2a01

Tags:Abstract:

For a coarse gaze estimation system to be useful, many people must be tracked simultaneously in real-time and in the presence of frequent occlusions and other distractions such as animals or vehicles. Two tracking systems were developed, both of which were based on two important image measurements. The first measurement was the output of a head detector which was trained using Dalal & Trigg's HOG detection algorithm that has become standard for the purposes of pedestrian detection. Although HOG detection is generally slow, it has become suitable for real-time use due to efficient GPU implementations. The second type of measurement comes from sparse KLT tracking. Although it has been around for a long time, KLT corner tracking still provides an impressive amount of information from very little processing time.

The first tracking system to be developed was based around a Kalman filter, however this proved to be susceptable to data association errors when the HOG detector failed. The second more recent approach uses Markov-Chain Monte-Carlo Data Association (MCMCDA) with an accurate error model. MCMCDA allows ambiguities to be resolved more efficiently, but also allows the tracking system to cope with temporary occlusions.

The 'Town Centre' dataset was used to test tracking performance in both the CVPR 2011 and the BMVC 2009 papers.

TownCentreXVID.avi (342MB) - The video file

TownCentre-calibration.ci (<1K) - The camera calibration data. This is in a human-readable format. The ground plane is at z=0 in the world coordinates.

TownCentre-groundtruth.top (5.3MB) - The hand labelled ground truth data. See below for a description of the 'top' file format. Note that the full body regions were estimated based on the head regions using the camera calibration with approximate human dimensions, so may be inaccurate.

Tracker Output file format

The ground truth and tracking output is provided in the '.top' file format. This consists of rows in comma-seperated variable (CSV) format:

personNumber, frameNumber, headValid, bodyValid, headLeft, headTop, headRight, headBottom, bodyLeft, bodyTop, bodyRight, bodyBottom

personNumber - A unique identifier for the individual person frameNumber - The frame number (counted from 0) headValid - 1 if the head region is valid, 0 otherwise bodyValid - 1 if the body region is valid, 0 otherwise headLeft,headTop,headRight,headBottom - The head bounding box in pixels bodyLeft,bodyTop,bodyRight,bodyBottom - The body bounding box in pixels For the purposes of tracking evaluation, full body regions were considered to be matched if they overlap by at least 50% and head regions were required to overlap by at least 25%.

Papers:

  • Colour Invariant Head Pose Classification in Low Resolution Video
  • Guiding Visual Surveillance by Tracking Human Attention
  • Stable Multi-Target Tracking in Real-Time Surveillance Video



URL: https://web.archive.org/web/20190714174044/http://www.robots.ox.ac.uk:80/ActiveVision/Research/Projects/2009bbenfold_headpose/project.html
License: 

No license specified, the work may be protected by copyright.


数据集详情
暂无
数据集元数据
暂无
概念层次
领域场景: 未指定
领域问题: 未指定
领域应用: 未指定
应用案例: 未指定