数据集 -- AVSpeech – 视听语音数据集 | 聚数力平台 | 大数据应用要素托管与交易平台

AVSpeech – 视听语音数据集

1379次浏览 dataju 于 2021-08-17 发布

该内容是由用户自发提供，聚数力平台仅提供平台，让大数据应用过程中的信息实现共享、交易与托管。如该内容涉及到您的隐私或可能侵犯版权，请告知我们及时删除。

数据集概述

AVSpeech is a new, large-scale audio-visual dataset comprising speech video clips with no interfering backgruond noises. The segments are 3-10 seconds long, and in each clip the audible sound in the soundtrack belongs to a single speaking person, visible in the video. In total, the dataset contains roughly 4700 hours of video segments, from a total of 290k YouTube videos, spanning a wide variety of people, languages and face poses. For more details on how we created the dataset see our paper.Previous

数据集详情

暂无

数据集元数据

暂无

概念层次

领域场景:	未指定
领域问题:	未指定
领域应用:	未指定
应用案例:	未指定