[{"@type":"PropertyValue","name":"Far-field 16-microphone array","value":"48kHz, 16bit, wav, 16channels;"},{"@type":"PropertyValue","name":"Far-field 8-microphone array","value":"48kHz, 16bit, wav, 8 channels;"},{"@type":"PropertyValue","name":"Far-filed high-fidelity microphone","value":"48kHz, 16bit, wav, mono channel;"},{"@type":"PropertyValue","name":"Near-field mobile phone","value":"16kHz, 16bit, wav, mono channel."},{"@type":"PropertyValue","name":"Recording Environment","value":"Four different-sized conference rooms, with each size specification including three different rooms."},{"@type":"PropertyValue","name":"Recording content","value":"Simulate a real meeting scenario;"},{"@type":"PropertyValue","name":"Demographics","value":"984 Chinese;"},{"@type":"PropertyValue","name":"Annotation","value":"extract and annotate individual sentences with their start and end timestamps, speaker identification, and spoken text content;"},{"@type":"PropertyValue","name":"Device","value":"16-microphone array, 8-microphone array, high-fidelity microphone, mobile phone;"},{"@type":"PropertyValue","name":"Language","value":"mandarin;"},{"@type":"PropertyValue","name":"Application scenarios","value":"speech recognition; voiceprint recognition;"},{"@type":"PropertyValue","name":"Accuracy rate","value":"sentences accuracy rate of 97%."}]
{"id":1203,"datatype":"1","titleimg":"https://www.nexdata.ai/shujutang/static/image/index/datatang_yuyin_default.webp","type1":"165","type1str":null,"type2":"223","type2str":null,"dataname":"672 Hours of Multi-party Conference Multi-channel Recorded Speech Data","datazy":[{"title":"Far-field 16-microphone array","content":"48kHz, 16bit, wav, 16channels;"},{"title":"Far-field 8-microphone array","content":"48kHz, 16bit, wav, 8 channels;"},{"title":"Far-filed high-fidelity microphone","content":"48kHz, 16bit, wav, mono channel;"},{"title":"Near-field mobile phone","content":"16kHz, 16bit, wav, mono channel."},{"title":"Recording Environment","content":"Four different-sized conference rooms, with each size specification including three different rooms."},{"title":"Recording content","content":"Simulate a real meeting scenario;"},{"title":"Demographics","content":"984 Chinese;"},{"title":"Annotation","content":"extract and annotate individual sentences with their start and end timestamps, speaker identification, and spoken text content;"},{"title":"Device","content":"16-microphone array, 8-microphone array, high-fidelity microphone, mobile phone;"},{"title":"Language","content":"mandarin;"},{"title":"Application scenarios","content":"speech recognition; voiceprint recognition;"},{"title":"Accuracy rate","content":"sentences accuracy rate of 97%."}],"datatag":"","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":null,"samplePresentation":[],"officialSummary":"672-hour Multi-person Meeting Multi-channel Speech Dataset covers meeting scenarios with 3-6 participants, collected in various conference room environments, mirroring real-world meeting interactions. Transcribed with text content, speaker's ID, gender, location and other attributes. Our dataset achieves high accuracy (sentence accuracy rate ≥97%) and provides high-quality resources for speech recognition and speaker recognition research and applications. Quality tested by various AI companies:","dataexampl":null,"datakeyword":["Meeting"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Data Type,Language","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"speechRec","BGimg":"brightSpot_audio","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"]}
672 Hours of Multi-party Conference Multi-channel Recorded Speech Data
Meeting
672-hour Multi-person Meeting Multi-channel Speech Dataset covers meeting scenarios with 3-6 participants, collected in various conference room environments, mirroring real-world meeting interactions. Transcribed with text content, speaker's ID, gender, location and other attributes. Our dataset achieves high accuracy (sentence accuracy rate ≥97%) and provides high-quality resources for speech recognition and speaker recognition research and applications. Quality tested by various AI companies:
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
Specifications
Far-field 16-microphone array
48kHz, 16bit, wav, 16channels;
Far-field 8-microphone array
48kHz, 16bit, wav, 8 channels;
Far-filed high-fidelity microphone
48kHz, 16bit, wav, mono channel;
Near-field mobile phone
16kHz, 16bit, wav, mono channel.
Recording Environment
Four different-sized conference rooms, with each size specification including three different rooms.
Recording content
Simulate a real meeting scenario;
Demographics
984 Chinese;
Annotation
extract and annotate individual sentences with their start and end timestamps, speaker identification, and spoken text content;
Device
16-microphone array, 8-microphone array, high-fidelity microphone, mobile phone;