Search - dct sound

Search - dct sound - List

[Other resource] SpeakerRecogntionBasedOnVQandGMM DL : 0: 本文完成了对唇动身份识别技术几个基本问题的理论研究,并对整个系统加以实现.作为本文研究的实验基础,我们建立了唇动方式身份识别数据库(HITLUDB), 该库目前包含30个说话人每人20个汉语词的音视频语料.数据库的扩充与完善工作仍在不断的进行之中.在嘴唇检测方面, 我们对自适应色度过滤模型进行改进,提高了算法的鲁棒性,完成了对嘴唇的精确定位.结合DCT变换与K-L变换的各自特点, 我们提出了特征提取算法,使用较少维数的特征完成了对嘴唇区域主要信息的刻画.由于唇动信息同时包含了生理特征与行为特征, 我们使用静念动念混合建模的方式,完成了对说话人唇动个性特点的精确描述.在HMM训练时,我们提出了特征的归一化处理方法,提高了HMM在实际应用中的性能. 最后,我们分别对身份辨认系统与身份确认系统的基本理论进行了叙述,并完成了系统的实践工作. 关　键　词：身份识别唇动特征提取隐马尔可夫模型 K-L变换 -paper completed the lip movements identification technology several basic issues of theoretical study, system as a whole be achieved. As this paper, the experimental basis, We have established a dynamic manner lip identification database (HITLUDB) The library currently contains 30 words each of 20 Chinese words, sound and video corpus. and the expansion of the database is still perfect keep going on. Detection of the lips, we adaptive color filter model improvements, improve the robustness of the algorithm, completed a pair of lips the precise positioning. DCT combined with the K-L transform their own characteristics, We have proposed a feature extraction algorithm, use less dimension of the lips completed the main message of regional characterization. As the lip movements of information,
Date : 2008-10-13 Size : 4.98mb User : 李岚熙
[Other] SpeakerRecogntionBasedOnVQandGMM DL : 0: 本文完成了对唇动身份识别技术几个基本问题的理论研究,并对整个系统加以实现.作为本文研究的实验基础,我们建立了唇动方式身份识别数据库(HITLUDB), 该库目前包含30个说话人每人20个汉语词的音视频语料.数据库的扩充与完善工作仍在不断的进行之中.在嘴唇检测方面, 我们对自适应色度过滤模型进行改进,提高了算法的鲁棒性,完成了对嘴唇的精确定位.结合DCT变换与K-L变换的各自特点, 我们提出了特征提取算法,使用较少维数的特征完成了对嘴唇区域主要信息的刻画.由于唇动信息同时包含了生理特征与行为特征, 我们使用静念动念混合建模的方式,完成了对说话人唇动个性特点的精确描述.在HMM训练时,我们提出了特征的归一化处理方法,提高了HMM在实际应用中的性能. 最后,我们分别对身份辨认系统与身份确认系统的基本理论进行了叙述,并完成了系统的实践工作. 关　键　词：身份识别唇动特征提取隐马尔可夫模型 K-L变换 -paper completed the lip movements identification technology several basic issues of theoretical study, system as a whole be achieved. As this paper, the experimental basis, We have established a dynamic manner lip identification database (HITLUDB) The library currently contains 30 words each of 20 Chinese words, sound and video corpus. and the expansion of the database is still perfect keep going on. Detection of the lips, we adaptive color filter model improvements, improve the robustness of the algorithm, completed a pair of lips the precise positioning. DCT combined with the K-L transform their own characteristics, We have proposed a feature extraction algorithm, use less dimension of the lips completed the main message of regional characterization. As the lip movements of information,
Date : 2025-10-25 Size : 4.98mb User : QHLee
[VHDL-FPGA-Verilog] DCT DL : 0: 用verilog语言实现DCT编解码附有DCT的说明-Using Verilog language realize DCT codec with a description of DCT
Date : 2025-10-25 Size : 64kb User : 周韧研
[Multimedia Develop] fmodapi375ce_small DL : 0: FMOD supports mp2, mp3 and Ogg vorbis in this release. For mpeg audio you can reduce cpu usage by by using FSOUND_MPEGHALFRATE as a flag in FSOUND_Stream_OpenFile. This does an inverse dct at only half the rate, producing a 22khz output for a 44khz stream-FMOD supports mp2, mp3 and Ogg vorbis in this release. For mpeg audio you can reduce cpu usage by by using FSOUND_MPEGHALFRATE as a flag in FSOUND_Stream_OpenFile. This does an inverse dct at only half the rate, producing a 22khz output for a 44khz stream for example. The difference is only slightly noticable as the 22khz output is re-interpolated back to the output rate, and if the output rate (from FSOUND_Init) is 22khz anyway, it shouldnt sound any different. The benefit is roughly a 20 speed increase.
Date : 2025-10-25 Size : 1.42mb User : Jewlong
[Graph program] watemarkr DL : 1: 一种基于DCT变换的音频数字水印，将MP3音乐声音文件转化为WAV声音文件，将水印信息嵌入到WAV声音文件，再将WAV声音文件转化为MP3音乐声音文件。 -DCT-based audio digital watermark, the MP3 music sound files into WAV sound files, the watermark information embedded into the WAV sound files, and then WAV sound files into MP3 music sound files.
Date : 2025-10-25 Size : 34kb User : 帅俊
[Compress-Decompress algrithms] HSDCTcode DL : 0: 心音信号的DCT域变换压缩编码，LLoyld最佳量化与霍夫曼编码，利用心电R波进行分帧，在不到1 左右失真度的情况下达到了6.0以上的压缩比-Heart sound signals on DCT transform coding, LLoyld optimal quantization and Huffman coding, the use of ECG R wave sub-frame, less than about 1 in the case of distortion to achieve a compression ratio of more than 6.0
Date : 2025-10-25 Size : 781kb User : Sun