Details of a Researcher

Papers - LEE Akinobu

Division display 61 - 80 of about 135 ／ All the affair displays >>

大語彙連続音声認識における単語信頼度に基づく単語固有ノードの枝刈り手法の検討

小林大晃, 伊藤直晃, 李晃伸

日本音響学会2014年春季研究発表会講演論文集 2014.03

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

Response selection based on hypothesis generation that prioritizes neighborhood word of keywords in statistical spoken dialogue system

3-Q5-13 - 224 2014.03

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

CiNii Articles

researchmap

条件付き確立場に基づく仮説の遂次早期確定を用い低遅延音声インタフェース

伊神陽介, 李晃伸, 徳田恵一, 南角吉彦

日本音響学会2014年春季研究発表会講演論文集 2-4-7 2014.03

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap

ユーザ生成型音声対話コンテンツに向けた有限状態トランスデューサに基づく簡潔な対話記述法の検討

船谷内泰斗, 大浦圭一郎, 南角吉彦, 李晃伸, 徳田恵一

音響学会講演論文集 223 - 224 2013.09

Language：Japanese Publishing type：Research paper (scientific journal)

researchmap

MMDAGENT - A FULLY OPEN-SOURCE TOOLKIT FOR VOICE INTERACTION SYSTEMS Reviewed International journal

Akinobu Lee, Keiichiro Oura, Keiichi Tokuda

2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) 8382 - 8385 2013.05

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

This paper describes development of an open-source toolkit which makes it possible to explore a vast variety of aspects in speech interactions at spoken dialog systems and speech interfaces. The toolkit tightly incorporates recent speech recognition and synthesis technologies with a 3-D CG rendering module that can manipulates expressive embodied agent characters. The software design and its interfaces are carefully designed to be fully open toolkit. Ongoing demonstration experiments to public indicates that it is promoting related researches and developments of voice interaction systems in various scenes.

DOI： 10.1109/ICASSP.2013.6639300

Web of Science

researchmap

スマートフォン単体で動作する音声対話3Dエージェント「スマートメイちゃん」の開発

山本大介, 大浦圭一郎, 李晃伸　他

情報処理学会インタラクション 675 - 680 2013.03

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap

ユーザ参加型双方向音声案内デジタルサイネージシステムの開発・設置・運用事例 Invited

徳田恵一, 大浦圭一郎, 李晃伸, 山本大介, 打矢隆弘, 内匠逸

日本音響学会2013年春季研究発表会論文集 119 - 122 2013.03

Language：Japanese Publishing type：Research paper (scientific journal)

researchmap

On-Campus, User-Participatable, and Voice-Interactive Digital Signage(<Special Issue>Practical Issues of Spoken Dialogue Systems) Reviewed

Keiichiro Oura, Daisuke Yamamoto, Ichi Takumi, Akinobu Lee, Keiichi Tokuda

28 ( 1 ) 60 - 67 2013.01

Language：Japanese Publishing type：Research paper (scientific journal)

CiNii Articles

CiNii Books

researchmap

Other Link： http://id.nii.ac.jp/1004/00008160/

Technical Advances of Speech-Oriented Guidance System "Takemaru-kun" by 10 Years of Long-Term Operation(<Special Issue>Practical Issues of Spoken Dialogue Systems) Reviewed

NISIMURA Ryuichi, HARA Sunao, KAWANAMI Hiromichi, LEE Akinobu, SHIKANO Kiyohiro, Ryuichi Nishimura, Sunao Hara, Hiromichi Kawanami, Akinobu Lee, Kiyohiro Shikano

Journal of the Japanese Society for Artificial Intelligence 28 ( 1 ) 52 - 59 2013.01

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：The Japanese Society for Artificial Intelligence

DOI： 10.11517/jjsai.28.1_52

CiNii Articles

CiNii Books

researchmap

Other Link： http://id.nii.ac.jp/1004/00008159/

ドライバの社会性に関するCharacter自動推定

神沼充伸, 西崎友規子, ブエ・ステファン, 南角吉彦, 李晃伸

Human Interface 2012予稿集 2012.09

Language：Japanese Publishing type：Research paper (other academic)

researchmap

登録キーワードと汎用言語モデルを用いた音声認識部・応答選択部の密結合に基づく統計的音声対話システム

平野隆司, 加藤杏樹, 南角吉彦, 李晃伸, 徳田恵一

2012 Information Processing Society of Japan 2012-SLP-92 ( 3 ) 1 - 6 2012.07

Language：Japanese Publishing type：Research paper (scientific journal)

researchmap

双方向音声デジタルサイネージのための学内イベント登録システム

山本大介, 大浦圭一郎, 李晃伸, 打矢隆弘, 内匠逸, 徳田恵一, 松尾啓志

大学ITC推進協議会2011年度年次大会 2011.12

Language：Japanese Publishing type：Research paper (other academic)

researchmap

魅力ある音声インタラクションシステムを構築するためのオープンソースツールキットMMDAgent

李晃伸, 大浦圭一郎, 徳田恵一

Technical Report of IEICE 1 - 6 2011.12

Language：Japanese Publishing type：Research paper (other academic)

researchmap

Speech recognition based on statistical models including multiple phonetic decision trees Reviewed

Sayaka Shiota, Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda

Acoustical Science and Technology 32 ( 6 ) 236 - 243 2011.11

Language：English Publishing type：Research paper (scientific journal)

連続音声認識における仮説の低遅延逐次確定アルゴリズムの評価

大野博之, 南角吉彦, 李晃伸, 徳田恵一

日本音響学会2011年秋季研究発表会論文集 45 - 46 2011.09

Language：Japanese Publishing type：Research paper (other academic)

researchmap

Evaluation of Tree-Trellis Based Decoding on Over-Million LVCSR Reviewed

Naoaki Ito, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda

Proc. ISCA Interspeech2011 1937 - 1940 2011.08

Language：English Publishing type：Research paper (international conference proceedings)

researchmap

Bayesian Context Clustering Using Cross Validation for Speech Recognition Reviewed

Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E94-D ( 3 ) 668 - 678 2011.03

Language：English Publishing type：Research paper (scientific journal) Publisher：IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG

This paper proposes Bayesian context clustering using cross validation for hidden Markov model (HMM) based speech recognition. The Bayesian approach is a statistical technique for estimating reliable predictive distributions by treating model parameters as random variables. The variational Bayesian method, which is widely used as an efficient approximation of the Bayesian approach, has been applied to HMM-based speech recognition, and it shows good performance. Moreover, the Bayesian approach can select an appropriate model structure while taking account of the amount of training data. Since prior distributions which represent prior information about model parameters affect estimation of the posterior distributions and selection of model structure (e.g., decision tree based context clustering), the determination of prior distributions is an important problem. However, it has not been thoroughly investigated in speech recognition, and the determination technique of prior distributions has not performed well. The proposed method can determine reliable prior distributions without any tuning parameters and select an appropriate model structure while taking account of the amount of training data. Continuous phoneme recognition experiments show that the proposed method achieved a higher performance than the conventional methods.

DOI： 10.1587/transinf.E94.D.668

Web of Science

researchmap

Evaluation of Tree-trellis based Decoding in Over-million LVCSR Reviewed

Naoaki Ito, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 1948 - 1951 2011

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA-INT SPEECH COMMUNICATION ASSOC

Very large vocabulary continuous speech recognition (CSR) that can recognize every sentence is one of important goals in speech recognition. Several attempts have been made to achieve very large vocabulary CSR. However, very large vocabulary CSR using a tree-trellis based decoder has not been reported. We report the performance evaluation and improvement of the "Julius" tree-trellis based decoder in large vocabulary CSR (LVCSR) involving more than one million vocabulary, referred to here as over-million LVCSR. Experiments indicated that Julius achieved a word accuracy of about 91% and a real time factor of about 2 in over-million LVCSR for Japanese newspaper speech transcription.

Web of Science

researchmap

Speech recognition based on statistical models including multiple phonetic decision trees Reviewed

Sayaka Shiota, Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda

Acoustical Science and Technology 32 ( 6 ) 236 - 243 2011

Language：English Publishing type：Research paper (scientific journal)

We propose a speech recognition technique using multiple model structures. In the use of context-dependent models, decision-tree-based context clustering is applied to find an appropriate parameter tying structure. However, context clustering is usually performed on the basis of unreliable statistics of hidden Markov model (HMM) state sequences because the estimation of reliable state sequences requires an appropriate model structures, that cannot be obtained prior to context clustering. Therefore, context clustering and the estimation of state sequences essentially cannot be performed independently. To overcome this problem, we propose an optimization technique of state sequences based on an annealing process using multiple decision trees. In this technique, a new likelihood function is defined in order to treat multiple model structures, and the deterministic annealing expectation maximization algorithm is used as the training algorithm. Experimental continuous phoneme recognition results show that the proposed method of using only two decision trees achieved about an 11.1% relative error reduction over the conventional method. © 2011 The Acoustical Society of Japan.

DOI： 10.1250/ast.32.236

Scopus

researchmap

Voice activity detection based on conditional random fields using multiple features（共著） Reviewed

Akira Saito, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda

Proc. Conference of the International Speech Communiation Association (INTERSPEECH) 2086 - 2089 2010.09

Language：English Publishing type：Research paper (international conference proceedings)

PREV - NEXT

To the head of this page.▲

<LEE Akinobu>

Papers - LEE Akinobu