Details of a Researcher

SAKO Shinji

写真a

Affiliation Department	情報工学科　メディア情報分野工学専攻　メディア情報プログラム Center for Research on Assistive Technology for Building a New Community
Title	Associate Professor
Other name(s) (e.g. nickname)	Sako Shinji
Contact information
Homepage	http://sakoweb.net/en
External Link

To the head of this page.▲

Degree

Ph.D. (Engineering) ( 2004.03 Nagoya Institute of Technology )

To the head of this page.▲

Research Interests

Music Signal Processing
Music Information Processing
Sign Language Recognition
Singing Voice Synthesis
Speech Synthesis

To the head of this page.▲

Research Areas

Life Science / Rehabilitation science
Informatics / Kansei informatics
Informatics / Perceptual information processing

To the head of this page.▲

From School

Nagoya Institute of Technology Faculty of Engineering Department of Intelligent Information Systems Graduated

1995.04 - 1999.03

　 More details

Country：Japan

To the head of this page.▲

From Graduate School

Nagoya Institute of Technology Graduate School, Division of Engineering Department of Electrical & Computer Engineering Doctor's Course Completed

2001.04 - 2004.03

　 More details

Country：Japan

To the head of this page.▲

External Career

Advanced Telecommunications Research Institute International

2003.04 - 2003.06

　 More details

Country：Japan
The University of Tokyo Graduate School of Information Science and Technology Research Assistant

2004.04 - 2007.03

　 More details

Country：Japan
AGH University of Science and Technology Faculty of Computer Science, Electronics and Telecommunications Guest Scientists

2014.07 - 2014.08

　 More details

Country：Poland
Technical University Munich Institute for Human-Machine Communication Guest Scientists

2012.06 - 2012.12

　 More details

Country：Germany
Technical University of Munich Institute for Human-Machine Communication JSPS Scientist for Joint lntemational Research

2016.07 - 2017.03

　 More details

Country：Japan

To the head of this page.▲

Professional Memberships

Japanese Association of Sign Linguistics

2010.06
Human Interface Society

2010.06
電気関係学会東海支部連合大会実行委員会

2009.04 - 2009.12
高度言語情報融合フォーラム

2008.07
The Institute of Image Information and Television Enginerrs

2007.10

display all >>

To the head of this page.▲

Qualification Acquired

Software Design & Development Engineer/Information Processing Engineer, Class 1

To the head of this page.▲

Papers

A study on visualization of music for scores and performances based on Chironomie Reviewed International journal

Shinji Sako, Kana Tatsumi, Ramirez Rafael

Proc. of 15th International Workshop on Machine Learning and Music 2024.09

　More details

Authorship：Lead author,　Corresponding author Language：English Publishing type：Research paper (international conference proceedings)

The aim of this study is to enhance the enjoyment of music for deaf and hard of hearing, and normal hearing people through the visual representation of music. To depict musical rhythm effectively and clearly, we focus on Chironomie, a conducting technique used in Gregorian chant. In general, Chironomie is represented by a curve that corresponds to the musical score, and this curve is determined by whether a short segment of the score represents one of two classes: Arsis or Thesis. In pursuit of our goal, our efforts encompass two essential facets: adapting Chironomie to Western tonal music to express intuitively perceivable musical features such as tension and relaxation, and evaluating whether Chironomie can effectively convey music visually. We present an automated method for estimating Arsis and Thesis within compound beats to draw Chironomie from both score and performance data.
Dynamic Hand Gesture Recognition for Human-Robot Collaborative Assembly Reviewed International journal

Bogdan Kwolek, Shinji Sako

ICAISC 2023: Artificial Intelligence and Soft Computing, Lecture Notes in Computer Science 14125 112 - 121 2023.06

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

In this work, we propose a novel framework for gesture recognition for human-robot collaborative assembly. It permits recognition of dynamic hand gestures and their duration to automate planning the assembly or common human-robot workspaces according to Methods-Time-Measurement recommendations. In the proposed approach the common workspace of a worker and Franka-Emika robot is observed by an overhead RGB camera. A spatio-temporal graph convolutional neural network operating on 3D hand joints extracted by MediaPipe is used to recognize hand motions in manual assembly tasks. It predicts five motion sequences: grasp, move, position, release, and reach. We present experimental results of gesture recognition achieved by a spatio-temporal graph convolutional neural network on real RGB image sequences.

DOI： 10.1007/978-3-031-42505-9_10
3D Ego-Pose Lift-Up Robustness Study for Fisheye Camera Perturbations Reviewed International journal

Teppei Miura, Shinji Sako, Tsutomu Kimura

Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications 4 600 - 606 2023.02

　More details

Language：English Publishing type：Research paper (international conference proceedings)

3D egocentric human pose estimations from a mounted fisheye camera have been developed following the advances in convolutional neural networks and synthetic data generations. The camera captures different images that are affected by the optical properties, the mounted position, and the camera perturbations caused by body motion. Therefore, data collecting and model training are main challenges to estimate 3D ego-pose from a mounted fisheye camera. Past works proposed synthetic data generations and two-step estimation model that consisted of 2D human pose estimation and subsequent 3D lift-up to overcome the tasks. However, the works insufficiently verify robustness for the camera perturbations. In this paper, we evaluate existing models for robustness using a synthetic dataset with the camera perturbations that increases in several steps. Our study provides useful knowledges to introduce 3D ego-pose estimation for a mounted fisheye camera in practical.

DOI： 10.5220/0011661000003417
Visualization of Affective Information in Music Using Chironomie Reviewed International journal

2022.09

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)
Simple yet effective 3D ego-pose lift-up based on vector and distance for a mounted omnidirectional camera Reviewed International journal

Teppei Miura, Shinji Sako

Applied Intelligence 2022.05

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Springer

Following the advances in convolutional neural networks and synthetic data generation, 3D egocentric body pose estimations from a mounted fisheye camera have been developed. Previous works estimated 3D joint positions from raw image pixels and intermediate supervision during the process. The mounted fisheye camera captures notably different images that are affected by the optical properties of the lens, angle of views, and setup positions. Therefore, 3D ego-pose estimation from a mounted fisheye camera must be trained for each set of camera optics and setup. We propose a 3D ego-pose estimation from a single mounted omnidirectional camera that captures the entire circumference by back-to-back dual fisheye cameras. The omnidirectional camera can capture the user’s body in the 360∘ field of view under a wide variety of motions. We also propose a simple feed-forward network model to estimate 3D joint positions from 2D joint locations. The lift-up model can be used in real time yet obtains accuracy comparable to those of previous works on our new dataset. Moreover, our model is trainable with the ground truth 3D joint positions and the unit vectors toward the 3D joint positions, which are easily generated from existing publicly available 3D mocap datasets. This advantage alleviates the data collection and training burden due to changes in the camera optics and setups, although it is limited to the effect after the 2D joint location estimation.

DOI： 10.1007/s10489-022-03417-3
3D skeleton motion generation of double bass from musical score Reviewed International journal

Takeru Shirai, Shinji Sako

15th International Symposium on Computer Music Multidisciplinary Research (CMMR) 41 - 46 2021.11

　More details

Language：English Publishing type：Research paper (international conference proceedings)

In this study, we propose a method for generating 3D skeleton motions of a double bass player from musical score information using a 2-layer LSTM network. Since there is no suitable dataset for this study, we have created a new motion dataset with actual double bass performance. The contribution of this paper is to show the effect of combining bowing and fingering information in the generation of performance motion, and to examine the effective model structure in performance generation. Both objective and subjective evaluations showed that the accuracy of generating performance motion for double bass can be improved using two types of additional information (bowing, fingering information) and improved by constructing a model that takes into account bowing and fingering.
SynSLaG: Synthetic Sign Language Generator Reviewed International journal

Teppei Miura, Shinji Sako

ASSETS '21: The 23rd International ACM SIGACCESS Conference on Computers and Accessibility ( 90 ) 1 - 4 2021.10

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Association for Computing Machinery

Machine learning techniques have the potential to play an important role in sign language recognition. However, sign language datasets lack the volume and variety necessary to work well. To enlarge these datasets, we introduce SynSLaG, a tool that synthetically generates sign language datasets from 3D motion capture data. SynSLaG generates realistic images of various body shapes with ground truth 2D/3D poses, depth maps, body-part segmentations, optical flows, and surface normals. The large synthetic datasets provide possibilities for advancing sign language recognition and analysis.

DOI： 10.1145/3441852.3476519
Recognition of JSL fingerspelling using Deep Convolutional Neural Networks Reviewed International journal

Bogdan Kwolek, Wojciech Baczynski, Shinji Sako

Neurocomputing 2021.06

　More details

Language：English Publishing type：Research paper (scientific journal)

In this paper, we present approach for recognition of static fingerspelling in Japanese Sign Language on RGB images. Two 3D articulated hand models have been developed to generate synthetic fingerspellings and to extend a dataset consisting of real hand gestures.In the first approach, advanced graphics techniques were employed to rasterize photorealistic gestures using a skinned hand model. In the second approach, gestures rendered using simpler lighting techniques were post-processed by a modified Generative Adversarial Network. In order to avoid generation of unrealistic fingerspellings a hand segmentation term has been added to the loss function of the GAN. The segmentation of the hand in images with complex background was done by proposed ResNet34-based segmentation network. The finger-spelled signs were recognized by an ensemble with both fine-tuned and trained from scratch neural networks. Experimental results demonstrate that owing to sufficient amount of training data a high recognition rate can be attained on RGB images. The JSL dataset with pixel-level hand segmentations is available for download.

DOI： 10.1016/j.neucom.2021.03.133
Fingerspelling recognition using synthetic images and deep transfer learning Reviewed

Nguyen Tu Nam, Shinji Sako, Bogdan Kwolek

2020 The 13th International Conference on Machine Vision (ICMV 2020) 2020.11

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Although gesture recognition has been intensely studied for decades, it is still a challenging research topic due to difficulties posed by background complexity, occlusion, viewpoint, lighting changes, the deformable and articulated nature of hands, etc. Numerous studies have shown that extending the training dataset with real images about synthetic images improves the recognition accuracy. However, little work is devoted to demonstrate what improvements in recognition can be achieved thanks to transferring the style onto synthetically generated images from the real gestures. In this paper, we propose a novel method for Japanese fingerspelling recognition using both real and synthetic images generated on the basis of a 3D hand model. We propose to employ a neural style transfer to include information from real images onto synthetically generated dataset. We demonstrate experimentally that neural style transfer and discriminative layer training applied to training deep neural models allow obtaining considerable gains in the recognition accuracy.
Study on Effective Combination of Features for Non-word Speech Recognition of Phonological Examination Reviewed

Toshiharu Tadano,Masahiko Nawate,Fumihito Ito,Shinji Sako

IPSJ Journal 61 ( 10 ) 1647 - 1657 2020.10

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：Information Processing Society of Japan

Developmental dyslexia is a main element of learning disability and its early detection is very important for intervention and reading treatment. A convenient screening test using PC has been published and the answer times in text reading, reversed reading of word and mora skip of word are automatically recorded in the test. However, the correctness determination must be done by tester. In order to automate those test, a speech recognition technology corresponding to a non-word that are non-meaningful words used in an examination is necessary, but in conventional speech recognition, recognition precision for non-words is low. Therefore, while reinforcing the function of conventional speech recognition, the accuracy for non-words to be improved to a level that can be practically used for phoneme examination. In this study, we have tried to improve the accuracy for non-words by incorporating a mechanism to determine non-word correctness into Julius, which is in the public domain and can be modified freely. In addition, six candidates are given as feature quantities of speech, and the trend of the accuracy by the combination is examined. As a result, depending on the target non-word, the accuracy was 75.0% to 95.0%, and the overall average value was 87.5%.

display all >>

To the head of this page.▲

Books and Other Publications

しゃべるヒト　ことばの不思議を科学する

菊澤律子・吉岡乾編著　ほか（ Role： Contributor , 言語認識装置の進化）

図書出版文理閣 2023.04 （ ISBN:9784892599248 ）
Speech Communication and People with Disabilities

Akira Ichikawa, Yuji Nagashima, Akira Okamoto, Naoto Kato, Shinji Sako, Tetsuya Takiguchi, Daisuke Hara, Michiru Makuuchi（ Role： Joint author , Chapter 2 Speech and Communication Disorders）

Corona Publishing 2021.07 （ ISBN:9784339013429 ）

　More details

Total pages：242 Language：jpn Book type：Scholarly book

Other Link： https://www.amazon.co.jp/%E9%9F%B3%E5%A3%B0%E3%82%B3%E3%83%9F%E3%83%A5%E3%83%8B%E3%82%B1%E3%83%BC%E3%82%B7%E3%83%A7%E3%83%B3%E3%81%A8%E9%9A%9C%E3%81%8C%E3%81%84%E8%80%85-%E9%9F%B3%E9%9F%BF%E3%82%B5%E3%82%A4%E3%82%A8%E3%83%B3%E3%82%B9%E3%82%B7%E3%83%AA%E3%83%BC%E3%82%BA-22-%E5%B8%82%E5%B7%9D-%E7%86%B9/dp/4339013420/ref=sr_1_2?__mk_ja_JP=%E3%82%AB%E3%82%BF%E3%82%AB%E3%83%8A&dchild=1&keywords=%E9%9F%B3%E5%A3%B0%E3%82%B3%E3%83%9F%E3%83%A5%E3%83%8B%E3%82%B1%E3%83%BC%E3%82%B7%E3%83%A7%E3%83%B3%E3%81%A8%E9%9A%9C%E3%81%8C%E3%81%84%E8%80%85&qid=1625213798&sr=8-2

To the head of this page.▲

Misc

Envisioning the Future: A Human Communication Research Perspective Invited

Sumaru Niida, Tomoyasu Komori, Shinji Sako, Akihiro Tanaka, Kiyohiko Nunokawa

Te Journal of Institute of Electronics, Information and Communication Engineers 107 ( 3 ) 237 - 243 2024.03

　More details

Authorship：Last author Language：Japanese Publishing type：Article, review, commentary, editorial, etc. (scientific journal) Publisher：Institute of Electronics, Information and Communication Engineers

Other Link： https://www.journal.ieice.org/bin/pdf_link.php?fname=k107_3_237&lang=J&year=2024
Accessibility Guidelines for Papers and Presentations towards Realizing Inclusive Society Invited

Kiyohiko Nunokawa, Daisuke Wakatsuki, Shinji Sako

Te Journal of Institute of Electronics, Information and Communication Engineers 106 ( 12 ) 1108 - 1114 2023.12

　More details

Authorship：Last author Language：Japanese Publishing type：Article, review, commentary, editorial, etc. (scientific journal) Publisher：Institute of Electronics, Information and Communication Engineers

In FY2023, the Accessibility Guidelines for Writing and Presenting Papers was revised to ver. 4.0. This paper introduces the background of the revision and explains the relationship between the Guidelines and the social background of the revision, namely, the Law on Elimination of Discrimination against Persons with Disabilities, which makes it mandatory to provide reasonable accommodation for persons with disabilities when they participate in academic societies and research groups.

Other Link： https://www.journal.ieice.org/bin/pdf_link.php?fname=k106_12_1108&lang=J&year=2023
ICF and Accessibility Guidelines for Papers and Presentations Invited

Kiyohiko Nunokawa, Daisuke Wakatsuki, Shinji Sako

Te Journal of Institute of Electronics, Information and Communication Engineers 106 ( 12 ) 1115 - 1119 2023.12

　More details

Authorship：Last author Language：Japanese Publishing type：Article, review, commentary, editorial, etc. (scientific journal) Publisher：Institute of Electronics, Information and Communication Engineers

The Accessibility Guidelines for Writing and Publishing Papers was revised to ver 4.0 in FY2023. This paper introduces the International Classification of Functional Factors (ICF), a world-standard view of disability that considers disability as a negative aspect of life function, and explains its relationship to the guidelines.

Other Link： https://www.journal.ieice.org/bin/pdf_link.php?fname=k106_12_1115&lang=J&year=2023
HMM-based Automatic Sign Language Recognition using Phonemic Structure of Japanese Sign Language

Shinji Sako, Tadashi Kitamura

Journa of The Japan Society for Welfare Engineering 17 ( 2 ) 2 - 7 2015.11

　More details

Authorship：Lead author Language：Japanese Publishing type：Article, review, commentary, editorial, etc. (international conference proceedings) Publisher：Japan Society for Welfare Engineering

CiNii Articles
Speech/Sound based Human Interfaces (1) Construction of Speech Synthesis Systems using HTS Reviewed

Keiichiro Oura, Heiga Zen, Shinji Sako, Keiichi Tokuda

Human interface 12 ( 1 ) 35 - 40 2010.02

　More details

Language：Japanese Publishing type：Article, review, commentary, editorial, etc. (international conference proceedings) Publisher：Human interface Society

CiNii Articles
特集　音楽とOR―日本語歌詞からの自動作曲 Invited

嵯峨山茂樹,中妻啓,深山覚,酒向慎司,西本卓也

Operations research as a management science 54 ( 9 ) 546 - 553 2009.10

　More details

Language：Japanese Publishing type：Article, review, commentary, editorial, etc. (scientific journal) Publisher：日本オペレーションズ・リサーチ学会

本稿では,任意の日本語テキストの持つ韻律に基づき,歌唱曲を自動作曲する手法について解説する.文学作品や自作の詩,ニュースやメールなど,あらゆる日本語テキストをそのまま歌詞として旋律を生成し,歌唱曲として出力する自動作曲システムは,手軽な作曲のツール,音楽の専門知識を持たない人のための作曲補助ツールとして有用であろう.さらに著作権問題の回避としても用途があろう.歌唱曲は歌詞との関連性が求められる.特に高低アクセントを持つ日本語では,発話音声にピッチの高低が付くため,歌詞を朗読する際の韻律と旋律が一致することが重要とされる.筆者らはこの点に着目し,ユーザが選択した和声,リズム,伴奏音形を拘束条件として,旋律を音高間を遷移する経路とし,韻律の上下動の制限の下で最適経路となる旋律を動的計画法により探索する問題として旋律設計を捉えた.このモデルに基づき,任意の日本語歌詞に,その韻律に一致した旋律を付ける自動作曲手法により自動作曲システムOrpheusを作成したので紹介する.

CiNii Articles

To the head of this page.▲

Presentations

Study on the Personalization of Sign Language Using Video Data of Japanese Sign Language

Zixuan, Dai, Shinji Sako

IEICE Human Communication Group. HCG Symposium 2024 2024.12 Institute of Electronics, Information and Communication Engineers

　More details

Event date： 2024.12

Language：English Presentation type：Oral presentation (general)

Venue：THE KANAZAWA THEATRE (Kanazawa) Country：Japan

In recent years, there have been growing expectations for technology to automatically recognize signs and generate sign language CG. It is believed that individuality is expressed in sign languages as in spoken languages, but not many studies have been conducted on the individuality of sign languages. In this study, we examined whether the individuality of sign movement can be analyzed for video data of Japanese Sign Language, referring to previous studies on the analysis of individuality and anonymity of signs in motion-capture data of French Sign Language. Preliminary results using 2D data have demonstrated that kinematic features contain sufficient information to identify individual signers, supporting our hypothesis. In addition, we extended the analysis by incorporating pose estimation with 3D data for further validation.
Song review generation using acoustic information and lyrics International conference

25nd International Society for Music Information Retrieval Conference 2024.11 International Society for Music Information Retrieval

　More details

Event date： 2024.11

Language：English Presentation type：Poster presentation

Venue：San Francisco Country：United States
愛知芸大芸術講座メディア・クロス・トーク Invited

酒向慎司

愛知パーカッション・フェア 2024 ～「共鳴～Kyo-mei」が繋ぐさまざまなパーカッションの世界～ 2024.11 愛知県立芸術大学社会連携センター

　More details

Event date： 2024.11

Presentation type：Symposium, workshop panel (nominated)

Venue：愛知県立芸術大学芸術資料館地下演習室 Country：Japan
一人称視点映像によるボディトラッキング技術と手話認識への応用 Invited

酒向慎司

ろう者・難聴者がイキイキと働ける環境を目指して! 「スマートグラスやデジタルセンシングを使ったコミュニケーションを体験しよう」 2024.11 特定非営利活動法人ウェアラブルコンピュータ研究開発機構

　More details

Event date： 2024.11

Presentation type：Oral presentation (invited, special)

Venue：QUINTBRIDGE（NTT西日本） Country：Japan
Detection and analysis of breathing during sign language interaction using MoCap data

Shinji Sako, Kentaro Kasama

IEICE 126th Technical Committee on Well-being Information Technology (WIT) 2024.10 Institute of Electronics, Information and Communication Engineers

　More details

Event date： 2024.10

Language：Japanese Presentation type：Oral presentation (general)

Venue：Teikyo University Utsunomiya Campus Country：Japan

The objective of this study was to investigate a method for estimating the state of breathing during sign language, which is considered to be related to the temporal structure (rhythm) of sign language, from motion-capture data of sign language. It is challenging to obtain accurate measurements of motion capture data without inadvertently influencing the sign language itself. In this study, we sought to determine whether it is feasible to quantify the respiratory state by analysing the changes in chest expansion from three-dimensional motion data of signed gestures that have been precisely measured. We estimated respiration using three-dimensional data from an existing sign language database (KoSign) and validated the results.
音響情報と歌詞を用いた楽曲のレビュー文生成

川地奎多, 酒向慎司

情報処理学会第141回音楽情報科学研究会 2024.08 情報処理学会

　More details

Event date： 2024.08

Language：Japanese Presentation type：Oral presentation (general)

Venue：駒澤大学駒澤キャンパス

近年，音楽配信サービスの普及により，楽曲へのアクセス性が大幅に向上した．その一方で，音楽の聴取スタイルは受動的かつ BGM として消費する傾向が強まり，深く鑑賞する機会が減少しているのではないかと感じている．そこで，本研究では音楽を言語化して説明することがリスナーの音楽理解を助け，音楽体験の価値を向上させる手段の 1 つであると考えた．音楽の言語化は，音楽キャプションタスク（音楽に関する情報を自然言語の文章形式で記述するタスク）として近年盛んに研究されている．従来の研究では音響情報のみを用いて，楽曲に関する説明文を生成することに焦点が置かれていた．そこで，本研究では音響情報に加えて歌詞にも着目し，楽曲のレビュー文を生成することに試みた．具体的には音楽特徴抽出器と大規模言語モデル（LLM）を用いて音楽記述を生成する MU-LLaMA をベースラインモデルとし，LLaMA に事前に指示を与えるシステムプロンプトを設計することで，歌詞も考慮したレビュー文生成を実現した．さらに，3 つの評価実験を通じて，提案手法が従来手法よりも表現の多様性や楽曲のイメージ形成に有効であることを確認した．
A study of mousing detection in Japanese Sign Language video for annotation support

Kana Tatsumi, Shinji Sako

IEICE 125th Technical Committee on Well-being Information Technology (WIT) 2024.08 Institute of Electronics, Information and Communication Engineers

　More details

Event date： 2024.08

Language：Japanese Presentation type：Oral presentation (general)

Venue：Future University Hakodate

We studied the detection of mouthing, a type of mouth movement in Japanese Sign Language, to support sign language annotation. There is a need to develop a larger and more versatile corpus of Japanese Sign Language. However, annotation is difficult due to the complexity of sign language expressions. Therefore, we would like to automate the annotation of signs so that the corpus can be developed more efficiently. In the development of BOBSL, they used a technology that recognizes mouth shapes for automatic annotation. In this study, we use existing machine lip-reading technology to recognize mouth shapes in Japanese Sign Language. It then detects mouth movements, called mouthing, by matching the recognition results with mouthing words identified from the translated text. We also collected videos of Japanese Sign Language to create our dataset. At that time, we investigated the expression patterns of mousing and found variations. Finally, we considered improvement suggestions of the proposed method for future research.
Visualization of music corresponding to the score and performance based on Chironomie

Shinji Sako, Kana Tatsumi

Visualisation Information Symposium 2024 2024.07 The Visualization Society of Japan

　More details

Event date： 2024.07

Language：Japanese Presentation type：Oral presentation (general)

Venue：Okinawa Industry Support Center Country：Japan

To increase the opportunities for deaf and hard of hearing people to enjoy music, it is necessary to present music in a form that is accessible to them. This study focuses on the sense of sight as a common sense for those who have hearing difficulties but no visual difficulties and attempts to visualize music. Various methods have been proposed to visualize music, including color and graphics. In this study, we aim to visualize music through a clear and intuitive visual representation corresponding to a musical score or performance using chironomy, which includes the physical raising and lowering of the feet and hands, and the spatial up and down meanings.
Estimation of lighting color, brightness, and movement based on music audio signals to support stage lighting and its evaluation

Nano GATTO, Shinji Sako

IPSJ 140th Special Interest Group on MUSic and computer (SIGMUS) 2024.05 Information Processing Society of Japan

　More details

Event date： 2024.05

Language：Japanese Presentation type：Oral presentation (general)

Venue：Nihon University, College of Humanities and Sciences
Exploring Individuality in Sign Language using Japanese Sign Language Video Data

Zixuan DAI, Shinji SAKO

Technical Committee on Human Communication Science 2024.03 Institute of Electronics, Information and Communication Engineers

　More details

Event date： 2024.05

Language：English Presentation type：Oral presentation (general)

Venue：Okinawa Industry Support Center

In recent years, there have been growing expectations for technology to automatically recognize signs and generate sign language CG. It is believed that individuality is expressed in sign languages as in spoken languages, but not many studies have been conducted on the individuality of sign languages. In this study, we examined whether the individuality of sign movement can be analyzed for video data of Japanese Sign Language, referring to previous studies on the analysis of individuality and anonymity of signs in motion-capture data of French Sign Language.

display all >>

To the head of this page.▲

Industrial Property Rights

単語決定システム

青井基行,赤津舞子,三浦七瀬,酒向慎司

　More details

Applicant：株式会社ユニオンソフトウェアマネイジメント,国立大学法人名古屋工業大学

Application no：特願2018-048022 Date applied：2018.03

Country of applicant：Domestic Country of acquisition：Domestic
飲酒状態判定装置及び飲酒状態判定方法

岩田英三郎, 酒向慎司

　More details

Application no：PCT/JP2010/062776 Date applied：2010.07

Announcement no：特開2011-553634 Date announced：2012.06

Country of applicant：Domestic Country of acquisition：Domestic

本発明は、キーワードのような特定の言葉の利用を前提としない飲酒判定を可能とするものである。飲酒モデルは、飲酒者の音声の音響特徴による分類基準を用いた木構造を有する。この木構造におけるノードは、飲酒者の音素における音響特徴を示す。非飲酒モデルは、非飲酒者の音声の音響特徴による分類基準を用いた木構造を有する。この木構造におけるノードは、非飲酒者の音素における音響特徴を示す。まず、対象者の音声データを、飲酒モデルと非飲酒モデルのそれぞれの木構造に適用して、音素の音響特徴をノードに振り分ける。つぎに、対象者の音素の音響特徴と、各モデルにおける各ノードで特定された音響特徴との尤度を計算する。つぎに、算出された尤度の値を用いて、当該音声の音響特徴が、飲酒モデル及び非飲酒モデルのうちのどちらに近いかを判別する。

J-GLOBAL
音声合成方法及び装置

嵯峨山茂樹, 槐武也, 酒向慎司, 松本恭輔, 西本卓也

　More details

Application no：特願2005-304082 Date applied：2005.10

Announcement no：特開2007-114355 Date announced：2007.05

Country of applicant：Domestic Country of acquisition：Domestic

【課題】高品質の合成音声を提供すると共に、加工性に優れた音声合成手法を提供する。【解決手段】音声のスペクトル包絡を混合ガウス分布関数で近似することで少数のパラメータによって音声スペクトルを表現して分析パラメータを得る。そして、この混合ガウス分布関数の逆フーリエ変換であるGabor関数の重ね合わせを基本波形とし、それをピッチ周期ごとに配置して有声音を合成する。ピッチ周期をランダムにすれば無声音も合成できる。

J-GLOBAL
音声認識装置及びコンピュータプログラム

山口辰彦, 酒向慎司, 山本博史, 菊井玄一郎

　More details

Applicant：株式会社国際電気通信基礎技術研究所

Application no：特願2003-317559 Date applied：2003.09

Announcement no：特開2005-84436 Date announced：2005.03

Country of applicant：Domestic Country of acquisition：Domestic

課題】あるモデルによる音声認識の誤りを、他のモデルによる音声認識結果で置換する際に、最終的な音声認識の精度を高める。【解決手段】音声認識装置は、N−グラムモデルを用いて音声認識を行ない、N−グラム候補44及び信頼度尺度を出力する音声認識部40、音声認識部40からのN−グラム候補44に対し、正誤を判別するように最適化された予備判別部46、予備判別部46が誤りと判定した箇所について、用例文モデルを用いて音声認識を行ない、用例文候補52と信頼度を算出する用例候補選択部50、N−グラム候補44を用例文候補52で置換するか否かを判別し最終の音声認識結果28を出力する最終判別部54とを含み、予備判別部46は、学習により得られた判別基準より多くの誤りを検出するようにバイアスした判別基準を用いて判別する。

J-GLOBAL

To the head of this page.▲

Works

論文作成・発表アクセシビリティガイドライン（Ver.4.0）

井上正之, 苅田知則, 今野順, 坂本隆, 酒向慎司, 塩野目剛亮, 布川清彦, 南谷和範, 宮城愛美, 若月大輔

2023.04

　More details

Work type：Educational material

電子情報通信学会ヒューマンコミュニケーショングループ（HCG）は，福祉情報工学研究会（WIT）を中心に，2005年に障害のある人が学会や研究会などの研究活動に参加できることを目指して，聴覚障害，視覚障害のある方への情報保障を中心とした学会や研究会参加におけるバリア（手話通訳がない，資料が点字化されていないなど）を無くすための論文作成，プレゼンテーション資料作成，および発表時の情報保障に関するアクセシビリティガイドラインを公開しました．2005年の公開以降，社会も技術も大きく変化してきています．この変化に対応するため，今回，大幅な改訂を行いました．

2001年12月の国連総会で「障害者の権利及び尊厳を保護・促進するための包括的・総合的な国際条約」に関する決議案が採択されました．日本は2007年にこの条約に署名し，2012年に「障害者の日常生活及び社会生活を総合的に支援する法律」，2013年に「障害を理由とする差別の解消の推進に関する法律（障害者差別解消法）」と「障害者の雇用の促進等に関する法律」を改正して法的な整備を整え，2014年に日本国内で条約が発効されました．また，2001年に世界保健機関（WHO）が採択した国際障害分類（ICF）における構成要素間の相互作用を示す図では，障害はある特性を持った人とその人を取り巻く環境との関係から生じることを示しています．障害者差別解消法では，会社やお店，学会や研究会といった事業者に対して障害のある人に「合理的配慮」を提供するように求めています．合理的配慮の提供は環境側にあるバリアを無くすことを意味しています．アクセシビリティガイドラインは，この合理的配慮に関わるものです．そして，環境側にあるバリアを技術の面から無くすことはWITの使命です．

WITや他のHCGの先輩方のアクセシビリティガイドライン作成は，共生社会の実現・多様性への対応を先取りするものでした．この取り組みを継続し，アップデートしていくことは，後に続く者の仕事です．「全ての人が自分の志した道にチャレンジできる」WITは，これからもそのような社会を支える研究活動を続けて行きます．学会や研究会で利用されていたプレゼンテーションの方法は，現在では学会や研究会だけではなく，広く社会で利用されるようになってきています．このガイドラインを多くの学会・研究会や会社，お店などでお使いいただき，ご意見，コメント，改善案などをお寄せいただけないでしょうか．全ての人に情報を届けることができるように，皆様と一緒に改良を続けて行きたいと思います．
Kogakuin University Japanese Sign Language Multi-Dimensional Database (2nd Term)

Yuji Nagashima, Daisuke Hara, Yasuo Horiuchi, Shinji Sako

2022.10

　More details

Work type：Database science

The purpose of this dataset is to create a general-purpose sign language database that can be used in a variety of research fields. The data set contains as much high-definition and high-accuracy data as possible for more than 6,000 sign words and a few dialogues selected by the project. The subjects were two native Japanese signers (one male and one female) from native sign language families, and the filming was conducted at the motion capture studio of Toei Tokyo Studio from 2017 to 2019. In addition to sign language video data (original MXF and mp4 formats) from 4K or full HD cameras installed in front and left/right, 3D motion data (BVH, C3D, and FBX formats) from optical motion capture and depth data from Kinect sensors (Kinect v2 xef format of Kinect v2) are also included. As the second phase, 1,172 words and 7 dialogs are provided.
National Museum of Ethnology, Homō loquēns 'talking human' Wonders of Language and Languages

Yuji Nagashima, Daisuke Hara, Yasuo Horiuchi, Shinji Sako

2022.09 - 2022.11

　More details

Work type：Database science Location：National Museum of Ethnology

A technical exhibit introducing the high-precision sign language database KoSign was presented at the National Museum of Ethnology's special exhibition Homō loquēns "Talking Humans" Wonders of Language and Languages. The hand movements and facial expressions of sign language can be recorded as digital data precisely by using motion capturing system. The huge amount of data recorded of thousands of Japanese Sign Languages used in daily life enables the analysis of the sign language and the expression of the sign language by avatars.
Kogakuin University Japanese Sign Language Multi-Dimensional Database

Yuji Nagashima, Daisuke Hara, Yasuo Horiuchi, Shinji Sako

2021.06

　More details

Work type：Database science

The purpose of this dataset is to create a general-purpose sign language database that can be used in a variety of research fields. The data set contains as much high-definition and high-accuracy data as possible for more than 6,000 sign words and a few dialogues selected by the project. The subjects were two native Japanese signers (one male and one female) from native sign language families, and the filming was conducted at the motion capture studio of Toei Tokyo Studio from 2017 to 2019. In addition to sign language video data (original MXF and mp4 formats) from 4K or full HD cameras installed in front and left/right, 3D motion data (BVH, C3D, and FBX formats) from optical motion capture and depth data from Kinect sensors (Kinect v2 xef format of Kinect v2) are also included. The first phase of the project initially provide 3,701 words, 3 dialogues, and a dedicated analysis tool (Drawing and Annotation Support System). The total data size is approximately 3.6 TB.
NIT-3DHP-OMNI

Teppei Miura, Shinji Sako

2020.08

　More details

Work type：Database science

The dataset comprises of 7 subjects, covering the 16 sentences with 3-4 times per subject.
Archived dataset size is 1.52 GB.

The dataset-tree is comprised such as below:
NIT-3DHP-OMNI
+ A (personal ID for paper)
| + 011001001 (personal ID & sentence & times for each 3 digit)
| | + input
| | | + 0000000001.jpg (RGB image)
| | | + 0000000002.jpg
| | | + ...
| | |
| | + target
| | + 0000000001.txt (3D joint positions)
| | + 0000000002.txt
| | + ...
| |
| + 011001002 ...
|
+ B ...

The target text holds 3D joint positions data such as below order:
-------------------
Time Stamp
Head
Neck
Torso
Waist
Left Shoulder
Right Shoulder
Left Elbow
Right Elbow
Left Wrist
Right Wrist
Left Hand
Right Hand
-------------------
Pressivo: 旋律の演奏表情を考慮した自動伴奏生成システム

宮田佳奈, 酒向慎司, 北村正

2014.02

　More details

Work type：Software Location：インタラクション2014
A stochastic model of artistic deviation and its musical score for the elucidation of performance expression

K. Okumura,S. Sako,T. Kitamura

2013.08

　More details

Work type：Software Location：Stockholm, Sweden

http://smac2013.renconmusic.org/
Ryry: Automatic Accompaniment System Capable of Polyphonic Instruments

Ryuichi Yamamoto,Shinj Sako,Tadashi Kitamura

2013.03

　More details

Work type：Software
音楽印象データベース

酒向慎司,岩月靖典,西尾圭一郎,北村正

2013.03

　More details

Work type：Software
自動作曲システム Orpheus

嵯峨山茂樹,他

2013.01

　More details

Work type：Software

display all >>

To the head of this page.▲

Other research activities

研究用マルチモーダル音声データベース M2TINIT

2003.03

　More details

研究用マルチモーダル音声データベース M2TINIT (Multi-Modal Speech Database by Tokyo Institute of Technology and Nagoya Institute of Technology) は、マルチモーダル音声研究の推進のため、東京工業大学大学院院総合理工学研究科小林隆夫研究室および名古屋工業大学知能情報システム学科北村・徳田研究室が開発・公開する音声・唇動画像同時収録データベースです。これまでに音声・唇動画像の生成やバイモーダル音声認識の研究に利用されています。

To the head of this page.▲

Awards

Best Presentation Award: Best New Direction Category

2024.08 Information Processing Society of Japan Dynamics Restoration for "Loud" Popular Music

Keita Kawachi, Shinji Sako

　More details

Award type：Award from Japanese society, conference, symposium, etc. Country：Japan

In recent years, the proliferation of music distribution services has markedly enhanced the accessibility of music. Conversely, music listening is often a passive activity, consumed as background music. This has led to a perceived decline in opportunities for deep appreciation of music. Accordingly, this study posits that verbalizing and explaining music is an effective method to facilitate listeners' comprehension of music and enhance the value of their musical experience. The verbalization of music has been the subject of considerable recent study as a music captioning task, defined as a task to describe information about music in the form of natural language sentences. Prior research has concentrated on the generation of musical descriptions based solely on acoustic data. In this study, we sought to generate review sentences of songs by focusing on lyrics in addition to acoustic information. Specifically, MU-LLaMA, which generates music descriptions using a musical feature extractor and a large-scale language model (LLM), was used as the baseline model. Additionally, review text generation that also takes lyrics into account was achieved by designing system prompts that provide instructions to LLaMA in advance. Furthermore, through three evaluation experiments, it was confirmed that the proposed method is more effective than conventional methods in terms of diversity of expression and image formation of music.
WIT Student Research Award

2023.12 The Institute of Electronics, Information and Communication Engineers, Well-begin Infromation Technology Proposal for Music Visualization Method Using Chironomie for Enhancing Musical Experience of the Hearing Impaired

Kana Tatsumi, Shinji Sako

　More details

Award type：Award from Japanese society, conference, symposium, etc. Country：Japan

The aim of this study is to enable both hearing-impaired and normal-hearing to enjoy music together by visualizing it using Chironomie which is Gregorian chant conducting. To achieve this goal, the tasks include adapting Chironomie to Western music to express intuitively perceivable musical features like tension and relaxation. Additionally, evaluating whether Chironomie can effectively convey music visually. This report focuses on the investigation of a method for automatically estimating Arsis and Thesis for complex beats to generate Chironomie, and the presentation of the results of an evaluation experiment involving normal-hearing to assess the utility of Chironomie.
Best Presentation Award, The Tokai Chapter of Acoustical Society of Japan

2023.12 The Tokai Chapter of Acoustical Society of Japan A study on music visualization based on Chironomie

Kana Tatsumi

　More details

Award type：Award from Japanese society, conference, symposium, etc. Country：Japan
第27回東海地区音声関連研究室修士論文中間発表会総合3位

2023.08 静岡大学 Chironomieに準ずる旋律線による音楽の可視化

辰巳花菜

　More details

Award type：Award from Japanese society, conference, symposium, etc. Country：Japan
Best Presentation Award, The Tokai Chapter of Acoustical Society of Japan

2021.12 The Tokai Chapter of Acoustical Society of Japan Dynamics Restoration for "Loud" Popular Music

Hyuga Ozeki

　More details

Award type：Award from Japanese society, conference, symposium, etc. Country：Japan
Student Encouraging Award

2021.09 Information Processing Society of Japan Dynamics Restoration for "Loud" Popular Music

Hyuga Ozeki, Shinji Sako

　More details

Award type：Award from Japanese society, conference, symposium, etc. Country：Japan
Japan Society for Fuzzy Theory and Intelligent Informatics Best Paper Award

2017.09 Japan Society for Fuzzy Theory and Intelligent Informatics Automatic Performance Rendering Method for Keyboard Instruments based on Statistical Model that Associates Performance Expression and Musical Notation

Kenta Okumura, Shinji Sako, Tadashi Kitamura

　More details

Award type：Honored in official journal of a scientific society, scientific journal Country：Japan

This paper proposes a method for the automatic rendition of performances without losing any characteristics of the specific performer. In many of existing methods, users are required to input expertise such as possessed by the performer. Although they are useful in support of users'own performances, they are not suitable for the purpose of this proposal. The proposed method defines a model that associates the feature quantities of expression extracted from the case of actual performance with its directions that can be surely retrieved from musical score without using expertise. By classifying expressive tendency of the expression of the model for each case of performance using the criteria based on score directions, the rules that elucidate the causal relationship between the performer's specific performance expression and the score directions systematically can be structured. The candidates of the performance cases corresponding to the unseen score directions is obtained by tracing this structure. Dynamic programming is applied to solve the problem of searching the sequence of performance cases with the optimal expression from among these candidates. Objective evaluations indicated that the proposed method is able to efficiently render optimal performances. From subjective evaluations, the quality of rendered expression by the proposed method was confirmed. It was also shown that the characteristics of the performer could be reproduced even in various compositions. Furthermore, performances rendered via the proposed method have won the first prize in the autonomous section of a performance rendering contest for computer systems.
IPSJ Yamashita SIG Research Award

2016.03 Information Processing Society of Japan A study of comparative analysis of music performances based on the statistical model that associates expression and notation

Kenta Okumura, Shinji Sako, Tadashi Kitamura

　More details

Award type：Award from Japanese society, conference, symposium, etc. Country：Japan
78th National Convention of IPSJ, Student Encouragement Award

2016.03 Information Processing Society of Japan A case-based approach to the melody transformation for automatic jazz arrangement

Naoto Sato, Shinji Sako, Tadashi Kitamura

　More details

Award type：Award from Japanese society, conference, symposium, etc. Country：Japan
学会活動貢献賞

2014.03 日本音響学会東海支部

酒向慎司

　More details

Country：Japan

display all >>

To the head of this page.▲

Scientific Research Funds Acquisition Results

一人称視点映像を用いた手話対話の支援技術および記録技術基盤の構築

Grant number：23K11197 2023.04 - 2026.03

日本学術振興会科学研究費補助金基盤研究(C)

酒向慎司

　 More details

Authorship：Principal investigator Grant type：Competitive

我々は深層学習を用いた手話翻訳システムを開発しているが，これを実現するには手話認識や意味解析などが必要である．それらには手話コーパスの構築や教師あり学習による深層学習向けのラベル付きデータが大量に必要であるが，ラベル付けには手間がかかる．そこで本研究ではラベルがない手話動画に対して，ラベル付けを半自動的に行うシステムを開発・公開する．本研究では，このシステムを用いて作成したラベル付き手話データセットを手話言語学研究者や手話工学研究者らに提供し，手話の意味解析や手話認識に関する研究をサポートする．
自己教師あり学習手法による手話認識エンジンの開発

Grant number：23747929 2023.04 - 2025.03

日本学術振興会科学研究費補助金挑戦的萌芽研究

木村勉（研究代表）

　 More details

Authorship：Coinvestigator(s) Grant type：Competitive
手話コーパス，深層学習向けラベル付き手話データ半自動生成システムの開発

Grant number：22H00661 2022.04 - 2026.03

日本学術振興会科学研究費補助金基盤研究(B)

木村勉

　 More details

Authorship：Coinvestigator(s) Grant type：Competitive

我々は深層学習を用いた手話翻訳システムを開発しているが，これを実現するには手話認識や意味解析などが必要である．それらには手話コーパスの構築や教師あり学習による深層学習向けのラベル付きデータが大量に必要であるが，ラベル付けには手間がかかる．そこで本研究ではラベルがない手話動画に対して，ラベル付けを半自動的に行うシステムを開発・公開する．本研究では，このシステムを用いて作成したラベル付き手話データセットを手話言語学研究者や手話工学研究者らに提供し，手話の意味解析や手話認識に関する研究をサポートする．
音声認識手法を応用した自動作曲・自動作詞・自動伴奏の研究

Grant number：21H03462 2021.04 - 2024.03

日本学術振興会科学研究費補助金基盤研究(B)

嵯峨山茂樹

　 More details

Authorship：Coinvestigator(s) Grant type：Competitive

我々は深層学習を用いた手話翻訳システムを開発しているが，これを実現するには手話認識や意味解析などが必要である．それらには手話コーパスの構築や教師あり学習による深層学習向けのラベル付きデータが大量に必要であるが，ラベル付けには手間がかかる．そこで本研究ではラベルがない手話動画に対して，ラベル付けを半自動的に行うシステムを開発・公開する．本研究では，このシステムを用いて作成したラベル付き手話データセットを手話言語学研究者や手話工学研究者らに提供し，手話の意味解析や手話認識に関する研究をサポートする．
視覚障害者が能動的に白杖で叩くことによる音情報の作製と利用に関する基礎的研究

Grant number：18K18698 2018.04 - 2022.03

日本学術振興会科学研究費補助金挑戦的萌芽研究

布川清彦

　 More details

Authorship：Coinvestigator(s) Grant type：Competitive

display all >>

To the head of this page.▲

Past of Commissioned Research

繊維産業に於けるＡＩ自動検査システムの構築に関する研究開発

2022.10 - 2025.03

愛知県知の拠点あいち重点研究プロジェクトプロジェクトDX General Consignment Study

　 More details

Authorship：Coinvestigator(s) Grant type：Collaborative (industry/university)

本課題では繊維産業の自動化のために、画像処理を用いた繊維の検品工程の自動化と、音響処理技術を用いた織機の異常検知の自動
化を目指す。繊維産業を含む全ての製造産業において、製品のチェックを行う検品工程は、製品の信頼性を担保するため重要であ
る。しかし、繊維産業における検品はほぼ全て熟練者による目視で行われており、自動化による効率化を妨げている。また、製造機械
のメンテナンスも同様に製品の信頼性向上に不可欠であるが、こちらの故障検知についても同様に人の経験に基づくところが大きい。
そこで本課題では、繊維を観測した画像を画像処理技術により解析することで、検品を自動化する方法を目指す。同様に織機が発する
音を音響処理技術により解析することで、織機の異常を検知する方法の確立を目指す。以上のように、本課題ではAIに基づく画像処
理・音響処理技術を利用することで、繊維産業における検査工程を自動化することを目指す。
手話の自動翻訳を実現させる高精度な動作検出と動作のパターンマッチングの技術開発

2016.10 - 2019.03

経済産業省戦略的基盤技術高度化支援事業（サポイン） General Consignment Study

青井基行

　 More details

Authorship：Coinvestigator(s) Grant type：Competitive
心地よく人間に合わせる自動演奏システムの研究

2015.01 - 2015.12

科学技術振興機構研究成果最適展開支援事業（A-STEP）FSステージ General Consignment Study

酒向慎司

　 More details

Authorship：Principal investigator Grant type：Competitive

Grant amount：\2210000 （ Direct Cost: \1700000 、 Indirect Cost：\510000 ）

本研究では、自動演奏システムにおいて重要な要素技術である、演奏追跡技術の高精度化と、演奏追跡技術を応用した人間の演奏に同期するロボットの開発を行った。演奏追跡技術では、楽譜の情報を活用することで、テンポ変動を把握しやすい打楽器音とそれ以外の楽器種別を考慮した新たな演奏追跡モデルを提案し、演奏追跡精度の改善を確認した。演奏に追従するロボットの開発では、テンポ変動を含んだ演奏情報にリアルタイムで追従しロボットを制御するシステムを産業ロボットメーカーと共同で開発し、国際ロボット展に出展し実演した。
多様な利用形態に柔軟に対応する自動伴奏リハビリ支援システムの開発

2013.08 - 2014.03

科学技術振興機構研究成果最適展開支援事業（A-STEP）FSステージ General Consignment Study

酒向慎司

　 More details

Authorship：Principal investigator Grant type：Competitive

Grant amount：\2210000 （ Direct Cost: \1700000 、 Indirect Cost：\510000 ）

楽器の演奏は趣味として楽しむだけでなく、複雑な身体動作を伴うことから身体機能や脳機能のリハビリとしても期待できる。楽器演奏によるリハビリ支援で重要なポイントは、支援の度合いが人それぞれであり、利用者の要望や制約に柔軟に対処できることが重要となる。利用者を問わない楽器演奏によるリハビリ支援システムの構築を念頭に、楽器の違いに頑健なスペクトルテンプレートの自動適応手法の検討、テンポ推定精度の高度化を検討するほか、実際の演奏におけるテンポ推定誤りの影響などを調査した。また、計算量と性能の関係を調査するとともに、実時間処理に向けたアルゴリズムの改善を行った。
ユーザーの嗜好と利用シーンの変動に対応可能な統計モデルに基づいた楽曲からの感性推定モデルの研究

2011.08 - 2012.03

科学技術振興機構研究成果最適展開支援事業（A-STEP）FSステージ General Consignment Study

酒向慎司

　 More details

Authorship：Principal investigator Grant type：Competitive

Grant amount：\2210000 （ Direct Cost: \1700000 、 Indirect Cost：\510000 ）

音楽から受ける印象を楽曲の電子データから直接推定する印象推定システムにおいて、個人の嗜好や感性の違いに対応するため、性別や音楽経験などからなるプロフィールを利用する新たな手法を開発した。この手法の特徴として、印象推定モデルを学習するための音楽を聴いたときの印象データを事前に収集する必要がなく、他者の印象推定モデルから、特定の利用者に合った(類似した)モデルをプロフィールの情報に基づいて自動選択することができる。また、音楽を聴いた際の印象データを短期間で効率的に収集するため、Webブラウザを利用した楽曲提示と印象データ収集システムを構築し、様々な年代を含む120名の大規模な印象評価データを収集した。

To the head of this page.▲

Teaching Experience

大学院工学研究科博士前期課程　数理情報特論

2023.04 Institution：Nagoya Institute of Technology

　More details

Level：Postgraduate
第二部電子情報工学科　計算機基礎

2022.04 - 2024.03 Institution：Nagoya Institute of Technology

　More details

Level：Undergraduate (specialized)
情報学専攻研究科共通科目情報資源総論

2021.06 Institution：Shizuoka University

　More details

Level：Graduate (liberal arts)
大学院工学研究科博士前期課程　数理情報特論

2020.04 - 2021.03 Institution：Nagoya Institute of Technology

　More details

Level：Postgraduate
第二部電子情報工学科　計算機工学

2018.04 - 2021.03 Institution：Nagoya Institute of Technology

　More details

Level：Undergraduate (specialized)

display all >>

To the head of this page.▲

Committee Memberships

情報処理学会 FIT2025 第25回情報科学技術フォーラム研究会担当委員・プログラム委員

2024.11

　 More details

Committee type：Academic society
Information Processing Society of Japan Editorial Board Member

2024.11

　 More details

Committee type：Academic society
電子情報通信学会ヒューマンコミュニケーションシンポジウム2024運営委員

2024.06

　 More details

Committee type：Academic society
情報処理学会 FIT2024 第24回情報科学技術フォーラム研究会担当委員・プログラム委員

2023.11 - 2024.09

　 More details

Committee type：Academic society
日本音響学会第150回秋季研究発表会実行委員

2023.05 - 2023.09

　 More details

Committee type：Academic society
The Institute of Electronics, Information and Communication Engineers IEICE Well-being Information Technology (WIT), Vice Chairman

2023.04

　 More details

Committee type：Academic society
電子情報通信学会ヒューマンコミュニケーションシンポジウム2023運営委員

2023.04 - 2023.12

　 More details

Committee type：Academic society
電子情報通信学会 FIT2023 第22回情報科学技術フォーラム研究会担当委員・プログラム委員

2023.01 - 2023.09

　 More details

Committee type：Academic society
一般社団法人手話言語等の多文化共生社会協議会代議員

2022.10

　 More details

Committee type：Other
電子情報通信学会ヒューマンコミュニケーションシンポジウム2022運営委員

2022.10 - 2022.12

　 More details

Committee type：Academic society

display all >>

To the head of this page.▲

Social Activities

生産現場での動作音の異常検知・予知技術開発

Role(s)： Lecturer

尾張繊維技術センターオンライン (Zoom) 2022.10

　More details

Audience： Researchesrs

Type：Visiting lecture

To the head of this page.▲

Details of a Researcher

Personnel Information

Research Activity

Education Activity

Contribution to Society

Degree 【 display / non-display 】

Degree

Research Interests 【 display / non-display 】

Research Interests

Research Areas 【 display / non-display 】

Research Areas

From School 【 display / non-display 】

From School

From Graduate School 【 display / non-display 】

From Graduate School

External Career 【 display / non-display 】

External Career

Professional Memberships 【 display / non-display 】

Professional Memberships

Qualification Acquired 【 display / non-display 】

Qualification Acquired

Papers 【 display / non-display 】

Papers

Books and Other Publications 【 display / non-display 】

Books and Other Publications

Misc 【 display / non-display 】

Misc

Presentations 【 display / non-display 】

Presentations

Industrial Property Rights 【 display / non-display 】

Industrial Property Rights

Works 【 display / non-display 】

Works

Other research activities 【 display / non-display 】

Other research activities

Awards 【 display / non-display 】

Awards

Scientific Research Funds Acquisition Results 【 display / non-display 】

Scientific Research Funds Acquisition Results

Past of Commissioned Research 【 display / non-display 】

Past of Commissioned Research

Teaching Experience 【 display / non-display 】

Teaching Experience

Committee Memberships 【 display / non-display 】

Committee Memberships

Social Activities 【 display / non-display 】

Social Activities