La commande vocale suscite actuellement un grand intérêt notamment dans l’habitat intelligent pour l’assistance, la santé et le confort. Depuis 2001, les travaux de l’équipe GETALP dans ce domaine s’appuient sur des allers-retours continus entre collecte de données, recherche, développements d’applications et évaluations expérimentales. Au fil des ans, entre 2001 et 2019, ces travaux ont montré la nécessité de s’attaquer aux problèmes durs de ce domaine d’application tels que l’adaptation en continu à l’utilisateur, la prise en compte de plusieurs locuteurs, la nécessité de fonctionner en ambiance bruitée et l’évaluation en milieu écologique. La démarche de l’équipe a fait usage de nombreuses expérimentations qui ont permis d’enregistrer des corpus mis à la disposition de la communauté. Les travaux et évaluations en habitat intelligent montrent la nécessité d’une approche large en considérant l’acte langagier non seulement comme une information linguistique, mais également comme une information située.
Voice control is currently attracting great interest, particularly in smart homes to bring enhanced assistance, health and comfort. Since 2001, the GETALP team’s work in this field has been based on continuous back and forth between data collection, research, application development and experimental evaluations. Over the years, since 2001 to 2019, this work has shown the need to address the hard problems of this field of application such as continuous adaptation to the user, the presence of several speakers, the need to operate in a noisy environment, and the evaluation in an ecological environment. The approach of the team uses numerous experiments that made it possible to record corpora which have been made available to the community. The current work and evaluations show the need for a broad approach by considering the language act not only as linguistic information but also as situated information.
Accepté le :
Publié le :
Keywords: Smart home, automatic speech recognition, natural language understanding, Human-computer interaction.
Michel L. Vacher 1 ; François Portet 1
@article{ROIA_2023__4_1_77_0, author = {Michel L. Vacher and Fran\c{c}ois Portet}, title = {La commande vocale en habitat intelligent~: 15 ans d{\textquoteright}exp\'erience dans l{\textquoteright}\'equipe {GETALP}}, journal = {Revue Ouverte d'Intelligence Artificielle}, pages = {77--105}, publisher = {Association pour la diffusion de la recherche francophone en intelligence artificielle}, volume = {4}, number = {1}, year = {2023}, doi = {10.5802/roia.51}, language = {fr}, url = {https://roia.centre-mersenne.org/articles/10.5802/roia.51/} }
TY - JOUR AU - Michel L. Vacher AU - François Portet TI - La commande vocale en habitat intelligent : 15 ans d’expérience dans l’équipe GETALP JO - Revue Ouverte d'Intelligence Artificielle PY - 2023 SP - 77 EP - 105 VL - 4 IS - 1 PB - Association pour la diffusion de la recherche francophone en intelligence artificielle UR - https://roia.centre-mersenne.org/articles/10.5802/roia.51/ DO - 10.5802/roia.51 LA - fr ID - ROIA_2023__4_1_77_0 ER -
%0 Journal Article %A Michel L. Vacher %A François Portet %T La commande vocale en habitat intelligent : 15 ans d’expérience dans l’équipe GETALP %J Revue Ouverte d'Intelligence Artificielle %D 2023 %P 77-105 %V 4 %N 1 %I Association pour la diffusion de la recherche francophone en intelligence artificielle %U https://roia.centre-mersenne.org/articles/10.5802/roia.51/ %R 10.5802/roia.51 %G fr %F ROIA_2023__4_1_77_0
Michel L. Vacher; François Portet. La commande vocale en habitat intelligent : 15 ans d’expérience dans l’équipe GETALP. Revue Ouverte d'Intelligence Artificielle, Volume 4 (2023) no. 1, pp. 77-105. doi : 10.5802/roia.51. https://roia.centre-mersenne.org/articles/10.5802/roia.51/
[1] Parole et traduction automatique : le module de reconnaissance RAPHAEL, Proceedings of COLING-ACL’98, Volume 2, ACL, Montréal, Québec (1998), pp. 36-40 | DOI
[2] Reconnaissance automatique de la parole de personnes âgées pour les services d’assistance à domicile, Thèse, Université de Grenoble, École Doctorale MSTII (2014)
[3] Influence of expressive speech on ASR performances : application to elderly assistance in smart home, Text, Speech, and Dialogue (Petr Sojka; Ales Horak; Ivan Kopecek; Karel Pala, eds.) (Lecture Notes in Computer Science, Artificial Intelligence), Volume 9924, Springer International Publishing, Brno , Czech Republic, 2016, pp. 522-530 | DOI
[4] Analysing the performance of automatic speech recognition for ageing voice : Does it correlate with dependency level ?, Proceedings of the 4th Workshop SLPAT, ACL (2013), pp. 9-15
[5] In-home detection of distress calls : the case of aged users, Proceedings of Interspeech 2013, ISCA (2013), pp. 2065-2067
[6] CompanionAble – integrated cognitive assistive & domotic companion robotic systems for ability & security, Proceedings of SFTAG’09, SFTAG (2009), pp. 18-20
[7] Aging and memory : cognitive and biological perspectives, Handbook of the Psychology of Aging, 5th ed. Academic Press, San Diego, 2001, pp. 349-377
[8] Sequential Dialogue Context Modeling for Spoken Language Understanding, Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, ACL, Saarbrücken, Germany (2017), pp. 103-114 | DOI
[9] The third ’CHIME’ speech separation and recognition challenge : Analysis and outcomes, Computer Speech and Language, Volume 46 (2017), pp. 605-626 | DOI
[10] The fifth ’CHiME’ Speech Separation and Recognition Challenge : Dataset, task and baselines, Proceedings of Interspeech 2018, ISCA, Hyderabad, India (2018), pp. 1561-1565 | DOI
[11] VoiceHome-2, an extended corpus for multichannel speech processing in real homes, Speech Commun., Volume 106 (2019), pp. 68-78 | DOI
[12] Projections de population à l’horizon 2060 : Un tiers de la population âgé de plus de 60 ans, INSEE (France), 2010 no. 1320
[13] Evaluation under real-life conditions of a stand-alone fall detector for the elderly subjects, Annals of Physical and Rehabilitation Medicine, Volume 54 (2011), pp. 391-398 | DOI
[14] Concevoir une technologie ambiante pour le maintien à domicile : une démarche prospective par la prise en compte des systèmes d’activité, Le travail humain, Volume 77 (2014) no. 1, pp. 39-62 | DOI
[15] CIRDO : Smart companion for helping elderly to live at home for longer, Innovation and Research in BioMedical engineering (IRBM), Volume 35 (2014) no. 2, pp. 101-108 | DOI
[16] Arcades : A deep model for adaptive decision making in voice controlled smart-home, Pervasive and Mobile Computing, Volume 49 (2018), pp. 92-110 | DOI
[17] Home Automation in the Wild : Challenges and Opportunities, Proceedings of SIGCHI Conference on Human Factors in Computing Systems (CHI ’11), ACM, Vancouver, Canada (2011), pp. 2115-2124 | DOI
[18] Context-aware decision making under uncertainty for voice-based control of smart home, Expert Systems with Applications, Volume 75 (2017), pp. 63-79 | DOI
[19] Smart homes – Current features and future perspectives, Maturitas, Volume 64 (2009) no. 2, pp. 90-97 | DOI
[20] Mise en œuvre d’une plateforme de suivi de l’actimétrie associée à un système d’identification, Symposium Mobilité et Santé (SMS 2011), Ludovia, Ax les Thermes (France) (2011)
[21] HomeService : Voice-enabled assistive technology in the home using cloud-based automatic speech recognition, Proceedings of the 4th Workshop SLPAT, ACL (2013), pp. 29-34
[22] The DIRHA simulated corpus, Proceedings of LREC 2014, ELRA (2014), pp. 2629-2634
[23] Amazon workers are listening to what you tell Alexa, 2019 (Bloomberg. Consulté le 5 avril 2022, https://www.bloomberg.com/news/articles/2019-04-10/is-anyone-listening-to-you-on-alexa-a-global-team-reviews-audio)
[24] End-to-End Spoken Language Understanding : Performance analyses of a voice command task in a low resource setting, Computer Speech & Language, Volume 75 (2022), 101369 https://www.sciencedirect.com/science/article/pii/S0885230822000134 | DOI
[25] Towards a French Smart-Home Voice Command Corpus : Design and NLU Experiments, Text, Speech, and Dialogue (Lecture Notes in Computer Science, Artificial Intelligence), Volume 11107, Springer International Publishing (2018), pp. 509-517 | DOI
[26] La dépendance des personnes âgées : une projection en 2040, Données sociales – La société française (2006), pp. 613-619
[27] An integrated system for voice command recognition and emergency detection based on audio signals, Expert Systems with Applications, Volume 42 (2015) no. 13, pp. 5668-5683 | DOI
[28] SVM-Based Multi-Modal Classification of Activities of Daily Living in Health Smart Homes : Sensors, Algorithms and First Experimental Results, IEEE Transactions on Information Technology in Biomedicine, Volume 14 (2010) no. 2, pp. 274 -283 | DOI
[29] A French corpus of audio and multimodal interactions in a health smart home, Journal on Multimodal User Interfaces, Volume 7 (2013) no. 1, pp. 93-109 | DOI
[30] Changes in vision and hearing with aging, Handbook of the Psychlogy of Aging, 5th ed. Academic Press, San Diego, USA, 2001, pp. 241-266
[31] Une plate-forme usage pour l’intégration de l’informatique ambiante dans l’habitat : DOMUS, Technique et Science Informatiques (TSI), Volume 32 (2013), pp. 547-574 | DOI
[32] Self-taught assistive vocal interfaces : an overview of the ALADIN project, Proceedings of Interspeech 2013, ISCA, Lyon, France (2013), pp. 2039-2043 | DOI
[33] Development of an automated speech recognition interface for Personal Emergency Response Systems, Journal of NeuroEngineering and Rehabilitation (2009) no. 1, 26, 11 pages | DOI
[34] Designing a home of the future, IEEE Pervasive Computing, Volume 1 (2002) no. 2, pp. 76-82 | DOI
[35] Information Extraction From Sound for Medical Telemonitoring, Information Technology in Biomedicine, IEEE Transactions on, Volume 10(2) (2006), pp. 264-274 | DOI
[36] Embedded Implementation of Distress Situation Identification Through Sound Analysis, The Journal on Information Technology in Healthcare, Volume 6 (2008), pp. 204-211
[37] Triangular-Chain Conditional Random Fields, IEEE Transactions on Audio, Speech, and Language Processing, Volume 16 (2008) no. 7, pp. 1287-1302 | DOI
[38] Multi-domain spoken language understanding with transfer learning, Speech Communication, Volume 51 (2009) no. 5, pp. 412-424 (Accessed 2017-02-10) | DOI
[39] Assessing Self-maintenance : Activities of Daily Living, Mobility, and Instrumental Activities of Daily Living, Journal of the American Geriatrics Society, Volume 31 (1983) no. 12, pp. 721-727 | DOI
[40] The ContextAct@A4H real-life dataset of daily-living activities – Activity recognition using model checking, CONTEXT (LNCS), Volume 10257, Springer, Paris, France (2017), pp. 175-188 | DOI
[41] Janus-III : speech-to-speech translation in multiple languages, Proceedings of ICASSP 97, Volume 1, IEEE (1997), pp. 99-102 | DOI
[42] Une enceinte connectée d’Amazon envoie une conversation privée par erreur, https ://www.lemonde.fr/pixels/article/2018/05/25/une-enceinte-connecteed-amazon-envoie-une-conversation-privee-par-erreur_5304453_4408996.html, 2018 (Date : 2018-05-25, Accessed : 2018-09-13)
[43] Imperfect Transcript Driven-Speech Recognition, Proceedings of InterSpeech’06, ISCA, Pittsburg, Pennsylvania, USA (2006), pp. 1626-1629
[44] Generalized Driven Decoding for Speech Recognition System Combination, Proceedings of ICASSP 2008, IEEE (2008), pp. 1549-1552 | DOI
[45] Distant Speech Processing for Smart Home Comparison of ASR approaches in distributed microphone network for voice command, International Journal of Speech Technology, Volume 21 (2018), pp. 601-618 | DOI
[46] The LIA speech recognition system : from 10xRT to 1xRT, Proceedings of the 10th International Conference on Text, Speech and Dialogue, TSD’07 (LNCS), Volume 4629, Pilsen, Czech Republic (2007), pp. 302-308 | DOI
[47] Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling, Proceedings of Interspeech 2016, ISCA, San Francisco, USA (2016), pp. 685-689 (Accessed 2017-09-26) | DOI
[48] Du rêve à la rigueur : la maison électrique de Georgia Knap, Culture technique : Machines au foyer, Volume 3 (1981), pp. 190-191 (Numéro spécial)
[49] An innovative speech-based user interface for smarthomes and IoT solutions to help people with speech and motor disabilities, Studies in Health Technology and Informatics, Volume 242 (2017), pp. 306-313 | DOI
[50] Learning Natural Language Understanding Systems from Unaligned Labels for Voice Command in Smart Homes, Proceedings of PerDial 2019, ISCA/ACL, Kyoto, Japan (2019)
[51] The neural network house : An environment hat adapts to its inhabitants, Proceedings of AAAI Spring Symposium on Intelligent Environments, Volume 58 (1998), pp. 110-114
[53] et al. AILISA plateformes d’évaluations pour des technologies de télésurveillance médicale et d’assistance en gérontologie, Gérontologie et société, Volume 28 (2005) no. 113, pp. 97-119 | DOI
[54] Multichannel Audio Source Separation With Deep Neural Networks, IEEE/ACM Transactions on Audio, Speech & Language Processing, Volume 24 (2016) no. 9, pp. 1652-1664 | DOI
[55] Context-Aware Voice-based Interaction in Smart Home -VocADom@A4H Corpus Collection and Empirical Assessment of its Usefulness, PICom 2019 - 17th IEEE International Conference on Pervasive Intelligence and Computing (2019 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress), IEEE, Fukuoka, Japan (2019), pp. 811-818 | DOI | HAL
[56] Determining useful sensors for automatic recognition of activities of daily living in health smart home, Proceedings of IDAMAP 2009, IMIA, Verona, Italy (2009), pp. 63-64
[57] Design and evaluation of a smart home voice interface for the elderly – Acceptability and objection aspects, Personal and Ubiquitous Computing, Volume 17 (2013) no. 1, pp. 127-144 | DOI
[58] The subspace Gaussian mixture model—A structured model for speech recognition, Computer Speech & Language, Volume 25 (2011) no. 2, pp. 404-439 | DOI
[59] The Kaldi Speech Recognition Toolkit, Proceedings of IEEE-ASRU, IEEE SPS, Hawaii, USA (2011)
[60] The DIRHA-English corpus and related tasks for distant-speech recognition in domestic environments, Proceedings of IEEE-ASRU, IEEE SPS, Scottsdale, Arizona, USA (2015), pp. 275-282 | DOI
[61] A Smart Room for Hospitalised Elderly People : Essay of Modeling and First Steps of an Experiment, Technology and Health care, Volume 7 (1999), pp. 343-357 | DOI
[62] An Experimental Health Smart Home and Its Distributed Internet-based Information and Communication System : First Steps of a Research Project, Proceedings of MEDINFO 2001, IOS Press, London, UK (2001), pp. 1479-1483
[63] Markov Logic Networks, Machine Learning, Volume 62 (2006) no. 1-2, pp. 107-136 | DOI | Zbl
[64] Deep Learning in Neural Networks : An Overview, Neural Networks, Volume 61 (2015), pp. 85-117 | DOI
[65] Keyword Based Speaker Localization : Localizing a Target Speaker in a Multi-speaker Environment, Proceedings of Interspeech 2018, ISCA, Hyderabad, India (2018), pp. 2703-2707 | DOI
[66] Making emergency calls more accessible to older adults through a hands-free speech interface in the house, ACM Transactions on Accessible Computing, Volume 12 (2019) no. 2, 8, 25 pages | DOI
[67] Evaluation of a context-aware voice interface for Ambient Assisted Living : qualitative user study vs. quantitative system evaluation, ACM Transactions on Accessible Computing , Volume 7 (2015) no. 2, 5, 36 pages | DOI
[68] The Sweet-Home Project : Audio Technology in Smart Homes to improve Well-being and Reliance, Proceedings of EMBC’13, EMBS, Osaka, Japan (2013), pp. 7298-7301
[69] Reconnaissance des sons et de la parole dans un Habitat Intelligent pour la Santé : expérimentations en situation non contrôlée, Proceedings of GRETSI 2009, Dijon, France (2009), pp. 1-4 (ID456)
[70] Complete Sound and Speech Recognition System for Health Smart Homes : Application to the Recognition of Activities of Daily Living, New Developments in Biomedical Engineering (Domenico Campolo, ed.), In-Tech, 2010, pp. 645-673 | DOI
[71] Preliminary evaluation of speech/sound recognition for telemedicine application in a real environment, Proceedings of Interspeech 2008, ISCA, Brisbane, Australia (2008), pp. 496-499 | DOI
[72] Smart Audio Sensor for Telemedicine, Proceedings of Smart Object Conference (SOC’2003) (Smart Object Conference (SOC’2003)), Grenoble, France (2003), pp. 222-225
[73] The Sweet-Home speech and multimodal corpus for home automation interaction, Proceedings of LREC 2014, ELRA, Reykjavik, Iceland (2014), pp. 4499-4506
[74] Experimental Evaluation of Speech Recognition Technologies for Voice-based Home Automation Control in a Smart Home, Proceedings of the 4th Workshop SLPAT, ACL (2013), pp. 99-105
[75] Challenges in the Processing of Audio Channels for Ambient Assisted Living, IEEE HealthCom 2010 – 12th International Conference on E-health Networking, Application & Services, Lyon, France (2010), pp. 330-338 | DOI
[76] Development of Audio Sensing Technology for Ambient Assisted Living : Applications and Challenges, International Journal of E-Health and Medical Communications (IJEHMC), Volume 2 (2011) no. 1, pp. 35-54 | DOI
[77] Speech and Sound Use in a Remote Monitoring System for Health Care, Text Speech and Dialogue (P. Sojka; I. Kopecek; K. Pala, eds.) (Speech and Sound Use in a Remote Monitoring System for Health Care), Volume 4188/2006, Springer Berlin/Heidelberg, Brno, Czech Republic, 2006, pp. 711 -718 | DOI
[78] The VocADom Project : Speech Interaction for Well-being and Reliance Improvement, MobileHCI 2018 - 20th International Conference on Human-Computer Interaction with Mobile Devices and Services, Barcelona, Spain (2018) | HAL
[79] A New Methodology for Speech Corpora Definition from Internet Documents, Proceedings of LREC 2000, ELRA, Athens, Greece (2000), pp. 423-426
[80] The second CHiME Speech Separation and Recognition Challenge : Datasets, tasks and baselines, IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, Vancouver, Canada (2013), pp. 126-130 | DOI
[81] Longitudinal study of ASR performance on ageing voices, Proceedings of Interspeech 2008, ISCA, Brisbane, Australia (2008), pp. 2550-2553 | DOI
[82] Sphinx-4 : A Flexible Open Source Framework for Speech Recognition (2004) (Technical report)
[83] The World is Not a Desktop, ACM Interactions, Volume 1 (1994) no. 1, pp. 7-8 | DOI
[84] Distant Speech Recognition, John Wiley and Sons, Chichester, UK, 2009, 573 pages
[85] A computer system to monitor older adults at home : Preliminary results, Gerontechnology, Volume 8 (2009) no. 3, pp. 129-139 | DOI
Cité par Sources :