ilhaire project
Allegro project
The 8th International Summer Workshop on Multimodal Interfaces
eNTERFACE 2012
July, 2nd - July, 27th 2012

Projects and teams

You will find on this page all the project proposals to which applicants may apply. As a shortcut, you can directly access to the PDF of the proposals below :

  • P1 - Speech, gaze and gesturing - multimodal conversational interaction with Nao robot PDF proposal,
  • P2 - Laugh Machine PDF proposal,
  • P3 - Human motion recognition based on videos PDF proposal,
  • P5 - M2M - Socially Aware Many-to-Machine Communication PDF proposal,
  • P6 - Is this guitar talking or what!? PDF proposal,
  • P7 - CITYGATE, The multimodal cooperative intercity Window PDF proposal,
  • P8 - Active Speech Modifications PDF proposal,
  • P10 - ArmBand : Inverse Reinforcement Learning for a BCI driven robotic arm controlPDF proposal.

The following projects have been discarded due to a lack of participants:

  • P4 - Wave Front Synthesis Holophony - Star Trek's Holodeck PDF proposal,
  • P9 - eyeTS: eye-tracking Tutoring Systems; Towards User-Adaptive Eye-Tracking Educational Technology PDF proposal,
List of projects
1 Speech, gaze and gesturing - multimodal conversational interaction with Nao robot
Description   Team   E-mail   Expand   Collapse  

Project Description :

The proposal is available as a whole in PDF

Abstract

The general goal of this project is to learn more about multimodal interaction with the Nao robot, including speech, gaze and gesturing. As a starting point for the speech interaction, the project has a more specific goal: to implement on Nao a spoken dialogue system that supports open-domain conversations using Wikipedia as a knowledge source. A prototype of such an open-domain conversation system has already been developed using Python and the Pyrobot robotics simulator (Wilcock and Jokinen, 2011). Preliminary work has been done with the Nao Choreographe software which supports Python, but Choreographe does not include Nao’s speech recognition or speech synthesis components.

We are also interested in multimodal communication features for the robot, especially gaze- tracking and gesturing. These need to be integrated with the spoken conversation system. The robot needs to know whether the human is interested or not in the conversational topic, and the human's gaze is important for this. The robot should also combine suitable gestures and body language with its own speech turns during the conversation.

Objective

The main goal of the project is to extend the Nao robot's interaction capabilities by enabling Nao to make informative spoken contributions on a wide range of topics during conversation. The speech interaction will be combined with gaze-tracking and gesturing in order to explore natural communication possibilities between human users and robots. If eye-tracking equipment is available we will use it, but if not we will provide simulated gaze information so that the role of gaze-tracking can be integrated in the interaction management.

- To learn basic techniques for spoken dialogue modeling and conversational chatting, especially related to topic management and presentation of new information
-To learn complex issues related to synchrony and unification of multimodal communication models, especially speech and gaze.
-To implement simple spoken conversational interactions with a robot agent
-To integrate some of the available speech, face, gaze, and gesture recognition technologies into the human-robot interaction
-To explore possibilities of natural intuitive human-robot interaction
-To learn to implement and develop conversational models for the Nao robot
-To learn techniques and theories for useful future interactive applications

Members of the team
Project's leaders:

  • Kristiina Jokinen, Adjunct Professor and Project Manager at University of Helsinki and leads the 3I (Intelligent Interaction and Information Systems) Research Group, website
  • Graham Wilcock, Professor and Lecturer in Language Technology at the University of Helsinki, website
Participants :

  • Grizou Jonathan
  • Han Frank
  • Csapo Adam
  • Meena Raveesh
  • Anastasiou Dimitra
E-mail list address:

You can contact the project's members at enterface12p1arobasemetz.supelec.fr

(You may use enterface12pallarobasemetz.supelec.fr to send e-mail to all participants. If you have any problems, please contact enterface12arobasemetz.supelec.fr)

2 Laugh Machine
Description   Team   E-mail   Expand   Collapse  

Project Description :

The proposal is available as a whole in PDF

Objective

Laughter is a significant feature of human communication, and machines acting in roles like companions or tutors should not be blind to it. In LAUGH MACHINE, we aim to increase the communicative repertoire of virtual agents by giving them the possibility to laugh naturally (i. e., expressing different types of laugh according to the context). To achieve this, we will need analysis components that can detect particular events (some can be pre-defined as the stimulus will be known in advance) as well as interpreters that will decide how the virtual agent should react to them. In some cases, the virtual agent will be instructed to laugh in a certain way, which will require other components, able to synthesize audio-visual laughs.

The members of the ILHAIRE consortium will provide the core components of the system. Other LAUGH MACHINE project members may provide additional components that will be integrated during the workshop. The work during the eNTERFACE'12 LAUGH MACHINE project will focus on 1) integrating these core components together to form a full processing chain (from multimodal event analysis to audio-visual laughter synthesis); and 2) evaluating the system and bringing knowledge on particular issues that interactions with a laughing agent might create.

This last aspect is extremely important. In addition to measuring the possible benefits of such an avatar, user evaluations will be used in support to designing improved or more natural person-machine interaction. The goals will be to identify important features, to make sure that the agent's laughs are not misperceived by the user (e.g., malicious or malevolent laughs) and paying attention to the "uncanny valley": the discomfort caused by an automated agent that reproduces very closely, but imperfectly (e.g. bad timing, disproportionate response), human behaviours.

In consequence, besides the technical considerations required to build a virtual agent capable of generating believable audio-visual laughs, particular attention will be given to designing an interactive scenario that will provide us with the best system evaluation possible. The design should allow for comparing different variations of social settings of interaction and agent behaviour patterns. Conditions will be designed on a continuum of laughter elicitation.

The LAUGH MACHINE project will deliver a full processing chain to have an agent laughing appropriately when interacting with a user, as well as deep scientific insights about integrating laughter to avatars, when and how should they laugh, what challenges arise from a psychological perspective, etc.

Members of the team
Project's leaders:

  • Jérôme Urbain - PhD researcher at the University of Mons, Belgium; website
  • Radoslaw Niewiadomski - Postdoc researcher at Telecom ParisTech, France, website
  • Jenny Hofmann - PhD researcher at the University of Zurich, Switzerland, website
Participants:

  • Thierry Dutoit
  • Maurizio Mancini
  • Tracey Platt
  • Florian Lingenfelser
  • Johannes Wagner
  • Willibald Ruch
  • Gary McKeown
  • Nadia Berthouze
  • Harry Griffin
  • Bantegnie Emeline
  • Pammi Sathish Chandra
  • Sharma Abhishek
  • Cruz Richard Thomas
  • Baur Tobias
  • Miranda Miguel Cristobal
  • Volpe Gualtiero
  • Dupont Stephane
  • CAKMAK Huseyin
  • PIOT BILAL
  • Pietquin Olivier
E-mail list address:

You can contact the project's members at enterface12p2arobasemetz.supelec.fr

(You may use enterface12pallarobasemetz.supelec.fr to send e-mail to all participants. If you have any problems, please contact enterface12arobasemetz.supelec.fr)

3 Human motion recognition based on videos
Description   Team   E-mail   Expand   Collapse  

Project Description :

The proposal is available as a whole in PDF

Abstract

Imagine, a large video surveillance database or a large video dance clip database. For statistical reasons, you want to know for a particular crossroads how many people are going in a specific direction for a period of time or you want to know, how many people falls on escalators in a shopping center per day. As choreographer, you are looking for a typical footstep movements (body movements) or a dance trajectory plan. How can we help you ?

In this project we would like to develop a system that will search in videos surveillance and videos dance clip some movements and trajectories with the help of the human motion recognition.

Objective

The main objective of this project is to design and implement a system that will search in videos surveillance and videos dance clip some movements and trajectories by using human motion recognition.

In our scenario, the user, via a web interface will be able to ask to our system the trajectories or the movements by using pre-defined lines, curves, etc. Later, we can propose to the user to draw by himself the trajectories or movements that he wants to find in our video dataset. Most of the work will be done offline such as segmentation of human bodies and key poses detection for tracking the trajectories and detecting movements. A data structure will be built to optimize the search and some a priori models will be implemented to guide the tracking module. Ontology [Staab et al.(2004)] will help to search out which body movements are done in the videos. This will be formalized using the Ontology Web Language (OWL) and the ontology editor and knowledge-base framework : Protege.

The objectives of the project are the design and the implementation of the following modules:

* HMI - user interface
* Tracking - segmentation, key poses detection
* Database structure
* A priori model
* Ontology

Members of the team
Project's leaders:

  • De Beul Dominique - PhD student, website
E-mail list address:

You can contact the project's members at enterface12p3arobasemetz.supelec.fr

(You may use enterface12pallarobasemetz.supelec.fr to send e-mail to all participants. If you have any problems, please contact enterface12arobasemetz.supelec.fr)

5 M2M - Socially Aware Many-to-Machine Communication
Description   Team   E-mail   Expand   Collapse  

Project Description :

The proposal is available as a whole in PDF

Abstract

The M2M project aims at a first step towards multi-user interaction with emotional virtual agents, targeting the speech input components. It will extend the speech input capabilities of the SEMAINE virtual agent to support hands-free input from multiple users, recognizing speech, personality and affect-related states of individual speakers. The M2M project will develop novel methods to combine speaker diarization with speaker trait and state classification in multi-source environments, and will improve detection of speech utterances directed to the system in a multi-user interaction scenario. The results of the project will be provided as source code and binaries for download. A database of realistic multi-user interaction with a virtual agent will be collected, partially annotated, and published on the workshop web page.

Objective

Social competence, i. e., the ability to permanently analyze and re-assess dialogue partners with respect to their traits (e. g., personality or age) and states (e. g., emotion or sleepiness), and to react accordingly (by adjusting the discourse strategy, or aligning to the dialogue partner) remains one key feature of human communication that is not found in most of today's technical systems. Hence, the SEMAINE project (Sustained Emotionally colored Machine-human Interaction using Nonverbal Expression) built the world's first fully automatic dialogue system with 'socio-emotional skills' realized through signal processing and machine learning techniques. It is capable of keeping sustained conversations with the user, using very shallow language understanding - basically, reacting to emotional keywords and allowing simple dialogue acts - yet advanced techniques for recognition of affect and non-linguistic vocalizations.

Still, the system is limited to interaction with a single user - however, in many real-world scenarios, human-computer interaction with multiple users, and hence, recognizing traits (e.g., personality) and affect-related states (e.g., interest) of the individuals and of the group as a whole, is desirable. Such scenarios include emotional agents incorporated into robots acting as museum guides, or information kiosks. Yet, the generalization from 1 to N system users comes with a variety of 'grand challenges' - the following is to be understood as a non-exhaustive list, reaching from front-end to back-end:

(i) Speech source localization. Among other applications, this is useful for feedback, such as the avatar / robot turning its head to the person speaking.
(ii) Technical robustness to non-stationary background noise (transient noise, background speakers) and reverberation in real-world hands-free application scenarios (such as trade fairs, museums etc.)
(iii) Speaker diarization. This is required for the character to access the interaction history with individual speakers. For instance, it can be used to detect that a person has not been speaking for a longer time; the main challenge is handling overlap between speakers.
(iv) Even in case of perfect speech detection and absence of overlap or background noise, speech may not be addressed to the virtual agent, but to other humans (side talk), or simply to the speaker itself (self directed talk). This can easily lead to erroneous actions taken by the system.
(v) Multi-talker recognition of affect and speech from cross-talk, i. e., in case that system users are speaking simultaneously.
(vi) Appropriate strategies for dialogue management and adaptation of visual agent behavior, such as 'integrating' users showing a low level of interest while preserving high levels of interest of other users.

Clearly, addressing all these challenges and implementing solutions is beyond the scope of a four week targeted research project. Hence, the M2M project will focus on some aspects of (ii) through (iv) in the above list: Precisely, it will extend the capabilities of the SEMAINE system to cope with a hands-free scenario where multiple users interact with the system in the presence of background talkers, environmental noise and reverberation, yet assuming little to no overlap between the user utterances targeted to the system. In the result, detected keywords, speaker traits and affect-related states will be attributed to different users by means of speaker diarization and visualized appropriately. Utterances not addressed to the system will be rejected. The project's objectives will be verified through a dedicated evaluation work package using objective and subjective measures (cf. proposal p. 6). The M2M project will deliver tangible results in the shape of source code and reports (cf. proposal p. 8).

Members of the team
Project's leaders:

  • Dr. Bjorn Schuller - Institute for Human-Machine Communication (IMMC), Technische Universitaet Munich, website
Participants:

  • Cyril Joder,
  • Florian Eyben,
  • Felix Weninger,
  • Gilmartin Emer,
  • Munier Christian,
  • Stefanov Kalin,
  • Marchi Erik.
E-mail list address:

You can contact the project's members at enterface12p5arobasemetz.supelec.fr

(You may use enterface12pallarobasemetz.supelec.fr to send e-mail to all participants. If you have any problems, please contact enterface12arobasemetz.supelec.fr)

6 Is this guitar talking or what!?
Description   Team   E-mail   Expand   Collapse  

Project Description :

The proposal is available as a whole in PDF

In this project we aim at developing a new framework for performative speech and singing synthesis, i.e. a realtime synthesis system where voice is directly produced by gestural control, with no more reference to textual input. We want to address both phonetical and prosodical issues, with applications in speech and singing synthesis. The target of this new system is to extend the context in which performative speech / singing synthesis is produced and explore a new relationship : controlled by an electric guitar. Indeed, we have known for a long time that managing all the parameters of speech production is tough for a single performer. However, we want to explore the idea of using instrumental gestures, i.e guitar playing techniques, since the subtlety and richness of the instrumental technique is a good start to have a refined control of the synthesized speech / singing. We want to see how intelligibility, naturalness and even speaker identity can be addressed as a guitar performance, involving the player and the audience.

Objective 1 - Interactive Control of Voice Production

Voice synthesis can be split in various types of typical issues to be solved: articulation and coarticulation of phonemes, speech timing management, intonation modelling, voice quality dimensions, etc. Most of these tasks refer to a significantly different representation of data and the development of appropriate human-computer interaction (HCI) models has not been widely studied for these tasks. We want to develop new interaction paradigms for voice synthesis based on an actual musical instrument.

Objective 2 - Second Release for the MAGE Platform

Most of current voice synthesis architectures are designed like a giant script which aims at writing down a waveform onto the hard drive. Eventually if we consider realtime and interactive voice synthesizers, we find out that their structure is quite monolithic and difficult to break down. MAGE, a platform for reactive HMM-based speech and singing synthesis, brought a first solution to reactively design voice synthesis with various components and controls being detachable on heterogeneous platforms - computers or mobile devices - while maintain sound quality and low latency. However, the currently provided controls are rather limited. We want to provide various reactive context control modules embedded in the platform, easily accessible to the user / developer / performer. Additionally we would like to explore the idea of reactive interpolation control between different speaking styles and voices over the currently synthesized voice. These integrated new control parameters can result to the release of a more complete, stable and flexible version of the MAGE platform. There is a linguistic and sociological interest in questioning and validating several properties of voice (at various levels: intelligibility, naturalness and identity) when this voice is produced by a performer.

Objective 3 - Guitar Independent Algorithms for Playing Techniques to Control a Voice Synthesizer

In the Guitar As Controller project, algorithms to detect playing techniques detection have been developed. A nylon string guitar has been used to create the database on which the algorithms have been created and tested. Other types of guitar / string need to be tested in order to make the algorithms guitar independent or to provide preset settings for each types of guitar.

Members of the team
Project's leaders:

  • Dr. Nicolas d'Alessandro, Research Associate, PhD Media and Graphics Interdisciplinary Centre University of British Columbia. website
Participants:

  • Maria Astrinaki,
  • Loïc Reboursiere,
  • Thierry Dutoit,
  • Krichi Mohamed Khalil,
  • Moinet Alexis,
E-mail list address:

You can contact the project's members at enterface12p6arobasemetz.supelec.fr

(You may use enterface12pallarobasemetz.supelec.fr to send e-mail to all participants. If you have any problems, please contact enterface12arobasemetz.supelec.fr)

7 CITYGATE, The multimodal cooperative intercity Window
Description   Team   E-mail   Expand   Collapse  

Project Description :

The proposal is available as a whole in PDF

Objective

Mons and Plzen will be the European capitals of culture in 2015. Their respective mainline will be "When Technology Meets Culture" and "Pilsen, Open Up!".

In order to prepare this event, UMONS and UWB have started collaborating on digital art technology, starting with the KINACT project during eNTERFACE11 in Plzen. One of the activities that could be organized between cities as part of the 2015 event would rely on establishing creative interaction between citizens in both cities. This needs to build a common infrastructure for allowing real-time multimodal interaction. The main goal of the Citygate project will be to achieve a first step in this direction, by developing the technology components required for interaction.

More precisely, the project will allow:
- Audiovisual telepresence streaming,
- Interaction: Games, Dance and Music performances, VJing / DJing, ...,
- Cooperative multiplayer (social) games (like in KINACT),
- Digital art installation.

Members of the team
Project's leaders:

  • Milos Zelezny, Professor, University ofWest Bohemia (Czech Republic), website
  • Thierry Dutoit, Professor, University of Mons (Belgium), website
Participants:

  • Radhwan Ben Madhkour,
  • Francois Zajéga,
  • Marek Hruz,
  • Ambroise Moreau ,
  • Jirik Miroslav,
  • Ryba Tomas,
  • Pirner Ivan,
  • Zimmermann Petr,
  • Dalla Rosa Pierluigi,
  • Vit Jakub.
E-mail list address:

You can contact the project's members at enterface12p7arobasemetz.supelec.fr

(You may use enterface12pallarobasemetz.supelec.fr to send e-mail to all participants. If you have any problems, please contact enterface12arobasemetz.supelec.fr)

8 Active Speech Modifications
Description   Team   E-mail   Expand   Collapse  

Project Description :

The proposal is available as a whole in PDF

Objective

The purpose of this project is to use modern speech analysis and reconstruction algorithms to:
1. identify which acoustic-phonetic characteristics are prominent in each of 3 different styles of clear speech (e.g. babble-countering clear speech, vocoder-countering clear speech, L2-'countering' clear speech) and when in time they are realized.
2. model at least some of these aspects so that they can be applied automatically (e.g. prosodic changes, changes in amplitude spectrum, modulation frequencies etc.) on speech.
3. run a series of 'proof of concept' perception experiments to see if the 'specifically-enhanced' speech is better perceived in the 'matched' adverse condition than other types of clear speech (there is evidence that this is the case with the naturally-enhanced speech).
For the purpose of the project we will use already developed relevant corpora (although we might need to create new).

Members of the team
Project's leaders:

  • Yannis Stylianou, Professor at University of Crete, Department of Computer Science, website
  • Valerie Hazan, Professor in Speech Sciences, University College London, website
Participants:

  • Raitio Tuomo,
  • Koutsogiannaki Maria,
  • Tang Yan,
  • Jokinen Emma,
  • Aubanel Vincent,
  • Mowlaee Pejman,
  • Godoy Elizabeth,
  • Nicolao Mauro,
  • Granlund Sonia,
  • Sfakianaki Anna.
E-mail list address:

You can contact the project's members at enterface12p8arobasemetz.supelec.fr

(You may use enterface12pallarobasemetz.supelec.fr to send e-mail to all participants. If you have any problems, please contact enterface12arobasemetz.supelec.fr)

10 Arm Band :Inverse reinforcement learning for a BCI driven robotic arm control
Description   Team   E-mail   Expand   Collapse  

Project Description :

The proposal is available as a whole in PDF

Abstract

The goal of this project is to use inverse reinforcement learning to better control a JACO robotic arm developed by Kinova in a Brain-Computer Interface (BCI). Asynchronous BCI such as motor imagery based-BCI allows the subject to give orders at any time to freely control a device. But using this paradigm, even after a long training, the accuracy of the classifier used to recognize the order is not 100%. While a lot of studies try to improve the accuracy using a preprocessing stage that improves the feature extraction, we propose to work on a post-processing solution. The classifier used to recognize the mental commands will provide as outputs a value for each command such as the posterior probability. But the executed action will not only depend on this information. A decision process will also take into account the position of the robotic arm and previous trajectories. More precisely, the decision process will be obtained applying an inverse reinforcement learning on a subset of trajectories specified by an expert.

Objective

Brain-Computer interfaces (BCI) [Wolpaw et al. (2002)] interpret brain activity to produce commands on a computer or other devices like a robotic arm (see figure 1 of the proposal). A BCI therefore allows its user, and especially a person with high mobility impairment, to interact with its environment only using its brain activity.

Overcome the variability of the mental command

A major difficulty to properly interpret the mental command lies in the fact that brain activity is very variable even if a particular task is reproduced identically. Beyond the noise acquired by the recording system, background brain activity, concentration, fatigue or medication of the subject are the source of this variability. This variability makes it difficult for the classifier to recognize the different mental commands. Specific preprocessings such as common spatial pattern filter [Lotte et al. (2010)] are useful to help distinguish the mental command. However, this effort is not always sufficient. It therefore becomes necessary to explore new solutions to address this variability.

Interest in reinforcement learning

Thus, it is now necessary to make decision systems able to deal with this variability. This is why some projects introduce a reinforcement learning in their BCI system such modifying the classifier [Fruitet et al.(2011)]. We propose to use reinforcement learning in a broader context.

Proposed study

We plan to show in this project that a reinforcement learning improves the control of a robotic arm. More precisely, the decision process will take into account a subset of trajectories specified by an expert and the position of the robotic arm in addition to the usual outputs of the mental commands classifier.

Members of the team
Project's leaders:

  • Laurent Bougrain, Associate professor at the university of Lorraine (France), CORTEX research group, website
Confirmed participants:

  • Edouard Klein,
  • Duvinage Matthieu.
E-mail list address:

You can contact the project's members at enterface12p10arobasemetz.supelec.fr

(You may use enterface12pallarobasemetz.supelec.fr to send e-mail to all participants. If you have any problems, please contact enterface12arobasemetz.supelec.fr)

4 Wave Front Synthesis Holophony - Star Trek's Holodeck
Description   Team   E-mail  

Project Description :

The proposal is available as a whole in PDF

Abstract

Using a large amount of loudspeakers, it is possible to reproduce complicated wave fields, that is to say to simulate acoustic waves produced by N and possibly moving virtual sound sources. This is based on the Wave Field Synthesis (WFS) paradigm originally developed in the University of Delft in early 90s. In the smart room of Supélec and UMI 2958, two rooms have been equipped with respectively 72 loudspeakers (the acoustic holograms can be efficiently placed in and outside the room, and almost all around the center of the room from 0 meter to 20 meters) and 32 loudspeakers (200 degrees are efficiently covered). For a listener moving in one of these rooms, the perception of the sources localization would remain coherent.

Objective

Four main points are concerned with this project :

- A portable and ergonomic graphical interface still needs to be build, in order to control efficiently the holophonic system. This interface will be written in GTK/Glade.
- The two rooms are also equipped with HD screens and with several arrays of microphones. Some techniques have been developed for 2D and 3D source localization using these arrays of microphones. They need to be extended and validated when several sources are present simultaneously, the number of sources being possibly larger than the number of microphones. Therefore, the development of acoustical antennas for efficient 2D/3D acoustical beam-forming is requested.
- It is envisioned in the continuity, for instance considering a visio-conference perspective, to build a system able to detect the source positions in one of the rooms and to reproduce the corresponding wave field in the other room, and conversely. This in order to emphasis towards sensations of immersion and presence in situations involving the participation of the listener.
- The rendering of virtual speaking agents. From a cognitive point of view, focusing on auditory and visual spatial cognition (that is to say on multisensorial integration processes), subject’s performance, in terms of localization, should be studied.

Members of the team
Project's leaders:

  • Stéphane Rossignol - Professor Supélec, website
Confirmed participants:

  • Jean-Luc Collette, Professor Supélec
  • Jean-Baptiste Tavernier, Engineer Supélec
Available slots:

Anyone interested in the project.

E-mail list address:

You can contact the project's members at enterface12p4arobasemetz.supelec.fr

(You may use enterface12pallarobasemetz.supelec.fr to send e-mail to all participants. If you have any problems, please contact enterface12arobasemetz.supelec.fr)

9 eyeTS: eye-tracking Tutoring Systems
Description   Team   E-mail  

Project Description :

The proposal is available as a whole in PDF

Abstract

The eyeTS project investigates the statistical properties of eye movements to information displayed to a learner by a foreign language learning tutoring system. In an eye-tracking ex- periment the granularity of the information is manipulated. Two questions are asked in this context: (1) Can learners' actions be predicted based on their allocation of attention to the different types of information? and (2) Can the amount of attention allocation to coarse- vs. fine-grain information be used to predict their learning gain?

Technical implementation of the project involves integration of a Facelab 5 eye-tracker with a foreign language learning platform via Text 2.0, an open-source infrastructure for tracking eye movements to web-based content. The project comprises the technical implementation, conducting an eye-tracking experiment, and eye movement analyses in order to answer the two above-mentioned research questions and to make suggestions on the eye movement regularities which a user-adaptive language learning interface should consider.

Objective

Aiming at cognitively-motivated adaptivity for intelligent tutoring systems (ITS), we propose to use eye-tracking to collect data on learners' gaze behavior during interaction with a tutoring system, in order to predict their learning paths, on the one hand, and their learning success, on the other. Ultimately, eye-tracking data collected in real-time would be fed back to the tutoring system which would adapt to the learner based on a predictive model exploiting eye movement information. Hence eyeTS: an ITS which keeps track of the learner's eye movements, in other words, is eye gaze-aware.

In this project we will focus on computer-assisted language learning (CALL) as the tutoring domain while addressing two research questions:
(1) Is learners' gaze behavior while learning with an interactive CALL system predictive of learners' actions?
(2) Are learners' gaze patterns while inspecting a CALL system's feedback of different gran- ularity predictive of learning gains?

More specifically, in (1) we hypothesize that learners' choice of action - e.g., continuing an exercise at the same level, moving to a more difficult level, or returning to an explanation of a language phenomenon - can be predicted based on the inspection time (and/or number of fixations) allocated to different types of contents displayed by the interface. The granularity of the information displayed to the learner will be manipulated to contain coarse-grained information (e.g., overall score on an activity) or more fine-grained information (e.g., mistakes highlighted and corrected or meta-linguistic explanations). In (2), we ask whether the amount of attention allocation to coarse- vs. fine-grained information can predict learning gains (as measured, for instance, by pre- and posttests)

Members of the team
Project's leaders:

  • Magdalena Wolska, Saarland University, website
  • Dr. Pirita Pyykkonen-Klauck, Saarland University, website
External consultant

  • Ralf Biedert, DFKI Kaiserslautern, Founder of the Text 2.0 framework; Eye-tracking & HCI researcher Text, website
Confirmed participants:

Available slots:

To join the project we invite 3-4 students and/or researchers interested in intelligent tutor- ing systems (especially computer-assisted language learning), human-computer interaction, and eye-tracking methodology.
Technical realization of the project requires fluent Java and C++ programming skills (at least 2 participants). Other technical skills needed include: JavaScript and CSS (for Text 2.0) and Perl CGI scripting (for the language learning platform). Familiarity with R for statistical analysis would be a plus.

E-mail list address:

You can contact the project's members at enterface12p9arobasemetz.supelec.fr

(You may use enterface12pallarobasemetz.supelec.fr to send e-mail to all participants. If you have any problems, please contact enterface12arobasemetz.supelec.fr)