Projects and teams
You will find on this page all the project proposals to which applicants may apply. As a shortcut, you can directly access to the PDF of the proposals below :
- P1 - Speech, gaze and gesturing - multimodal conversational interaction with Nao robot PDF proposal,
- P2 - Laugh Machine PDF proposal,
- P3 - Human motion recognition based on videos PDF proposal,
- P5 - M2M - Socially Aware Many-to-Machine Communication PDF proposal,
- P6 - Is this guitar talking or what!? PDF proposal,
- P7 - CITYGATE, The multimodal cooperative intercity Window PDF proposal,
- P8 - Active Speech Modifications PDF proposal,
- P10 - ArmBand : Inverse Reinforcement Learning for a BCI driven robotic arm controlPDF proposal.
The following projects have been discarded due to a lack of participants:
- P4 - Wave Front Synthesis Holophony - Star Trek's Holodeck PDF proposal,
- P9 - eyeTS: eye-tracking Tutoring Systems; Towards User-Adaptive Eye-Tracking Educational Technology PDF proposal,
List of projects | |
1 | Speech, gaze and gesturing - multimodal conversational interaction with Nao robot Description Team E-mail Expand Collapse |
Project Description : The proposal is available as a whole in PDF Abstract
The general goal of this project is to learn more about multimodal interaction with the Nao
robot, including speech, gaze and gesturing. As a starting point for the speech interaction,
the project has a more specific goal: to implement on Nao a spoken dialogue system that
supports open-domain conversations using Wikipedia as a knowledge source. A prototype of
such an open-domain conversation system has already been developed using Python and
the Pyrobot robotics simulator (Wilcock and Jokinen, 2011). Preliminary work has been done
with the Nao Choreographe software which supports Python, but Choreographe does not
include Nao’s speech recognition or speech synthesis components. Objective
The main goal of the project is to extend the Nao robot's interaction capabilities by enabling
Nao to make informative spoken contributions on a wide range of topics during conversation.
The speech interaction will be combined with gaze-tracking and gesturing in order to explore
natural communication possibilities between human users and robots. If eye-tracking
equipment is available we will use it, but if not we will provide simulated gaze information so
that the role of gaze-tracking can be integrated in the interaction management. |
|
Members of the team
Project's leaders:
Participants :
|
|
E-mail list address:
You can contact the project's members at enterface12p1 |
|
2 | Laugh Machine Description Team E-mail Expand Collapse |
Project Description : The proposal is available as a whole in PDF Objective
Laughter is a significant feature of human communication, and machines acting in roles
like companions or tutors should not be blind to it. In LAUGH MACHINE, we aim to
increase the communicative repertoire of virtual agents by giving them
the possibility to laugh naturally (i. e., expressing different types
of laugh according to the context). To achieve this, we will need analysis components that can detect particular events
(some can be pre-defined as the stimulus will be known in advance) as well as
interpreters that will decide how the virtual agent should react to them. In some cases,
the virtual agent will be instructed to laugh in a certain way, which will require other
components, able to synthesize audio-visual laughs. |
|
Members of the team
Project's leaders:
Participants:
|
|
E-mail list address:
You can contact the project's members at enterface12p2 |
|
3 | Human motion recognition based on videos Description Team E-mail Expand Collapse |
Project Description : The proposal is available as a whole in PDF Abstract
Imagine, a large video surveillance database or a large video dance clip database. For
statistical reasons, you want to know for a particular crossroads how many people
are going in a specific direction for a period of time or you want to know, how many
people falls on escalators in a shopping center per day. As choreographer, you are
looking for a typical footstep movements (body movements) or a dance trajectory
plan. How can we help you ? Objective
The main objective of this project is to design and implement a system that will
search in videos surveillance and videos dance clip some movements and trajectories
by using human motion recognition. |
|
Members of the team
Project's leaders:
|
|
E-mail list address:
You can contact the project's members at enterface12p3 |
|
5 | M2M - Socially Aware Many-to-Machine Communication Description Team E-mail Expand Collapse |
Project Description : The proposal is available as a whole in PDF Abstract
The M2M project aims at a first step towards multi-user interaction with emotional virtual agents,
targeting the speech input components. It will extend the speech input capabilities of the SEMAINE
virtual agent to support hands-free input from multiple users, recognizing speech, personality and
affect-related states of individual speakers. The M2M project will develop novel methods to combine
speaker diarization with speaker trait and state classification in multi-source environments, and will
improve detection of speech utterances directed to the system in a multi-user interaction scenario. The
results of the project will be provided as source code and binaries for download. A database of realistic
multi-user interaction with a virtual agent will be collected, partially annotated, and published on the
workshop web page. Objective
Social competence, i. e., the ability to permanently analyze and
re-assess dialogue partners with respect to their traits (e. g., personality or age) and states (e. g., emotion or sleepiness), and to react accordingly
(by adjusting the discourse strategy, or aligning to the dialogue partner) remains one key feature of
human communication that is not found in most of today's technical systems. Hence, the SEMAINE
project (Sustained Emotionally colored Machine-human Interaction using Nonverbal Expression) built
the world's first fully automatic dialogue system with 'socio-emotional skills' realized through signal
processing and machine learning techniques. It is capable of keeping sustained conversations with the
user, using very shallow language understanding - basically, reacting to emotional keywords and
allowing simple dialogue acts - yet advanced techniques for recognition of affect and non-linguistic
vocalizations. |
|
Members of the team
Project's leaders:
Participants:
|
|
E-mail list address:
You can contact the project's members at enterface12p5 |
|
6 | Is this guitar talking or what!? Description Team E-mail Expand Collapse |
Project Description : The proposal is available as a whole in PDF In this project we aim at developing a new framework for performative speech and singing synthesis, i.e. a realtime synthesis system where voice is directly produced by gestural control, with no more reference to textual input. We want to address both phonetical and prosodical issues, with applications in speech and singing synthesis. The target of this new system is to extend the context in which performative speech / singing synthesis is produced and explore a new relationship : controlled by an electric guitar. Indeed, we have known for a long time that managing all the parameters of speech production is tough for a single performer. However, we want to explore the idea of using instrumental gestures, i.e guitar playing techniques, since the subtlety and richness of the instrumental technique is a good start to have a refined control of the synthesized speech / singing. We want to see how intelligibility, naturalness and even speaker identity can be addressed as a guitar performance, involving the player and the audience. Objective 1 - Interactive Control of Voice Production Voice synthesis can be split in various types of typical issues to be solved: articulation and coarticulation of phonemes, speech timing management, intonation modelling, voice quality dimensions, etc. Most of these tasks refer to a significantly different representation of data and the development of appropriate human-computer interaction (HCI) models has not been widely studied for these tasks. We want to develop new interaction paradigms for voice synthesis based on an actual musical instrument. Objective 2 - Second Release for the MAGE Platform Most of current voice synthesis architectures are designed like a giant script which aims at writing down a waveform onto the hard drive. Eventually if we consider realtime and interactive voice synthesizers, we find out that their structure is quite monolithic and difficult to break down. MAGE, a platform for reactive HMM-based speech and singing synthesis, brought a first solution to reactively design voice synthesis with various components and controls being detachable on heterogeneous platforms - computers or mobile devices - while maintain sound quality and low latency. However, the currently provided controls are rather limited. We want to provide various reactive context control modules embedded in the platform, easily accessible to the user / developer / performer. Additionally we would like to explore the idea of reactive interpolation control between different speaking styles and voices over the currently synthesized voice. These integrated new control parameters can result to the release of a more complete, stable and flexible version of the MAGE platform. There is a linguistic and sociological interest in questioning and validating several properties of voice (at various levels: intelligibility, naturalness and identity) when this voice is produced by a performer. Objective 3 - Guitar Independent Algorithms for Playing Techniques to Control a Voice Synthesizer In the Guitar As Controller project, algorithms to detect playing techniques detection have been developed. A nylon string guitar has been used to create the database on which the algorithms have been created and tested. Other types of guitar / string need to be tested in order to make the algorithms guitar independent or to provide preset settings for each types of guitar. |
|
Members of the team
Project's leaders:
Participants:
|
|
E-mail list address:
You can contact the project's members at enterface12p6 |
|
7 | CITYGATE, The multimodal cooperative intercity Window Description Team E-mail Expand Collapse |
Project Description : The proposal is available as a whole in PDF Objective
Mons and Plzen will be the European capitals of culture in 2015. Their
respective mainline will be "When Technology Meets Culture" and "Pilsen,
Open Up!". |
|
Members of the team
Project's leaders:
Participants:
|
|
E-mail list address:
You can contact the project's members at enterface12p7 |
|
8 | Active Speech Modifications Description Team E-mail Expand Collapse |
Project Description : The proposal is available as a whole in PDF Objective
The purpose of this project is to use modern speech analysis and reconstruction algorithms to: |
|
Members of the team
Project's leaders:
Participants:
|
|
E-mail list address:
You can contact the project's members at enterface12p8 |
|
10 | Arm Band :Inverse reinforcement learning for a BCI driven robotic arm control Description Team E-mail Expand Collapse |
Project Description : The proposal is available as a whole in PDF Abstract The goal of this project is to use inverse reinforcement learning to better control a JACO robotic arm developed by Kinova in a Brain-Computer Interface (BCI). Asynchronous BCI such as motor imagery based-BCI allows the subject to give orders at any time to freely control a device. But using this paradigm, even after a long training, the accuracy of the classifier used to recognize the order is not 100%. While a lot of studies try to improve the accuracy using a preprocessing stage that improves the feature extraction, we propose to work on a post-processing solution. The classifier used to recognize the mental commands will provide as outputs a value for each command such as the posterior probability. But the executed action will not only depend on this information. A decision process will also take into account the position of the robotic arm and previous trajectories. More precisely, the decision process will be obtained applying an inverse reinforcement learning on a subset of trajectories specified by an expert. Objective
Brain-Computer interfaces (BCI) [Wolpaw et al. (2002)] interpret brain activity to produce commands on a computer or other devices like a robotic arm (see figure 1 of the proposal). A BCI therefore allows its user, and especially a person with high mobility impairment, to interact with its environment only using its brain activity. |
|
Members of the team
Project's leaders:
Confirmed participants:
|
|
E-mail list address:
You can contact the project's members at enterface12p10 |
|
4 | Wave Front Synthesis Holophony -
Star Trek's Holodeck Description Team E-mail |
Project Description : The proposal is available as a whole in PDF Abstract
Using a large amount of loudspeakers, it is possible to reproduce complicated
wave fields, that is to say to simulate acoustic waves produced by N and possibly
moving virtual sound sources. This is based on the Wave Field Synthesis (WFS)
paradigm originally developed in the University of Delft in early 90s. In the
smart room of Supélec and UMI 2958, two rooms have been
equipped with respectively 72 loudspeakers (the acoustic holograms can
be efficiently placed in and outside the room, and almost all around the center of the room from 0 meter
to 20 meters) and 32 loudspeakers (200 degrees are efficiently covered). For a
listener moving in one of these rooms, the perception of the sources localization
would remain coherent. Objective
Four main points are concerned with this project : |
|
Members of the team
Project's leaders:
Confirmed participants:
Available slots: Anyone interested in the project. |
|
E-mail list address:
You can contact the project's members at enterface12p4 |
|
9 | eyeTS: eye-tracking Tutoring Systems Description Team E-mail |
Project Description : The proposal is available as a whole in PDF Abstract
The eyeTS project investigates the statistical properties of eye movements to information
displayed to a learner by a foreign language learning tutoring system. In an eye-tracking ex-
periment the granularity of the information is manipulated. Two questions are asked in this
context: (1) Can learners' actions be predicted based on their allocation of attention to the
different types of information? and (2) Can the amount of attention allocation to coarse- vs.
fine-grain information be used to predict their learning gain? Objective
Aiming at cognitively-motivated adaptivity for intelligent tutoring systems (ITS), we propose
to use eye-tracking to collect data on learners' gaze behavior during interaction with a tutoring
system, in order to predict their learning paths, on the one hand, and their learning success, on
the other. Ultimately, eye-tracking data collected in real-time would be fed back to the tutoring
system which would adapt to the learner based on a predictive model exploiting eye movement
information. Hence eyeTS: an ITS which keeps track of the learner's eye movements, in other
words, is eye gaze-aware. |
|
Members of the team
Project's leaders:
External consultant
Confirmed participants: Available slots:
To join the project we invite 3-4 students and/or researchers interested in intelligent tutor-
ing systems (especially computer-assisted language learning), human-computer interaction,
and eye-tracking methodology. |
|
E-mail list address:
You can contact the project's members at enterface12p9 |