Music Informatics and Cognition Group (MusIC)

2014-09-24-MT-Colloquium-imageDate:  24 September
Time: 12.45-14.00
Place: RDB14 3.429

The Music Informatics and Cognition Research Group (MusIC) designs, implements and evaluates algorithms for automatically carrying out tasks that, if performed by humans, would be considered to require musical expertise. The group shares the common goal of understanding on the computational and algorithmic levels the cognitive processes that operate when musical experts compose, analyse, improvise and perform music. Although unified by this common goal, the group’s members adopt quite different but complementary approaches to achieving it. In this presentation, Gissel Velarde will introduce her work on using wavelet-based methods to segment, classify and discover patterns in melodies. Olivier Lartillot will present a new simplified model of motivic analysis based on multiparametric closed pattern and cyclic sequence mining. Brian Bemman will report on his attempts to develop an algorithmic model of Milton Babbitt’s compositional procedures. David Meredith will present his most recent work on geometric pattern discovery in music, focusing on methods for automatically discovering maximal scalable patterns.

For more information: http://www.create.aau.dk/music

Guest lecture: Interacting at different ends of the scale: Massive interaction surfaces and minimal ambient & tangible interaction devices: The MiniOrb and CubIT projects – Markus Rittenbruch

RitterbuschDate:  12 June
Time: 13.00-14.00
Place: KAR6B-102

In this talk will present two projects conducted at QUT’s Institute for Future Environments, which consider user interaction at significantly different scales:
The MiniOrb project addresses the question of how to design systems that aid office
inhabitants in controlling their localised office environments, as well as support the process of negotiating shared preferences amongst co-located inhabitants. I will describe the design, use and evaluation of MiniOrb, a system that employs ambient and tangible interaction mechanisms to allow inhabitants of office environments to report on subjectively perceived office comfort levels. MiniOrb consists of a sensor device measuring localised environmental conditions, and two input / output devices: an ambient and tangible interaction device that allows users to maintain peripheral awareness of environmental factors and provides a tangible input mechanism to report personal preferences, and a mobile application, which displays measurements in a more explicit manner and allows for touch-based user input. Our aim was to explore the role of ubiquitous computing in the individual control of indoor climate and specifically answer the question to what extent ambient and tangible interaction mechanisms are suited for the task of capturing individual comfort preferences in a nonobtrusive manner.
The CubIT project developed a large-scale multi-user presentation and collaboration
platform running on QUT’s Cube facility. “The Cube” is a unique facility that combines 48
large multi-touch screens and very large-scale projection surfaces to form a very large
interactive learning and engagement space. The CubIT system was specifically designed to
allow QUT staff and students to utilise the capabilities of the Cube. CubIT’s primary purpose is to enable users to upload, interact with and share their own media content on the Cube’s display surfaces using a shared workspace approach. CubIT combines multiple  interfaces (multi-touch, mobile & web) each of which play different roles and support different interaction mechanisms, supporting a range of collaborative features including multi-user shared workspace interaction, drag and drop upload and sharing between users, session management and dynamic state control between different parts of the system. I will briefly introduce the Cube facility and describe the design and implementation of the CubIT system.

Bio
Dr Markus Rittenbruch is a Senior Research Fellow with the Institute for Future Environments (IFE) at the Queensland University of Technology (QUT) and a member of
QUT’s Urban Informatics Research Lab. He has over 19 years of research experience in the
fields of Human-Computer Interaction (HCI), Computer Supported Cooperative Work
(CSCW), and Ubiquitous Computing (UbiComp). Before joining QUT he has been invited to work at various research organisations in Germany and Australia, including the University of Bonn, the Distributed Systems Technology Centre (DSTC), the University of Queensland, the Australasian CRC for Interaction Design (ACID), and NICTA, Australia’s Information and Communications Technology Centre of Excellence.

Markus’ research focuses on solving problems in urban and organisational contexts through the design of engaging, innovative interaction technologies and approaches. His interests include the design of collaborative software, advanced models of awareness in groupware, in particular contextual and intentional awareness, social software, ambient, ubiquitous and physical computing, natural user interfaces and different ways of interfacing with sensors and sensor data. Markus has authored and co-authored over 50 publications in journals, edited books, and conference proceedings, including publications in the leading journals in HCI (Human-Computer Interaction Journal) and CSCW (Journal on Computer Supported Cooperative Work).

Fast Joint DOA and Pitch Estimation Using a Broadband MVDR Beamformer – Sam Karimian-Azari

Date:  28 August 2013
Time: 13.00-14.00
Place: SOF9.101

The harmonic model, i.e., a sum of sinusoids having frequencies that are integer multiples of the pitch, has been widely used for modeling of voiced speech. In microphone arrays, the direction-of-arrival (DOA) adds an additional parameter that can help in obtaining a robust procedure for tracking non-stationary speech signals in noisy conditions. In this paper, a joint DOA and pitch estimation (JDPE) method is proposed. The method is based on the minimum variance distortionless response (MVDR) beamformer in the frequency-domain and is much faster than previous joint methods, as it only requires the computation of the optimal filters once per segment. To exploit that both pitch and DOA evolve piecewise smoothly over time, we also extend a dynamic programming approach to joint smoothing of both parameters. Simulations show the proposed method is much more robust than parallel and cascaded methods combining existing DOA and pitch estimators.

Bio
Sam Karimian-Azari received the B.Sc. degree in electrical engineering in 2001 from Isfahan University of Technology, Iran, and the M.Sc. degree in electrical engineering with emphasis on signal processing in 2012 from Blekinge Institute of Technology, Sweden. He is currently a Ph.D. fellow in electrical engineering at Aalborg University under the supervision of Mads Græsbøll Christensen. His research interests include the speech signal processing and microphone array processing techniques.


Statistically Efficient Methods for Pitch and DOA Estimation – Jesper Rindom Jensen

Date:  06 March 2013
Time: 13.00-14.00
Place: tba

The direction-of-arrival (DOA) and the pitch of multichannel, periodic sources are some of the main paraphernalia in many of signal processing methods for, e.g., tracking, separation, enhancement, and compression. Traditionally, the estimation of these parameters have been considered as two separate problems. Separate estimation may render the task of resolving sources with similar DOA or pitch impossible, and it may decrease the estimation accuracy. Therefore, it was recently considered to estimate the DOA and pitch jointly. In this talk, we present two novel methods for DOA and pitch estimation. They both yield maximum-likelihood estimates in white Gaussian noise scenarios, where the signal-to-noise (SNR) may be different across channels, as opposed to state-of-the-art methods. The first method is a joint estimator, whereas the latter use a cascaded approach, but with a much lower computational complexity. The simulation results confirm that the proposed methods outperform state-of-the-art methods in terms of estimation accuracy in both synthetic and real-life signal scenarios.

Bio
Jesper Rindom Jensen received the M.Sc. and Ph.D. degrees from the Deptartment of Electronic Systems at Aalborg University in 2009 and 2012, respectively. Currently, he is a postdoctoral researcher at the Department of Architecture, Design & Media Technology at Aalborg University. His research interests include spectral analysis, estimation theory and microphone array signal processing.

Text is not the Enemy: Multimodal interfaces for illiterate people – Hendrik Knoche

Date:  20 February 2013
Time: 13.00-14.00
Place: tba

Rainfed farming provides the bulk of the world’s food supply and has tremendous potential to increase its productivity. Most rainfed farms are operated by farmers from the bottom of the pyramid who lack information about novel agricultural techniques, what their peers are doing and in many developing areas have high illiteracy rates. Illiteracy does not feature in accessibility guidelines and its implications for interaction design are poorly understood. Most of the research on illiterate users in the ICT for development (ICT4D) literature has hitherto focused on mobile phones relying on keypads for input. Research methods of human computer interaction honed for lab settings need to be re-assessed and modified to conduct studies ‘in-the-wild’. In this talk I will reflect on the lessons learned from two applications for illiterate users designed within the scope of an ICT4D project, covering design implications, methodological pitfalls and the role text can play in multi-modal interfaces for illiterate people.

Bio
Hendrik Knoche holds an MSc (UoHamburg) and PhD (UC London) in computer science. His research interests include human-centered design, design thinking, mediated experiences, proxemics, and ICT for development along with methods for prototyping and evaluating applications and their user experiences “in the wild”. Since October 2012 he is working at ADMT in Aalborg.

Sensory Perception from Multimodal Spike Inputs (sensory fusion and spatiotemporal perception) – Sam Karimian-Azari

Date:  05 December 2012
Time: 13.00-14.00
Place: tba

The information interaction in biological nervous system sprits a living body. Sensory neurons encode the spatiotemporal information of a stimulus into a sequential spike train. Any neural spike carries Inter-Spike-Interval (ISI) timing information as a message and a statistical inference of the ISI along a temporal window is assumed as a perception. When a sensor is missed, the spike generation is predicted through the prior perception and the uncertainty of this perception has to be increased. Multimodal sensory fusion improves the perception by decreasing the uncertainty and its combination with prediction improves the perception psychophysically. The Belief Propagation (BP) algorithm is used to perform sensory message passing in a spiking network. The last messages from asynchronous spikes of different modalities are preserved to make an association in a memory. A stationary perception is constructed by the memorized messages along a missing sense but the perception has to be dynamic to predict spatiotemporal information of a dynamic stimulus psychophysically. In this research we investigate the sensory perception through a stimulated spiking visual sensor and address the attenuation of preserved information by the drift-diffusion phenomenon for a dynamic perception. We experiment the variation of statistical features optimizes the spiking network in multimodal sensory perception over time and space.

Wavelet representation of melodic shape – Gissel Velarde

Date: 27 November 2012 (TUESDAY!)
Time: 13.00-14.00
Place: NJ14 3-228 (Las Vegas)

The multi-scale aspect of wavelet analysis makes it particularly attractive as a method for analysing melodies. In this talk, I will present a methodology that I developed jointly with Tillman Weyde to represent and segment symbolic melodies. The wavelet coefficients obtained by Haar continuous wavelet transform at one scale are not only used to represent melodies, but to find relevant segmentation points at the coefficients’ zero crossings. Classification experiments showed that the method is more descriptive in a melody-recognition model than other approaches.

Bio
For over a year, Gissel Velarde has been working on wavelets applied to symbolic music. Currently she is a PhD student at Aalborg University working on a wavelet-based approach for analysis of symbolic music under the supervision of David Meredith. Velarde holds a degree in Systems Engineering from the Bolivian Catholic University and a Master of Science in Electronic Systems and Engineering Management from the South Westphalia University of Applied Sciences. Besides, she studied piano at the Bolivian National Conservatory of Music and won, among others, first and second prizes at the National Piano Competition in Bolivia (1994 and 1997 respectively). From 2006 to 2008, she was DAAD scholarship holder. In 2010, she received a Best Paper Award nomination at the Industrial Conference on Data Mining in Berlin.

The Harmonic Pattern Function: A Model For Audio/Visual Synthesis – Lance Putnam

Date: 21 November 2012
Time: 13.00-14.00
Place: NJ14 3-228 (Las Vegas)

In this talk, I give an overview of my dissertation research concerning the use of harmonics in audio/visual synthesis and composition. I introduce my main contribution to this area, the harmonic pattern function, a mathematical model capable of compactly describing a wide array of sound waveforms and visual patterns. The function is based on a rational function of inverse discrete Fourier transforms. In practice, however, sparse representations are more useful. For this purpose, a simplified notation for specifying non-
zero complex sinusoids comprising the patterns was developed. Additionally, the harmonic pattern function serves as a platform for formalizing relationships between myriad audio/visual pattern making techniques spanning from the 18th century geometric pen to modern digital signal processing.

Bio
Lance Putnam is a composer and researcher with interest in unified audio/visual synthesis, harmonic patterns, and perceptualization of dynamic systems. He holds a B.S. in Electrical and Computer Engineering from the University of Wisconsin, Madison and both an M.A.
in Electronic Music and Sound Design and a Ph.D. in Media Arts and Technology from the University of California, Santa Barbara. In 2006, he was awarded a prestigious NSF IGERT fellowship in Interactive Digital Multimedia. He was selected as one of eight international students to present his research in media signal processing at the 2007 Emerging Leaders in Multimedia Workshop at the IBM T. J. Watson Research Center in New York. His work, S Phase, has been shown at the 2008 International Computer Music Conference in Belfast, Northern Ireland and the 2009 Traiettorie Festival in Parma, Italy. From 2008 to 2012, he conducted research in audio/visual synthesis at the AlloSphere Research Facility in Santa Barbara, California.

Brainstorming: Field studies for situated learning – Søren Eskildsen

Date: 23 May 2012
Time: 13.00-14.00
Place: NJ14 3-228 (Las Vegas)

Between 2007 and 2009 around 60.000 immigrants were coming to Denmark each year (Grunnet 2010). The main integration activities for these new citizens are language courses run by the communities, which are offered free of charge or with a minimal fee. The goal of these courses is to put the new comers into a position, where they can participate in everyday Danish life. But although the courses establish a thorough theoretical understanding of the language, they often lack the possibilities to motivate the students in their everyday activities to apply this knowledge in real cultural settings and thus to pro-actively participate in their host society.

To cope with this applicability problem, we suggest an approach based on our previous work on intercultural communication (e.g. Rehm et al. 2009) that puts the cultural and language learning task in the context of its actual use by embracing new and innovative training methods – like situated and experience-based learning – that have been shown to be more effective in terms of cultural integration (Landis et al 2004). Imagine e.g. a situation where the student stands in line at the train station in order to purchase a ticket. This can be seen as an ideal situation to trigger a Danish learning session on buying a train ticket. The student has some time for this session as he is waiting for his turn, he is in the right context for the knowledge that is conveyed and he is able to apply the knowledge shortly afterwards in a real situation.

The talk will be about presenting the idea and approach on how to incorporate a system/application, which will support this claim about how context based learning can be more effective in cultural integration. I hope to gain some insight from many of the “immigrants” that currently inhabit here on the department.

Bio
Søren Eskildsen graduated as M.Sc. from Medialogy in the summer 2011 at AAU Aalborg. With main focus on creating virtual environments with the purpose of investigating learning possibilities by using a more vivid approach to creating learning material. This idea has paved the way to Namibia to work on preserving indigenous knowledge for duration of 3½ weeks. His master thesis focused on learning from a virtual environment and measuring different activities like preferred media. He is currently employed as a Research assistant at Medialogy Aalborg, where the current project regarding cultural support in a learning context is being investigated. In addition he teaches Animation and Graphic Design on first semester as well as supervising on several semesters (2nd & 6th).

Software-Based Adjustment for Static Parallax Barriers for Autostereoscopic Mobile Displays – Martin Marko Paprock

Date: 02 May 2012
Time: 13.00-14.00
Place: NJ14 3-228

We show that the autostereoscopic display of stereoscopic images using a static parallax barrier can be improved by adapting the rendering to the angle under which the user is looking at a mobile display; thus, ghosting artifacts and depth reversals can often be avoided even if the user tilts the mobile device. Instead of moving the barrier itself to compensate for a misplacement of the device in relation to the user, a pixel column shifting in software can provide a similar compensation. This requires a parallax barrier where each section covers two pixel columns at a time. The proposed method has been implemented using OpenGL shaders and a parallax barrier that was designed for a display of exactly half the resolution of the employed display. Technical tests showed a good left and right image separation with a viewing angle of up to 60 degrees. Preliminary user tests indicate that an improvement in stereo experience can be observed.