October 2012 – ad:mt Research Seminars

Statistically Efficient Methods for Pitch and DOA Estimation – Jesper Rindom Jensen

Date: 06 March 2013
Time: 13.00-14.00
Place: tba

The direction-of-arrival (DOA) and the pitch of multichannel, periodic sources are some of the main paraphernalia in many of signal processing methods for, e.g., tracking, separation, enhancement, and compression. Traditionally, the estimation of these parameters have been considered as two separate problems. Separate estimation may render the task of resolving sources with similar DOA or pitch impossible, and it may decrease the estimation accuracy. Therefore, it was recently considered to estimate the DOA and pitch jointly. In this talk, we present two novel methods for DOA and pitch estimation. They both yield maximum-likelihood estimates in white Gaussian noise scenarios, where the signal-to-noise (SNR) may be different across channels, as opposed to state-of-the-art methods. The first method is a joint estimator, whereas the latter use a cascaded approach, but with a much lower computational complexity. The simulation results confirm that the proposed methods outperform state-of-the-art methods in terms of estimation accuracy in both synthetic and real-life signal scenarios.

Bio
Jesper Rindom Jensen received the M.Sc. and Ph.D. degrees from the Deptartment of Electronic Systems at Aalborg University in 2009 and 2012, respectively. Currently, he is a postdoctoral researcher at the Department of Architecture, Design & Media Technology at Aalborg University. His research interests include spectral analysis, estimation theory and microphone array signal processing.

Text is not the Enemy: Multimodal interfaces for illiterate people – Hendrik Knoche

Date: 20 February 2013
Time: 13.00-14.00
Place: tba

Rainfed farming provides the bulk of the world’s food supply and has tremendous potential to increase its productivity. Most rainfed farms are operated by farmers from the bottom of the pyramid who lack information about novel agricultural techniques, what their peers are doing and in many developing areas have high illiteracy rates. Illiteracy does not feature in accessibility guidelines and its implications for interaction design are poorly understood. Most of the research on illiterate users in the ICT for development (ICT4D) literature has hitherto focused on mobile phones relying on keypads for input. Research methods of human computer interaction honed for lab settings need to be re-assessed and modified to conduct studies ‘in-the-wild’. In this talk I will reflect on the lessons learned from two applications for illiterate users designed within the scope of an ICT4D project, covering design implications, methodological pitfalls and the role text can play in multi-modal interfaces for illiterate people.

Bio
Hendrik Knoche holds an MSc (UoHamburg) and PhD (UC London) in computer science. His research interests include human-centered design, design thinking, mediated experiences, proxemics, and ICT for development along with methods for prototyping and evaluating applications and their user experiences “in the wild”. Since October 2012 he is working at ADMT in Aalborg.

Sensory Perception from Multimodal Spike Inputs (sensory fusion and spatiotemporal perception) – Sam Karimian-Azari

Date: 05 December 2012
Time: 13.00-14.00
Place: tba

The information interaction in biological nervous system sprits a living body. Sensory neurons encode the spatiotemporal information of a stimulus into a sequential spike train. Any neural spike carries Inter-Spike-Interval (ISI) timing information as a message and a statistical inference of the ISI along a temporal window is assumed as a perception. When a sensor is missed, the spike generation is predicted through the prior perception and the uncertainty of this perception has to be increased. Multimodal sensory fusion improves the perception by decreasing the uncertainty and its combination with prediction improves the perception psychophysically. The Belief Propagation (BP) algorithm is used to perform sensory message passing in a spiking network. The last messages from asynchronous spikes of different modalities are preserved to make an association in a memory. A stationary perception is constructed by the memorized messages along a missing sense but the perception has to be dynamic to predict spatiotemporal information of a dynamic stimulus psychophysically. In this research we investigate the sensory perception through a stimulated spiking visual sensor and address the attenuation of preserved information by the drift-diffusion phenomenon for a dynamic perception. We experiment the variation of statistical features optimizes the spiking network in multimodal sensory perception over time and space.

Wavelet representation of melodic shape – Gissel Velarde

Date: 27 November 2012 (TUESDAY!)
Time: 13.00-14.00
Place: NJ14 3-228 (Las Vegas)

The multi-scale aspect of wavelet analysis makes it particularly attractive as a method for analysing melodies. In this talk, I will present a methodology that I developed jointly with Tillman Weyde to represent and segment symbolic melodies. The wavelet coefficients obtained by Haar continuous wavelet transform at one scale are not only used to represent melodies, but to find relevant segmentation points at the coefficients’ zero crossings. Classification experiments showed that the method is more descriptive in a melody-recognition model than other approaches.

Bio
For over a year, Gissel Velarde has been working on wavelets applied to symbolic music. Currently she is a PhD student at Aalborg University working on a wavelet-based approach for analysis of symbolic music under the supervision of David Meredith. Velarde holds a degree in Systems Engineering from the Bolivian Catholic University and a Master of Science in Electronic Systems and Engineering Management from the South Westphalia University of Applied Sciences. Besides, she studied piano at the Bolivian National Conservatory of Music and won, among others, first and second prizes at the National Piano Competition in Bolivia (1994 and 1997 respectively). From 2006 to 2008, she was DAAD scholarship holder. In 2010, she received a Best Paper Award nomination at the Industrial Conference on Data Mining in Berlin.

The Harmonic Pattern Function: A Model For Audio/Visual Synthesis – Lance Putnam

Date: 21 November 2012
Time: 13.00-14.00
Place: NJ14 3-228 (Las Vegas)

In this talk, I give an overview of my dissertation research concerning the use of harmonics in audio/visual synthesis and composition. I introduce my main contribution to this area, the harmonic pattern function, a mathematical model capable of compactly describing a wide array of sound waveforms and visual patterns. The function is based on a rational function of inverse discrete Fourier transforms. In practice, however, sparse representations are more useful. For this purpose, a simplified notation for specifying non-
zero complex sinusoids comprising the patterns was developed. Additionally, the harmonic pattern function serves as a platform for formalizing relationships between myriad audio/visual pattern making techniques spanning from the 18th century geometric pen to modern digital signal processing.

Bio
Lance Putnam is a composer and researcher with interest in unified audio/visual synthesis, harmonic patterns, and perceptualization of dynamic systems. He holds a B.S. in Electrical and Computer Engineering from the University of Wisconsin, Madison and both an M.A.
in Electronic Music and Sound Design and a Ph.D. in Media Arts and Technology from the University of California, Santa Barbara. In 2006, he was awarded a prestigious NSF IGERT fellowship in Interactive Digital Multimedia. He was selected as one of eight international students to present his research in media signal processing at the 2007 Emerging Leaders in Multimedia Workshop at the IBM T. J. Watson Research Center in New York. His work, S Phase, has been shown at the 2008 International Computer Music Conference in Belfast, Northern Ireland and the 2009 Traiettorie Festival in Parma, Italy. From 2008 to 2012, he conducted research in audio/visual synthesis at the AlloSphere Research Facility in Santa Barbara, California.