ad:mt Research Seminars – Page 2 – Research seminars in the Department of Architecture, Design and Media Technology

Music Informatics and Cognition Group (MusIC)

Date: 24 September
Time: 12.45-14.00
Place: RDB14 3.429

The Music Informatics and Cognition Research Group (MusIC) designs, implements and evaluates algorithms for automatically carrying out tasks that, if performed by humans, would be considered to require musical expertise. The group shares the common goal of understanding on the computational and algorithmic levels the cognitive processes that operate when musical experts compose, analyse, improvise and perform music. Although unified by this common goal, the group’s members adopt quite different but complementary approaches to achieving it. In this presentation, Gissel Velarde will introduce her work on using wavelet-based methods to segment, classify and discover patterns in melodies. Olivier Lartillot will present a new simplified model of motivic analysis based on multiparametric closed pattern and cyclic sequence mining. Brian Bemman will report on his attempts to develop an algorithmic model of Milton Babbitt’s compositional procedures. David Meredith will present his most recent work on geometric pattern discovery in music, focusing on methods for automatically discovering maximal scalable patterns.

For more information: http://www.create.aau.dk/music

Guest lecture: Interacting at different ends of the scale: Massive interaction surfaces and minimal ambient & tangible interaction devices: The MiniOrb and CubIT projects – Markus Rittenbruch

Date: 12 June
Time: 13.00-14.00
Place: KAR6B-102

In this talk will present two projects conducted at QUT’s Institute for Future Environments, which consider user interaction at significantly different scales:
The MiniOrb project addresses the question of how to design systems that aid office
inhabitants in controlling their localised office environments, as well as support the process of negotiating shared preferences amongst co-located inhabitants. I will describe the design, use and evaluation of MiniOrb, a system that employs ambient and tangible interaction mechanisms to allow inhabitants of office environments to report on subjectively perceived office comfort levels. MiniOrb consists of a sensor device measuring localised environmental conditions, and two input / output devices: an ambient and tangible interaction device that allows users to maintain peripheral awareness of environmental factors and provides a tangible input mechanism to report personal preferences, and a mobile application, which displays measurements in a more explicit manner and allows for touch-based user input. Our aim was to explore the role of ubiquitous computing in the individual control of indoor climate and specifically answer the question to what extent ambient and tangible interaction mechanisms are suited for the task of capturing individual comfort preferences in a nonobtrusive manner.
The CubIT project developed a large-scale multi-user presentation and collaboration
platform running on QUT’s Cube facility. “The Cube” is a unique facility that combines 48
large multi-touch screens and very large-scale projection surfaces to form a very large
interactive learning and engagement space. The CubIT system was specifically designed to
allow QUT staff and students to utilise the capabilities of the Cube. CubIT’s primary purpose is to enable users to upload, interact with and share their own media content on the Cube’s display surfaces using a shared workspace approach. CubIT combines multiple interfaces (multi-touch, mobile & web) each of which play different roles and support different interaction mechanisms, supporting a range of collaborative features including multi-user shared workspace interaction, drag and drop upload and sharing between users, session management and dynamic state control between different parts of the system. I will briefly introduce the Cube facility and describe the design and implementation of the CubIT system.

Bio
Dr Markus Rittenbruch is a Senior Research Fellow with the Institute for Future Environments (IFE) at the Queensland University of Technology (QUT) and a member of
QUT’s Urban Informatics Research Lab. He has over 19 years of research experience in the
fields of Human-Computer Interaction (HCI), Computer Supported Cooperative Work
(CSCW), and Ubiquitous Computing (UbiComp). Before joining QUT he has been invited to work at various research organisations in Germany and Australia, including the University of Bonn, the Distributed Systems Technology Centre (DSTC), the University of Queensland, the Australasian CRC for Interaction Design (ACID), and NICTA, Australia’s Information and Communications Technology Centre of Excellence.

Markus’ research focuses on solving problems in urban and organisational contexts through the design of engaging, innovative interaction technologies and approaches. His interests include the design of collaborative software, advanced models of awareness in groupware, in particular contextual and intentional awareness, social software, ambient, ubiquitous and physical computing, natural user interfaces and different ways of interfacing with sensors and sensor data. Markus has authored and co-authored over 50 publications in journals, edited books, and conference proceedings, including publications in the leading journals in HCI (Human-Computer Interaction Journal) and CSCW (Journal on Computer Supported Cooperative Work).

Fast Joint DOA and Pitch Estimation Using a Broadband MVDR Beamformer – Sam Karimian-Azari

Date: 28 August 2013
Time: 13.00-14.00
Place: SOF9.101

The harmonic model, i.e., a sum of sinusoids having frequencies that are integer multiples of the pitch, has been widely used for modeling of voiced speech. In microphone arrays, the direction-of-arrival (DOA) adds an additional parameter that can help in obtaining a robust procedure for tracking non-stationary speech signals in noisy conditions. In this paper, a joint DOA and pitch estimation (JDPE) method is proposed. The method is based on the minimum variance distortionless response (MVDR) beamformer in the frequency-domain and is much faster than previous joint methods, as it only requires the computation of the optimal filters once per segment. To exploit that both pitch and DOA evolve piecewise smoothly over time, we also extend a dynamic programming approach to joint smoothing of both parameters. Simulations show the proposed method is much more robust than parallel and cascaded methods combining existing DOA and pitch estimators.

Bio
Sam Karimian-Azari received the B.Sc. degree in electrical engineering in 2001 from Isfahan University of Technology, Iran, and the M.Sc. degree in electrical engineering with emphasis on signal processing in 2012 from Blekinge Institute of Technology, Sweden. He is currently a Ph.D. fellow in electrical engineering at Aalborg University under the supervision of Mads Græsbøll Christensen. His research interests include the speech signal processing and microphone array processing techniques.

Statistically Efficient Methods for Pitch and DOA Estimation – Jesper Rindom Jensen

Date: 06 March 2013
Time: 13.00-14.00
Place: tba

The direction-of-arrival (DOA) and the pitch of multichannel, periodic sources are some of the main paraphernalia in many of signal processing methods for, e.g., tracking, separation, enhancement, and compression. Traditionally, the estimation of these parameters have been considered as two separate problems. Separate estimation may render the task of resolving sources with similar DOA or pitch impossible, and it may decrease the estimation accuracy. Therefore, it was recently considered to estimate the DOA and pitch jointly. In this talk, we present two novel methods for DOA and pitch estimation. They both yield maximum-likelihood estimates in white Gaussian noise scenarios, where the signal-to-noise (SNR) may be different across channels, as opposed to state-of-the-art methods. The first method is a joint estimator, whereas the latter use a cascaded approach, but with a much lower computational complexity. The simulation results confirm that the proposed methods outperform state-of-the-art methods in terms of estimation accuracy in both synthetic and real-life signal scenarios.

Bio
Jesper Rindom Jensen received the M.Sc. and Ph.D. degrees from the Deptartment of Electronic Systems at Aalborg University in 2009 and 2012, respectively. Currently, he is a postdoctoral researcher at the Department of Architecture, Design & Media Technology at Aalborg University. His research interests include spectral analysis, estimation theory and microphone array signal processing.

Text is not the Enemy: Multimodal interfaces for illiterate people – Hendrik Knoche

Date: 20 February 2013
Time: 13.00-14.00
Place: tba

Rainfed farming provides the bulk of the world’s food supply and has tremendous potential to increase its productivity. Most rainfed farms are operated by farmers from the bottom of the pyramid who lack information about novel agricultural techniques, what their peers are doing and in many developing areas have high illiteracy rates. Illiteracy does not feature in accessibility guidelines and its implications for interaction design are poorly understood. Most of the research on illiterate users in the ICT for development (ICT4D) literature has hitherto focused on mobile phones relying on keypads for input. Research methods of human computer interaction honed for lab settings need to be re-assessed and modified to conduct studies ‘in-the-wild’. In this talk I will reflect on the lessons learned from two applications for illiterate users designed within the scope of an ICT4D project, covering design implications, methodological pitfalls and the role text can play in multi-modal interfaces for illiterate people.

Bio
Hendrik Knoche holds an MSc (UoHamburg) and PhD (UC London) in computer science. His research interests include human-centered design, design thinking, mediated experiences, proxemics, and ICT for development along with methods for prototyping and evaluating applications and their user experiences “in the wild”. Since October 2012 he is working at ADMT in Aalborg.

Sensory Perception from Multimodal Spike Inputs (sensory fusion and spatiotemporal perception) – Sam Karimian-Azari

Date: 05 December 2012
Time: 13.00-14.00
Place: tba

The information interaction in biological nervous system sprits a living body. Sensory neurons encode the spatiotemporal information of a stimulus into a sequential spike train. Any neural spike carries Inter-Spike-Interval (ISI) timing information as a message and a statistical inference of the ISI along a temporal window is assumed as a perception. When a sensor is missed, the spike generation is predicted through the prior perception and the uncertainty of this perception has to be increased. Multimodal sensory fusion improves the perception by decreasing the uncertainty and its combination with prediction improves the perception psychophysically. The Belief Propagation (BP) algorithm is used to perform sensory message passing in a spiking network. The last messages from asynchronous spikes of different modalities are preserved to make an association in a memory. A stationary perception is constructed by the memorized messages along a missing sense but the perception has to be dynamic to predict spatiotemporal information of a dynamic stimulus psychophysically. In this research we investigate the sensory perception through a stimulated spiking visual sensor and address the attenuation of preserved information by the drift-diffusion phenomenon for a dynamic perception. We experiment the variation of statistical features optimizes the spiking network in multimodal sensory perception over time and space.

Wavelet representation of melodic shape – Gissel Velarde

Date: 27 November 2012 (TUESDAY!)
Time: 13.00-14.00
Place: NJ14 3-228 (Las Vegas)

The multi-scale aspect of wavelet analysis makes it particularly attractive as a method for analysing melodies. In this talk, I will present a methodology that I developed jointly with Tillman Weyde to represent and segment symbolic melodies. The wavelet coefficients obtained by Haar continuous wavelet transform at one scale are not only used to represent melodies, but to find relevant segmentation points at the coefficients’ zero crossings. Classification experiments showed that the method is more descriptive in a melody-recognition model than other approaches.

Bio
For over a year, Gissel Velarde has been working on wavelets applied to symbolic music. Currently she is a PhD student at Aalborg University working on a wavelet-based approach for analysis of symbolic music under the supervision of David Meredith. Velarde holds a degree in Systems Engineering from the Bolivian Catholic University and a Master of Science in Electronic Systems and Engineering Management from the South Westphalia University of Applied Sciences. Besides, she studied piano at the Bolivian National Conservatory of Music and won, among others, first and second prizes at the National Piano Competition in Bolivia (1994 and 1997 respectively). From 2006 to 2008, she was DAAD scholarship holder. In 2010, she received a Best Paper Award nomination at the Industrial Conference on Data Mining in Berlin.

The Harmonic Pattern Function: A Model For Audio/Visual Synthesis – Lance Putnam

Date: 21 November 2012
Time: 13.00-14.00
Place: NJ14 3-228 (Las Vegas)

In this talk, I give an overview of my dissertation research concerning the use of harmonics in audio/visual synthesis and composition. I introduce my main contribution to this area, the harmonic pattern function, a mathematical model capable of compactly describing a wide array of sound waveforms and visual patterns. The function is based on a rational function of inverse discrete Fourier transforms. In practice, however, sparse representations are more useful. For this purpose, a simplified notation for specifying non-
zero complex sinusoids comprising the patterns was developed. Additionally, the harmonic pattern function serves as a platform for formalizing relationships between myriad audio/visual pattern making techniques spanning from the 18th century geometric pen to modern digital signal processing.

Bio
Lance Putnam is a composer and researcher with interest in unified audio/visual synthesis, harmonic patterns, and perceptualization of dynamic systems. He holds a B.S. in Electrical and Computer Engineering from the University of Wisconsin, Madison and both an M.A.
in Electronic Music and Sound Design and a Ph.D. in Media Arts and Technology from the University of California, Santa Barbara. In 2006, he was awarded a prestigious NSF IGERT fellowship in Interactive Digital Multimedia. He was selected as one of eight international students to present his research in media signal processing at the 2007 Emerging Leaders in Multimedia Workshop at the IBM T. J. Watson Research Center in New York. His work, S Phase, has been shown at the 2008 International Computer Music Conference in Belfast, Northern Ireland and the 2009 Traiettorie Festival in Parma, Italy. From 2008 to 2012, he conducted research in audio/visual synthesis at the AlloSphere Research Facility in Santa Barbara, California.

Koldinghus Augmented: Memories of the Walls and The Castle Chapel – Jakob Madsen, Claus Madsen

Date: 14 November 2012
Time: 13.00-14.00
Place: NJ14 3-228 (Las Vegas)

Since June 2011 we have participated in two Computer Graphics oriented projects centered around dissemination of historical knowledge with a focus on the Koldinghus Castle, in Kolding, Denmark. Koldinghus has played a major role as a part time residence for a row of kings from circa 1200 to 1700. In 1808 the castle was destroyed in a fire. The castle is now partly restored and houses a museum.

We will present two different projects: 1) an iPad based Augmented Reality game for children, and 2) a Virtual Reality visual reconstruction of the castle chapel as it appeared in early 1600 during King Christian IV’s reign.

Memories of the Walls is an interactive Augmented Reality game where children use an iPad handed out by the museum to go through an interactive narrative and solve puzzles at various stations scattered throughout the castle. In a longitudinal study we have datamined logging data from the iPads allowing us a look into what the children actually do while playing the game. The experiences from a 3 month trial period are not entirely positive, and we discuss things we have learnt from the project.

The Castle Chapel is a visual reconstruction of how the castle’s church looked like 400 years ago. We have modeled the chapel based on all available historical material, as well as on 3D scannings of surviving lime stone carvings, etc. The visual reconstruction is displayed to visitors on a pair of “view finder” stands, allowing users to look around inside the church.

MEL: A Geometric Language for Representing Musical Structure – David Meredith

CANCELED, will be presented next semester

Date: 7 November 2012
Time: 13.00-14.00
Place: NJ14 3-228 (Las Vegas)

The work described here takes its starting point in the idea that the best ways of understanding a particular musical object are those that are represented by the shortest possible descriptions of that object. I also propose that the goals of both music analysis and music perception are to find the shortest possible descriptions of musical objects. Note that a musical “object” could be a short musical fragment, a song, a multi-movement work or even a whole corpus of pieces. The goal of the research presented in this talk is to design an encoding language capable of expressing parsimonious descriptions of musical objects. This language must be able to express the types of equivalence relations that occur between musical structures, since a description of an object can often be shortened (i.e., compressed) by taking advantage of such equivalences that exist between parts of the object. The most important type of equivalence in music is translational equivalence within pitch-time space. However, musical translation is different from Euclidean geometric translation because pitch-time space can be transformed by pitch alphabets (periodic subsets of the pitch dimension) and rhythms (periodic subsets of the time dimension). Common examples of pitch alphabets are the usual scales and chords used in tonal music. Both pitch alphabets and rhythms can be represented by periodic masks, organised into mask sequences. Examples will be given of parsimonious descriptions (encodings) of musical objects that employ masks and mask sequences and an algorithm will be introduced that attempts to find such encodings from in extenso descriptions of musical objects.