I just finished teaching “Interactive Systems Programming” — one of the graduate courses in the Medialogy section of the Department of Architecture, Design and Media Technology at Aalborg University Copenhagen. Any set of lectures and exercises I could design would not have been of interest to most of the students because the subject matter is so broad, not to mention that it completely spans theory to practice. (I specialize in audio and music, and so my presentation would have been extremely unbalanced too.) Instead I had each student select one research paper in his or her field of interest related in some way to interactivity, and to making the human computer interaction more natural. Over the course of the course, I helped each student read his or her paper, write a critical annotation of it, and attempt to reproduce some aspect of the paper — whether it is building an interface, running the experiment, testing an algorithm, etc. On the final day the students presented their papers and results. The selected papers are extremely varied, and give a good glimpse at state-of-the-art work (10 of 15 papers are from 2008 and later). Since I spent time with each of the papers over the past month, I write below my one line description of the work in each.
P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,” IEEE Trans. Pattern Anal. Machine Intell., vol. 19, no. 7, pp. 711-720, July 1997.
Class-specific features using Fisher’s linear discriminant can improve automatic face recognition in poor lighting conditions much more than using class-specific features from eigenfaces.
B.O. Peters, G. Pfurtscheller,and H. Flyvbjerg, “Automatic differentiation of multichannel EEG signals,” IEEE Trans. Biomedical Eng., vol.48, no.1, pp.111-116, Jan. 2001.
The authors take multichannel EEGs and see how well we can discriminate between two different physical movements using artificial neural networks and features found using 10-order autoregressive models without targeting any particular placement on the scalp.
C. Loscos, D. Marchal, A. Meyer, “Intuitive Crowd Behaviour in Dense Urban Environments using Local Laws,” Proc. IEEE Theory Practice Computer Graphics, pp. 122-129, Birmingham, UK, June 2003.
In realtime, the authors simulate 10,000 people walking (in goal oriented ways, and interacting groups) in an urban environment viewed from above.
O. Hilliges, P. Holzer, R. Klüber and A. Butz, “AudioRadar: A Metaphorical Visualization for the Navigation of Large Music Collections,” Lecture Notes in Computer Science, vol. 4073/2006, pp. 82-92, 2006.
We can visualize relationships and similarities in a collection of music by positioning pieces on a radar-like display; but sometimes the high-level descriptors used for placement make no useful sense.
B. Pardo, “Finding structure in audio for music information retrieval,” IEEE Signal Process. Mag., pp. 126-132, May 2006.
The author provides an overview of state-of-the-art (in 2006) audio search and retrieval by humming, audio fingerprinting, and musical source separation.
R. Murray-Smith, J. Williamson, S. Hughes, T. Quaade, “Stane: Synthesized Surfaces for Tactile Input,” in Proc. SIGCHI Conf. Human Factors in Computing Systems, pp. 1299-1302, (Florence, Italy), Apr. 2008.
Scratching and rubbing this alien device creates vibrations sensed by a small internal microphone, which can then control music playing.
H. Danielsiek, R. Stuer, A. Thom, N. Beume and B. Naujoks, “Intelligent Moving of Groups in Real-Time Strategy Games,” Proc. IEEE Symp. Computational Intell. Games, pp. 71-78, Perth, Australia, Dec. 2008.
The authors combine flocking behavior with influence maps to create a better method for the movement of characters in realtime strategy games.
M. McGuire and D. Luebke, “Hardware-Accelerated Global Illumination by Image Space Photon Mapping,” ACM SIGGRAPH/EuroGraphics High Performance Graphics, 2009.
Here is an excellent solution to the high complexity of ray tracing with complex light sources and caustic material in interactive environments — intellectual property of NVIDIA.
S.-P. Chao, Y.-Y. Chen, and W.-C. Chen, “The cost-effective method to develop a real-time motion capture system,” Proc. Int. Conf. Computer Sciences Convergence Info. Tech., pp. 494-498, Seoul, Korea, Nov. 2009.
The authors create an effective real-time motion capture system using four web cams, a vector codebook approach to background subtraction, and a simple camera calibration method.
C. D. Ward and P. I. Cowling, “Monte Carlo Search Applied to Card Selection in Magic The Gathering,” IEEE Symp. Computational Intell. Games, pp. 9 – 16, Milano, Italy, Sep. 2009.
A Monte Carlo approach can create challenging and real-time opponents for games having search spaces so enormous that other approaches are too difficult to implement.
H. Wang, O. Malik and A. Nareyek, “Multi-Unit Tactical Pathplanning”, IEEE Symp. Computation Intell. Games, pp. 349-354, Milano, Italy, Sep. 2009.
Planning paths for several entities in a game is more than going from point A to B; you might also want to avoid C while staying two steps behind D.
B. Moens, L. van Noorden, and M. Leman, “D-JOGGER: SYNCING MUSIC WITH WALKING,” Proc. Sound and Music Computing, Barcelona, Spain, July 2010.
Experiments show that people synchronize their walking speed with music tempo.
L.A. Ludovico, D. A. Mauro, and D. Pizzamiglio, “HEAD IN SPACE: A HEAD-TRACKING BASED BINAURAL SPATIALIZATION SYSTEM,” Proc. Sound and Music Computing, Barcelona, Spain, July 2010.
The authors combine head tracking with binaural spatialization to create an interactive and realistic experience of sound in an environment.
F. Kelly and N. Harte, “A Comparison of Auditory Features for Robust Speech Recognition,” Proc. European Signal Process. Conf., pp. 1968-1972, Aalborg, Denmark, Aug. 2010.
Speech features found using a gammatone filterbank with a physiologically-inspired non-linear power thresholding makes for a very robust automatic speech recognition system.
E. R. Miranda, “Plymouth brain-computer music interfacing project: from EEG audio mixers to composition informed by cognitive neuroscience,” Int. J. Arts and Tech., vol. 3, nos. 2/3, pp. 154-176, 2010.
We can extract information from BCIs to control musical devices, such as audio mixers, generating melodies; but several challenges exist impeding higher-level music control.