MPEG-7 Audio and Beyond Audio Content Indexing and Retrieval

Today,digital audio applications are part of our every dayl ives.Popular examples include audio CDs, MP3 audio players, radio broadcasts, TV or video DVDs, video games, digital cameras with sound track, digital camcorders, telephones, telephone answering machines and telephone enquiries using speech or word recognition. Various new and advanced audiovisual applications and services become possible based on audio content analysis and description. Search engines or specific filters can use the extracted description to help users navigate or browse through large collections of data. Digital analysis may discriminate whether an audio file contains speech, music or other audio entities, how many speakers are contained in a speech segment, what gender they are and even which persons are speaking. Spoken content may be identified and converted to text.
Music may be classified into categories, such as jazz, rock, classics, etc. Often it is possible to identify a piece of music even when performed by different artists – or an identical audio track also when distorted by coding artefacts. Finally, it may be possible to identify particular sounds, such as explosions, gunshots, etc.
We use the term audio to indicate all kinds of audio signals, such as speech, musicaswellasmoregeneralsoundsignalsandtheircombinations.Ourprimary goal is to understand how meaningful information can be extracted from digital audio waveforms in order to compare and classify the data efficiently. When such information is extracted it can also often be stored as content description in a compact way.
These compact descriptors are of great use not only in audio storage and retrieval applications, but also for efficient content-based classification, recognition, browsing or filtering of data. A data descriptor is often called a feature vector or fingerprint and the process for extracting such feature vectors or fingerprints from audio is called audio feature extraction or audio fingerprinting.
TABLE OF CONTENT:
Chapter 1 - Introduction
Chapter 2 - Low Level Descriptors
Chapter 3 - Sound Classification and Similarity
Chapter 4 - Spoken Content
Chapter 5 - Music Description Tools
Chapter 6 - Fingerprinting and Audio Signal Quality
Chapter 7 - Application
Password:ganelon
Random Posts
- PHP Architect October 2006 [INCLUDING CODE]
- Java Design Patterns: A Tutorial - Addison Wesley
- English Grammar Workbook For Dummies
- Java 2D Graphics
- Pro SQL Server 2008 Service Broker
- Managing Software Requirements: A Unified Approach
- JavaScript Application Cookbook
- Understanding IPv6 2nd Edition
- IBM Workplace Services Express For Dummies
- Pro T-SQL 2005 Programmers Guide
















