Difference between revisions of "WWW2010 Talk"

From CDOT Wiki
Jump to: navigation, search
(Abstract (2000 chars max))
(Abstract (2000 chars max))
Line 5: Line 5:
 
==Abstract (2000 chars max)==
 
==Abstract (2000 chars max)==
  
The HTML5 specification introduces the audio and video media elements, and with them the opportunity to dramatically change the way we integrate media on the web.  The current API provides ways to play and get limited information about audio and video, but gives no way to programatically access or create such media. We present a new API for these media elements, as well as a working Firefox prototype, which allows web developers to read and write raw audio data.  Through numerous working demonstrations, we show how access to this data allows for improved web accessibility, opens new possibilities for real-time media manipulation, and extends the web in ways that have traditionally required proprietary plugins.  For example, we look at how an audio data API can be used to create such things as in-browser synthesizers, audio based animated visualizations, digital signal processing, text to speech, etc.  We will also explore the code necessary for web developers to work with audio data, calculate FFT, and other audio algorithms in JavaScript.
+
The HTML5 specification introduces the audio and video media elements, and with them the opportunity to dramatically change the way we integrate media on the web.  The current API provides ways to play and get limited information about audio and video, but gives no way to programatically access or create such media. We present a new API for these media elements, as well as a working Firefox prototype, which allows web developers to read and write raw audio data.  Through numerous working demonstrations, we show how access to this data allows for improved web accessibility, opens new possibilities for real-time media manipulation, and extends the web in ways that have traditionally required proprietary plugins.  For example, we look at how an audio data API can be used to create such things as in-browser synthesizers, audio based animated visualizations, digital signal processing, text to speech, etc.  We will also explore the code necessary for web developers to work with audio data, calculate FFT, and other audio algorithms in JavaScript.
 +
 
 +
Granting direct access to the audio stream in the browser, opens up the web as a powerful tool for accessibility, innovation and creativity. There are a myriad of readers for the web such as SuperNova for the visually impaired, but having worked with people who have absolutely no vision, there is an obvious gap in usability. Someone who is partially sighted or color-blind, can change the size of the screen and switch color palettes. But people who have very-little to no-sight-at-all have to rely on text-to-speech. The problem with this method is simple to explain, but difficult to solve. Have you ever tried to navigate a drop-down menu with your eyes closed using text-to-speech? This alone renders the web unusable in seconds. Imagine having absolutely no visual cues to help you know what to click on. Imagine having no idea if you are in the navigation or in the content. What if the tab indexes have been neglected on a site you need to pay your bills, or change personal information with a company? You would have to reply on other people to help you on the web, and you would miss out on all the benefits that the web can bring to your daily life.
 +
 
 +
This happens to be a wonderful use-case for audio stream access in the browser. Audio cues can be used to assist the blind, allowing developers to create content that is navigable for a wider audience. But there are even deeper implications to using audio technology in the browser. Research and development of video-to-audio technology by physicist Peter Meijer, has demonstrated that moving images can be converted into audio-signals the human brain can use to navigate in 3D space, and can in some cases create artificial synesthesia, allowing users to "really see" with sound as the brain adapts to common usage and begins filtering the relevant audio-signals, firring them at the visual cortex.
  
 
==Authors==
 
==Authors==

Revision as of 11:37, 26 February 2010

Title (256 chars max)

Unlocking the potential of audio in the browser

Abstract (2000 chars max)

The HTML5 specification introduces the audio and video media elements, and with them the opportunity to dramatically change the way we integrate media on the web. The current API provides ways to play and get limited information about audio and video, but gives no way to programatically access or create such media. We present a new API for these media elements, as well as a working Firefox prototype, which allows web developers to read and write raw audio data. Through numerous working demonstrations, we show how access to this data allows for improved web accessibility, opens new possibilities for real-time media manipulation, and extends the web in ways that have traditionally required proprietary plugins. For example, we look at how an audio data API can be used to create such things as in-browser synthesizers, audio based animated visualizations, digital signal processing, text to speech, etc. We will also explore the code necessary for web developers to work with audio data, calculate FFT, and other audio algorithms in JavaScript.

Granting direct access to the audio stream in the browser, opens up the web as a powerful tool for accessibility, innovation and creativity. There are a myriad of readers for the web such as SuperNova for the visually impaired, but having worked with people who have absolutely no vision, there is an obvious gap in usability. Someone who is partially sighted or color-blind, can change the size of the screen and switch color palettes. But people who have very-little to no-sight-at-all have to rely on text-to-speech. The problem with this method is simple to explain, but difficult to solve. Have you ever tried to navigate a drop-down menu with your eyes closed using text-to-speech? This alone renders the web unusable in seconds. Imagine having absolutely no visual cues to help you know what to click on. Imagine having no idea if you are in the navigation or in the content. What if the tab indexes have been neglected on a site you need to pay your bills, or change personal information with a company? You would have to reply on other people to help you on the web, and you would miss out on all the benefits that the web can bring to your daily life.

This happens to be a wonderful use-case for audio stream access in the browser. Audio cues can be used to assist the blind, allowing developers to create content that is navigable for a wider audience. But there are even deeper implications to using audio technology in the browser. Research and development of video-to-audio technology by physicist Peter Meijer, has demonstrated that moving images can be converted into audio-signals the human brain can use to navigate in 3D space, and can in some cases create artificial synesthesia, allowing users to "really see" with sound as the brain adapts to common usage and begins filtering the relevant audio-signals, firring them at the visual cortex.

Authors

  • Corban Brook
  • David Humphrey
  • Al MacDonald
  • Thomas Saunders