Implementing webbased real time video chat using HTML5 websockets Implementing webbased real time video chat using HTML5 websockets python python

Implementing webbased real time video chat using HTML5 websockets


If you want to go with HTML5 only, you will need a browser implementing the HTML Media Capture draft (available here) in order to access the raw data from the microphone.

Once you have this data in hand, you need to send it over the network. Websockets would be the HTML5 option to have fast enough round trips with the server (sending local audio data and receiving remote audio data at the same time)

Since you mention python, I would recommend looking around the twisted implementation of websockets.

You can have all your clients "register" on the websocket server with a callerID, so the server knows where to find a given callerID.

Then your server will need an "invite" API where caller1 "invites" caller2.

Once the call is setup and each client starts sending its audio data, the server will be able to send this audio data to the other party.

Upon receiving audio data, the browser will need to play this audio data on the speakers, probably using the HTML5 audiotag.

To do this, you may be forced to use a "trick" : instead of having the websocket server forward the raw audio data to the client, you may need to simulate 2 "infinite" files :

  1. caller1.wav : sound captured on caller1 mic
  2. caller2.wav : sound captured on caller2 mic

caller1 browser would add caller2.wav in the audio.src attribute once the call is setup (caller1 would be informed of this event via websocket) and hopefully if the python server appends the raw audio data to the caller2.wav as it receives it, it would start playing.

This sounds like a cool prototype you're going to hack up !

Good luck on your journey,

Jerome Wagner


Seems like Ericsson created the first HTML5 Video Conference App.

The technique they used:

  • Implemented the device element and the Stream API (device element GUI is currently written in JavaScript/CSS)
  • Added MediaStreamManager to map Stream URLs to the corresponding pipeline in the media backend
  • Added MediaStreamTransceiver to control the related media processing and transport
  • Added support for binary data in the WebSocket protocol

See: labs.ericsson.com:


Video on YouTube: Beyond HTML5: Conversational Voice and Video demo | Ericsson Labs

Unfortunately Ericsson doesn't want to share device_dialog.js (yet).


WebRTC might be an answer: http://www.webrtc.org/running-the-demos (currently only Chrome Canary with MediaStream flag enabled)

See demo: https://apprtc.appspot.com (make sure you watch in a proper browser) and code http://code.google.com/p/webrtc-samples/source/browse/trunk/apprtc/


The reason I'm writing is... I got really cheap Android tablet and cannot intall Skype nor Vtok nor Google Voice is available outside the US. I need to find HTML5 based solution as I'm able to run Opera Mobile 12 and got http://html5demos.com/ working properly