For the last several months, Entropy Wave has been making Theora work on the TI C64x+ DSP as a project for Mozilla Corp.

An Ogg/Theora video of Big Buck Bunny being played back on a Beagle Board via the C64x+ DSP coprocessor
The goal behind porting to the C64x+ is to run on OMAP3 SoC from TI, which has an ARM Cortex A8 core and also has a C64x+ DSP coprocessor. This SoC (System on Chip) is best known as being the base behind Nokia’s N series of mobiles (including the N900), the Motorola Droid, Palm Pre, and the Beagle Board. The DSP coprocessor is commonly used for audo and video processing, including video encoding and decoding, and TI makes codecs available for MPEG-4 video decoding, AAC decoding, etc. Having Theora decoded on the DSP fits into Mozilla’s Fennec project, making Firefox with video useful on a mobile platform.
One of the engineering reasons behind having a separate processor for media handling is that it separates real-time tasks (media decoding) from non-real-time tasks, such as running web browser software. From the standpoint of software running on the ARM, the video decoder looks and acts just like a hardware video codec. The DSP on the OMAP3 is even more compelling for video decoding because attached to the DSP are several units that accelerate motion vector copying, VLC decoding, and loop deblocking. Unfortunately, these pieces are not publicly documented by TI, so the current Theora port (which is open source) is unable to use them. A future Entropy Wave project will likely add support for these acceleration units which would allow the performance of the Theora decoder to be similar to TI’s MPEG-4 codec, which can do 800×480 playback (possibly more?). As it looks now, the resulting code would necessarily be closed source until such a time when TI wishes to make the specifications public.
As it currently stands, the Theora decoder plays 640×360 24fps at slightly more than 100% speed on average. This isn’t quite good enough to call it “real time”, since some frames take longer than the allotted time to decode, but it’s pretty close and the results are good. Additional speed improvements in libtheora would require internal changes, which would be a project in itself. One clear area for improvement is that the DSP spends a substantial part of its time idle, because the host code is serialized with the DSP processing. Fixing this is likely to put the above case firmly into the “real time” category. Given that 640×360 is larger than the iPhone display resolution and almost as large as the N900 resolution, it’s clearly good enough, even if it is less than TI’s hardware accelerated MPEG-4.
On the Entropy Wave site is a page describing the demo, including where to download images and how to compile source code.
A big thanks to the people that laid the foundations for this work, especially Felipe Contreras.
[...] related to your search David Schleef: Theora on TI C64x+ DSP and OMAP3 is now available in this link…: News [...]
[...] David Schleef’s theora optimisation for the dsp in the OMAP3 will really make open video a reality for the N900 and other OMAP3 devices like Palm Pre/Motorola Droid. Thank you David and Mozilla for the work. [...]
Actually, most HD 16:9 camcorders shoot at 60i, which means that when you de-interlace, you export at 30p. Therefore, I think it’s important for any hardware acceleration solution to be able to decode 640×360 at 30p.
[...] This post was mentioned on Twitter by Jeff Waugh, anidel. anidel said: http://bit.ly/A4d6 Makes open video a reality on OMAP3 devices (#n900, #pre) cool stuff [...]
“As it looks now, the resulting code would necessarily be closed source until such a time when TI wishes to make the specifications public.”
This looks like an unfortunate over simplification, especially coming from a Free Software contributor.
A non-public specification does not necessarily imply a non-public implementation as the Linux Driver Project has shown.
Social comments and analytics for this post…
This post was mentioned on Twitter by jdub: Awesome… David Schleef has been working on Theora decoding with common mobile DSP for MozCorp: http://is.gd/4T6ER…
Sad sad state of affairs.
First let me say a few things that will make my point of view a little more reasonable. I’ve been working at a small company that was basically doing video codecs for TI. The “DSP core” in that little TI chip can easily decode 1080p, at nowhere near 100% (it can do far more, in fact it can decode h264 and reencode as, say mpeg2 without getting to 100% usage). It can decode more than one 1080p stream (unless you have some lower-end model than we did). My point is – the hardware is really there – it’s far more complex than most people realize and it’s not at all easy to get that performance in real world but the software that can do it exists.
Now I’d like to say what (in my opinion) is the problem. The whole FOSS world is a 3rd world citizen in embedded. Everyone is using us, nobody really cares about us at all. We’re like little children in China, assembling the success stories of western companies. HOW CAN YOU BE HAPPY ABOUT DOING BARELY ADEQUATE PERFORMANCE. Without access to internal specification, lots of experience, time and money (to pay the developers involved) Theora will remain that “slower” codec that nobody in their right mind will be interesed in using. So while you present it as a success story (don’t get me wrong – I’m sure it’s a monumental achievement without documentation) it’s still sad if you take a look at what the guys at the other side of the fence are doing.
I’m just angry with that, that’s all…
anonymous:
I think you’re confusing the OMAP3 with the TMS320DM646x or other DaVinci video processor, which is an entirely different class of device. The ‘6467, for example, uses 2.5 W for “typical” usage (I imagine transcoding 1080p is higher), whereas the OMAP3430 is rated for 540 mW for a video encoding application. I did not measure these numbers, so I can’t vouch for their accuracy, but 5x the power for 5x the performance seems reasonable.
[...] David Schleef’s theora optimisation for the dsp in the OMAP3 will really make open video a reality for the N900 and other OMAP3 devices like Palm Pre/Motorola Droid. Thank you David and Mozilla for the work. 2009-11-12 08:45 UTC with score 5 VN:F [1.7.0_948]Rating: 0 (from 0 votes) [...]
[...] David Schleef’s theora optimisation for the dsp in the OMAP3 will really make open video a reality for the N900 and other OMAP3 devices like Palm Pre/Motorola Droid. Thank you David and Mozilla for the work. [...]
I understand about .005% of the article and the comments.
What is the relevance of this for me as a (soon to be) N900 user today?
[...] http://www.schleef.org/blog/2009/11/11/theora-on-ti-c64x-dsp-and-omap3/ a few seconds ago from xmpp [...]
Pretty neat! Thanks.
Can someone please confirm that the released N900 device from Nokia running Maemo is actually using this gst-dsp software stack? If not, can you please let me know what other gstreamer release it might be using?
[...] rest of Vorbis decoding and some work to get the video onto a fullscreen texture.) This is based on David Schleef’s Theora-on-DSP work and it’s showing real [...]
Thanks a lot for this!
Can you please also share with us how to run the video playback use case?
I believe the readme located here: http://code.entropywave.com/git?p=ew-leonora-beagle-demo.git;a=blob_plain;f=README;hb=master
has instructions on prerequisites, build process, installation but the command line used to run the actual use case was missing.
[...] rest of Vorbis decoding and some work to get the video onto a fullscreen texture.) This is based on David Schleef’s Theora-on-DSP work and it’s showing real [...]
Web video controversy summarized…
This’ll be the last, definitive article from me on this subject for a while, I promise, but I wrote such a good summary on the Theora/H.264 controversy and the new Silverlight Theora player on Slashdot that I must put it up here as well (with some twe…