I recently added support for 10- and 16-bit encoding and decoding to Schrödinger, so I did a little release. Presenting Schrödinger-1.0.11. Also pushed changes to GStreamer to handle the new features. Although these changes have been in the works for some time, a little prompting from j-b caused me to finish this off, so this will probably appear in VLC soon, too.
This was the last piece needed to create a 10-bit master of Sintel, which I’ve been planning to do for some time.
Archive for the ‘video’ Category
New Schrödinger Release
Monday, January 23rd, 2012GStreamer SDI Capture Plugins
Thursday, March 24th, 2011I’m getting ready to push several commits to the gst-plugins-bad source repository that add plugins for capturing SDI and HD-SDI using cards from two different manufacturers: BlackMagic Design‘s DeckLink, and Linear Systems SDI Master capture card.
The Linear Systems cards are probably better known by their reseller, DVEO. Entropy Wave uses both of these cards in the E1000 Live Encoder appliance, we’ve found that aside from some motherboard incompatibilities in the DeckLink cards, they both work great in Linux. While we’re primarily interested in live capture at the moment, output has also been implemented.
We slightly prefer the Linear Systems cards – mainly because the drivers are open source, but also because the API allows lower level access to the hardware, including SDI clocking and raw VANC and HANC data. It also allows subframe latency, although not implemented in the GStreamer plugin, it will be nice to use in the future.
In comparison, the DeckLink driver and SDK are not open source (which means I can’t fix any bugs), although they conveniently provide open source headers and shim code for interfacing with the SDK. This allows the GStreamer plugin to be completely open source and legally distributable separately from the SDK, but will only work if the SDK libraries and driver are present. Optical fiber connections are only available in the DeckLink, and the DeckLink cards tend to be less expensive.
It will take a few weeks for these to be available as part of a GStreamer release, however, they are available in the Media SDK now.
(Reposted from my Entropy Wave blog.)
FLAC code mirror
Thursday, February 3rd, 2011Since parts of SourceForge have been down for a while, I am making my git mirror of the FLAC CVS repository available. It is here. I’ve also made my fairly uninteresting patches available on the Entropy Wave Open Source site, here.
Open Video Conference
Friday, September 3rd, 2010The Open Video Conference is happening again, and coming up in a few short weeks (October 1-2). It’s a great mixture of people from both sides of open video: open content and open technology. I’ll be giving a tutorial that ties the two sides together, explaining to content producers how to use the technology to its fullest.
In association with OVC is the Foundations of Open Media workshop (October 3-4), which is mostly a bunch of video and media hackers getting together to talk about our next steps in taking over the world.
Dirac Update
Thursday, March 4th, 2010One of the difficulties of having a long release cycle for a small project is that a lot of activity might be taking place behind the scenes, but a casual observer might not notice. Of course, in the case of Schrödinger, it didn’t help that diracvideo.org was lacking CSS and important links for over a month, looking rather bit rotten. So it’s not surprising that there have been people wandering into the IRC channel wondering if the project is dead. Um, no. We’re just quiet. And there’s a new release.
Partly the reason for the long release cycle is that it took more time than expected merging several of the encoding tools from dirac-research. But now there are fewer loose threads and development and releases can proceed at a more even pace. I’m hoping to do new releases at the pace of about one a month. (But I’ve said that before…)
Schrödinger now requires Orc to build. Switching from liboil to Orc has made decoding a lot faster, sometimes as much as 4 times faster.
Look forward to separate blog posts about some of the new features. Encoding quality has improved quite a bit for typical cases, and hugely in cases where there were bugs that were being ignored.
Theora on TI C64x+ DSP and OMAP3
Wednesday, November 11th, 2009For the last several months, Entropy Wave has been making Theora work on the TI C64x+ DSP as a project for Mozilla Corp.

An Ogg/Theora video of Big Buck Bunny being played back on a Beagle Board via the C64x+ DSP coprocessor
The goal behind porting to the C64x+ is to run on OMAP3 SoC from TI, which has an ARM Cortex A8 core and also has a C64x+ DSP coprocessor. This SoC (System on Chip) is best known as being the base behind Nokia’s N series of mobiles (including the N900), the Motorola Droid, Palm Pre, and the Beagle Board. The DSP coprocessor is commonly used for audo and video processing, including video encoding and decoding, and TI makes codecs available for MPEG-4 video decoding, AAC decoding, etc. Having Theora decoded on the DSP fits into Mozilla’s Fennec project, making Firefox with video useful on a mobile platform.
One of the engineering reasons behind having a separate processor for media handling is that it separates real-time tasks (media decoding) from non-real-time tasks, such as running web browser software. From the standpoint of software running on the ARM, the video decoder looks and acts just like a hardware video codec. The DSP on the OMAP3 is even more compelling for video decoding because attached to the DSP are several units that accelerate motion vector copying, VLC decoding, and loop deblocking. Unfortunately, these pieces are not publicly documented by TI, so the current Theora port (which is open source) is unable to use them. A future Entropy Wave project will likely add support for these acceleration units which would allow the performance of the Theora decoder to be similar to TI’s MPEG-4 codec, which can do 800×480 playback (possibly more?). As it looks now, the resulting code would necessarily be closed source until such a time when TI wishes to make the specifications public.
As it currently stands, the Theora decoder plays 640×360 24fps at slightly more than 100% speed on average. This isn’t quite good enough to call it “real time”, since some frames take longer than the allotted time to decode, but it’s pretty close and the results are good. Additional speed improvements in libtheora would require internal changes, which would be a project in itself. One clear area for improvement is that the DSP spends a substantial part of its time idle, because the host code is serialized with the DSP processing. Fixing this is likely to put the above case firmly into the “real time” category. Given that 640×360 is larger than the iPhone display resolution and almost as large as the N900 resolution, it’s clearly good enough, even if it is less than TI’s hardware accelerated MPEG-4.
On the Entropy Wave site is a page describing the demo, including where to download images and how to compile source code.
A big thanks to the people that laid the foundations for this work, especially Felipe Contreras.
YCbCr Gamut Checking
Wednesday, October 7th, 2009I recently added a pattern to GStreamer’s videotestsrc that can be used to check YCbCr to RGB conversion is being done correctly as part of video output. It is the result of a clever hack — some YCbCr values, when converted to RGB, are out of range, so as part of the conversion process, they are clamped to the nearest RGB value. The pattern generator creates a checkerboard pattern of a color (say, red) and a YCbCr value that upon correct conversion will result in the same color. Thus the pattern should be invisible. Usefully, these out-of-gamut YCbCr values are preserved by video codecs, so I can present to you a Theora video demonstrating this:
Firefox does the conversion correctly, so it’s unlikely you’ll see the pattern. However, some video display drivers still get this wrong, so you might see the pattern when playing the video in a standalone program that uses XV. For those of you with working kit, I created a demonstration video that simulates a bad conversion:
Sometimes it’s possible to see the pattern very faintly due to rounding in even a correct conversion. This is unavoidable because the RGB->YCbCr->RGB round trip is lossy.
Cog in gst-plugins-bad
Saturday, September 19th, 2009I finally moved my collection of Orc-based GStreamer plugins (codename “Cog”) into gst-plugins-bad, since they’re moved on from being an experiment. Orc is a runtime compiler for a simple cross-platform assembly-like language that specifically targets SIMD instructions for several processors. Orc is very effective inside it’s domain, which is small but growing.
One such application that is covered is chroma subsampling and color matrixing for video, semi-incorrectly referred to as “colorspace conversion” in GStreamer. There has been a colorspace element in Cog (cogcolorspace) for some time, but I never really bothered to do any speed comparisons between it and the default GStreamer colorspace element (ffmpegcolorspace), which is based on code copied from FFMpeg. However, recently I did, and was somewhat surprised (although I shouldn’t have been) that cogcolorspace is the same speed as, or much faster than, ffmpegcolorspace for almost all operations. (Please note that the FFMpeg code was forked a long time ago and heavily modified, so it does not reflect FFMpeg itself, only GStreamer’s ffmpegcolorspace.)
This is a scatter plot of the run time (in ms) for converting 1000 frames of 320×240 video between a variety of uncompressed video formats:
The axes are execution time (in ms), with cogcolorspace on the horizontal axis and ffmpegcolorspace on the vertical axis. The green line represents same execution time, thus for points below the line, ffmpegcolorspace was faster, for those above, cogcolorspace was faster. Most of the points clustered around the green line are statistically the same as the green line, since my timing method is quite crude. Things to observe from this graph are that 1) many cases are very similar in speed, indicating that both ffmpegcolorspace and cogcolorspace are using similar code paths, 2) some cases, cogcolorspace is a lot faster, probably indicating that there isn’t an assembly fast path in ffmpegcolorspace for that conversion, and 3) a few cases (which, not coincidentally, are the most heavily used cases) ffmpegcolorspace is slightly faster than cogcolorspace.
The conclusions to draw from this are that 1) by writing very generic code with Orc, you can get very similar results to hand-crafted assembly code, and 2) a developer can cover a lot more cases with a small amount of work, and 3) there are a few cases where special-case Orc code would be beneficial.
This is only the low quality mode that cogcolorspace supports, which is similar or identical in quality to ffmpegcolorspace. Higher-quality conversion is also implemented in most cases, and is only slightly slower in speed. This is the real advantage of Orc — Orc takes care of huge number of combinations of options, and produces good SIMD code for all of them.

Orc-0.4.0
Sunday, May 31st, 2009Lately, I’ve been working on a side project called Orc as a replacement for liboil. Liboil’s first major problem has always been that it doesn’t scale well — every software package that wanted to use liboil typically required several new liboil functions, and then someone would need to actually write assembly code for those functions on several architectures. My original plan was to develop a critical mass of functions, and then additions would be “simple”. This never happened. The second major problem is that liboil’s compilation is terribly fragile. Thousands of lines of inline assembly code that depends on specific compilers, compiler versions, libtool internals, and random snippets of code such as “if $user != msmith” do not lead to a maintainable project.
Orc is now to the point where it can not only reproduce about 90% of the code that is currently in liboil, but also generate 90% of the code that should be in liboil, but nobody ever wrote. At runtime. And the Orc language allows you to describe your own liboil-style functions. At runtime. Or, you can also use it like a normal compiler, converting Orc language source into N different assembly source files for every possible vector instruction set combination.
A large part of the decoding path in Schroedinger has been converted to optionally use Orc, where speed is either slightly faster or 20-30% faster than the previous liboil code. The real benefit is that takes only a few minutes to convert code that took weeks to develop originally. A side project of mine, Cog, has turned into a showcase for Orc, with demonstrations of video processing GStreamer elements, such as format and colorspace conversion and scaling. I’ve found that since it is so easy and fast to create vectorized code, it now becomes possible to offer additional features to users, such as quality vs. speed tradeoffs.
Orc can generate code for MMX and SSE on x86 and x86_64, and Altivec on PowerPC, as well as NEON for ARM and c64x+DSP code. The NEON and c64x+ backends are not currently open source.
Entropy Wave
Monday, April 27th, 2009I see Christian outed my new company, Entropy Wave. The mission of the new company is to create video post-production tools using open media technology for a wide range of users, including high-end studios, professional video editors, and hobbyists. Most of our products will be based on open-source code, including projects I’ve been heavily involved with such as GStreamer, Schroedinger, Orc, and various Xiph projects.
Existing and upcoming products include:
- A GStreamer-based Media SDK that allows developers to rapidly create and deploy applications on major platforms (Windows, Linux, OS/X)
- QuickTime plugins for DiracPro (SMPTE VC-2)
- A video encoder application geared toward content producers putting video on the web
- A capture application compatible with Numedia‘s line of DiracPro hardware encoders
In addition, Entropy Wave can provide support and custom development services in a variety of areas including open media.

