Wednesday, October 21, 2009

How to Get Started with OpenGL ES 2.0

OpenGL ES 2.0 is coming to DTV. Like Java before it, learning OpenGL ES will become a necessary part of programming in the DTV world. However, its not easy, even for old OpenGL programmers like me. As already noted in a previous article, it is very different in philosophy from even OpenGL ES 1.x. Secondly, its not easy to write code for an embedded environment without having that device. Fortunately AMD have solved that problem. Here then are the steps to setting up and beginning to learn OpenGL ES 2.x.

Prerequisites
  1. A windows based PC, with an AMD/ATi graphics card. Nvidia or Intel platforms may work but are untested.
  2. Knowing how to program in C or C++
  3. Experience with projects and IDEs
Installation
  1. Install the latest catalyst drivers or equivalent for your card
  2. Install Visual Studio Express C++ from Microsoft
  3. Install the AMD OpenGL ES 2.0 Emulator If you are not already an AMD registered developer, you will need to do so but its free and painless.
First OpenGL ES 2.0 Program
The emulator from AMD comes with a single sample that draws a textured triangle (and carefully avoids rotating it ;-). Simply go to the install directory (which can be done from your start menu) and click on the project file. Build the project and run. You should see something like this (click to enlarge):


Going Further

In a later article I'll cover getting started with OpenGL-ES 1.x. Whilst phones and mobile devices are surging ahead into ES 2.x, its is possible that DTV will remain in the 1.x world for a while. The future is a little unclear and unfortunately though the languages are similar, they are conceptually different in approach. The vast majority of OGL-ES devices in the world will run EGL 2.x of course but I have this suspision that due to the status of of OpenGL implementations in DTV and the fact that some companies are beginning to use whats out there already, we will see OpenGL 1.x quite popular in DTV at least for a couple of years.

Tuesday, October 20, 2009

WebGL brings 3D to Browsers (soon)


If its not clear yet, then I'll state it clearly. Whatever 3D graphics technology we are using in DTV in ten years time, it will be OpenGL. FLASH 10 will require OpenGL and render to it. The base of everything will be OpenGL, in one form or another.

It should be no surprise that browsers too are jumping on the OpenGL bandwagon then. The Khronos Group , the guardians of OpenGL and other related standards, has announced WebGL.

Wikipedia has this to say:

WebGL is a standard specification to display 3D graphics on web browsers. It enables hardware-accelerated rich 3D graphics in web pages without the need for special browser plug-ins, on any platform supporting OpenGL or OpenGL ES. Technically it is a binding for JavaScript to native OpenGL ES 2.0 implementation, to be built in to browsers.

So, essentially OpenGL ES 2.0 bindings in Javascript. Not yet relevant in the DTV world as we don't have OpenGL ES 2.0 hardware in most set tops or TVs yet but its coming. Browsers are part and parcel of DTV experience now and are supported by GEM based languages too. Webkit based browsers already offer a version of WebGL and the full specification will be available in 2010. In the mean time, here is a demo video:

Sunday, October 11, 2009

A White Paper on Broadcoms Graphics Architecture

How I missed this before is beyond me, but here is a link to a white paper on Broadcoms 3D graphics architecture supporting OpenGL ES.

There are a few interesting things in this paper. Firstly texturing:
[The px3d] Reads 40 million unrotated texels per second. Rotations around the Y-axis and, to a larger extent. the Z-axis cause texture cache misses, with larger rotations causing more misses.
40million texels is very very few (a texel is a texture pixel). And thats a peak figure. The rotations make things worse, the worst case being rotations around Z or spinning an image. Actually, four texels are usually required to render a single pixel on the screen as hardware does a weighted average four texels to get the best colour.

The practical upshot of this is, as the paper says:
The 3D device cannot render at full screen at 60 fps when the display is
1920x1080, but can only do it at about 15 fps. Instead, the 3D device could render
to a quarter the display size (960x540), and have the 2D device scale it vertically to
960x1080 with filtering.
And thats without "excessive" use of transparency, rotations or even scaling. In other words, if we don't actually use much 3D.

The px3d quoted in the paper is used in the 7400 device but has been improved for later Broadcom chipsets. Nontheless, the barriers still exist, as this is essentially the same architecture, and even in newer BCM devices, in my opinion, expert design is required to get 60fps out of the device in full screen 3D. For the time being , as the paper says, 3D rendering should be limited to small areas of the screen. The image below illustrates that:

The blitter is much faster at filling, even greater than ten times faster when scaling is used! However with a more modest goal of 20-30fps, no transparency, only small amounts of rotation and rendering at 1/4 full HD and using the blitter to scale, full screen 3D can be achieved. I'm a purist from the graphics world, not DTV, and maintain that full screen 3D at 20fps is a painful, to be avoided experience. It never ceases to amaze me the happiness some DTV folks show when their 3d user interfaces run at 10fps and cause me a headache in minutes.

FLASH 10.1..everywhere, really?

Incase you missed the big announcement...FLASH 10.1 is the future of everything. Only, no mention of set top boxes. Then, a few days later, Nvidia announce they have stopped making chips for Intel chip sets. Opinion seems split on FLASH 10 for DTV. Some believe it will be on the market in 2011, others (including one DTV chip manufacturer) think 2012.

What a mess.

Thursday, October 8, 2009

OpenGL ES 2.0 - What has been cut out


OpenGL is changing with each release. For OpenGL programmers its a busy time to keep up. OpenGL ES 1.1 and OpenGL ES 2.0 are virtually different languages. The approaches programmers must take to getting 3D on the screen change dramatically. The best known change is the addition of shaders (programmable bits of the graphics pipeline) but whats left out is almost as dramatic:

  • Immediate mode (glBegin/End and glVertex)
  • Per primitive attributes like glColor, glNormal
  • Explicit image drawing (glDrawPixels, glCopyPixels etc.)
  • Antialiased points and lines
  • The matrix stack and commands for co-ordinate transforms (eg glRotate)
  • Non power of 2 textures with mip-mapping
  • 3D textures
  • Many data commands like glNormalPointer, glColorPointer, client state etc.
  • Material and light mode
  • Fog
  • Texture environment (no fixed function blending)
In summary OpenGl coders must now:
  • Provide their own maths libraries
  • Use arrays for declaring 3d data
  • Write shaders and pass in their own data for transforms and lighting and materials
  • Handle images via textures
The language is streamlined, more in line with modern graphics thinking and better for the hardware to accelerate. However, its also no longer an easy start language (which was one advantage over DX) and lots more burden is on the programmer.

Microsoft Set Top Box

I'm not convinced by the arguments yet but this article, makes an interesting read nontheless. Could it be that Microsoft plans a gaming set top box? The key patent contains the following claims:

1. A system for integrating media on a gaming console, comprising:a first subsystem for displaying a dashboard having a plurality of media selections, wherein said dashboard is native to a gaming console and wherein at least one of said media selections is a television selection; and a second subsystem for providing users with the option of switching back and forth between said television selection and other media selections in said plurality of media selections.

2. The system according to claim 1, wherein said television selection is branded with a logo of a service provider providing content to said gaming console.

3. The system according to claim 1, further comprising a third subsystem for providing users with the option of selecting starting of said gaming console as a set-top box.

4. The system according to claim 1, further comprising a third subsystem for providing users with the option of selecting remote starting of said gaming console as a set-top using a gaming controller.

The patent goes on to clima DVR features. So are we are looking at a set top with DVR based on the xbox 360 console?

Notice the magic wand trademark they have too. I previously blogged about such a device which when used is very compelling and fun.

Tuesday, October 6, 2009

Graphics in MHP Tutorial

Incase you are interested in MHP or GEM graphics there is a great tutorial on the web already. I couldn't write better so I'll just provide a link:

Introduction to MHP Graphics

My only caveat is part on colour is a little out of date now. Most receivers support way more than 256 colours these days.

I should add that the article comes from a book called Interactive TV Standards: A guide to MHP, OCAP and JavaTV.

Tutorial: Double buffering (and more)

Buffers are areas of memory specifically reserved for rendering graphics. The number of buffers affects what can be done by the graphics subsystem well and also has a memory cost. In general the number of buffers, the better.

Single Buffering

One buffer is used for drawing and also used for display simultaneously.
There is usually a problem with flickering. This happens if rendering of graphics is partially complete when frame fly back (the process of copying the buffer to the output screen) occurs. We see half of one frame and half another and, over time, this looks like a flicker in a small area or a band moving down the screen if the whole screen is being redrawn.

This is really only useful on software based, SD projects, with little or no animation and so very few pixels are being rendered each frame. The outcome is very little evident flicker. If the drawing process can wait for frame fly back, render everything in the next 50th of a second and then wait again for frame fly back, no flicker will be seen.

Graphics libraries go to a lot of trouble to identify the minimal area of the screen that needed to be redrawn. Remember that moving one icon across some image requires both the icon and the image to be redrawn in many cases.

Double Buffering

Double buffering or better is used in all PC games, and in simulations and anywhere that moving most of the visual scene is required. One buffer is being used as a display (front buffer), whilst the other is drawn to (back buffer). This means there is no flicker. Simply put, the buffer being used for display is complete and is never drawn to. When drawing is finished, the buffers can be swapped (a pointer index, not a copy usually). However, if this is done at any time, tearing can occur still (as half of one buffer and then half of another is copied to the output). Usually therefore, we lock the swapping of buffers to vsync. However, the problem of blocking occurs. Imagine that we must redisplay the graphics every 1/50th second, or every 20ms. Now if we take 21 ms to actually render our scene, we miss the 20ms critical barrier but as we do not wish to risk tearing, we wait 19ms before swapping again! This means our rendering rate drops to 25 fps from 50 fps as we always take 40ms to render (and then block) each frame.

Triple Buffering

To avoid this drop in frame rate, a third buffer is used. Here,
One buffer is being being displayed, one is ready to display, one being drawn to. The two buffers not being displayed can be swapped at any time with no effect. This allows our graphics process to render as fast as it can. In modern graphics cards on PCs double buffering often really implements triple buffering without the user or coder knowing in the background by default, though controller dialogues that come with the graphics card may also expose this, but they not.

N-buffering
Of little ineterst in DTV but just for completeness, triple buffering can be extended to cushion buffering, a logical extension of triple buffering that uses N buffers rather than just 3:
One buffer being displayed, N-2 buffers ready for display, one buffer being drawn to.
The idea is that some frames of animation are harder to draw than others and by amortising across many buffers the frames that would slow the 3D down do not. The problem with this technique is lag and it has not become popular.

On Memory
So the preferred technique is triple buffering. However, the cost is memory. Example: HD 32-bit buffer (RGBA, 8 bits each) is 1920x1080x4 = 7.9Meg. So triple buffering would be 7.9 x 3 = 24 Meg just for graphics buffers. In a unified memory architecture (where the graphics chip uses main memory for rendering) thats 24 Meg of DRAM gone, just for the graphics architecture to operate.

This can be cut down using pre-multiplied alpha format (ie 24 bits per pixel) or even a limited colour range of 565 (green usually gets more bits as the human eye is more sensitive to green) and by reducing to double buffering:

So a double buffered, limited colour, with pre-multiplied alpha could be: 1920x1080x2x2 = 8 Meg roughly. Remember its just for the graphics architecture to render, nothing to do with user defined graphics yet.

Thats without a z-buffer of course but thats for another tutorial, suffice to say in this context, if a full z-buffer is used it may require 1920x1080x3 bytes (24 bit z-buffer). Only one is required regardless of how many drawing buffers are used so an additional 6 Meg. However many of the architectures in DTV devices are smarter than this and use only small amounts of memory for z-buffering.

Conclusion
Triple buffering is the best for performance but double buffering is a good performance/memory balance. Single buffering is only for low end, software rendering on SD devices. Unfortunately, these buffers use large amounts of memory and will increase the BOM of set top boxes accordingly.

Monday, October 5, 2009

Fonts for TV

If you are a designer, you take fonts very seriously. Its hard for most techies to understand this and get their heads around it. Don't believe me? Look here, here, or here to see passion about fonts. Or, more entertainingly :



I suspect as we go HD and graphics become more important to our lives, fonts will be too. Ten years of staring at Tiresias may be enough for anyone. Here it is, in all its glory:


Bitstream, who offer Tiresias screen font, also offer other TV fonts.

Another company offering TV fonts is Ascender. One font of particular note is their Mayberry(TM) font which is a one for one replacement for Tiresias. An image below illustrates how close the fonts are visually:



The top font is Teresias, the one below is Mayberry. The most obvious difference is how open the font is (compare for example the letter "S"). Open fonts will render much better at lower resolutions, something important for TV. Ascender also offer fonts for all sorts of TV functions and regions such as these teletext fonts.

This all said, a big part of the design of TV fonts has been geared to low resolution displays historically. In HDTV, I suspect a wider range of fonts are acceptable visually. So long as its not Comic Sans Serif of course ...

Thursday, October 1, 2009

Tutorial: Antialiasing for DTV

The point of this blog was always to help bring people who are experts in DTV up to speed in the emerging graphics technologies. This mini tutorial looks at antialiasing and aliasing.

There are many types of aliasing but the most easily understood is that seen when drawing lines. The screen is composed of pixels. A perfect line cannot be drawn, instead one is approximated by filling in pixels. Close up it might look something like this:
This is aliasing. Its the result of sampling something which is continuous (the line equation) into digital discreet samples (the pixels). This can be unnoticeable to extremely annoying, especially if the line moves. When the line moves slowly, crawl occurs as first one set of pixels and then another is highlighted. The line is meant to be moving slowly and smoothly but the pixels suddenly switch. Over many frames of animation this makes the line look like its changing shape and crawling across the screen.

Antialiasing can be seen as an attempt to smooth out the digital sampling and have less harsh edges. Here is an antialiased version:

Pixels around the line are measured for how close they are to the line and the colour chosen depending on distance. It looks like the line is just blurred but it isnt. Blurring would not use distance from the line equation. The antialiasing make all the difference visually.

Fortunately 3D chips have antialiasing built in. This is all very well but TVs have digital filters that filter the output of any set top box of their own rendering pipeline. These filters blur or sharpen the image AFTER the graphics output. In experiments I've done, the filters within a single TV can make a huge difference and the filters between different TVs can make any attempt to compensate useless.

On LCD television this situation is even more interesting as each pixel is composed of three different coloured bulbs in a grid. Something like this for a single pixel row:
Now its possible to use each coloured bulb and measure its distance from the mathematically perfect line. This is "sub-pixel" antialiasing. It means we get more antialiasing and less blurring in effect. The result on black text on white background, looks like this:

Its worth taking a moment to see the left of the letter is more red and the right more blue. Looking at the pixel grid and distances you might think it would be the other way round but this is drawing black on white so things further away from the line or letter are brighter and so red is brighter on the left. There is a great article on this at java.net from where I blatantly stole the images.

So in antialiasing we might deal with: pixels and digital sampling, LCD bulb colours and layout. There is no easy way to deal with TV filters. The variation is simply too high. Yet there are other kinds of aliasing we can hope to deal with directly.

Temporal aliasing occurs when we sample a continuous animation. Imagine an image moving from one side of the screen to the other. It takes one second. At 50 fps the image will digitally be sampled at 50 locations as it moves from one side of the screen to the other. At 5 fps only five locations. This is a form of aliasing and explains why high frames per second are critical for rendering smooth graphics.

We can do better. Antialiasing of motion is called motion blur. It attempts to add graphics in the direction of motion. Here is a photograph of a pool table. Because the shutter of the camera remains open for a short time, the balls move across the image and leave a motion trail:
A single image again looks blurred but when seen as an animation, it all makes much more visual sense. This is computationally very expensive to render in a user interface, however some approximations can be done such as provided in this Flash tool. Disney artists long ago used techniques to approximate motion blur (reminder: temporal antialiasing) and actually deformed objects as they animate:



Notice the shape deforming on the ball, particularly just before it hits the ground. This is a crude, but highly effective, version of motion blur. The same technique ca n be applied to images as they move across the screen. They can be stretched slightly during fast motion to suggest motion blur.

Image sampling can also suffer from aliasing. There is considerable hardware built into modern 3D chips to avoid this. Mip-mapping and anisotropic texture filtering are used to avoid aliasing in images when scaling them. A blitter also uses a many TAP (texture accesses per pixel) filter to draw nicely scaled images without aliasing. The idea in all cases is simle. One pixel on the screen does not correspond to a single pixel in the source image when an image is scaled or rotated. The colour of many pixels in the source image is needed in order to draw one pixel on the screen. The source image pixels are then averaged depending on distance (compare with the line).

So, antialiasing is critical technique for producing compelling graphics. It has spatial and temporal forms. Some spatial antialiasing is done for us by the 3d hardware and blitter but then often ruined by the filters on TVs. Temporal antialiasing is usually achieved by rendering more frames per second. However, its possible to consider deforming objects in the direction of motion as cartoon animators have done for a century now.

Tuesday, September 29, 2009

Video of Intel CE3100 Running Flash 3.1



The first thing that strikes me is the quality of the H.264 video from YouTube. Its truly the best I've seen so far. The next thing is how damn slow the FLASH user interface is. The CE3100 is running at 800Mhz, according to Engadget, that is a 1.2GHz Atom equivalent. There is PowerVR 3D hardware in the device. And this is FLASH lite , not FLASH 10. Its running at 2-3 frames per second I guess. This is on the edge of simply being unuseable: for sure it would be at least annoying.

It begs the question, is FLASH a really great way to quickly design slow user interfaces? To this date, I have not seen any FLASH demos that impress in terms of graphics and useability.

According to Digital Cable News:
Time Warner Cable Inc. (NYSE: TWC) has used Flash in some of its boxes for several years, but industry sources say the MSO is in the process of phasing it out. However, Time Warner Cable is said to still be interested in the potential of Flash, and could consider it as an execution engine in digital set-top boxes later on.
Meanwhile Digital Cable News reports Comcast are approaching FLASH cautiously:

"We do want to see this [Flash] ship on actual set-top boxes," Comcast senior vice president and chief software architect Sree Kotay tells Cable Digital News. But he envisions Comcast starting out with more "lightweight" apps that can be embedded with the IPG, such as email readers and weather widgets.

Getting even to that point will take a while. Comcast is busy in 2009 getting base tru2way architecture deployed in the first place. The addition of Flash could be as much as 24 months away, Kotay says.

It seems clear right now that adopting a FLASH strategy as the core for application delivery for a set top box project would be a bad idea. The power to deliver compelling user interfaces and applications simply isn't there today. A better approach seems to be to base on proven faster software and integrate FLASH for secondary applications, as Comcast are doing.

This won't be free though. The effort of integrating FLASH and getting it working together with an existing, proven, fast environment will cost someone money. Whereas Time Warner and Comcast can afford these kinds of progressive projects, not everyone can. Can operators really make money from integrating FLASH at this time? FLASH is at the peak of the Gartner hype cycle. The cost of implementation is hidden somewhere deep in the trough at this time.


FLASH per se is a great idea. Its easy to see why the idea of FLASH is so popular. At least for now though, the idea is much greater than the reality in my opinion.

Monday, September 28, 2009

The Future of Television and HDTV - Featured Article by Digital Trends


The Future of Television and HDTV - Featured Article by Digital Trends

Posted using ShareThis

A nice little article which, whilst not strictly about graphics, does look WAY into the future at what might be technology wise for displays which will ultimately affect graphics too. Couldn't resist the Star Wars reference , sorry.

Gametree TV Gaming Model



There are a few ways to deliver games to the set top box or iDTV. The traditional model is that a company provides the games and servers themselves. The games are developed in-house and remain casual in nature. All intellectual property is owned by the company delivering the game service. Operators can either host the game portal themsleves or pay for them to be hosted on a server.Digiquest is an example of this type of model. Typically their games are simple, casual games that target many platforms.

Another way to deliver games is by streaming video. Here the games, usually full Console or PC titles are actually run remotely on a server or gaming console and video of the result sent to the user. This allows any games to be run at all, but has considerable issues of lag and bandwidth to overcome. In addition the server back end could be expensive to setup and maintain. Operators would not run this server service themselves I guess. One company providing such a service is Playcast which I have previously blogged about here.

Free, in the box games exist of course, but they are of little interest commercially speaking.



Now TransGaming Inc are attacking the TV gaming world. Known for their portability layer for the Mac platform, Transgaming have established a gaming service called Gametree. Here the company concentrates on servers, porting apis and crucially apis to support various business models for delivering the games including advertising. The idea is that certain PC games will run on next generation set top boxes. Transgaming, rather than develop the games themselves, enables the porting and integration of existing games onto the platforms and their Gametree server and business model.

This seems like a good idea. So much so, Intel have invested. This is logical as Intel perhaps make the porting easiest with their Canmore and in the future Sodaville devices. Its not clear if the relationship is exclusive.

There is a video available from the Gametree site. Its highly marketing oriented, but it does push the ability of games developers to integrate apis that allow advertising and various purchasing models for the games. Something important until we understand well Gaming on Demand. Also, Transgaming plans to integrate gaming controllers into the platform. An absolute necessity and again reaffirms my view that next generation set top boxes will no longer use the traditional RCU.

The biggest problem of this method of delivery is that the game must run on the set top box which of course is not a PC. This means only a limited range of PC games are applicable. Thewebsite has this to say:

While the graphics performance isn't on the same level as high-end triple-A gaming, it is not intended to be. It is truly amazing the number of titles that are possible on GameTree.tv, including native Linux games and Flash-based games.
Nontheless the problem of which games to deliver is shifted from Transgaming to the games companies, where it should be. The server and supporting APIs are provided by Transgaming, where they should be. Business models are flexible, as they should be. It all sounds rather good.

Sunday, September 27, 2009

Intel CE4100 Atom Gives Double Graphics Clock Rate



A few days ago Intel officially announced its new CE4100 device. Its hard at this stage to penetrate the buzz words like "connected", "BluRay", "FLASH 10" etc. and no date for shipments has been announced but one number given is the clock rate of the graphics. This has been doubled.

Camel is already the smoothest graphics of any device available today and the CE4100 ("Sodaville") seems set to keep Intel ahead of the pack in 2010 at least. Doubling the clock rate may not double the graphics performance if, for example, the application is limited by memory access, no improvement will be seen, nonetheless I can't wait to see this in action.

Sony XMB



The set top box is an odd device. Its not a mobile device, nor does it have the power of a PC. When designing a new interface for it, its not always clear which path it could follow in terms of copying technology and learning from those two worlds. One additional possible source of inspiration might be consoles. The Sony XMB (cross media bar) is a UI design (seen above) that is meant to work on a range of devices. It works on both consoles and TVs currently.

The XMB is a mainly 2D interface, a cross of icons with associated text. The user moves both horizontally and vertically and can see at all times, the complete menu path taken to reach the current choice. Text is only displayed where it could affect the users decision making process (ie its contextual) keeping the screen free of clutter. The focus moves but slowly and never far - rather icons scroll underneath the users focus. Unavailable options are faded out. Leaf menus may be displayed in almost any fashion and differently on different devices, the XMB is mainly a way to navigate the menu structure. The mainly 2D approach does not demand advanced 3D hardware. Users note that it elegant and fast. Here is an example on a TV set:



There are many great subtle features such as the way the bars cross each other and icons skip a space to appear on the other side of the bar and yet the user hardly notices. The animation itself with rapid acceleration and deceleration feels almost alive when navigating. Many of the subtle features work only because the interface renders at 50 or 60 frames per second. Such animations cant work at 10-20fps. Rendering speed matters.

Yet, this generic, works-anywhere approach suffers from lowest common denominator problems. Even many Sony Playstation fans complained that the Xbox had a more media oriented approach that simply looked better. Add to that the fact the icons were very small and barely visible on anything but a large screen TV and SONY had problems.

They responded and through firmware updates, upgraded the initial interface substantially over time. The new version 3.0 was released recently:



So SONY listen and adapt. User interfaces are updated after delivery to the users. Media richness is an important feature. 3D per se, is not, it would seem. Performance allows for subtle animations that brings the interface to life.

Thursday, September 24, 2009

Seeing (Android) is not believing

At IBC 2009 Kaon were displaying an "Android set top box" on their stand. It was playing HD video with Android graphics overlaid. Apart from one sign it was just one of dozens of set top boxes Kaon were displaying on their impressive stand. This was rather surprising as you would think an Android set top would be big news.

Two other things were quite strange. The graphics seemed fuzzy, not clear at all, and the set top box itself was quite large: I had expected an Android set top box to be quite small, Android coming from the mobile space.

It occurred that there was a good explanation. Although encased in production set top box housing the "set top box" was far from production inside, consisting of two hardware boards. The first was a normal set top box, running the video with a central CPU, and the second was running the Android, on a different central CPU. My guess is that this was effectively a mobile phone HDK. The resolution of output of the Android part was 800x480 (odd, but thats what I was told) and this was then overlaid on the true HD output of the set top box.

Wednesday, September 23, 2009

Patent: Remote touch screen user interface

Thomson hold a user interface patent described thus:

A method for control comprises a set top box receiving coordinates from a touch sensing screen. The coordinates are interpreted for controlling the set top box, and in accordance with the interpreted coordinates an action is performed. A further method for control comprises a set top box receiving a signal representative of displacement. A control function is determined from the displacement representative signal and the control function is activated. In accordance with the control function a signal is formed for communication.

Its a 2D/3D input system for a set top box from something like a touch sensitive RCU, or more likely a media device. The trick to the patent seems to be that the video from the set top box is used as the rendering of the user interface for the input device. That is, the touch sensitive input device does not render a user interface itself but takes a video stream and displays it. The video stream from the set top box already has the user interface rendered into it. The user then uses normal touch or stylus input and these co-ordinates are transmitted back to the set top box where some smart software converts those co-ordinate inputs into commands for the set top box (which may have more traditional form of input).

Here is a diagram:



Nice.

3D User Interface Conference and Competition


The IEEE runs a conference on 3D user interfaces each year since 2006. In 2010 its is being held in Waltham, Massachusetts USA in March. There is also a competition, open on any platform, using any hardware, for novel 3D interfaces. Its early days yet and the web site is incomplete. The conference looks small and intimate but it is linked to the IEEE Conference on VR and so will have a Virtual Reality slant to it no doubt.


Tuesday, September 22, 2009

Mini Tutorial: What is Deferred Shading

Over at Imagination Technology, the PowerVR, found in Intel chip sets for set top boxes is described in this way:

POWERVR graphics technology is based on a concept called Tile Based Deferred Rendering (TBDR). In contrast to Immediate Mode Rendering (IMR) used by most graphics engines in the PC and games console worlds, TBDR focuses on minimising the processing required to render an image as early in the processing of a scene as possible, so that only the pixels that actually will be seen by the end user consume processing resources. This approach minimises memory and power while improving processing throughput but it is more complex. Imagination Technologies has refined this challenging technology to the point where it dominates the mobile markets for 3D graphics rendering, backed up by an extensive patent portfolio.

Tile based rendering was covered in a previous tutorial. There I described the normal graphics process as having two distinct phases, a geometric transformation one and a drawing one:

for each triangle
  • transform it and light it
  • project to 2d
  • clip it to the window
for each pixel in the triangle
  • check visibility (usually done with z-buffer hardware) and if its visible
  • interpolate lighting
  • fetch textures to draw on it
  • write the pixel with alpha blending if necessary
Notice that every pixel in every triangle is drawn. The problem with this technique comes from the fact that some pixels which are drawn are later overwritten by other pixels from other triangles which are closer to the viewer and thuis obscure the previously calculated pixel. The more triangles overlap in the scene, the more wasted effort occurs. This is explained in the image below:



This is a significant factor in the DTV world. We have relatively few triangles and huge numbers of pixels to deal with. This "overdraw" - calculating a pixels colour more than once - is a major performance factor for us.

Deferred rendering/shading rewrites the graphics pipeline like this:

for each triangle
  • transform it and light it
  • project to 2d
  • clip it to the window
For each pixel in the triangle
  • check visibility (including overlap) and store reference if necessary
next triangle

for each pixel on the screen
  • interpolate lighting
  • fetch textures to draw on it
In other words deferred rendering attempts to only shade each pixel in the resulting image once by first identifying how triangles overlap and only storing and shading the closest one at each pixel in the screen. The overlap can be detected very quickly using a z-buffer, present on most graphics architectures. As there are many more pixels than triangles, shading each pixel as few times as possible is a huge win.

The biggest problem for deferred shaders is alpha - transparency. When a triangle is transparent, the triangle behind it is also visible, negating the benefits of the technique and forcing the rendering pipeline to store several copies of pixels, one from each triangle, during the triangle transform stage. It gets even more complex when you think about the fact that we could receive a transparent triangle, then another on top, then an opaque one on top of them - forcing the architecture to remove the first two entries and replace with one reference to the opaque triangle for efficiency. In other words a collapsable stack, performing at incredibly high speeds is required in hardware. An area rich in patents no doubt.

One last point. Deferred shading does not require a tile based architecture such as PowerVR. Indeed, many modern games used deferred shading and it is key to high rendering performance at high resolutions.

ARM Mali Explained and Future Direction of DTV Graphics

Over at ARM there is a "confidential" presentation, freely available. It makes very interesting reading about the current generation of ARM Mali processors, their target performances, availability and future roadmap.

For me there are two very interesting slides. The first is this one, you can click on it to enlarge it:
This slide shows that mali-200 will cope with FLASH lite at low resolutions (SD I guess), but will struggle with FLASH 10 at lower HD resolution. Only Mali-400 will cope with FLASH 10 easily. Its noticeable as well that video post processing - something shaders should be good at - will only be good with the Mali-400. This makes me wonder what the chip manufacturers will ship in 2010? 400 or 200? Even an HD TV GUI is borderline according to ARM. Perhaps its ARM upselling the 400 but the diagram is a little worrying. Anyone from ARM care to comment?

The second slide is only academically interesting at this stage:
There is a real mixture here. Power comsumption is mostly for mobile devices. Composition is of most interest to DTV and comes as a software layer it seems, whilst geometry shaders and OpenCL would be of main interest to gamers with advanced rendering engines and physics.

Love the gal with the crystal ball.

Monday, September 21, 2009

To 3D, or not to 3D, that is the question. ..

Shakespeare may well have written:
To 3D, or not to 3D, that is the question:
Whether 'tis nobler in the set top to suffer
The spins and zooms of outrageous interfaces,
Or to take arms against a sea of dimensions
And by opposing end them.
The first consumer PC graphics cards were introduced in 1995. Previous to that for a good ten years, 3D was available on workstations such as those from IBM, SUN or best known at the time, Silicon Graphics. By 1995 OpenGL was well established (being based on the existing IrisGL from SGI) and with the exception of shaders, little has changed in the graphics world since.

Yet, 15 years later the user interface of Windows from Microsoft and, more telling perhaps, the user interface from Apple remain solidly 2D and we remain with interfaces like the one below:



In the early days of 3D, of course, there were many attempts to bring 3D interfaces to the desktop. PC shows looked very similar to set top box shows of today with a dozen 3D metaphors for licensing/sale. I particularly liked the messy bedroom metaphor. The idea was that your interface was a bedroom (or a house) and you left files, well, wherever, under the bed, next to the cat, on the TV. The claim was that you could more easily remember leaving your notes next to the cat than you could remember leaving them in /usr/ct/home/private/expenses/trips/florida. It never caught on.

One arguably successful user interface in 3D was seen briefly in the film Jurassic Park, running on an SGI. The young Lex sits down, pulls up FSN (File System Navigator) and declares that she knows this system, its Unix. Here is a shot of FSN, which was shipped with every SGI machine (not every Unix machine ;-) ).



It was a file searching utility and genuinely useful. The user saw a landscape of files and could easily find large files or large collections of files or new files (colour) or combinations. It presents far more information than a traditional windowing system can and removes the need to navigate the tree of files, allowing the user to jump to interesting directories simply by clicking on something in the distance.

Useful it may be, but pretty, it isn't.

Another successful 3D interface in the sea of unsuccessful ones is CoolIris. Though many would claim it is barely 3D at all, it presents a wall of images which can be examined and scrolled quickly. The user can very quickly find through visual queues an image in the distance that would otherwise take many clicks to find in a 2D paged scheme. Here is an image:



Again, like FSN, the interface presents information in the distance that a 2D windowed interface would fail to present at all. Again its not generally useful but useful for a specific type of data. However, it is well designed. A video is available on the website or you can download it and install in Firefox.

This is key I think, for the first time in history, with FLASH 10, designers of user interfaces have access to 3D technology and can experiment freely without recourse to programming. So perhaps the technology has not changed much but the useability has. The power is coming into the right hands (after 15 years).

We will see some experimentation at first. Whitevoid, who created Liquid user interface for Rovi, have an interesting 3D portfolio at their website. Meanwhile over at EcoZoo, it gets wilder. Ecozoo illustrates that a 3D interface based on a 3d world metaphor can work but also that it is much slower to navigate and less intuitive. Such interfaces will fail because they do not add to the user experience but subtract from it. Again I'm reminded of the early 3D world for PCs and workstations. The real question is can designers help create a generation of 3D interfaces that are genuinely useful?

Here are some golden rules that I made up for 3D interfaces to succeed. Thornborrows golden rules of 3D interfaces:
  1. Use 3d only when it presents more information more quickly than 2d
  2. Simplify the 3D as much as possible without breaking rule 1
  3. Relevant (focused) information should be "close" to the user
  4. Present textual information in 2D, not 3D
  5. Only have supporting visuals such as images and visual cues in 3D
  6. Never force the user to navigate freely a 3D world but instead use 2D control metaphors
  7. Short highlight effects and transitions are ideal for 3D
  8. Make it as fast as possible: the performance of 3D needed for a good experience is high

Sunday, September 20, 2009

Tutorial: Drawing Trapezoids with a Blitter (and more)

3D without a 3D chip? At HD resolution? at 25 fps? Yes it is possible.

EGG from Osmosys and STMicroelectronics graphics library for the blitter can both draw rotated images. Specifically images rotated about the Y axis and the X axis (not Z). This results in trapezoidal shapes like those below:

These trapezoids can be useful for user interfaces such as in the example below:

However, a blitter, on which they both rely, can only copy a block of memory and can only draw a scaled, flat, square image (not quite true but for simplification I will assume it). How is it possible to get 3D effects from a blitter? The trick is simple: use more than one blit to draw the angled surface and vary the scaling. For example in the case of a surface rotated about the Y-axis (as above), a number of thin vertical blits can be used to draw the resulting image. The scaling power of the blitter is used to simulate the decrease in size of the image as it gets further away. Thus a series of blits can be used to draw a single off angle surface in perspective.

The blitter on modern chips however is very efficient at drawing pixels but quite slow at setting up individual blits. The key then to performance is to reduce the number of the blits needed to draw such a surface.

So how many blits are required? The obvious (and patented technique at the time we created EGG) was one blit per line of the resulting image. This means drawing lots of one line wide blits. The scaling is calculated for each line and passed to the blitter. The drawing below illustrates this:



Incidentally, the part of the source image that fits into this destination line must also be calculated (to maintain correct perspective). However, the bottleneck is setting up the blitter and issuing a blit command. It should be obvious that reducing the number of blits is critical.

Osmosys have patented a more efficient technique, which is now public so I can blog about it.
Instead of blitting every verticla line with a new scaling factor, the technique idetifies the maximum size block of pixels with the same scaling factor that can be rendered using a single blit. In other words, the technique finds a series of rectangles in the destination image, often wider than one line. A diagram helps to explain this:


In the example above, the Osmosys technique uses 9 blits to draw the trapezoid, whereas the original technique uses 26. Thats 3x more efficient in this case. At smaller angles and big images the technique really wins as only a few blits are needed even for a large trapezoid, compared with many hundreds for the traditional technique. The worst case is the same with respect to the number of blits. Therefore the technique is of the order of a magnitude faster in the general case.
There are further optimisations possible but I'm not free to talk about those.

Of course, rotating about the X axis is similar but the blits are drawn horizontally, not vertically.

The technique could be taken further to draw triangles and allow true 3D shapes in their libraries or to rotate images about the Z axis. However, the number of blits could be very large and therefore probably not worth the effort. Its still remarkable however that a blitter, when used correctly can draw many 3D surfaces in HD resolution at frame rates of 20fps or more.

The biggest problem with the technique is aliasing. The edge of the image is very clear and presents so-called "jaggies". At IBC this year Sagem used the ST library to have a 3D interface. To cover up the jaggies, the background was a very strange grey-black grid of lines - highly unsuited to the TV screen but a clever way to hide the problems.

Incidentally, even though the technique developed at Osmosys is I believe optimal, the patent appears to be badly written (ie not by a graphics expert) and thus easy to work around but then again, I'm no lawyer.

Ekioh SVG Engine


Ekioh provide an SVG engine. Described as :
(the) most advanced user interface engine, available for any embedded device such as a television, Set Top Box (STB), mobile phone or portable media player. Utilising the latest Web 2.0 standards from the W3C, the Ekioh UI Engine can provide a user experience second to none, with a completely customisable user interface and comprehensive integration with media controls.
Some details can be found at Ekiohs website. It appears to be the engine I saw in action on Dreamparks stand at IPTV world forum 2009 and blogged about previously. Ekioh is a small company formed by ex employees of ANT but with some good momentum judging by their press releases.

SVG, technically, plays to the same space as FLASH and Abobe creative tools can output SVG in some form. The hype surrounding SVG right now may die down once set tops have FLASH 10 and can play many of the games now available on the internet. In the mean time there is little to chose between FLASH and SVG. They are both vector formats that have been bent to make use of images through blitting and will also exploit OpenVG/GL-ES as it becomes mainstream.

Guessing right now you would have to say that FLASH has the edge in terms of momentum, future usefulness (games) and trained designers. It has heavyweight marketing and resources on its side and the chip vendors are supplying FLASH but have not been clear yet on SVG. SVG is more lightweight and seems to run a little faster.

The future has a way of eluding prediction, but the logic suggests SVG and FLASH may both have a place. SVG could be used with low end solutions, useful for the user interface but little else. FLASH meanwhile could be used at the higher end, faster platforms with higher performane 3D. The content available on the internet for FLASH would then be available on the higher end platforms. The picture probably will not clear up until we reach FLASH 10 for set tops...2011?

Wednesday, September 16, 2009

3D TV is not 3D Set Top Box

It was hard to find a killer theme at IBC this year but if there was one it was "3D". The hype surrounding 3D movies and television was there even if, in most cases, the products weren't. In many cases companies hi-jacked the hype and announced 3D Television - only to produce a psuedo3D GUI. Over at ITVT there is rather an amusing article on this...

However, though we didn't join the hype pool, Alticast did show something interesting. Alticast presented a 3D Video on Demand user interface running at high frame rates and full HD resolution. It was powered by a Broadcom chipset (7400) but the important point is that it was a Java 3D application, to be precise a GEM application using JSR 239.

JSR 239, also known as JOGL by some, is simply Java bindings for OpenGL-ES. This means that the C level calls of the OGL-ES drivers are duplicated at Java level. This standard sits alongside GEM and allows for GEM/MHP/tru2way/BluRay/ACAP applications to use any 3D graphics chips.

Tuesday, September 15, 2009

"Harry Potter" and Dolphin Remote Control for TVs

I'm growing more and more convinced that novel input devices take a side by side place with
advanced graphics in the digital TV world. 3D is not good unless you can navigate it. A run-jump game needs well, running and jumping or two key presses simultaneously. Shooting games work
much better with a pointer device. Simply put, the traditional RCU is not enough for next generation user interfaces. To this end I'll include remote control devices to go alongside the graphics on this blog.


The first device of interest is the Dolphin: a point and click device. It is based on the hardware development kit from Hillcrest Labs (see this blog entry) and gives a very positive mid air feel (unlike some devices in this category). The Dolphin can be used as a air mouse or can have gesture recognition. There is a longer article here. The device will apparently ship with Kodak picture viewer in the near future and shipping in huge numbers is likely to make the device costs come down.



A second and incredibly fun device is the Kymera Magic Wand from the Magic Wand Company. It comes in a real dragon skin box (I am told) and allows the young at heart to control an IR device in the home using a flick and swish. Infact the device can recgnise 13 different inputs (tap, tap side, rotate right, flick, rotate left, quick flick left or right, pull back push forward and so on). The device is programmable with the input from your cumbersome remote control and can send any IR signal for any gesture, allowing such feats as rewinding, power on from standby and volume control all with Hermione Granger grace.

It works and can even control lights in the house (infinite fun) but does cost a whopping 50 uk pounds from the Magic Wand Company. I know I'm buying one for my little Hermione.

Adobe(R) FLASH(R) for the Digital Home

Finally after months of trying, the FLASH story on set top boxes is clearing up. For the near future Adobe will be licensing Adobe Flash Lite 3.1 to silicon chip vendors. It will run on high end offerings only. Adobe offers two licensing models: free and paid. The Free license must leave the FLASH engine open (upgradeable) whereas the paid license may enclose the engine in the stack at manufacturing time. This is an interesting conflict. Most middleware and hardware vendors (who write drivers) would not want a third party upgrading the platform I suspect. The testing and support costs could be large. There is no such thing as WHQL for set top boxes so the latest FLASH engine can be a wild card. Secondly its not totally clear what Open means. Perhaps it will mean more than upgrading in the future (delivering adverts on FLASH startup?). This is a new model for set top box vendors and cable operators. However Adobe are most keen for the free model to be taken up and may make the paid license high in order to discourage its uptake.

Which version of FLASH? It will be FLASH lite 3.1 which is a subset of FLASH 8 but with some video support added from FLASH 9/10. The most interesting added video functionality is H.264. The missing functionality from FLASH 8 is:

  • Filters (blur, drop shadow, and so forth)
  • Blend modes (add, subtract, multiply, and so forth)
  • Enhanced strokes (miter, square, and so forth)
  • Text as Link
  • setTimeout
  • _target
  • Encoding per pixel alpha with video created with Flash 8 Professional (On2 VP6)
  • Bitmap caching
  • ActionScript objash Rects or methods
  • Flash remoting
The roadmap is currently under NDA but will be available publically on October 4th so watch this space...

Wednesday, September 9, 2009

IBC 2009

Well, for once I'm excited by a trip to IBC. I'm looking forward to the 3D graphics innovations that should be there this year. If you read this blog and you are at IBC, pop over to my employers stand and say hello and we can discuss graphics until, as they say, the cows come home.
Maybe you think your company should be on this blog?
Maybe you think you shouldn't have been?

Alticast, stand 1.C35,BM2
Ask for Chris Thornborrow

See you there!

Tuesday, September 8, 2009

Intel Canmore

The Intel Canmore (CE 3100) is a system on a chip for DTV. It includes an 800Mhz processor, security hardware, dual 300Mhz DSP decoders for video, PCI Express, ATA, USB and Wifi. Phew! There is an in-depth article over at EETimes.

The important thing for this blog is the chip also includes a 3d core - the PowerVR from Imagination Technologies. It is OpenGL ES 2.0 and OpenVG compatible. Here is a video of Yahoo widgets running on a Canmore device.



The widgets are not entirely smooth, but perhaps this is an early demo. In the meantime, a real set top box was launched by Gigabyte back in June 2009 using the Canmore.



Here we see a video the set top box running a game using two controllers!



And this month sees the launch of Metrologicals Mediaconnect TV, also powered by Canmore. I hope to see this at IBC and do a more in-depth report on the device.

So Canmore is gaining some market traction in DTV at last but its fair to say these are high end devices and Canmore remains a high end device.

Sunday, September 6, 2009

NXP announce chips using PowerVR for set top boxes


In a press release dated September 04 2009, NXP Semiconductors announce their latest range of processors for set top boxes, the catchily named NXP PNX847x/8x/9x. The interesting part for this blog is that the chips integrate the PowerVR architecture from Imagination Technologies.

The key may lie in the power management goal of NXP who say in the press release that the architecture is specifically designed to maximize energy efficiency. This may explain the choice of PowerVR which uses a region based architecture that helps to reduce power useage and has been very successful in the mobile sector.

Job: User Interface development for set top box

Synacor is advertising a number of roles, one of which is for a set top box developer with web development skills. You need to be based in America to apply.

Tuesday, September 1, 2009

SVG on set top boxes

SVG (Scalable Vector Language) is a family of specifications of an XML-based file formats for describing two-dimensional vector graphics with animation. It includes full and light versions (Tiny SVG) more suited for the mobile environment. Recent versions of the standard include images and video as well as scripting.

SVG is an open standard, human readable and appears to play head to head with FLASH. There are many comparisons of the technologies available on the web, written fom different points of view such as this one. However, the comparison states FLASH does not support filters which FLASH 10 now does.

The famous tiger image above is a GIF image. However, it is rendered from a vector language called SVG. My blog provider does not support SVG files or I could have embedded that image as an SVG file directly into the web page.

SVG supports Java and JSRs are available that give language bindings (eg JSR 287 language bindings for Tiny 1.2) that allow for creating SVG files dynamically and rendering them.

One proponent of SVG in the set top box world is IPTV solution provider Dreampark. Dreampark, based in Sweden, offer a scalable user interface for IPTV. I was able to see the demo at IPTV world Forum 2009. The demo uses non-rotated (axis aligned if you like) boxes and images and text. Using scaling some psuedo-3D was present. It was not clear if the text scaled from the demo. So it appears the SVG from Dreampark is connected directly to the blitter and limited to the functionality it provides if a smooth, fast user experience is required. Of course over time, like FLASH, the vector functionality, such as lines and polygons, will be available as 3D hardware supporting Open-VG hits the market.

Ikivo also supply SVG for mobile devices and offer Enrich Mobile TV as a product. This is a user interface suite based on SVG for mobile TV. An example from their user interface is shown below.


Adobe meantime support SVG output from their Creative Tool suite. This means that whichever solution is chosen Adobe still win, though of course FLASH offers more opportunity to Adobe for additional license fees. And there is one of the real deciding factor between these two vector graphics languages: business. The licensing model for FLASH on set top boxes is not clear to me yet and I wonder if it is clear to anyone. Designers most likely will pick FLASH if they can but just how much will it add to the cost of an STB?

So in conclusion, FLASH and SVG offer similar functionality. FLASH is proprietary but has momentum in the press and with chip manufacturers. There are however a few companies using SVG for user interfaces for TV, Dreampark being the most significant. The license model is unclear for FLASH and once this is known the business choice between SVG and FLASH will become much more obvious.