Produced by:
| Follow Us  

NOW SERVING:

  • Enabling the Fix
  • Leg Asea
  • The Habit and “The Hobbit”

Enabling the Fix

April 29th, 2013 | No Comments | Posted in Schubin Cafe

NAB logo

Sometimes cliches are true. Sometimes the check is in the mail. And sometimes you can fix it in post. Amazingly, the category of what you can fix might be getting a lot bigger.

Sonic Notify trimmedAt this month’s NAB show, there was the usual parade of new technology, from Sonic Notify’s near-ultrasonic smartphone signaling for extremely local advertising — on the order of two meters or so (palmable transducer shown at left) to Japan’s National Institute ofARRI-Ikegami-HDK-97ARRI-Camera Information and Communications Technology’s TV “white space” transmissions per IEEE 802.22. In shooting, for those who like the large-sensor image characteristics of the ARRI Alexa but need the “systemization” of a typical studio/field camera, there was the Ikegami HDK-97ARRI (right), with the front end of the former and the back end of the latter.

Dolby 1Even where items weren’t entirely new, there was great progress to be seen. Dolby’s autostereoscopic (no glasses) 3D demo (left) has come a long way in one year. So has the European Project FINE, which can create a virtual-camera viewpoint almost anywhere, based on just a few normally positioned cameras. Last year, there was a lot of processing time per frame; this year, the viewpoint repositioning was demonstrated in real-time.

Leyard 4K wallIf you’re more interested in displays, consider what’s been going on in direct-view LED video. It started out in outdoor stadium displays, where long viewing distances would hide the visibility of the individual LEDs. At NAB 2013, two companies, Leyard (right) and SiliconCore, showed systems with 1.9-mm pixel pitch, leaving the LED structure virtually invisible even at home viewing distances. Is “virtually” not good enough? SiliconCore also showed their new Magnolia panel, with a pitch of just 1.5 mm!

The Leyard display shown here (and at NAB) was so-called “4K,” with more than twice the number of pixels of so-called “Full HD” across the width of the picture. 4K also typically has 2160 active (picture carrying) lines per frame, twice 1080, so it typically has four times the number of pixels of the highest-resolution for of HD.

The Way of the Eagle4K was unquestionably the major unofficial theme on the NAB show floor, replacing the near-ubiquitous 3D of two years ago. There were 4K lenses, 4K cameras, 4K storage, 4K processing, 4K distribution, and 4K displays. Using a form of the new high-efficiency video codec (HEVC), the Fraunhofer Institute was showing visually perfect 4K pictures Inca trimmedwith their bit rates reduced to just 5 Mbps; with the approval of the FCC, that means it could be possible to transmit multiple 4K programs simultaneously in a single U.S. broadcast TV channel. But some other things in the same booth seemed to be attracting more attention, including ordinary HD images, shot by INCA, a tiny, 2.5-ounce “intelligent” camera, worn by an eagle in flight. The eagle is shown above left, the camera, with lens, at right. The seemingly giant attached blue rod is a thin USB cable.

smartphoneThroughout the show floor, wherever manufacturers were highlighting 4K, visitors seemed more interested in other items. The official theme of NAB 2013 was METAMORPHOSIS, with the “ME” intended to stand for media and entertainment, not pure self interest. But most metamorphoses seemed to have happened before the show opened. metamorphosisDigital cinematography cameras aren’t new; neither are second-screen applications. Mobile DTV was introduced years ago. So was LED lighting.

There were some amazing new technologies discussed at NAB 2013 — perhaps worthy of the metamorphosis label.  But they weren’t necessarily on the show floor (at least not publicly exhibited). Attendees at the SMPTE Technology Summit on Cinema (TSC), for example, could watch large-screen bright images that came from a laser projector.

The NAB show was vast, and the associated conferences went on for more than a week. So I’m going to concentrate on just one hour, a panel session called “Advancing Cameras for Cinema,” in one room, the SMPTE TSC, and how it showed the metamorphosis of what might be fixed in post.

1895 MaryConsider the origin of post, the first edit, and it was a doozy! It occurred in 1895 (and technically wasn’t exactly an edit). At a time when movies depicted real scenes, The Execution of Mary, Queen of Scots, in its 27-foot length (perhaps 17 seconds), depicts a living person being led to the chopping block. Then the camera was stopped, a dummy replaced the person, the camera started again, and the head was chopped off. It’s hard to imagine what it must have been like to see it for the first time back then. And, since 1895, much more has been added to the editing tool kit.

It’s now possible to combine different images, generate new ones, “paint” out wires and other undesirable objects, change colors and contrast, and so on. It’s even possible to stabilize jerky images and to change framing at the sacrifice of some resolution. But what if there were no sacrifice involved?

Hitachi-Compact-8K-Camera croppedAstrodesign 8K trimmedThe first panelist of the SMPTE TSC Advancing Cameras session was Takayuki Yamashita of the NHK Science & Technology Research Labs. He described their 8K 120-frame-per-second camera. 8K is to 4K approximately as 4K is to HD, and 120 fps is also four times the 1080i frame rate. This wasn’t a theoretical discussion; cameras were on the show floor. Hitachi showed an 8K camera in a familiar ENG/EFP form (left); Astrodesign showed one dramatically smaller (right).

If pictures are acquired at higher resolutions, they may be reframed in post with no loss of HD resolution. With 8K, four adjacent full-HD-resolution images can be extracted across the width of the 8K frame and four from top to bottom. A shakily captured image that bounces as much as 400% of the desired framing can be stabilized in post with no loss of HD resolution. And the higher spatial sampling rate also increases the contrast ratio of fine detail.

100perc_lin_xHDR_color

Contrast ratio was just one of the topics in the presentation, “Computational Imaging,” of the second panelist, Peter Centen of Grass Valley. Above is an image he presented at the SMPTE summit. The only light source in the room is the lamp facing the camera lens, but every chip on the reflectance chart is clearly visible and so are the individual coils of the hot tungsten filament. It’s an extraordinarily high dynamic range (HDR); a contrast ratio of about ten million to one — more than 23 stops — was captured.

Yes, that was an image he presented at the SMPTE summit — five years ago in 2008. This year he showed a different version of an HDR image. There’s nothing wrong with the technology, but bringing it to the market is a different matter.

Coded apertureAt the 2013 TSC, Centen showed an even older development, one first presented by an MIT-based group at SIGGRAPH in 2007 <http://groups.csail.mit.edu/graphics/CodedAperture>, a so-called “coded aperture.” Consider a point just in front of a camera’s lens. The lens might zoom in or out and might focus on something in the foreground or background. Its aperture might be wide open for shallow depth of field or partially closed for greater depth of field. If it’s a special form of lens (or lenses), it might even deliver stereoscopic 3D. All of those things might happen after the light enters the lens, but all of those possibilities exist in the “lightfield” in front of the lens.

ApertureCoded aperture from MIT paperThere have been many attempts to capture the whole lightfield. Holography is one. Another, used in the Lytro still camera, uses a fly’s-eye type of lens, which can cut into resolution (an NAB demonstration a few years ago had to use an 8K camera for a low-resolution image). A third was described by the third panelist (and shown in his booth on the show floor). The one Centen showed requires only the introduction of a disk with a pattern of holes into the aperture of any lens on any camera.

Centen closeCenten farHere is just one possible effect on fixing things in post, with images from the MIT paper. It is conceivable to change focus distance and depth of field and derive stereoscopic 3D from any single camera and lens combo after it has been shot (click on images to enlarge).

The moderator’s introduction to the panel showed a problem with higher resolutions: getting lenses that are good enough. He showed an example of a 4K lens (with just a 3:1 zoom ratio) costing five times as much as the professional 4K camera it can be mounted on. Centen offered possibilities of correcting both lens and sensor problems in post and of deriving 4K (or even 6K) from today’s HD sensors.

Fraunhofer arrayThe third panelist, Siegfried Foessel of the Fraunhofer Institute, seemed to cover some of the same ground as did Centen — using computational imaging to derive higher resolution from lower-resolution image sensors, increasing dynamic range, and capturing a lightfield, but his versions used completely different technology. The higher resolution and HDR can come from masking the pixels of existing sensors. And the Fraunhofer lightfield capture uses an array of tiny cameras not much bigger than one ordinary one, as shown in their booth (right). Two advantages of the multicamera approach are that each camera’s image looks perfect (with no fly’s eye resolution losses or coded-aperture light losses) and that the wider range of lens positions also allows some “camera repositioning” in post (without relying on Project FINE processing).

Foessel also discussed higher frame rates (as did many others at the 2013 TSC, including a professor of neuroscience and an anesthesiologist). He noted that capturing at a high frame rate allows “easy generation of different presentation frame rates.” He also speculated that future motion-image programming might use a frame rate varying as appropriate.

jotsThe last panelist was certainly not the least. He was Eric Fossum from Dartmouth’s Thayer School of Engineering, but he was introduced more simply, as the inventor of the modern CMOS sensor. His presentation was about a “quanta image sensor” (QIS) containing, instead of pixels, “jots.” The simplest description of a jot is as something like a photosensitive grain from film. A QIS sensor counts individual photons of light and knows their location and arrival time.

An 8K image sensor has more than 33 million pixels; a QIS might have 100 billion jots and might keep track of them a thousand times a second. The exposure curve seems very film-like. Fossum mentioned some other advantages, like motion compensation and “excellent low light performance,” although this is a “longer-term effort” and we “won’t see a camera for some time.”

The “convolution window size” (something like film grain size) can be changed after image acquisition.  In other words, even the “film speed” will be able to be changed in post.

Tags: , , , , , , , , , , , , , , , , , , , , , , , ,

4K* from 40,000 Feet, CCW, 4K Acquisition: The Possibilities & Challenges (Nov.15, 2012)

November 20th, 2012 | No Comments | Posted in Download, Today's Special

4K* from 40,000 Feet (Nov. 15, 2012)

4K Acquisition: The Possibilities & Challenges session
Content Creation World
New York, NY

Video (10:29 TRT)

Tags: , , , , , , ,

All You Can See

July 7th, 2012 | 2 Comments | Posted in Schubin Cafe

The equipment exhibitions at the annual convention of the National Association of Broadcasters (NAB) often seem to have themes. Two years ago, it was stereoscopic 3D. Before that, it was DSLRs. Long before HDTV became common, it was a theme at NAB conventions. And there was at least one convention at which the theme seemed to be teletext. At the 2012 NAB show, a theme seemed to be 4K.

What is 4K? That’s a good question without a simple answer. Nominally, 4K denotes a moving-image system with 4096 active (image-carrying) picture elements (pixels) per row. At one time, it was considered to have 2048 active rows; now 2160 — twice HDTV’s 1080 — is more common. But, if twice HDTV is appropriate vertically, why not horizontally, too? Sure enough, some call 3840 pixels across the screen 4K (others call it Quad HD, because twice the number horizontally and vertically results in four times the number of pixels of 1080-line HDTV).

Then there is color. There have been 4K cameras using a beam-splitting prism (right, diagram by Colin M. L. Burnett, http://en.wikipedia.org/wiki/File:Dichroic-prism.png) and three image-sensor chips, just like a typical studio or truck camera. Other 4K cameras have single chips overlayed with color filters (one version, the Bayer pattern, is shown below). There have also been four-chip cameras, with HD-resolution chips and an additional green one offset diagonally by half a pixel. Conceivably, as was done in HD cameras, a 4K camera could also use three HD-resolution chips with the green offset from the red and blue.

Some say a color-filtered chip with at least 4096 (or 3840) photosites per row is 4K; others say it is not. Consider optical low-pass filtering. In a three-chip camera, the optical low-pass can be designed to match any of the chips. In a filtered single-chip (left, also from Burnett, http://en.wikipedia.org/wiki/File:Bayer_pattern_on_sensor.svg) or four-chip camera, should it be optimized for the individual photosites (the “luma” or uncolored resolution), the green ones (which occur more frequently), or the other colors (which have filters spaced twice as far apart as the photosites)?

Then there are those who think it’s not necessary to go all the way to 4K (e.g., the “3.5K” of the popular ARRI Alexa at right) and those who think 4K is insufficient (e.g., proponents of “8K”). Just counting photosites, there have been “4K” cameras with anything from roughly 8.3 to roughly 38.2 million, and there have been other beyond-HDTV-resolution cameras shown and discussed with as few as 3.3 million and as many as 100 million. There’s even a group working on camera systems with a thousand times more pixels than even that high end (100 gigapixels http://www.disp.duke.edu/projects/AWARE/index.ptml).

There are also ways of increasing resolution without changing the number of photosites on an image sensor. One is compressive sampling (described by Siegfried Foessel of Germany’s Fraunhofer Institut at the HPA Tech Retreat in February in a system that increases resolution by covering portions of sensor photosites). There are also various forms of “super-resolution” (one version, which can take advantage of aliases that slip through filters, is shown below, original at left, enhanced at right, in a portion of an image from the Almalence PhotoAcute Studio web site: http://photoacute.com/studio/examples/mac_hdd/index.html).

As I noted in a previous post (“Y4K?” http://www.schubincafe.com/2011/08/31/y4k/), there are benefits to using a beyond-HD-resolution camera even if the distribution will be only HD. These include the possibilities of reframing in post, image stabilization without loss of resolution, one form of stereoscopic 3D shooting, and the delivery of images with perceptually increased sharpness. They’re not just theoretical benefits. Zaxel, for example, announced on July 1 the delivery of their 720CUT, a system that allows a 720p high-definition window to be smoothly moved around a 4K moving image in real time.

Although such issues as cost and storage might still keep users away from higher-resolution cameras, they clearly seem like a good idea. But what about delivering more resolution (not just more sharpness) to the viewer? How many pixels are enough?

Unfortunately, there’s no simple answer. Look again at the pictures above. They could clearly benefit from more detail — even the one on the right.  But what if the whole picture were of something the size of a building. In that case, when zooming in so close (the pictures show the label of a hard drive), even a 100-gigapixel image might be insufficient. One benefit of delivering 4K to a home viewer, therefore, is the ability to zoom in to any desired HD frame from the larger 4K frame, as shown in the inner rectangle in the example at left, with a trimmed original image from HighDefWallpapers.Info (http://www.highdefwallpapers.info/amazing-sea-resort-high-definition-wallpapers/). Systems for doing such extraction at home have been shown at NAB conventions for years.

How about complete images? Again, there’s no simple answer. At right is a diagram from ARRI’s “4K+ Systems Theory Basics for Motion Picture Imaging” (http://www.efilm.com/publish/2008/05/19/4K%20plus.pdf). Based on 20/20 (or 6/6) vision, it shows visual-acuity limitations for movie viewers in different seats. Even at the rear of this auditorium, a viewer with 20/20 vision could perceive more than 50% more detail than 1080-line HD can deliver in any direction. In the front of the main section of seating, such a viewer could perceive 8K resolution, and, in the very front row, far more than even that extraordinary resolution.

There are, however, some problems with the above. For one thing, almost no one has 20/20 vision. The extra lines at the bottom of an eye chart (left) below the red line indicate that many people have visual acuity far better than 20/20. But the seven lines above the 20/20 line indicate that other people have poorer visual acuity.

Then there is the number 20; 20/20 means that the viewer can see at 20 feet what the “standard” viewer (one with 20/20 vision) can also see at 20 feet (in 6/6, the numbers are in meters). But why specify 20 feet? It’s because at that distance eye-lens focus plays almost no role, and aging viewers can have trouble with eye-lens focus.

In a cinema auditorium, that’s not much of an issue; the screen is likely to be at least 20 feet away.  At home-TV viewing distances, it is an issue. So is lighting. Movies are viewed in dark rooms; TV is often viewed with the light on. A simple formula for contrast can be the division of the sum of desired light plus undesired light divided by the undesired light. Movie screens are typically much dimmer than TV screens, but cinema auditoriums are typically very much darker than TV-viewing rooms, so movies typically offer more contrast.

The image above is called a contrast-resolution grating. Contrast increases from bottom to top; detail resolution increases from left to right. You probably see undifferentiated gray at the bottom left and right corners, but both between those corners and above them, you can probably make out vertical lines. The reason you can make out the lines between the corners is that the human visual system has a contrast-sensitivity function with a peak. So perception of resolution depends on contrast. And that’s not all.

If there is an ideal resolution for viewing, it is based on a compromise: Too much, and the system becomes overly expensive; too little, and, aside from any possibility that the viewer might find the pictures insufficiently detailed, the structure of the display becomes visible, theoretically preventing the viewer from seeing the image due to its visible pixels — in effect, not being able to see the forest for the trees. At left and right above are two different pixel structures of two different display panels.  Do they offer equivalent structure visibility for the same resolution?

Suppose everyone’s visual acuity is 20/20, and eye-lens-focus (accommodation), contrast, color, and pixel structure don’t matter. Then, with 20/20 defined as 30 cycles per degree, and assuming a white pixel and a black pixel constitute a cycle, as shown at right, it’s possible to use high-school trigonometry to calculate optimum viewing distances. For U.S. standard-definition television, which has about 480 active rows of pixels, that distance would be 7.15 times the height of the picture 7.15H); for 1080-line HDTV, it would be 3.16H; for 2160-line 4K 1.54H; for 4320-line 8K 0.69H.  With a lot of rounding (of the same sort that allows 7680-across to be called 8K), these have been called 7, 3, 1.5, and 0.75 times the picture height.

The “9 feet” in the image above happens to be the result of the calculation for an old 25-inch 4×3-shaped TV set, but it has another significance. It is the Lechner Distance. Named for then-RCA Laboratories researcher Bernard Lechner, it is the result of a survey conducted to see how far people sit from their TV screens.  Richard Jackson, a researcher at Philips Laboratories in Redhill, England, conducted his own survey and came up with a similar 3 meters. The distance is determined by room sizes and furniture.  It is not affected by screen sizes or resolutions, although flat-panel TV sets, lacking the depth required by a long-necked picture tube, would, in theory at least, increase the distance somewhat.

At right is a portion of Figure 3 of the paper “‘Super Hi-Vision’ Video Parameters for Next-Generation Television,” by Takayuki Yamashita, Kenichiro Masaoka, Kohei Ohmura, Masaki Emoto, Yukihiro Nishida, and Masayuki Sugawara of the NHK Science and Technology Research Laboratories. It shows that a viewer’s “sense of being there” increases as the viewing distance decreases, as might be expected; as the screen occupies more of the visual field, the viewer gets enveloped in the image. It also shows that “sense of realness” increases with greater viewing distance. That’s also as might be expected; from the top of a skyscraper, a viewer can’t tell the difference between a mannequin (fake) and a person (real) at street level.

Super Hi-Vision is being shown to the public at the 2012 Olympic Games in special, giant-screen viewing rooms, as has been the case when it was exhibited at such broadcast exhibitions as NAB and the International Broadcasting Convention. Viewers can see HD detail from just the segment of screen in front of them and glance elsewhere to see more HD-equivalent images forming the whole. I wrote previously of a system Canon has demonstrated with even more resolution (http://www.schubincafe.com/2010/09/07/whats-next/). In those special viewing venues, it’s easy to achieve a viewing distance of 0.75H; at home, at the Lechner distance, it would require a TV image 12-feet high.

At the same London Games, however, the official host broadcaster is using the DVCPROHD codec, which reduces 1920-pixel-across 1080-line HDTV resolution by a substantial amount. HDCAM does something similar. Both have been acceptable because they retain most of the image sharpness, even though they greatly reduce its resolution, because they preserve most of the area under the modulation-transfer-function curve shown at right.

Perhaps it would be better to say that DVCPROHD and HDCAM have been acceptable. Today, some viewers seem willing to comment on the difference between the reduced resolution of those systems and “full HD.” That might be because some forms of perception are learned.

After Thomas Edison switched from phonograph cylinders to disks, he came up with a plan to demonstrate their quality.  He presented a series of “tone tests.” In small venues, as shown at left, listeners would be blindfolded. At larger ones, the lights would go out. In either case, the audience had to decide whether they’d heard the live singer or a pre-electronic phonograph disk.

These comments from a Pittsburgh Post reporter in 1919 were typical: “It did not seem difficult to determine in the dark when the singer sang and when she did not. The writer himself was pretty sure about it until the lights were turned on again and it was discovered that [the singer] was not on the stage at all and that the new Edison alone had been heard.” Today, we scoff at the idea that audiences couldn’t hear differences between those forms of sounds, but we’ve had years of high fidelity to let us know what sounds bad.

As with hearing, so, too, with vision. At right is the apparatus used in an old experiment conducted to see whether animals would cross a visual gap. When the gap was covered with a visually transparent material, they would not. When the transparent material was covered with visible stripes, they would. But animals raised from birth in an environment devoid of lines oriented in a particular direction treated stripes oriented that way on the transparent material as though they weren’t there and wouldn’t cross.

So, can viewers actually avail themselves of beyond-HD resolution at home? If they’d simply sit closer to their screens, the answer would be a definite yes.  If they continue to sit at the Lechner Distance, the answer is less obvious. On April 28, reporting on an 8K 145-inch television screen, PC World used the headline “Panasonic’s Newest TV Prototype Is Too Big for Your Living Room” <http://www.pcworld.com/article/254649/panasonics_newest_tv_prototype_is_too_big_for_your_living_room.html>.

Possibilities? Maybe we’ll sit closer. Maybe we’ll learn to see with greater acuity (NHK’s Super Hi-Vision research showed subjects already able to perceive differences in “realness” in detail more than five times finer than the 20/20 criterion). Maybe we’ll use virtual viewing systems unrestricted by rooms and furniture. Or maybe not.

Meanwhile, a little skepticism probably couldn’t hurt. Things aren’t always as they seem.

In a 1972 interview, Anna Case (left), one of the opera singers used in the Edison tone tests, admitted that she’d trained herself to sound like a phonograph recording. Oh, well.

Tags: , , , , , , , , , , , , , , ,

Redefining High Definition

May 24th, 2012 | No Comments | Posted in Download, Today's Special

Redefining High Definition
May 21, 2012
The Cable Show (NCTA Convention)
Boston Convention Center

Collateral:

 

Video:

Tags: , , , , , , ,

4K, HPA Tech Retreat 2012

March 11th, 2012 | No Comments | Posted in Download, Today's Special

4K
HPA Tech Retreat 2012

PowerPoint: Schubin-HPA2012-4K.ppt
9 slides (1440 x 1080) with audio, 3 MB, TRT: 6:57
YOU MUST BE IN SLIDESHOW MODE TO HEAR THE AUDIO

MP4: Schubin-HPA2012-4K.mp4
640 x 480, 7.2 MB, TRT: 6:55

Tags: , ,

Smellyvision and Associates

February 25th, 2012 | No Comments | Posted in 3D Courses, Schubin Cafe


 What is reality? And is it something we want to get closer to? Take a look at the picture of a cat above, as printed on a package of Bell Rock Growers’ Pet Greens® Treats <http://www.bellrockgrowers.com/cattreats.html>. Does it look unreal? Distorted? Is it?

At this month’s HPA Tech Retreat in Indian Wells, California (shown above in a photo by Peter Putman as part of his coverage of the event <http://www.hdtvexpert.com/?p=1804>), there was much talk about getting closer to reality by using images with higher resolution, higher frame rate, greater dynamic range, larger color gamut, stereoscopic sensation, and even surround vision. The last was based on a demonstration from C360 Technologies. Another demo featured Barco’s Auro-3D enveloping sound technology. In the main program, vision scientist Jenny Read explained how stereoscopic 3D in a cinema auditorium can’t possibly work right and why we think it does. And then there were the quizzes.

All of them related to the introduction of image and sound technologies at various World’s Fairs. Although the dates ranged from 1851 to the late 20th century, more than one quiz related to technologies introduced at the 1900 Paris Exposition. It stands to reason.

At that one event, people could attend sync-sound movies and watch large-format high-resolution movies on a giant-screen. They could also experience reality simulations: an “ocean voyage” on a motion-platform with visual effects called the Mareorama (depicted at left), a “train trip” on the Trans-Siberian Railway using spatial motion parallax (with one image belt moving at 1000 feet per minute!), and a “flight above the city” in the surround-projection-based Cinéorama (shown below, with synchronized projectors under the audience). At the same fair, they could also hear sound broadcasting of music (with no radios required) and even try out the newly coined word television.

Well over a century later, we still have sound broadcasting (though receivers are now required), we still watch sync-sound movies, and we still use the word television. There are still large-format large-screen, surround vision, and moving-platform experiences, but they tend to be at, well, World’s Fairs, museums, and other special venues.

There was a time when at least 70-mm film was used as a selling point for some Hollywood movies and the theaters where they were shown. And then it wasn’t. The audience’s desire for quality didn’t seem to justify the additional cost. The digital-cinema era started at lower-than-home-HD resolution but is now moving towards “4K,” more than twice the linear resolution of the best HD (the 4K effects and workflows of The Girl with the Dragon Tattoo were discussed at the HPA Tech Retreat).

Back in the publicized 70-mm film era, special-effects wizard, inventor, and director Douglas Trumbull created a system for increasing temporal resolution in the same way that 70-mm offered greater spatial resolution than 35-mm film. It was called Showscan, with 60 frames per second (fps) instead of 24.

The results were stunning, with a much greater sensation of reality. But not everyone was convinced it should be used universally. In the August 1994 issue of American Cinematographer, Bob Fisher and Marji Rhea interviewed a director about his feelings about the process after viewing Trumbull’s 1989 short, Leonardo’s Dream.

“After that film was completed, I drew a very distinct conclusion that the Showscan process is too vivid and life-like for a traditional fiction film. It becomes invasive. I decided that, for conventional movies, it’s best to stay with 24 frames per second. It keeps the image under the proscenium arch. That’s important, because most of the audience wants to be non-participating voyeurs.”

Who was that mystery director who decided 24-fps is better for traditional movies than 60-fps? It was the director of the major features Brainstorm and Silent Running. It was Douglas Trumbull.

As perhaps the greatest proponent of high-frame-rate shooting today, Trumbull was more recently asked about his 1994 comments. He responded that a director might still seek a more-traditional look for storytelling, but by shooting at a higher frame rate that option will remain open, and the increased spatial detail offered by a higher frame rate will also be an option.

That increased spatial detail is shown at left in a BBC/EBU simulation of 50-fps (top) and 100-fps (bottom) images based on 300-fps shooting. Note that the tracks and ties are equally sharp in both images; only the moving train changes. The images may be found in the September 2008 BBC White Paper on “High Frame-Rate Television,” available here <http://downloads.bbc.co.uk/rd/pubs/whp/whp-pdf-files/WHP169.pdf>.

Trumbull is a fan of using higher frame rates, especially for stereoscopic 3D (his Leonardo’s Dream was stereoscopic). Such other directors as James Cameron and Peter Jackson have joined that approach. And at the SMPTE International Conference on Stereoscopic 3D in June Martin Banks of UC-Berkeley’s Visual Space Perception Laboratory explained strobing effects that can occur in S3D viewing.

A hit of the 2012 HPA Tech Retreat, however, in both the main program and the demo area, was the Tessive Time Filter, a mechanism for eliminating (or at least greatly reducing) strobing effects without changing frame rate. It applies appropriate temporal filtering in front of the lens — essentially any lens. Because the filtering is temporal, it does not affect the sharpness of items that are stationary relative to the image sensor. Above right is an image illustrating a “compensator” plug-in for Apple’s Final Cut Pro “to achieve the best possible representation of time in your footage” (when the green word “Compensated” appears at the bottom right, the compensator is on <http://www.tessive.com/>).

That’s frame rate and resolution. Visual dynamic range (from brightest to darkest) and color gamut were also topics at the 2012 HPA Tech Retreat, primarily in Charles Poynton’s seminar on the physics of imaging displays and presentation on high-dynamic-range imaging, in a panel discussion on laser projection, and in Dolby’s high-dynamic-range monitoring demonstrations.

Poynton noted a conflict between displays that can “create” their own extended ranges and gamuts and the intentions of directors. He also noted that in medical imaging, where gray scale and color can be critical, there are standards that don’t exist in consumer television. But that doesn’t mean medical imaging is closer to reality. In fact, it might be nice for a tumor otherwise invisible to show up very obviously, like a clown’s red nose.

Above left is another scientific image, the National Oceanic and Atmospheric Administration’s satellite image of cloud cover over the U.S. this morning, at a very clear time. Rest assured that the air did not look green, yellow, and brown at the time. Sometimes reality is not desirable.

Consider the cat at the top of this post. Its unusual look is intentional, something to grab a shopper’s intention. But it’s actually not unrealistic.

Try holding your hand about a foot in front of your face and note its apparent size. Now move it two feet away. It looks smaller, but not half the size. Yet the “real” image of the hand on your retina is half the size.

Reality is even more complex. We track different moving objects at different times, changing what looks sharp or blurry. We focus on objects at different depths in a scene, unlike a camera (regarding stereoscopic 3D perception, at the HPA Tech Retreat Read noted that although a generation that grows up with S3D imagery might not experience today’s S3D viewing difficulties neither might they find S3D exciting). We can see 360 degrees in any direction (by moving our heads and bodies, if necessary). We can also hear sounds coming from any direction. And then there are our other senses.

At the 2010 International Broadcasting Convention in Amsterdam, the Korean Electronics and Telecommunications Research Institute demonstrated what they called “4D TV” (diagram above). When there was a fire on screen, viewers felt heat. When there was the appearance of speed on screen, viewers felt the rush of air across their faces. During an episode reminiscent of a news event in which an athlete was struck, viewers felt a blow on their legs. And there were also scents.

“There may come a time when we shall have ‘smellyvision’ and ‘tastyvision’. When we are able to broadcast so that all the senses are catered for, we shall live in a world which no one has yet dreamt about.”

That quotation by Archibald Montgomery Low appeared in the “Radio Mirror” of the (London) Daily News on December 30, 1926. Much more recently (June 14 of last year), the Samsung Advanced Institute of Technology and the University of California – San Diego’s Jacobs School of Engineering jointly announced the development of something that might sit on the back of a TV set and generate “thousands of odors” on command. But that raises the reality issue, again. Do we really want to smell what the sign above left depicts?

Archibald Low was an interesting character. He was inducted posthumously into the International Space Hall of Fame as the “father of radio guidance systems” and was one of the founders and presidents of the British Interplanetary Society, but he was also (among many other posts and appointments) fellow and president of the British Institute of Radio Engineers, fellow of the Chemical Society, fellow of the Geographical Society, and chair of the Royal Automobile Club’s Motor Cycle Committee (he built and arranged the demonstration of a rocket-powered motorcycle, above right).

Besides that motorcycle, he also developed drawing tools, a well-selling whistling egg boiler, and what was probably the first drone aircraft not carrying a pilot. But two other aspects of Low’s long and varied career might be worth considering.

In 1914, he demonstrated, first to the Institute of Automobile Engineers and later at Selfridge’s Department Store, something he called “televista” but probably better described in the title of his presentation, “Seeing by Wireless.” And, in a 1937 book, he wrote, “The telephone may develop to a stage where it is unnecessary to enter a special call-box. We shall think no more of telephoning to our office from our cars or railway-carriages than we do today of telephoning from our homes.” So he wasn’t too bad at predictions.

“Smellyvision”? Who knows? But, if we’re lucky, it won’t bring us any closer to reality.

Tags: , , , , , , , , , , , , , , , ,

Update: Schubin Cafe: Beyond HD: Resolution, Frame-Rate, and Dynamic Range

February 9th, 2012 | No Comments | Posted in Download, Today's Special

You can download the PowerPoint presentation by clicking on the title:

SchubinCafe_Beyond_HD.ppt (7.76 MB)

 

You can download the mov file of the webinar by clicking on the title:

Schubin-Cafe-Webinar-2-9-12-1.mov

 

Tags: , , , , , , , , ,

The Blind Leading

December 10th, 2011 | No Comments | Posted in Schubin Cafe

Once upon a time, people were prevented from getting married, in some jurisdictions, based on the shade of their skin colors. Once upon a time, a higher-definition image required more pixels on the image sensor and higher-quality optics.

Actually, we still seem to be living in the era indicated by the second sentence above. At the 2012 Hollywood Post Alliance (HPA) Tech Retreat, to be held February 14-17 (with a pre-retreat seminar on “The Physics of Image Displays” on the 13th) at the Hyatt Grand Champions in Indian Wells, California <http://bit.ly/slPf9v>, one of the earliest panels in the main program will be about 4K cameras, and representatives from ARRI, Canon, JVC, Red, Sony, and Vision Research will all talk about cameras with far more pixel sites on their image sensors than there are in typical HDTV cameras; Sony’s, shown at the left, has roughly ten times as many.

That’s by no means the limit. The prototypical ultra-high-definition television (UHDTV) camera shown at the right has three image sensors (from Forza Silicon), each one of which has about 65% more pixel sites than on Sony’s sensor. There is so much information being gathered that each sensor chip requires a 720-pin connection (and Sony’s image sensor is intended for use in just a single-sensor camera, so there are actually about five times more pixel sites).  But even that isn’t the limit! As I pointed out last year, Canon has already demonstrated a huge hyper-definition image sensor, with four times the number of pixels of even those Forza image sensors used in the camera at the right <http://www.schubincafe.com/2010/09/07/whats-next/>!

Having entered the video business at a time when picture editing was done with razor blades, iron-filing solutions to make tape tracks visible, and microscopes, and when video projectors utilized oil reservoirs and vacuum pumps, I’ve always had a fondness for the physical characteristics of equipment. Sensors will continue to increase in resolution, and I love that work. At the same time, I recognize some of the problems of an inexorable path towards higher definition.

The standard-definition camera that your computer or smart phone uses for video conferencing might have an image sensor with a resolution characterized as 640×480 or 0.3 Mpel (megapixels), even if that same smart phone has a much-higher-resolution image sensor pointing the other way for still pictures. That’s because video must make use of continually changing information. At 60 frames per second, that 0.3 Mpel camera delivers more pixels in one second than an 18 Mpel sensor shooting a still image.

Common 1080-line HDTV has about 2 Mpels. So called “4K” has about 8 Mpels. It’s already tough to get a great HDTV lens; how will we deal with UHDTV’s 33-Mpel “8K”?

A frame rate of 60-fps delivers twice as much information as 30-fps; 120-fps is twice as much as 60-fps. How will we ever manage to process high-frame-rate UHDTV?

Perhaps it’s worth consulting the academies. In U.S. entertainment media, the highest awards are granted by the Academy of Motion Picture Arts & Sciences (the Academy Award or Oscar), the Academies (there are two) of Television Arts & Sciences (the Emmy Award), and the Recording Academy (the Grammy Award). Win all three, and you are entitled to go on an EGO (Emmy-Grammy-Oscar) trip!

In the history of those awards, only 33 people have ever achieved an EGO trip. And only two of those also won awards from the Audio Engineering Society (AES), the Institute of Electrical and Electronics Engineers (IEEE), and the Society of Motion-Picture and Television Engineers (SMPTE). You’re probably familiar with the last name of at least one of those two, Ray Dolby, shown at left during his induction into the National Inventors Hall of Fame in 2004.

The other was Thomas Stockham. Some in the audio community might recognize his name.  He was at one time president of the AES, is credited with creating the first digital-audio recording company (Soundstream), and was one of the investigators of the 18½-minute gap in then-President Richard Nixon’s White House tapes regarding the Watergate break-in.

Those achievements appeal to my sense of appreciation of physical characteristics. The Soundstream recorder (right) was large and had many moving parts. And the famous “stretch” of Nixon’s secretary Rose Mary Woods (left), which would have been required to accidentally cause the gap in the recording, is a posture worthy of an advanced yogi (Stockham’s investigative group, unfortunately for that theory, found that there were multiple separate instances of erasure, which could not have been caused by any stretch). But what impressed (and still impresses) me most about Stockham’s work has no physical characteristics at all.  It’s pure mathematics.

On the last day of the HPA Tech Retreat, as on the first day, there will be a presentation on high-resolution imaging. But it will have a very different point of view. Siegfried Foessel of Germany’s Fraunhofer research institute will describe “Increasing Resolution by Covering the Image Sensor.” The idea is that, instead of using a higher-resolution sensor, which increases data-readout rates, it’s actually possible to use a much-lower-resolution image sensor, with the pixel sites covered in a strange pattern (a portion of which is shown at the right). Mathematical processing then yields a much-higher-resolution image — without increasing the information rate leaving the sensor.

In the HPA Tech Retreat demo room, there should be multiple demonstrations of the power of mathematical processing. Cube Vision and Image Essence, for example, are expected to be demonstrating ways of increasing apparent sharpness without even needing to place a mask over the sensor. Lightcraft Technology will show photorealistic scenes that never even existed except in a computer. And those are said to have gigapixel (thousand-megapixel) resolutions!

All of that mathematical processing, to the best of my knowledge, had no direct link to Stockham, but he did a lot of mathematical processing, too. In the realm of audio, his most famous effort was probably the removal of the recording artifacts of the acoustical horn into which the famous opera tenor Enrico Caruso sang in the era before microphone-based recording (shown at left in a drawing by the singer, himself).

As Caruso sang, the sound of his voice was convolved with the characteristics of the acoustic horn that funneled the sound to the recording mechanism. Recovering the original sound for the 1976 commercial release Caruso: A Legendary Performer required deconvolving the horn’s acoustic characteristics from the singer’s voice.  That’s tough enough even if you know everything there is to know about the horn. But Stockham didn’t, so he had to use “blind” deconvolution. It wasn’t the first time.

He was co-author of an invited paper that appeared in the Proceedings of the IEEE in August 1968. It was called “Nonlinear Filtering of Multiplied and Convolved Signals,” and, while some of it applied to audio signals, other parts applied to images. He followed up with a solo paper, “Image Processing in the Context of a Visual Model,” in the same journal in July 1972. Both papers have been cited many hundreds of times in more-recent image-processing work.

One image in both papers showed the outside of a building, shot on a bright day; the door was open, but the inside was little more than a black hole (a portion of the image is shown above left, including artifacts of scanning the print article with its half-tone images). After processing, all of the details of the equipment inside could readily be seen (a portion of the image is shown at right, again including scanning artifacts). Other images showed effective deblurring, and the blur could be caused by either lens defocus or camera instability.

Stockham later (in 1975) actually designed a real-time video contrast compressor that could achieve similar effects. I got to try it. I aimed a bright light up at some shelves so that each shelf cast a shadow on what it was supporting. Without the contrast compressor, virtually nothing on the shelves could be seen; with it, fine detail was visible. But the pictures were not really of entertainment quality.

That was, however, in 1975, and technology has marched — or sprinted — ahead since then. The Fraunhofer Institut presentation at the 2012 HPA Tech Retreat will show how math can increase image-sensor resolution. But what about the lens?

A lens convolves an image in the same way that an old recording horn convolved the sound of an acoustic gramophone recording. And, if the defects of one can be removed by blind deconvolution, so might those of the other. An added benefit is that the deconvolution need not be blind; the characteristics of the lens can be identified. Today’s simple chromatic-aberration corrections could extend to all of a lens’s abberations, and even its focus and mount stability.

Is it a merely a dream?  Perhaps.  But, at one time, so was the repeal of so-called anti-miscegenation laws.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , ,

Y4K?

August 31st, 2011 | No Comments | Posted in 3D Courses, Schubin Cafe

 

What should come after HDTV? There’s certainly a lot of buzz about 3D TV. Such directors as James Cameron and Douglas Trumbull are pushing for higher frame rates. Several manufacturers have introduced TVs with a 21:9 (“CinemaScope”) aspect ratio instead of HDTV’s 16:9. Some think we should increase dynamic range (the range from dark to light). Some think it should be a greater range of colors. Japan’s Super Hi-Vision offers 22.2-channel surround sound. And then there’s 4K.

In simple terms, 4K has approximately twice as much detail as HDTV in both the horizontal and vertical directions. If the orange rectangle above is HDTV, the blue one is roughly 4K. It’s called 4K because there are 4096 picture elements (pixels) per line.

This post will not get much more involved with what 4K is. The definition of 4096 pixels per line says nothing about capture or display.  Even at lower resolutions, some cameras use a complete image sensor for each primary color; others use some sort of color filtering on a single image sensor. At left is Colin Burnett’s depiction of the popular Bayer filter design. Clearly, if such a filtered image sensor were shooting another Bayer filter offset by one color element, the result would be nothing like the original.

Optical filtering and “demosaicking” algorithms can reduce color problems, but the filtering also reduces resolution. Some say a single color-filtered image sensor with 4096 pixels per line is 4K; others say it isn’t. That’s an argument for a different post.  This one is about why 4K might be considered useful.

An obvious answer is for more detail resolution. But maybe that’s not quite as obvious as it seems at first glance. The history of video technology certainly shows ever-increasing resolutions, from eight scanning lines per frame in the 1920s to HDTV’s….

As can be seen above, in 1935, a British Parliamentary Report declared that HDTV should have no fewer than 240 lines per frame. Today’s HDTV has 720 or 1080 “active” (picture-carrying) lines per frame, and 4K has a nominal 2160, but even ordinary 525-line (~480 active) TV was considered HDTV when it was first introduced.

Human visual acuity is often measured with a common Snellen eye chart, as shown at left above. On the line for “normal” vision (20/20 in the U.S., 6/6 in other parts of the world), each portion of the “optotype” character occupies one arcminute (1′, a sixtieth of a degree) of retinal angle, so there are 30 “cycles” of black and white lines per degree.

Bernard Lechner, a researcher at RCA Laboratories at the time, studied television viewing distances in the U.S. and determined they were about nine feet (Richard Jackson, a researcher at Philips Laboratories in the UK at the same time, came up with a similar three meters). As shown above, a 25-inch 4:3 TV screen provides just about a perfect match to “normal” vision’s 30 cycles per degree when “525-line” television is viewed at the Lechner Distance — roughly seven times the picture height.

HDTV should, under the same theory, be viewed from a smaller multiple of the screen height (h). For 1080 active lines, it should be 7.15 x 480/1080, or about 3.2h. Looked at another way, at a nine-foot viewing distance, the height should be about 34 inches, a diagonal screen size of about 60 inches, and, indeed, 60-inch (and larger) HDTV screens are not uncommon (and so are closer viewing distances).

For 4K (again, using the same theory), it should be a screen height of about 68 inches. Add a few inches for a screen bezel and stand, and mount it on a table, and suddenly the viewer needs a minimum ceiling height of nine feet!

Of course, cinema auditoriums don’t have domestic ceiling heights. Above is an elevation of a typical old-style auditorium, courtesy of Warner Bros. Technical Operations. The scale is in picture heights. Back near the projection booth, standard-definition resolution seems adequate. Even in the fifth row, HD resolution seems adequate. Below, however, is a modern, stadium-seating cinema auditorium (courtesy of the same source).

This time, even a viewer with “normal” vision in the last row could see greater-than-HD detail, and 4K could well serve most of the auditorium. That’s one reason why there’s interest in 4K for cinema distribution.

Another is questions about that theory of “normal” vision. First of all, there are lines on the Snellen eye chart (which dates back to 1862) below the “normal” line, meaning some viewers can see more resolution.

Then there are the sharp lines of the optotypes. A wave cycle would have gently shaded transitions between white and black, which might make the optotype more difficult to identify on an eye chart. Adding in higher frequencies, as shown below, makes the edges sharper, and 4K offers higher frequencies than does HD.

Then there’s sharpness, which is different from resolution. Words that end in -ness (brightness, loudness, sharpness, etc.) tend to be human psychophysical sensations (psychological responses to physical stimuli) rather than simple machine-measurable characteristics (luminance, sound level, resolution, contrast, etc.). Another RCA Labs researcher, Otto Schade, showed that sharpness is proportional to the square of the area under a modulation-transfer function (MTF) curve, a curve plotting contrast ratio against resolution.

One of the factors affecting an MTF curve is the filtering inherent in sampling, as is done in imaging. An ideal filter might use a sine of x divided by x function, also called a SINC function. Above is a SINC function for an arbitrary image sensor and its filters. It might be called a 2K sensor, but the contrast ratio at 2K is zero, as shown by the red arrow at the left.

Above is the same SINC function. All that has changed is a doubling of the number of pixels (in each direction). Now the contrast ratio at 2K is 64%, a dramatic increase (again, as shown by the red arrow at the left). Of course, if the original sensor offered 64% at 2K, the improvement offered by 4K would be much less dramatic, a reason why the question of what 4K is is not trivial.

Then there’s 3D.  Some of the issues associated with 3D shooting relate to the use of two cameras with different image sensors and processing. One camera might deliver different gray scale, color, or even geometry from the other.

Above is an alternative, two HD images (one for each eye’s view) on a single 4K image sensor. A Zepar stereoscopic lens system on a Vision Research Phantom 65 camera serves that purpose. It’s even available for rent.

There are other reasons one might want to shoot HD-sized images on a 4K sensor. One is image stabilization. The solid orange rectangle above represents an HD image that has been jiggled out of its appropriate position, the lighter orange rectangle behind it with the dotted border. There are many image-stabilization systems available that can straighten out a subject in the center, but they do so by trimming away what doesn’t fit, resulting in the smaller, green rectangle. If a 4K sensor is used, however, the complete image can be stabilized.

It’s not just stabilization. An HD-sized image shot on a 4K sensor can be reframed in post production. The image can be moved left or right, up or down, rotated, or even zoomed out.

So 4K offers much even to people not intending to display 4K. But it comes at a cost. Cameras and displays for 4K are more expensive, and an uncompressed 4K signal has more than four times as much data as HD. If the 1080p60 (1080 active lines, progressively scanned, at roughly 60 frames per second) version of HD uses 3G (three-gigabit-per-second) connections, 4K might require four of those.

When getting 4K to cinemas or homes, however, compression is likely to be used, and, as can be seen by the MTF curves, the highest-resolution portion of the image has the least contrast ratio. It has been suggested that, in real-world images, it might take as little as an extra 5% of data rate to encode the extra detail of 4K over HD.

So, is 4K the future? The aforementioned Super Hi-Vision is already effectively 8K, and it’s scheduled to be used in next year’s Olympic Games.

Tags: , , , , , , , , , , , , , , , ,

NAB 2011 Wrapup, Washington, DC SMPTE Section, May 19, 2011

June 1st, 2011 | No Comments | Posted in Download, Today's Special

NAB 2011 Wrapup
Washington, DC SMPTE Section
May 19, 2011

PPT:
http://www.schubincafe.com/wp-content/uploads/2011/05/Schubin_NAB_2011.ppt
(38 slides / 43 minutes)

PLEASE VIEW IN SLIDE SHOW MODE TO ACTIVATE AUDIO

Tags: , , , , , , , ,
forex tradingforex
Web Statistics