Archive for the ‘Augmented Reality’ Category
So, it’s been a couple of years since I’ve felt compelled to post to this blog, but I think it’s high time for an update. I’m just going to quickly touch on a few of things I’m excited about, having just attended Augmented Reality Event 2011.
Things in the Augmented Reality world have progressed rapidly, if not as rapidly as I might once have imagined they would. In one of my first posts, I closed with an idea about streaming one’s first-person POV to a giant Microsoft Photosynth system in the cloud. The Bing Maps team, under Blaise Aguera y Arcas and Avi Bar-Zeev, is doing exactly that. With Read / Write World, Microsoft is developing what I think will be the foundation of what Blaise called “Strong AR.” This is in contrast with the “weak,” strictly sensor-based AR applications that we’re seeing on mobile devices at the moment.
To clarify, there are two paradigms of current AR usage:
One of these two is local vision-based AR using marker or texture tracking to position virtual objects relative to a camera’s perspective. This is done by calculating the homography that describes the relationship between the captured image of the tracked pattern, and the original pattern. From this, one generates translation and orientation matrices for the placement of virtual content in the scene. This is Strong AR, but on a local scale and without a connection to a coordinate system linked to the world as a whole.
The other is the AR found in most mobile apps like Layar and Wikitude. The information visualized through these apps is placed using a combination of geolocation and orientation derived from the sensors found in smartphones. These sensors are the components of a MARG array: triaxial magnetometric, accelerometric, and gyroscopic sensors. By knowing a user’s position and orientation, which are together referred to as a user’s pose, one nominally knows what a user is looking at, and inserts content into the scene. The problem with this method is one of resolution and accuracy, and this is what Blaise was referring to as “weak.” This method, however, provides an easy means by which to place data out in the broader world, if not with precise registration.
The future of Strong AR is the fusion of these two paradigms, and this is what Read / Write World is being developed for. The underlying language of the system is called RML, or Reality Markup Language. Already, if photographic data for a location exists in the system, and one uploads a new image with metadata placing it nearby, the Read / Write World can return the homography matrix. According to Blaise’s statements during his Augmented Reality Event keynote, pose relative to the existing media is determined with accuracy down to the centimeter. And the new image becomes part of the database, so users will constantly be refining and updating the system’s knowledge of the world.
Anyhow, I think Read / Write World has the potential to be the foundation for everything that I, and so many others, have envisioned. That’s on the infrastructure side.
So what about the hardware?
In the last couple of years, mobile devices have really grown up, and are getting to, or have reached, the point where they pack enough processing power to be the core of a real Strong AR system. Qualcomm has positioned itself as one of the most important entities in Augmented Reality, providing an AR SDK optimized for their hardware, on which most Android and Windows Mobile platforms are based. In a surprising move, at ARE, they announced that they are bringing their AR SDK to the iOS platform as well.
With peripheral sensor support and video output, we’ve got almost everything we need to be able to connect a pair of see-through display glasses (more on those in a little bit) to one of these mobile devices for AR experience. But the best that those connections can provide is a “weak” AR experience. Why? Because the connectors don’t support external cameras. True, there are devices like the Looxcie, but the resolution and framerate are paltry, and are a limitation of the Bluetooth connection. On top of that, the integrated cameras in mobile devices are wired at a low-level to the graphics cores of their processors and dump the video feed directly into the framebuffers, facilitating the use of optimized processing methods, such as Qualcomm’s. What we need is the inclusion of digital video input in the device connectors, providing the same sort of low-level access to the video subsystems of the devices. This is absolutely vital to being able to use visual information from the camera(s) on a pair of glasses for their intended purpose of real-time pose estimation.
At ARE I got to try out a Vuzix prototype that finally delivers what I’d hoped to see with the AV920 Wrap. The new device is called the STAR 1200, for See-Through Augmented Reality. It looks a little funny in the picture, but don’t worry about the frame. The optical engine is removable and the final unit’s frame will probably look substantially different. It provides stereo 852×480 displays projected into optically see-through lenses and, let me tell you, it looks good. It is a great first step towards something suitable for mass adoption. The limited field of view coverage means that it won’t provide a truly immersive experience for gaming and the like, but again, it is a great first step. Now before I get your hopes up, this device will be priced for the professional and research markets, like the Wrap 920AR. Vuzix isn’t a big enough company to bust this market open on its own. But once apps are developed and the market grows, we’ll see this technology reaching consumer-accessible price points. I’m going to refrain from predictions of timeframe this time around, but I think that things are very much on track. Also, keep in mind that this is a different technology than the Raptyr, the prototype that Vuzix showed at CES this year. The Raptyr’s displays utilize holographic waveguides, while the STAR 1200 is built around more traditional optics. I did get to see another Vuzix prototype technology in private, and can’t say anything about it, but it is very promising.
One last development that has me very excited is Google’s new Open Android Accessory Development Kit. It’s based on the Arduino platform, making it instantly accessible to hundreds of thousands, if not millions, of existing experimenters, developers, and hardware hackers, including myself. This opens up all kinds of possibilities for custom human interface devices.
Okay. That’s it for today, but I’ll write again soon. I promise.
First, thank you to the awesome people, especially Sean White of Columbia University, who helped make it possible for me to be there.
Right now I”m just going to give you the beginning of my takeaway.
The paper that resonated most with my basic desire to see the big platform problems handled first was “Global Pose Estimation using Multi-Sensor Fusion for Outdoor Augmented Reality” by Gerhard Schall, Daniel Wagner, Gerhard Reitmayr, Elise Taichmann, Manfred Wieser, Dieter Schmalstieg, and Bernhard Hofmann-Wellenhof, all out of TU Graz, Austria, with the exception Mr. Reitmayr, who is at Oxford. This is the kind of fusion work that I’ve been talking about since my first post, and it was really exciting to see people actually doing it seriously on the hardware side. The two XSens MTi OEM boards headed to the new lab for a non-AR project should have cleared customs by now. I’ll find out if they’re there on Tuesday.🙂 I only mentioned it because it’s more-or-less the same device that was used for the inertial portion of this project, and I can’t wait to build them into something.
I also loved reading Mark Livingston’s paper on stereoscopy.
Incidentally, all of the papers, and video of all the sessions, should be getting posted soon to ISMARSociety.org. Serious props to the student volunteers who appeared to really keep things running smoothly, and who performed the awesome task of capturing all of the content on video. This, the first year of AR as a popular buzzword, is the time to share with the rest of the world just how much scientific effort is going into making real progress.
I’ve got lots to say about the HMDs, including Nokia Reasearch Center’s cool eye-tracking see-through display sunglasses prototype, but I’m going to save it for tomorrow, or perhaps for another forum. For the moment, just enjoy this photograph of Dr. Feiner stylishly rockin’ the Nokia prototype.
Hell yeah, dude.
Though we were still notably lacking Tish Shute and Rouli, this pic has a pretty stacked roster of AR blogosphere heavy-hitters in it. And speaking of Tish, I think she may be onto something with the AR Wave initiative. The diagram in her most recent post makes a great deal of sense.
And sorry to flake on the daily updates. I did end up demoing some glove stuff, and I was just generally pretty wiped out by the time I got back to my hotel each evening. ISMAR was terribly exciting for me, and have a ton more to recount.
I just had a very interesting conversation with Paul Travers, CEO of Vuzix.
Paul explained several things to me, including that it was a mistake on their part to keep a name so similar to the AV920 Wrap when creating the Wrap 920. The Wrap product line is distinct from those previously shown, including the AV920 Wrap. There is no denying that the pictures of the Wrap series now posted on the Vuzix website do suggest that, when it is released, it will be far and away the most attractive looking “video eyewear” device to be brought to market. Paul also confirmed that there will be a stereo camera pair, as well as other accessories, for the Wrap series devices. I’ve seen a picture, and I don’t think people will be disappointed with the approach that they’ve taken for attaching cameras to the device.
The most important part of the conversation was that in which Paul assured me that Vuzix will not be abandoning the optically transparent see-through display market, and that we have a great deal to look forward to. He reaffirmed their commitment to the Augmented Reality market, and told me that he was confident that their products would continue to be well ahead of the curve and offer features unheard of at their price point. Confirming what I’d heard from people like Joe Ludwig, Robert Rice and Ori Inbar, he told me that the AV920 Wrap was an imperfect device, and that he thought it better not to release a product that didn’t meet his company’s own high standards, rather than releasing something which he thinks would have let people down. He reiterated that it had been a mistake to keep a name so similar to that of the AV920 Wrap, said that he regretted having left the AV920 Wrap up on their website for as long as they did after having decided not to release it, and also admitted that there should have been more clarification when the Wrap line was re-envisioned as it was. Yes, it would’ve been nice to have been told.
Having had this reassuring conversation with Paul, I can tell you that I still expect to see great things come out of Vuzix (if anything, my expectation are now higher). Though I’m still quite disappointed by having to continue waiting for a true consumer-oriented see-through HMD, and though I do feel a little led-on, I expect that when we do see one from Vuzix, it will far exceed the expectations initially set by the AV920 Wrap prototype shown at the last CES. More than anything, I was reassured by the frankness with which Paul admitted the unintentional mistakes that had been made in handling the separation of the Wrap line from their ongoing optically transparent display research and development. They’ve been at this for a long time, and I’m convinced that Vuzix would never squander their hard-earned credibility by deliberately deceiving their customers.
Note: When you’re done reading this, please see my followup post.
Today we have received confirmation from Vuzix CEO Paul Travers that the highly anticipated Vuzix Wrap 920, previously known as the AV920 Wrap, will not, in fact, be a see-through head-mounted display (HMD). It will instead be a “see-around” model. This means that the LCD viewing elements will be opaque, as in previous models, but will be suspended behind a sunglasses-style lens without obstructing the peripheral view around the display. In previous HMD devices this wasn’t generally the case because one doesn’t view the LCD panel and light source directly as one does a typical computer or television monitor. Put simply, an HMD requires focal optics so that your eyes can focus on something so close without giving you a headache.
(See this previous post where I reported on being told by Robert Rice, and then Vuzix, that the AV920 Wrap would, in fact, be a true optical see-through HMD.)
Presumably Vuzix will still be offering a stereo pair camera accessory for the Wrap 920, as was supposed to be produced for the original AV920 Wrap, but it’s hard to know what to expect at this point.
So while this does represent an incremental step forward in Vuzix’s offerings, it isn’t the one we were promised. More importantly, it isn’t the one we’ve all been waiting for.
I am, of course, disappointed by this news. After Lumus Optical went back to the drawing board, as they told Ori Inbar they had done in this interview on his his blog, Vuzix was the only company still promising a see-through head-mounted display for consumers any time soon.
Now? Well, we’re left waiting for:
- Somebody to get serious and invest some real VC money in Lumus
- Sony to produce something using using their holographic waveguide technology
- Konica Minolta to further develop their Holographic Optical Element technology
- Microvision to show that they’re serious by showing something other than a Photoshopped concept illustration (Microvision has been subcontracted to develop a new see-through HMD for the military under the ULTRA-Vis program, but who knows when that might lead to development of a civilian device)
- something unexpected to show up.
I had been hoping to be able to use a see-through HMD in the ISMAR demo presentation on which I’m working with Seac02 using their awesome LinceoVR software. It looks like we’ll have to make do with the conventional HMDs already at our disposal.
Maybe we’ll still see released products using Vuzix’s touted “Quantum Optics” before we get our quantum computers.
Lots of big AR news these days. Where to start?
Well, there are two big ones today so far:
Robert Rice and Mobilizy are proposing an ARML Specification for mobile AR browsers to the newly formed AR Consortium. The Consortium, with its distinguished list of members, is big news in and of itself. I really, truly hope that Layar chooses to get on board with this. As the other widely recognized player in the mobile AR Browser game so far, I fear they may have the power to make or break this standard. Between the endorsement of Rice (and so, presumably, Neogence), and adoption by Layar and Mobilizy (maker of Wikitude), we could have a real functional standard. If, on the other hand, Layar fails to adopt the spec, it could go the way of VRML if no new competitive players arrive quickly and with support.
And today, Layar announced the upcoming addition of support for dynamic 3D models embedded in their content layers.
If the ARML Spec is made versatile enough to support Layar’s 3D strategy, we could see a real revolution in AR standardization, interoperability, etc. This all goes back to Tish Shute’s fantastic interview with Robert Rice on UGOTrade back in January. Interoperability, standardization, and shared content are the keys here.
It’ll also be interesting to see if Total Immersion and Int13’s upcoming mobile framework will support ARML. Depending on what they produce, that could establish the standard even without adoption by Layar.
Also, as Sergey Ten was quick to point out to me on Twitter, “ARML should include geometry/models and points descriptors/patches so that locations could be recognized by camera.” Given Layar’s 3D announcement, this would be key to their ability to get on board. (Come to think of it, Layar’s announcement may have been prompted by the prospect of Total Immersion and Int13’s entry into the mobile AR Browser fray and what they would bring to it… but that’s tangential and speculative, so I’ll let that notion sit.)
Also, I hear that Mr. Rice’s Neogence has licensed a certain very impressive markerless tracking algorithm. If this is, in fact, the case, then I’m sure he wouldn’t be opposed to the inclusion of optical data-point sets that could be downloaded, based on proximity, and used to register with views of the real world. I myself have been toying with (conceptually only, mind you) the idea of using Google Earth 3D model textures and StreetView imagery as tiles, generated and retrieved based on GPS proximity and heading, to produce more accurate registration. The plausibility of this approach was only reinforced in my head after watching this sweet piece of work by Lee Felaraca today. (See addendum at bottom of post.)
Keep the augmentation coming folks! I can’t wait to see you all at ISMAR!
I’ll leave you with this, in case you haven’t seen it yet:
The reason, incidentally, that I was encouraged by Mr. Felaraca’s work is that a similar technique might be used for generating trackable textures from camera input. Upon revisitation, I’m not exactly sure how that would aid the process of pinpoint registration. My thought is to generate the tiles from previously gathered data and match that against the camera input, as with previously implemented tracking methods. Regardless, the Texture Extraction Experiment is awesome, and would provide an excellent tool for gathering the data used for said tile generation, as well as on-the-fly creation of virtual objects for use in augmented environments.
via @Pogue on Twitter