JavaScript Parser for Depth Photo

Android on Intel

Rate me:

5.00/5 (2 votes)

14 Jan 2016CPOL7 min read

9.8K

The JavaScript parser for depth photo parses eXtensible Device Metadata (XDM) image files [1] and extracts metadata embedded in image files to generate XML files.

This article is for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers

Intel® Developer Zone offers tools and how-to information for cross-platform app development, platform and technology information, code samples, and peer expertise to help developers innovate and succeed. Join our communities for Android, Internet of Things, Intel® RealSense™ Technology, and Windows to download tools, access dev kits, share ideas with like-minded developers, and participate in hackathon’s, contests, roadshows, and local events.

Abstract

The JavaScript* parser for depth photo parses eXtensible Device Metadata (XDM) image files [1] and extracts metadata embedded in image files to generate XML files. In addition, this app analyzes XML files to extract color image data and depth map data. It is a fundamental building block for depth photography use cases, like the image viewer, refocus feature, parallax feature, and measurement feature. We have deployed the JavaScript* parser in an open source project named Depthy [2] and proved its correctness and efficiency.

The input to this app is an XDM image file and outputs include XML file(s), color image file(s), and depth map file(s).

XDM

First, we describe the input to this app, XDM image files. XDM is a standard for storing metadata in a container image while maintaining the compatibility with existing image viewers. It is designed for Intel® RealSense™ Technology [3]. Metadata includes device-related information, like depth map, device and camera pose, lens perspective model, vendor information, and point cloud. The following figure shows an example where the XDM file stores the depth map (right) as metadata with the color image (left).

Adobe XMP Standard

Currently, the XDM specification supports four types of container image formats: JPEG, PNG, TIFF, and GIF. XDM metadata is serialized and embedded inside a container image file, and its storage format is based on the Adobe Extensible Metadata Platform (XMP) standard [4]. This app is specifically developed for JPEG format. Next we briefly describe how XMP metadata is embedded in JPEG image files and how the parser parses XMP packets.

In the XMP standard, 2-byte markers are interspersed among the data. The marker types 0xFFE0–0xFFEF are generally used for application data, named APPn. By convention, an APPn marker begins with a string identifying the usage, called a namespace or signature string. An APP1 marker identifies Exif and TIFF metadata; an APP13 marker designates Photoshop Image Resources that contains IPTC metadata, another or multiple APP1 marker designate the location of the XMP packet(s).

The following table shows an entry format for the StandardXMP section in the JPEG, including:

2-byte APP1 marker 0xFFE1
Length of this XMP packet, 2-bytes long
StandardXMP namespace, http://ns.adobe.com/xap/1.0/, 29-bytes long
XMP packet, less than 65,503 bytes

If the serialized XMP packet becomes larger than 64 KB, it can be divided into a main portion (StandardXMP) and an extended portion (ExtendedXMP), stored in multiple JPEG marker segments. The entry format for the ExtendedXMP section is similar to that for StandardXMP except that the namespace is http://ns.adobe.com/xmp/extension/.

The following image shows how StandardXMP and ExtendedXMP are embedded in a JPEG image file.

The following code snippet shows three functions:

findMarker. Parse the JPEG file (that is, buffer) from the specified location (that is, position) and search 0xFFE1 marker. If it is found, return the marker position; otherwise, return -1.
findHeader. Look for StandardXMP namespace (http://ns.adobe.com/xap/1.0/) and ExtendedXMP namespace (http://ns.adobe.com/xmp/extension/) in the JPEG file (that is, buffer) from the specified location (that is, position). If found, return corresponding namespace; otherwise, return an empty string.
findGUID.Look for GUID which is stored in xmpNote:HasExtendedXMP in the JPEG file (that is, buffer) from the start location (that is, position) to the end location (that is, position+size-1) and return its location.

JavaScript

// Return buffer index that contains marker 0xFFE1 from buffer[position]
// If not found, return -1
function findMarker(buffer, position) {
    var index;
    for (index = position; index < buffer.length; index++) {
        if ((buffer[index] == marker1) && (buffer[index + 1] == marker2))
            return index;
    }
    return -1;
}

// Return header/namespace if found; return "" if not found
function findHeader(buffer, position) {
    var string1 = buffer.toString('ascii', position + 4, position + 4 + header1.length);
    var string2 = buffer.toString('ascii', position + 4, position + 4 + header2.length);
    if (string1 == header1)
        return header1;
    else if (string2 == header2)
        return header2;
    else
        return noHeader;
}

// Return GUID position
function findGUID(buffer, position, size) {
    var string = buffer.toString('ascii', position, position + size - 1);
    var xmpNoteString = "xmpNote:HasExtendedXMP=";
    var GUIDPosition = string.search(xmpNoteString);
    var returnPos = GUIDPosition + position + xmpNoteString.length + 1;
    return returnPos;
}

A 128-bit GUID stored as a 32-byte ASCII hex string is stored in each ExtendedXMP segment following the ExtendedXMP namespace. It is also stored in the StandardXMP segment as the value of xmpNote:HasExtendedXMP property. This way, we can detect a mismatched or modified ExtendedXMP.

XML

XMP metadata can be directly embedded within an XML document [5]. According to the XDM specification, the XML data structure is defined as follows:

The image file contains the following items as shown in the above table, formatted as RDF/XML. This describes the general structure:

Container image. The image external to the XDM, visible to normal non-XDM apps.
DeviceThe root object of the RDF/XML document as in the Adobe XMP standard.
- Revision - Revision of XDM specification
- VendorInfo - Vendor-related information for the device
- DevicePose - Device pose with respect to the world
- Cameras - RDF sequence of one or more camera entities
  - Camera - All the information for a given camera. There must be a camera for any image. The container image is associated with the first camera, which is considered the primary camera for the image.
    - VendorInfo - Vendor-related information for the camera
    - CameraPose - Camera pose relative to the device
    - Image - Image provided by the camera
    - ImagingModel - Imaging (lens) model
    - Depthmap - Depth-related information including the depth map and noise model
      - NoiseModel - Noise properties for the sensor
    - PointCloud - Point-cloud data

The following code snippet is the main function of this app, which parses the input JPEG file by searching APP1 marker 0xFFE1. If it is found, search the StandardXMP namespace string and ExtendedXMP namespace string. If the former, calculate metadata size and starting point, extract the metadata, and form the StandardXMP XML file. If the latter, calculate metadata size and starting point, extract the metadata, and form the ExtendedXMP XML file. The app’s outputs are two XML files.

JavaScript

// Main function to parse XDM file
function xdmParser(xdmFilePath) {
 try {
     //Get JPEG file size in bytes
     var fileStats = fs.statSync(xdmFilePath);
     var fileSizeInBytes = fileStats["size"];

     var fileBuffer = new Buffer(fileSizeInBytes);

        //Get JPEG file descriptor
     var xdmFileFD = fs.openSync(xdmFilePath, 'r');

     //Read JPEG file into a buffer (binary)
     fs.readSync(xdmFileFD, fileBuffer, 0, fileSizeInBytes, 0);

     var bufferIndex, segIndex = 0, segDataTotalLength = 0, XMLTotalLength = 0;
     for (bufferIndex = 0; bufferIndex < fileBuffer.length; bufferIndex++) {
         var markerIndex = findMarker(fileBuffer, bufferIndex);
         if (markerIndex != -1) {
                // 0xFFE1 marker is found
             var segHeader = findHeader(fileBuffer, markerIndex);
             if (segHeader) {
                 // Header is found
                 // If no header is found, go find the next 0xFFE1 marker and skip this one
                    // segIndex starts from 0, NOT 1
                 var segSize = fileBuffer[markerIndex + 2] * 16 * 16 + fileBuffer[markerIndex + 3];
                 var segDataStart;

                 // 2-->segSize is 2-byte long
                    // 1-->account for the last 0 at the end of header, one byte
                 segSize -= (segHeader.length + 2 + 1);
                 // 2-->0xFFE1 is 2-byte long
                 // 2-->segSize is 2-byte long
                 // 1-->account for the last 0 at the end of header, one byte
                 segDataStart = markerIndex + segHeader.length + 2 + 2 + 1;
                
                 if (segHeader == header1) {
                        // StandardXMP
                     var GUIDPos = findGUID(fileBuffer, segDataStart, segSize);
                     var GUID = fileBuffer.toString('ascii', GUIDPos, GUIDPos + 32);
                     var segData_xap = new Buffer(segSize - 54);
                     fileBuffer.copy(segData_xap, 0, segDataStart + 54, segDataStart + segSize);
                     fs.appendFileSync(outputXAPFile, segData_xap);
                 }
                 else if (segHeader == header2) {
                        // ExtendedXMP
                     var segData = new Buffer(segSize - 40);
                     fileBuffer.copy(segData, 0, segDataStart + 40, segDataStart + segSize);
                     XMLTotalLength += (segSize - 40);
                     fs.appendFileSync(outputXMPFile, segData);
                 }
                 bufferIndex = markerIndex + segSize;
                 segIndex++;
                 segDataTotalLength += segSize;
             }
         }
         else {
                // No more marker can be found. Stop the loop
             break;
         };
     }
 } catch(ex) {
  console.log("Something bad happened! " + ex);
 }
}

The following code snippet parses the XML file and extracts the color image and depth map for depth photography purposes. It is very straightforward. The function xmpMetadataParser() searches the attribute named IMAGE:DATA and extracts the corresponding data into a JPEG file which is the color image. If multiples are found, multiple JPEG files are created. The function also searches the attribute named DEPTHMAP:DATA and extracts the corresponding data into a PNG file which is the depth map. If multiples are found, multiple PNG files are created, too. The app’s outputs are JPEG file(s) and PNG file(s).

JavaScript

// Parse XMP metadata and search attribute names for color image and depth map
function xmpMetadataParser() {
    var imageIndex = 0, depthImageIndex = 0, outputPath = "";
    parser = sax.parser();

    // Extract data when specific data attributes are encountered
    parser.onattribute = function (attr) {
        if ((attr.name == "IMAGE:DATA") || (attr.name == "GIMAGE:DATA")) {
            outputPath = inputJpgFile.substring(0, inputJpgFile.length - 4) + "_" + imageIndex + ".jpg";
            var atob = require('atob'), b64 = attr.value, bin = atob(b64);
            fs.writeFileSync(outputPath, bin, 'binary');
            imageIndex++;
        } else if ((attr.name == "DEPTHMAP:DATA") || (attr.name == "GDEPTH:DATA")) {
            outputPath = inputJpgFile.substring(0, inputJpgFile.length - 4) + "_depth_" + depthImageIndex + ".png";
            var atob = require('atob'), b64 = attr.value, bin = atob(b64);
            fs.writeFileSync(outputPath, bin, 'binary');
            depthImageIndex++;
        }
    };

    parser.onend = function () {
        console.log("All done!")
    }
}

// Process XMP metadata
function processXmpData(filePath) {
    try {
        var file_buf = fs.readFileSync(filePath);
        parser.write(file_buf.toString('utf8')).close();
    } catch (ex) {
        console.log("Something bad happened! " + ex);
    }
}

Conclusion

This white paper described the XDM file format, Adobe XMP standard, and XML data structure. The JavaScript parser app for the depth photo parses the XDM image file and output StandardXMP XML file and ExtendedXMP XML file. Then it parses the XML files to extract color image file(s) and depth map file(s). This app does not depend on any other programs. It is a basic building block for any depth photography use cases.

References

[1] “The eXtensible Device Metadata (XDM) specification, version 1.0,” https://software.intel.com/en-us/articles/the-extensible-device-metadata-xdm-specification-version-10

[2] Open source project Depthy. http://depthy.me/#/

[3] Intel® RealSense™ Technology: http://www.intel.com/content/www/us/en/architecture-and-technology/realsense-overview.html

[4] Adobe XMP Developer Center. http://www.adobe.com/devnet/xmp.html

[5] “XML 1.0 Specification,” World Wide Web Consortium. Retrieved 2010-08-22.

Attribution and Thanks

This project uses the following environment and packages:

[1] Node.js environment (https://nodejs.org/en/) is used in this project. Its license can be found at the following link: https://raw.githubusercontent.com/nodejs/node/master/LICENSE

[2] sax package (https://www.npmjs.com/package/sax) is used to parse the XML file in this project. Its ISC license can be found at the following link: http://opensource.org/licenses/ISC

[3] atob package (https://www.npmjs.com/package/atob) is used to do BASE 64 data decoding in this project. Its Apache 2.0 license can be found at the following link: http://opensource.org/licenses/Apache-2.0

About The Author

Yu Bai is an application engineer in Intel® Software and Services Group (SSG), working with external ISVs to ensure their applications run well on Intel® platforms. Before joining SSG, she worked for Rudolph Technologies as a senior software engineer, developing applications used in the operation of precision photolithograph equipment for the semiconductor capital equipment industry. Prior to Rudolph, she worked for Marvell Semiconductor as a staff engineer working on power analysis and power modeling for the company's application processors. She joined Marvell through the company's acquisition of Intel® XScale technology in 2006.

Yu received her master and doctorate degrees in Electrical Science and Computer Engineering from Brown University. Her graduate research focused on high-performance and low-power computer architecture design. Yu holds six U.S. patents and has published 10+ journal and international conference papers on power/performance management and optimization.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

Android on Intel

United States

Intel is inside more and more Android devices, and we have tools and resources to make your app development faster and easier.

JavaScript Parser for Depth Photo

Abstract

XDM

Adobe XMP Standard

XML

Conclusion

References

Attribution and Thanks

About The Author

License

Comments and Discussions