Click here to Skip to main content
15,895,746 members
Articles / Programming Languages / Dbase
Article

How to Build Custom Information Management Systems in the Cloud Quickly

20 Jul 2020CPOL6 min read 4.8K   1  
Getting Started with the yuuvis® Ultimate Object Store for Document Management
In this tutorial, we'll show you how you can use yuuvis® Ultimate to build custom content and document management systems (DMS) with intelligent document processing.

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

Introduction

Document databases and content management systems (CMS) are familiar to any developer who’s had to create systems around the storage and management of files, images, and many other content-focused data. Typically these have been on-premises products with proprietary data formats. yuuvis® Ultimate is a cloud-based document management system (DMS) using a RESTful API interface that makes it easy to integrate into modern cloud or hybrid application environments.

yuuvis® can help you build scalable, document-rich applications with advanced features like streaming, full-text search, and format conversion. Use the API to store, retrieve, and search up to billions of your documents in seconds. You can define your own custom schema to catalog documents, enforce policies, and manage retention. And because it can handle any type of data, some customers are using it to go beyond traditional document management. For example, the SoundSearch platform is employing yuuvis® Ultimate to let users run full-text searches on audio and video files.

In this tutorial, we'll show you how you can use yuuvis® Ultimate to build custom content and document management systems (DMS) with intelligent document processing.

Set Up Your Project

We'll use JavaScript for the examples, as it’s modern, accessible, and very popular. Its key advantage is that it works everywhere: in browsers and the backend, on servers, containers, and in cloud service lambdas and functions. It's the perfect language for microservice development.

To follow this tutorial, you’ll need a browser or Node.JS. You can also use an online service like JSFiddle, or try the API test service on the developer portal.

You'll also need an API key, which you can get for free by signing up on developer.yuuvis.com and activating a subscription. Once subscribed, get your key by clicking on My Account on the top-right corner and then on Subscriptions.

To set up the project, your JavaScript file should contain something like the following variables to specify the API URL and key:

JavaScript
const baseUrl = "https://api.yuuvis.io";
const apiKey = "YOUR YUUVIS API KEY";

When working with Node.JS, it's a good practice to install the node-fetch and form-data modules:

JavaScript
npm install node-fetch form-data

Then, import the modules:

JavaScript
const fetch = require("node-fetch");
const formData = require("form-data");
const fs = require("fs");

each with their own endpoints:

  • dms-core (YADB / Yet Another Database) lets you send, update, retrieve, and search documents
  • dms-view (MultiView) converts stored documents into the various formats
  • admin (Admin) manages types and schemas

We'll see some examples of using these endpoints in the following sections.

Define a Document Schema

Every document in yuuvis® Ultimate has a type. The default type is document.

You can define new document types by using custom schemas. Schemas also let you create folders to organize the documents better.

Whether you use a default document schema or a custom one, they all derive from the same simple XML definition, and yuuvis® Ultimate includes APIs to create, validate, and update the schemas used for your data. Let's take a look at a schema.

First, prepare an XML file with the schema definition. Schemas can enforce data integrity, declare properties, and define whether they are mandatory or optional.

For instance, the following code defines the document type book:

XML
<schema xmlns="http://optimal-systems.org/ns/dmscloud/schema/v5.0/" 
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
        xsi:schemaLocation="http://optimal-systems.org/ns/dmscloud/schema/v5.0/ dmsCloud-schema.xsd">
        <propertyStringDefinition>
            <id>title</id>
            <propertyType>string</propertyType>
            <cardinality>single</cardinality>
            <required>true</required>
        </propertyStringDefinition>
        <propertyStringDefinition>
            <id>authors</id>
            <propertyType>string</propertyType>
            <cardinality>multi</cardinality>
            <required>false</required>
        </propertyStringDefinition>
        <propertyStringDefinition>
            <id>isbn</id>
            <propertyType>integer</propertyType>
            <cardinality>single</cardinality>
            <required>false</required>
        </propertyStringDefinition>
        <propertyStringDefinition>
            <id>numberOfPages</id>
            <propertyType>integer</propertyType>
            <cardinality>single</cardinality>
            <required>false</required>
        </propertyStringDefinition>
        <typeDocumentDefinition>
            <id>book</id>
            <baseId>system:document</baseId>
            <propertyReference>title</propertyReference>
            <propertyReference>authors</propertyReference>
            <propertyReference>isbn</propertyReference>
            <propertyReference>numberOfPages</propertyReference>
            <contentStreamAllowed>allowed</contentStreamAllowed>
        </typeDocumentDefinition>
</schema>

With the schema in hand, you can send a POST request to /admin/schema/validate to validate it:

JavaScript
const requestOptions = {
    method: "POST",
    headers: {
        "Ocp-Apim-Subscription-Key": apiKey,
        "Content-Type": "application/xml",
        "Accept": "application/json"
    },
    body: fs.createReadStream("schema.xml"),
};

As shown in the above example, all requests must include the Ocp-Apim-Subscription-Key and API key key-value pair in the header.

Send the request to verify if your schema is valid:

JavaScript
fetch(baseUrl + "/admin/schema/validate", requestOptions)
    .then(response => {
        if(response.status === 200) {
            console.log("Schema is valid.")
        } else {
            console.error("Schema has errors.")
        }
    })
    .catch(console.error);

After your schema had been successfully validated, send the same request to /admin/schema to apply the schema:

JavaScript
fetch(baseUrl + "/admin/schema", requestOptions)
    .then(response => {
        if(response.status === 200) {
            console.log("Schema was applied.")
        } else {
            console.error("Schema was NOT applied.")
        }
    })
    .catch(console.error);

You can then use the new book type to store and retrieve books on your DMS using the yuuvis® Ultimate API.

Send Documents to the yuuvis® Object Storage

The document content can be any binary file, such as a Microsoft Office document, video and audio, PDF, or a compressed folder.

Every document stored in yuuvis® Ultimate must have metadata, which describes that document’s properties in JSON format. For example:

{
    "objects": [{
        "properties": {
            "system:objectTypeId": {
                "value": "book"
            },
            "title": {
                "value": "The Great Gatsby"
            },
            "authors": {
                "value": ["Scott Fitzgerald"]
            },
            "isbn": {
                "value": 9781433210471
            },
            "numberOfPages": {
                "values": 218
            }
        },
        "contentStreams": [{
            "cid": "cid_63apple"
        }]
    }]
}

To upload both the metadata and the content in a single operation, you send a multipart POST request to /dms-core/objects.

The following function uploads the metadata and the content to the platform:

JavaScript
function sendMetadataAndContents(metadata, contents) {
    const data = new formData();
    data.append("data", metadata);
    data.append("cid_63apple", contents);

    const requestOptions = {
        method: "POST",
        headers: {
            "Ocp-Apim-Subscription-Key": apiKey,
            "Accept": "application/json"
        },
        body: data
    };

    fetch(baseUrl + "/dms-core/objects", requestOptions)
        .then(response => { return response.json() })
        .then(responseJson => { 
            const objectId = responseJson.objects[0].properties[
                "system:objectId"].value;
            console.log("Created document with objectID: "+objectId);
        })
        .catch(console.error);
}

Every document created in yuuvis® Ultimate gets a unique objectId.

You can send multiple documents in a single request. The following metadata, for example, describes two books:

{
    "objects": [{
        "properties": {
            "system:objectTypeId": {
                "value": "book"
            },
            "title": {
                "value": "Tender is the Night"
            },
            "authors": {
                "value": ["Scott Fitzgerald"]
            },
            "isbn": {
                "value": 9781560549697
            },
            "numberOfPages": {
                "values": 320 
            }
        },
        "contentStreams": [{
            "cid": "cid_63apple"
        }]
    },
    {
        "properties": {
            "system:objectTypeId": {
                "value": "book"
            },
            "title": {
                "value": "The Last Tycoon"
            },
            "authors": {
                "value": ["Scott Fitzgerald"]
            },
            "isbn": {
                "value": 9781543617870
            },
            "numberOfPages": {
                "values": 168
            }
        },
        "contentStreams": [{
            "cid": "cid_60apple"
        }]
    }]
}

To send two books described in the above example, modify requestOptions to include both files:

const data = new formData();
data.append("data", fs.createReadStream("metadata.json"));
data.append("cid_63apple", fs.createReadStream("book1.pdf"));
data.append("cid_60apple", fs.createReadStream("book2.pdf"));

const documentOptions = {
  method: "POST",
  headers: {
    "Ocp-Apim-Subscription-Key": apiKey,
    "Accept": "application/json"
  },
  body: data
};

yuuvis® Ultimate responds with the objectId of every document created.

You can upload metadata with no content. This option enables you to add content later. In that case, the metadata doesn’t need to include the contentStreams section:

JavaScript
{
    "objects": [{
        "properties": {
            "system:objectTypeId": {
                "value": "book"
            },
            "title": {
                "value": "This side of paradise"
            },
            "authors": {
                "value": ["Scott Fitzgerald"]
            },
            "isbn": {
                "value": 9781455128259
            },
            "numberOfPages": {
                "values": 305
            }
        }
    }]
}

Use the following function to create a metadata-only document.

JavaScript
function sendMetadata(metadata) {
    const requestOptions = {
        method: "POST",
        headers: {
            "Ocp-Apim-Subscription-Key": apiKey,
            "Accept": "application/json",
            "Content-Type": "application/json"
        },
        body: metadata
    };

    fetch(baseUrl + "/dms-core/objects/", requestOptions)
        .then(response => {
            if(response.status === 200) {
                console.log("Metadata-only document created");
            } else {
                console.error("Document was NOT created");
            }
        })
        .catch(console.error);
}

Update Document Metadata and Content

Knowing a document’s objectId, you can update that document’s metadata with a POST request to /dms-core/objects/<objectId>:

JavaScript
function updateMetadata(metadata, objectId) {
    const documentOptions = {
        method: "POST",
        headers: {
            "Ocp-Apim-Subscription-Key": apiKey,
            "Accept": "application/json",
            "Content-Type": "application/json"
        },
        body: metadata
    };

    fetch(baseUrl + "/dms-core/objects/" + objectId , documentOptions)
        .then(response => {
            if(response.status === 200) {
                console.log("Metadata updated");
            } else {
                console.error("Metadata was NOT updated");
            }
        })
        .catch(console.error);
}

To change only some of the properties (instead of overwriting all of them), you send a PATCH request with the properties to be changed.

JavaScript
{
    "objects": [{
        "properties": {
            "title": {
                "value": "The Curious Case of Benjamin Button"
            }
        }
    }]
}

To upload or overwrite content, make a similar POST request to /dms-core/objects/<objectId>/contents/file.

Search for Documents

The benefit of having defined a custom schema is that you can now query the data store to search by specialized fields defined in the schema. yuuvis® Ultimate supports a subset of the SQL language that maps document types to virtual tables and makes properties available as columns. As a result, you get a relational view of unstructured data.

First, define the search query using a SQL statement in the query JSON structure. The following sample code finds books written by a particular author:

JavaScript
"query": {
    "statement": "SELECT * FROM book WHERE authors = 'Scott Fitzgerald'",
    "skipCount": 0,
    "maxItems": 50
}

Use maxItems and skipCount to paginate the results.

To run your search, send the query to /dms-core/objects/search.

JavaScript
function searchByAuthor(author) {
    const requestOptions = {
        method: "POST",
        headers: {
            "Ocp-Apim-Subscription-Key": apiKey,
            "Accept": "application/json",
            "Content-Type": "application/json"
        },
        body: JSON.stringify({
                "query": {
                    "statement": "SELECT * FROM book WHERE authors = 'Scott Fitzgerald'",
                    "skipCount": 0,
                    "maxItems": 50
                }
            })
    };

    fetch(baseUrl + "/dms-core/objects/search", requestOptions)
        .then(response => { return response.json() } )
        .then(responseJson => { console.log(JSON.stringify(responseJson, null, 4)) } )
        .catch(console.error);
}

The endpoint replies with a list containing metadata for every match.

Retrieve Documents

To retrieve metadata, make a GET request to /dms-core/objects/<objectId>:

JavaScript
function getDocumentMetadata(objectId) {
    const requestOptions = {
        method: "GET",
        headers: {
            "Ocp-Apim-Subscription-Key": apiKey,
            "Accept": "application/json"
        }
    };

    fetch(baseUrl + "/dms-core/objects/" + objectId, requestOptions)
        .then(response => { return response.json() } )
        .then(responseJson => { console.log(JSON.stringify(responseJson, null, 4)) } )
        .catch(console.error);
}

The same code works for retrieving the document content. You only need to append .../contents/file to the path:

JavaScript
function getDocumentContent(objectId) {
    const requestOptions = {
        method: "GET",
        headers: {
            "Ocp-Apim-Subscription-Key": apiKey
        }
    };

    fetch(baseUrl + "/dms-core/objects/" + objectId + "/contents/file", requestOptions)
        .then(response => {
            // process content here
        })
        .catch(console.error); 
}

On each update, yuuvis® Ultimate generates a new version of the document. You can access a particular version by adding /versions/<versionNumber> to the request.

Delete Documents

To delete a document, send a DELETE request using that document’s objectId:

JavaScript
function deleteDocument(objectId) {
    const requestOptions = {
        method: "DELETE",
        headers: {
            "Ocp-Apim-Subscription-Key": apiKey
        }
    }

    fetch(baseUrl + "/dms-core/objects/" + objectId, requestOptions) 
        .then(response => {
            if(response.status === 200) {
                console.log("Document deleted");
            } else {
                console.error("Failed to delete document");
            }
        })
        .catch(console.error);
}

Deletion of a document removes all its versions.

Alternatively, you can delete a specific version by appending /versions/{{versionNr}} to the request.

Get the Document History

yuuvis® Ultimate is certified as a cloud information management system. As such, it meets strict international standards. The platform maintains an audit trail for all your documents, where every read, modification, and deletion is logged.

You can get a complete history for a given document with a GET request to /dms-core/objects/<objectId>/history:

JavaScript
function getDocumentHistory(objectId) {
    const requestOptions = {
        method: "GET",
        headers: {
            "Ocp-Apim-Subscription-Key": apiKey,
            "Accept": "application/json",
        }
    };

    fetch(baseUrl + "/dms-core/objects/" + objectId + "/history", requestOptions)
        .then(response => { return response.json(); })
        .then(responseJson => { console.log(JSON.stringify(responseJson, null, 4)); })
        .catch(error => { console.log(error); });
}

Generate Renditions

yuuvis® Ultimate can render documents in various formats. To convert a document to a different format, make a GET request to /dms-view/contents/renditions/<type>. For example, the following function generates a thumbnail image:

JavaScript
function getDocumentImage(objectId) {
    const requestOptions = {
        method: "GET",
        headers: {
            "Ocp-Apim-Subscription-Key": apiKey,
            "Accept": "image/png"
        }
    };

    fetch(baseUrl + "/dms-view/objects/" + objectId + "/contents/renditions/slide", requestOptions)
        .then(response => { 
            // process image here
        })
        .catch(console.error);
}

The following renditions are available:

  • slide — a PNG file suitable for thumbnails
  • pdf — a PDF version of the document
  • text — plain text extracted from the document
  • extract — format-specific metadata extracted from the document (for example, EXIF data for images)

Additional Resources

To continue learning how to build advanced content and document management systems, check out these additional sources:

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
Tomas started his career as a PHP developer. After graduating, he worked at British Telecom as head of the Web Services department in Argentina. After that, he went to IBM, where he wore many technical hats: DBA, Sysdamin, and DevOps. He's now an independent consultant and writer. He loves to learn and to teach about technology. In his free time, he likes reading, sailing, and board gaming.

Comments and Discussions

 
-- There are no messages in this forum --