Tutorial: Demo Application

Demo Application

Follow these steps to create a new capture-enabled web project. Topics include adding the document viewer and scanning controls to your web page, and handling uploaded content on the server.

This guide is intended to be followed exactly, but it is not intended to give you a solution that is ready to deploy. Once you have succeeded building the example project, you can begin modifying it to fit your organization.

Also see Demo Application (ASP.NET Core).

Set up a new project

A capture-enabled web application requires these basic elements:

  • A client-side ASPX page containing the web scanning controls and document viewer.
  • A server-side ASHX handler for the Web Document Viewer.
  • A server-side ASHX handler for the Web Capture back end.
  • WebCapture and WebDocumentViewer resources files.
  • An upload location for scanned documents.

Start by creating a new ASP.NET Web Application in Visual Studio.

Note In the following instructions the project is called BasicWebCapture.

Visual Studio automatically gives you Default.aspx as a page, which we will use for placing the web scanning controls and viewer.

Add assembly references

Add the following DotImage assemblies to your project:

  • Atalasoft.dotImage.WebControls
  • Atalasoft.Shared

In a default installation, these assemblies can be found in C:\Program Files (x86)\Atalasoft\DotImage 11.1\bin\3.5\x86.

There may be further dependencies on any of the remaining DotImage assemblies. Include all DotImage assemblies in your project if there are problems resolving them.

Copy web resources

WingScan comes with two sets of web resources: WebCapture and WebDocumentViewer. Copy the WebCapture and WebDocumentViewer directories into the root of your project.

Create the upload location

Create a new directory in the root of your project called atala-capture-upload. This is the default path that will be used for storing images uploaded by the web scanning controls.

If you need to change the location of the upload path (for example, to place it in a location outside of your document root), you can set an atala_uploadpath value in the appSettings section of either your web.config.

<appSettings>
    <add key="atala_uploadpath" value="c:\path\to\location"/>
</appSettings>

Add the Web Document Viewer handler

The Web Document Viewer handler is responsible for communicating with the Web Document Viewer embedded in your page, and is separate from the capture handler.

Add a new Generic Handler to your project. For the purposes of this guide, it is assumed this file will be called WebDocViewerHandler. Change the class definition to extend WebDocumentRequestHandler (part of ). Your handler should resemble the following example.

using Atalasoft.Imaging.WebControls;
namespace BasicWebCapture
{
    public class WebDocViewerHandler : WebDocumentRequestHandler
    { 
    }
}

Further modification to your handler is optional and allows to modify default behavior according to application logic.

Add the Web Capture handler

The Web Capture handler is responsible for handling file uploads from the scanning controls embedded in your page, and routing them to their next destination along with any necessary metadata. It is also responsible for supplying the scanning controls with the available content and document types, and status information.

Note, WebCaptureRequestHandler customizations described below is required only in case if application utilizing document capture functionality. For regular scanning, WebCaptureRequestHandler is required for licensing and default scanned files upload mechanism and don't requires customization.

For this guide, we will create a custom handler that provides a few static content and document types, and saves uploaded files to another location. Using this baseline, you can continue modifying the handler to suit your own document handling needs.

If your organization uses Kofax Import Connector (KIC) or Microsoft SharePoint for document management, DotImage ships with handlers to connect to these services. Check their respective topics for more information on how to use these handlers.

Create a handler

Add a new Generic Handler to your project. For the purposes of this guide, it is assumed this file will be called .

The handler should be modified to extend from WebCaptureRequestHandler, and should not implement the IHttpHandler interface, as is done when a generic handler is first created. Instead your handler will need to override several methods of WebCaptureRequestHandler. Your handler should resemble the following example.

using System;
using System.Collections.Generic;
using System.IO;
using System.Web;
using Atalasoft.Imaging.WebControls.Capture;
namespace BasicWebCapture
{
    public class WebCaptureHandler : WebCaptureRequestHandler
    {
        protected override List<string> GetContentTypeList(HttpContext context)
        {
            // ...
        }
        protected override List<Dictionary<string, string>>
        GetContentTypeDescription(HttpContext context, String contentType)
        {
            // ...
        }
        protected override Dictionary<string, string> ImportDocument(HttpContext context,
        string filename,
        string contentType, string contentTypeDocumentClass, string
        contentTypeDescription)
        {
            // ...
        }
    }
}

The three stubs represent the minimum number of methods that must be implemented for basic functionality, but there are other methods available in the public API that can also have their behavior overridden, such as methods to generate IDs or query the status of documents. Refer to the accompanying object reference for the complete WebCaptureRequestHandler API.

GetContentTypeList

This method returns the collection of available content types that can be used to organize scanned and uploaded documents. Content types are the top-level organizational unit, and each one has its own collection of document types (also called document classes) below it.

For this example, GetContentTypeList will be implemented to return a fixed list of two types: Accounts and HR. In a real system, this would probably query a database or other data source instead. In the KIC and SharePoint handlers, this method queries the system for these values.

protected override List<string> GetContentTypeList(HttpContext context)
{
    return new List<string>() { "Accounts", "HR" };
}

GetContentTypeDescription

This method returns a collection of data describing all the document types under a single content type. The return data is a list of dictionaries, where each dictionary contains a set of properties describing a single document type. In this example, the only property returned for a document type is its documentClass, which serves as its name.

protected override List<Dictionary<string, string>> GetContentTypeDescription(HttpContext context, String contentType)
{
    switch (contentType)
    {
        case "Accounts":
            return CreateDocumentClassDictionaryList(new string[] { "Invoices", "Purchase Orders" });
        case "HR":
            return CreateDocumentClassDictionaryList(new string[] { "Resumes" });
        default:
            return base.GetContentTypeDescription(context, contentType);
    }
}

private List<Dictionary<String, String>> CreateDocumentClassDictionaryList(String[] docList)
{
    return docList.Select(doc => new Dictionary<String, String> {{"documentClass", doc}}).ToList();
}

A helper method is provided to produce the actual list of document types, while GetContentTypeDescription switches on a given content type to determine what document types should be included in the list. As with content types, it is expected that this data will originate from another data source, instead of being hard-coded.

ImportDocument

This method is responsible for actually moving a document and its metadata to its real destination, which could be a directory, database, or system like KIC and SharePoint.

protected override Dictionary<string, string> ImportDocument(HttpContext context, string filename, string contentType, string contentTypeDocumentClass, string contentTypeDescription)
{
    string docId = Guid.NewGuid().ToString();
    string importPath = @"C:\DocumentStore";
    importPath = Path.Combine(importPath, contentType);
    importPath = Path.Combine(importPath, contentTypeDocumentClass);
    importPath = Path.Combine(importPath, docId + "." + Path.GetExtension(filename));
    string uploadPath = Path.Combine(UploadPath, filename);
    File.Copy(uploadPath, importPath);
    return new Dictionary<string, string>()
    {
        {"success", "true"},
        {"id", docId},
        {"status", "Import succeeded"},
    };
}

In this example, imported documents are copied into a directory tree rooted at C:\DocumentStore, using the content type and document class as subdirectories for organizing files. The imported file is copied and given a new name based on a GUID, which is also passed back to the client in the "id" field of a dictionary. The id could be used by the client to query the handler at a future time for the status of the imported document, but this functionality is not included in the guide.

Set up the scanning controls and viewer

The setup for web scanning just requires placing some JavaScript, CSS, and HTML into your page. For this guide however, we will update the document , which was originally included in the new web project.

Include the web resources

Include the following script and link tags in your page's head section to include the necessary Web Document Viewer and Web Capture code and dependencies.

<!-- Script includes for Web Viewing -->
<script src="jquery-3.3.1.min.js" type="text/javascript"></script>
<script src="atalaWebDocumentViewer.js" type="text/javascript"></script>
<!-- Style for Web Viewer -->
<link href="jquery-ui-1.12.1.min.css" rel="Stylesheet" type="text/css" />
<link href="atalaWebDocumentViewer.css" rel="Stylesheet" type="text/css" />
<!-- Script includes for Web Capture -->
<script src="atalaWebCapture.js" type="text/javascript"></script>

Configure the controls

The web scanning and web viewing controls need to be initialized and configured to set up connections to the right handlers, specify behavior for events, and so forth. This can be done with another block of JavaScript, either included or pasted directly within your page's head somewhere below the included dependencies.

<script type="text/javascript">
// Initialize Web Scanning and Web Viewing
$(function() {
    try {
        var viewer = new Atalasoft.Controls.WebDocumentViewer({
            parent: $('.atala-document-container'),
            toolbarparent: $('.atala-document-toolbar'),
            serverurl: 'WebDocViewerHandler.ashx'
        });
        Atalasoft.Controls.Capture.WebScanning.initialize({
            handlerUrl: 'WebCaptureHandler.ashx',
            onUploadCompleted: function(eventName, eventObj) {
                if (eventObj.success) {
                    viewer.OpenUrl("atala-capture-upload/" + eventObj.documentFilename);
                    Atalasoft.Controls.Capture.CaptureService.documentFilename =
                    eventObj.documentFilename;
                }
            },
            scanningOptions: { pixelType: 0 }
        });
        Atalasoft.Controls.Capture.CaptureService.initialize({
            handlerUrl: 'WebCaptureHandler.ashx'
        });
    }
    catch (error) {
        alert('Thrown error: ' + error.description);
    }
});
</script>

Note that the URL for the WebDocViewer handler is specified once and the URL for the WebCapture handler is specified twice, since two capture services must be initialized.

There are several additional options and handlers that can be specified in the initialization routines for web scanning and viewing. This example represents the minimal configuration necessary for web scanning with an integrated document viewer.

Add the UI

Add the following HTML to your project to create a basic viewer UI. This includes the Web Document Viewer, drop-down boxes to choose scanners, content types, and document types, and buttons to drive the UI. The web scanning demos included with DotImage also include more complete examples.

<p>
    Select Scanner:
    <select class="atala-scanner-list" disabled="disabled" name="scannerList" style="width: 22em">
        <option selected="selected">(no scanners available)</option>
    </select>
</p>
<p>
    Content Type:
    <select class="atala-content-type-list" style="width: 30em"></select>
</p>
<p>
    Document Type:
    <select class="atala-content-type-document-list" style="width: 30em"></select>
</p>
<p>
    <input type="button" class="atala-scan-button" value="Scan" />
    <input type="button" class="atala-import-button" value="Import" />
</p>
<div>
    <div class="atala-document-toolbar" style="width: 670px;"></div>
    <div class="atala-document-container" style="width: 670px; height: 500px;"></div>
</div>

Wrap-up

Your project should be ready to deploy to IIS. It is also ready to run from your developing environment, for testing purposes.

Web server Upload size limits

By default, IIS limits uploads to 30MB. If your application may sometimes generate larger uploads, you will need to adjust this limit for the server, or at least for your web application.

Estimating Upload Sizes

The size of an upload is approximately the sum of the compressed sizes of the uploaded images x 4/3 (1.333). The calculations below are for images. Remember that duplex scanning generates two images per page, minus any blank sides discarded by setting discardBlankPages:true.

Raw Uncompressed Image Size

Uncompressed image size in bytes is calculated according to following formula

(width x DPI x height x DPI x depth) / 8

Where: depth is 24 for color, 8 for grayscale, and 1 for B&W images. For example, an 8.5" x 11" color page, scanned at 200 DPI: (8.5 x 200 x 11 x 200 x 24) / 8 = 11,220,000 bytes (~11MB).

Compression Ratios

Typical office documents in B&W will compress by ~10X. White space increases the compression, lots of text or detailed graphics of any kind decreases the compression. 50KB per compressed B&W image is not a bad average, 70KB is conservative. Grayscale and color images will compress by 20X-30X, sometimes more. As with B&W, blank paper compresses more, detailed content compresses less. For our example 8.5" x 11" color page scanned at 200 DPI, with a raw size of 11MB we estimate a compressed size in the range 374KB - 560KB.

Factor In Base64 Encoding

We multiply by 4/3 (1.33) because uploads are encoded in Base64, which encodes 3 binary bytes as 4 text characters

Adjusting The ASP.NET and IIS Upload Limits

IIS, by default, limits any single upload to 30MB. And ASP.NET have even stricter limit for incoming request in 4MB. If you expect to upload larger files, you will need to increase this limit by editing web.config:

<configuration>
  <system.web>
    <!-- Sets the maximum request size, measured in kilobytes, default value is 4096 kilobytes. -->
    <!-- This setting required by ASP.NET -->
    <httpRuntime maxRequestLength="512000" />
  </system.web>
  <system.webServer>
    <security>
      <requestFiltering>
        <!-- Sets the maximum lengeth of content in request, measured in bytes, default values is 30000000 bytes. -->
        <!-- This seeting required by IIS. -->
        <requestLimits maxAllowedContentLength="524288000"/>
      </requestFiltering>
    </security>
  </system.webServer>
</configuration>

Alternatively, the IIS limit can also be changed interactively:

  1. Open IIS 7 SnapIn
  2. Select the website you want enable to accept large file uploads.
  3. In the main window double-click Request filtering. Once the window is opened you may see on top a list of tabs e.g. file name extensions, rules, hidden segments and so on.
  4. Regardless of the tab you select, in the main window right-click and select Edit Feature Settings and modify the "Maximum allowed content length (bytes)" in "Memory Limitation".