Handout, ECPGR Workshop ‘On-line Central Crop Databases’, 17-22 February 1998, Alnarp, Sweden.

PC Databases on the Internet

Herman Nijland

Centre for Genetic Resources the Netherlands (CGN), Centre for Plant Breeding and Reproduction Research, P.O. Box 16, 6700 AA Wageningen, The Netherlands

 

Overview

In this article the focus will be on connecting MS Access databases to the Internet, using standard software on a standard platform. Several options will be discussed for the choice of the middleware, the software that interfaces between the Internet server and the database.

The proposed architecture is the same as CGN uses at the moment to maintain an on-line searchable database.

The global picture is like this:

    1. A user formulates a query using a search form, and sends it to the Internet server
    2. The Internet server passes the query on to the database interface software (middleware)
    3. The middleware passes the query on to the database, using ODBC (Open DataBase Connectivity; see below)
    4. The query is executed on the database, and the results are sent back to the middleware
    5. The results are laid out in an HTML file by the middleware
    6. The Internet server sends the file back to the user.

One of CGN’s main issues is browser independence. Every user should be able to access our databases, no matter what browser software he or she is using. All search actions are based on server-side scripts. An example of a platform is the one CGN is using:

Hardware/OS: Pentium-based server, 64 Mb RAM, good and fast connection to the Internet

Running on Microsoft Windows NT 4.0 Server

Internet server: Microsoft Internet Information Server (IIS in short) version 3.0

(package deal with Microsoft Windows NT 4.0)

Middleware: The architecture as outlined above leaves a lot of options open. The next paragraphs

will handle a couple of readily available middleware solutions.

Database: Microsoft Access, via ODBC (Open DataBase Connectivity)

 

ODBC: uniform access to databases

In order to make the database ‘reachable’ by the middleware a layer is needed in between. ODBC does just that. It provides a uniform way to access databases by defining a structure where a descriptive name (like CGN-Brassica for instance) corresponds with a database using the appropriate driver.

This way, the Internet Server sends queries to a datasource name, without bothering which format the database is in. This works for all ODBC-compliant databases (like Dbase, Foxpro, Access, Oracle, virtually any major database product.)

 

HTML: building pages and forms

HTML (HyperText Markup Language) is the language which defines the layout of Internet pages.

HTML uses (among many others) a layout method called a FORM, which basically does the following things:

One of the key attributes of the FORM tag is the ‘action-attribute’, which defines to which programme (on which computer) the variables have to be sent.

Middleware

Middleware is the real workhorse in architectures like this. While the Internet server generally only sends out HTML pages, the difficult jobs are handed over to the middleware. The options to be discussed can be part of the Internet Information Server (IDC/HTX and Active Server Pages) or can be a completely different programme altogether.

A few middleware options are:

    1. Internet Database Connector (IDC/HTX)
    2. Active Server Pages (ASP)
    3. ActiveX Data Objects (ADO)
    4. Third party middleware

Option 1: The Internet Database Connector (IDC/HTX)

IDC/HTC is an integral part of Internet Information server, and thus only works on this platform. No extra software is needed. The command- (IDC) and template (HTX) files can either be created by hand (using a text-editor) or by using the Internet Assistant for Access. Internet Assistant is a free add-on for Access ’95, and built-in in Access ’97, which allows you to automatically generate ‘Internet scriptfiles’, like IDC/HTX or ASP (see option 3). Although Internet Assistant generates a working script, you’ll need to adapt the generated code manually to suit your own wishes concerning layout and the like. The drawback of this technique is that it is hardly documented, so you’ll have to find your own way in it.

The technique is based on two types of files, an IDC file containing the query to the database and an HTX file, which is a template to layout the results from the query.

The IDC/HTX technique consists of two parts. The header, containing the name of the datasource, the name of the template and, preferably, a maxrecords option limiting the number of results to an acceptable amount. The rest of the file consists of queries, with a maximum of 40. As stated, the results of these queries are displayed in the template file, and each query needs its own space to do that. This space is assigned by placing certain tags in the template file. To control placement of records a bare-bones control language is available. This solution is now being used by CGN to search the Brassica CCDB (http://www.cpro.dlo.nl/cgn/brasedb/).

Information sources:

Microsoft site: Search in Support/Knowledge Base on "Internet Database Connector"

http://www.microsoft.com/

Web-Enable Apps with the Internet Database Connector:

http://premium.microsoft.com/msdn/library/periodic/d1/sa1096.htm

Jobforum: and example IDC application (Microsoft)

http://www.microsoft.com/accessdev/articles/jobforpa.htm

 

Option 2: Active Server Pages (ASP)

Active Server Pages are recognizable by their extension (.asp). The difference between ‘ordinary’ HTML pages and ASP pages is that HTML files are just sent from the server to the client as they are. ASP pages on the other hand are executed on the server (active server!), resulting in an HTML file which is then sent to the client. Since all actions occur on the server, it is browser independent.

The layout of an ASP file is a combination of HTML and a scripting language, typically VBScript. VBScript is a language derived from Visual Basic. This technique is more flexible than the IDC/HTX technique, but requires a lot more programming skills. ASP can access databases via ODBC or other techniques (see also option 3).

Information sources:

Microsoft site: Search in Support/Knowledge Base on "Active Server Pages"

http://www.microsoft.com/

Site Builder Workshop, server area

http://www.microsoft.com/workshop/server/default.asp

 

Option 3: ActiveX Data Objects (ADO)

The basis of the ADO technique lies in the old OLE (Object Linking and Embedding) technique. This was used mainly to combine (for example) spreadsheet data in a wordprocessor document, but has grown to be much more than that. In short, ADO is OLE over the Internet. This means that information from any object - a database, spreadsheet, or virtually anything - can be embedded in an HTML file. This object is controlled by an ActiveX control, which is executed at the client’s PC. This requires an ActiveX compatible browser on the client side (i.e. Internet Explorer). ADO can also be used in cooperation with ASP pages.

Information sources:

Microsoft Universal Data Access: http://www.microsoft.com/data/

Option 4: Third Party software

Finally a short look at alternatives for the Microsoft solutions. A number of companies offer integrated packages to publish database information on the Internet. What they have in common is that most of these programmes offer a database server connecting to the database(s), a kind of scripting language for manipulating and layout of retrieved data sets and lots of automated procedures to make life easier. What they also have in common is a more than decent price tag.

Some examples:

Cold Fusion by Allaire

Webbase by Expertelligence

AskSam by AskSam Systems

BASIS (used by ZADI)

and many others.