Search
 
     
   
   
   
 
Exchanging Data between Globally Accepted Technologies

When one is totally confused in today's decentralized, distributed environment for exchange of information, you can still encounter well-standardized, globally accepted technologies. Each of these technologies serves very well in their respective domain. But when the question arises as to how data can be exchanged between these technologies, you face a lot of difficulties.

For instance, CORBA achieves interoperability between languages, hardware platforms, and Operating Systems quite well. But when it comes to run CORBA over the Internet, it's a bit complicated that you cannot implement a CORBA based system so easily. Thus almost every technology has its pros and cons that developers have to look into and patch up the loophole depending upon the requirement.

Interoperability between Languages, Hardware Platforms and OS

Handling Different Datatypes


In this article we shall focus on the CORBA web interface keeping in mind the interoperability issue and how to make it work on the Internet. When we consider the hardware platform, some processors store the data in the little endian or big endian format. For instance consider the char datatype, it takes one byte of memory space and thus the question does not arise, whether it is a little or big endian storage. On the other hand when we look at the 16-bit integer, it is made up of two bytes that permits two different ways of storing these bytes in memory.

 
 
 
address A+1
address A
 
little-endian byte order:
 
 
 
 
 
 
 
big-endian byte order:
 
 
address A
address A+1
 
 
 
Little-endian byte order and big-endian byte order for a 16-bit integer

 

It is up to the processor to decide how to write this data into the memory. Some processors store the low order byte at the starting address, known as little endian, i.e the low order byte at the address (x) and the high order byte at the address (x+1). On the other hand, some processors store the high order byte at the starting address i.e. high order byte at the address (x) and the low order byte at the address (x+1). Thus there is no agreement as to which byte ordering to use. If you exchange some information from one system to another without taking care of this byte ordering, you receive wrong data.

A character is an atomic unit of text as specified by ISO/IEC 10646 [ISO/IEC 10646] (see also [ISO/IEC 10646-2000]). Legal characters are tab, carriage return, line feed, and the legal characters of Unicode and ISO/IEC 10646. The versions of these standards cited in A.1 Normative References were current at the time this document was prepared. New characters may be added to these standards by amendments or new editions. The use of "compatibility characters", as defined in section 6.8 of [Unicode] (see also D21 in section 3.6 of [Unicode3]), is discouraged.

If the exchange of data involves byte-encoded transfer only, there is no problem in communicating between processor, which use different endian schemas. Some of the technologies revolve around the char set as the basic medium of transfer of data.

But the world is not only made of only characters, you also have primitive data types like the ones that exist in C++, which includes int, short, long, etc. that are language constructs and user defined types. So there must be some way of handling these datatypes also. Languages such as C++, Java, and CORBA take care of these language constructs. But other technologies like XML, and HTML do not take care of these language constructs.

XML for CORBA


XML (Extensible Markup Language) is a streamlined subset of SGML .It rather talks of structure rather than infrastructure as done by CORBA. XML is powerful because it lets the developers create their own Markup Language i.e. it is the key to create markups that can be used by any number of applications beyond Web browser.

The definition of markup as defined in the XML Specification:

Markup takes the form of start-tags, end-tags, empty-element tags, entity references, and character references. It also takes care of comments, CDATA section delimiters, document type declarations, processing instructions, XML declarations, text declarations, and any white space that is at the top level of the document entity (that is, outside the document element and not inside any other markup)

All text that is not markup constitutes the character data of the document. Choosing XML for CORBA is an added benefit, in the sense that you can define a markup for each of your IDL definition.

The IDL definition Associated XML description
module modtest
    {
    interface inttest
      { void methtest();
    };
  };
<module name = modtest>
  <interface name = inttest>
    <method attlist >
    methtest </method>
  </interface>
</module>


HTTP and the WWW


The Hypertext Transfer Protocol (HTTP) is a very well known protocol used over the Internet for data transfer and has been in use since 1990. This protocol is a layer above the TCP/IP protocol. It is an application-level protocol that is very light and provides the speed necessary for distributed, collaborative, hypermedia information systems. It is a generic, stateless protocol, which can be used for many tasks, such as Name servers and Distributed Object Management Systems, through extension of its request methods (commands). A notable feature of HTTP is the 'typing of data representation', that allows systems to be built independent of the data being transferred.


How HTTP, XML and CORBA can Work Together


Thus it is possible to combine the features of HTTP, XML and CORBA to achieve the overall interoperability still maintaining the speed requirement.


In HTTP, you have a set of common methods like GET , and POST. These methods are flexible i.e. you can extend these methods to contain additional information that is used during the method processing. The POST method can be used to send a block of data, such as the result of submitting a form, to a data handling process. The POST method of the HTTP can be used to pass the request from the client, to the server. This feature of HTTP can be used to pass the CORBA request/response model.


The CORBA client request can be encoded into a XML document (request). The complete request can be passed onto the server i.e. the HTTP server using the POST method. The HTTP server, upon receiving the request can treat it differently from normal HTTP request either based on a special CORBA flag sent together with the request or based on the URL the request was sent to. The server should pass the request to the CORBA layer again. Before invoking the method the real form of CORBA request is again formed into the server side by browsing the XML document. The results of the method requested is encoded into the XML document and the complete response is again posted back to the client. This scenario looks simple but lot of innovative efforts is required to accomplish this.


One of the problems encountered will be encoding the primitive data types and the user defined types into the XML request/response document. We have the XML Schema Working Group, who are currently working on this issue.

The XML file is of two types:

Document oriented that takes care of messages
Data oriented that takes care of real datatypes

What do we understand by datatypes?
Datatype is a 3-tuple, consisting of:
A set of distinct values, called its value space
A set of lexical representations, called its lexical space
A set of facets that characterize the properties of the value space, individual values or lexical items.

The requirements of this XML Schema language are:
Provision for primitive data typing, including byte, date, integer, sequence, SQL & Java primitive data types.
Define a type system that is adequate for import/export from database systems (e.g., relational, object, OLAP)
Distinguish requirements relating to lexical data representation vs. those governing an underlying information set
Allow creation of user-defined datatypes, such as datatypes that are derived from existing datatypes and which may constrain certain of its properties (e.g., range, precision, length, format)

Using HTTP with CORBA has an added advantage that HTTP works on a very well known port and even across firewalls or Proxies. The second issue that needs to be solved is the kind of adapter that will take up the HTTP request with CORBA flag and convert the HTTP request/response into a CORBA request/response type. There are people working in this direction also. One of the current work is the Protocol Adapter namely XORBA, which act as a web server adapter that automatically handles requests for CORBA services . These request and response uses SOAP (Simple Object Access Protocol), an object -access mechanism that uses HTTP as the transport base and XML as the method for encoding information.

Conclusion:
Thus we have seen how powerful it is to integrate the features of HTTP, CORBA and XML. The 'typing of data representation', a notable feature of HTTP allows systems to be built independent of the data being transferred. Similarly using XML as the data exchange format between systems will grow with time and we can look forward to many more ways of using XML and CORBA together that have yet to be invented.

- Rupak Kumar, Software Engineer, iCMG.

 
     
Copyright © 2006 iCMG. All rights reserved.
Site Index | Contact Us | Legal & Privacy Policy