[Home] [Current Edition] [Compendium] [Forum] [Web Archive]
[Email Archive] [Guestbook] [Subscribe] [Advertising Rates]

Problems Associated with Collecting International Data over the Internet

By Graham Rhind, Owner, GRC Database Information

Email: graham@grcdi.nl
Web: http://www.grcdi.nl

Graham Rhind is an acknowledged expert in the field of data management. He runs his own consultancy company, GRC Database Information, based in The Netherlands, where he researches postal code and addressing systems, collates international data, runs a busy postal link website and writes data management software. Graham speaks regularly on the subject and is the author three books on the topic of international data management.


Though a web site can be accessed from almost any part of the world, most companies are failing to utilize the opportunities offered in this electronic medium to optimise the quality of data collected, instead resorting to spending vast amounts of money cleaning data after collection.

It is interesting to me that, however new and exciting a marketing channel is, most businesses and individuals within those businesses continue to operate in deep ruts of their own making, and fail to widen their understanding to encompass the new medium

One of the deepest ruts that I see in my chosen field is how, when it comes to collecting data from a visitor to a web site, companies treat the Internet as though it is as inflexible as if it was printed on paper.

Many companies spend large amounts of money investing in knowledge, which is intended to make their web sites culturally pleasing to a chosen audience. They will work on localisation issues, but in far too many cases seem to think that the concept of “localisation” stops where translation ends.

Visitor information data collection forms are perfect examples of this. Even those sites of companies specialising in globalisation, internationalisation and/or localisation present a single input form for every visitor, regardless of their location, cultural background or personal needs. These forms are, without exception, culturally biassed and in almost all cases totally unsuitable to collect data from most places outside the country of location of the website owner.

There are millions of websites on the Internet, and I’ve only visited a fraction, but I’ve never yet come across a site where its owner has given enough thought to the world around them and to their own internal processes to remember that the Internet is a highly flexible electronic medium and that you are not limited to a single web form when collecting data, as you might be if your form was printed on paper.

Remember that a data input form is the point of your website where interaction with your customer is likely to be at its greatest. It deserves to be given greater thought in its design.

The personal information that you will usually want to collect and which will usually vary the most between cultures is the person’s name and address details. There are around 130 different address formats used in the World today, and about 35 different personal name formats. If you want to truly open your business and site to an international audience, this issue has to be addressed.

Unfortunately, companies used to paper-based data collection forms forget the flexibility of the electronic medium and present a single form to collect name and address data on their websites. In far too many cases this means that the customer is unable to enter their details correctly, in full and/or in the right place. At the same time, your customers become frustrated and irritated and your database becomes polluted.

I am sure most of us have visited web sites before which require a “state” field to be filled in, whereas most countries in the world do not have states. Another example is American forms which use the field labels Prefix, First name, Last name and Suffix when collecting personal name data. As you can see, each of these labels suggests a relative position of one component of a person’s name to the other. Whilst it intends to collect the given name in the first name field and the family name in the last name field, most of the world’s population actually write their names with the family name first. Furthermore, whilst Americans write their form of address (Mr, Mrs etc.) as a prefix and their seniority (Sr, Jr, III etc.) or academic title (Ph.D. etc.) as a suffix, German’s, for example, write both their form of address and their academic titles as a prefix and the Japanese write their form of address as a suffix.

The result is highly polluted and, in many cases, useless data. Companies seem to perceive it as acceptable to spend large amounts of money cleaning and validating data collected via an Internet form, but don’t appreciate that a smaller amount of money spent on optimising input forms for their international visitors will save them far larger sums of money further down the line.

Software components do exist which can resolve this problem. Requesting initially, for example, the country of residence and the language in which they would like to see the form allows a web page input form to be built which perfectly mirrors the requirements of the customer, in terms of fields presented, field order and field labels. This produces better data for the website owner and a happier customer, which in turn can translate to commercial success.

Alas, sales of these components are known to be minimal. Clearly, we still have a long way to go before we are able to pull companies out of this particular rut.