Capturing Customer Data on the Web

Data – for some reason when I hear the term, I still tend to think of the almost, but not quite human android from Star Trek: The Next Generation. As cool as good old, white-skinned Data might have been, though, in the real world data is a much more immediate entity that drives much of what goes on in our everyday lives.

Capturing Customer Data on the Web

In fact, this whole thing that we call the internet has been constructed on data of some type. HTML is a type of content data. CSS is data that is parsed by the browser to control the way a page is actually rendered. A mySQL database is comprised of tables, which essentially organize bits of data into manageable and accessible units. As such, if you are designing or (more specifically) developing for the web, you’re probably dealing with data of some sort all the time.

Beyond the web itself, however, data also plays a significant role in a number of different types of marketing, providing companies and organizations with the information they need to make informed decisions about their continuing marketing and advertising efforts.

For example, a company might launch a nation-wide marketing campaign to promote a new product. Then, several weeks into the campaign, they could analyze response rates from different regions, making appropriate adjustments for areas where the overall response may not have been as high. In order to do this, however, they need good, reliable data about their responders.

In these days of the great wide web, many of these customer responses come through the very websites that we are designing for our clients. So, the question is – what are we doing to help ensure the highest possible integrity and usability of the data collected through the sites we build?

In response to this question, I would like to go over a few areas that you might want to consider when capturing customer data on the web.

A Little Background

Most people reading here probably know me for my work in design and development, both here and over on the Echo Enduring Blog. However, I am also currently the Creative Director for a Highland Marketing, a direct marketing company with a strong technical side. While my role has primarily focused around design and web development, I’ve also had the chance to work with quite a bit of customer data in various capacities. I’ve also been involved in a number of custom data entry projects, in which information entered from physical cards and surveys had to be married with data streaming in from a website.

Ultimately, I’ve found that all of this has given me a pretty strong grasp on the role of data in marketing (and specifically direct marketing). It has also afforded me the opportunity to see just how frequently data coming in from websites simply does not meet the levels of quality that it should.

So let’s address that issue.



When a website is collecting customer data – through surveys, online purchases or some other type of form – it is rarely doing so just for the sake of capturing data.

The information that is gathered is intended for a specific purpose, which will usually have some kind of bearing on marketing and/or ongoing customer relationships.

This also means that the data is likely intended to stream off of the website for inclusion in some form of larger system. For instance, suppose a company is receiving requests for quotations from their website. These requests will likely contain a variety of important information, such as a contact name, address, telephone number, email and all of the details needed to prepare the quotation itself. If there are only a couple quotations coming in per day, someone might be able to process these manually.

But, what if there are dozens of requests coming in every day, or even every hour? Instead of processing these all manually, wouldn’t it be easier if there was some sort of process that would allow the company to download all of the recent requests and load them into their own internal systems for processing?

As a web developer, you really shouldn’t be expected to provide the entire solution for that process, but you will definitely need to be involved. In order for it to run smoothly, that data will have to be parsed into the proper format (XML, CSV etc). It should also be broken out into the proper fields. If the client’s system uses separate fields for the first and last names (which we will talk about below), then be sure to provide the name data broken out accordingly.

Similarly, try to use values that are consistent with the client’s existing data structure. I’ve seen instances where data streaming off of a website is represented differently from the same type of data on the main system. In order for the two to be properly married, custom routines had to be written to interpret and convert the web data. Try saving your client these kind of headaches by formatting data to match their existing schematic.

It’s all about improving the overall interchangeability of data from the web side, making things easier for the client and helping maintain the overall integrity of the data.

Time Stamp

Time Stamp

Always, always, always include a date and time stamp on your records. This helps to place the data in the context of time. For instance, if a company is running a marketing campaign with the primary response vehicle being their website, the time stamp will indicate exactly when each customer responded within the course of that campaign. When analyzing a cross-section of different customers, this information can also help identify trends, which may in turn indicate which marketing efforts were the most successful.

Time stamps are also critical on fulfillment-based campaigns. I’ve worked on a number of projects where consumers can request samples of a product from a website. In these instances, the time stamps let us track the amount of time that passes between the request and the fulfillment, helping to ensure that everything moves along a timely and orderly fashion.

Edits & Validation

Edits & Validation

As wonderful and marvelous as they are, I think we can all agree that computers are stupid. Actually, they are less than stupid – they are entirely devoid of intelligence. A computer can only interpret data in the manner that it is told to. That interpretation can certainly be very complex, but the computer cannot deviate from its programmed course (though it can appear to through nasty little bugs).

So, when you go to all the trouble of capturing information from users online, the last thing that you want is for them to be sabotaging the data (intentionally or not) with improper or invalid values. This kind of bad data can render records partially or entirely useless to your clients.

To avoid these kinds of shenanigans, I strongly recommend using some form of validation on your forms. If you have an email address field, validate the input to make sure that it is at least structured like an email address, rather than just a string of nonsense. The same sort of thing can be done for any piece of data that requires a standardized format, like a telephone number or a postal code.

Another important form of validation simply involves checking to make sure that data is actually present. If a user is submitting a form that contains certain required fields, a good validation script will parse through the form inputs to make sure that there is actually data in all the right fields. This helps prevent your clients from building a collection of empty, and entirely useless records.

You can also help maintain the integrity of data by simply providing users a choice of options to select from. Free form text fields are great for unique data and comments, but they very unpredictable. By using drop-downs, radio buttons and check boxes, you can enforce very strict control on both the data itself and its format (for interchangeability purposes).

Data Segmentation

Data Segmentation

Have you ever noticed how some web forms ask you for your first name and last name in separate fields? Or how address data is very carefully broken out, rather than all mashed together in a single line? This has to do with the importance of data segmentation. It’s a whole lot easier to stick two pieces of data back together than it is to break one piece of data into multiple parts – especially if it doesn’t have any kind of standardized format or structure.

Take these names, fictitiously entered into the name field of a web form, as an example:

Don Davis
Ms. Kathy Lauper
Johan van Driel
Dr. Mary Anne Smith

There is no simple way to break all of these names down into their separate parts. However, if the individual parts were all captured separately, everything would be much easier:

Don Davis
Ms. Kathy Lauper
Johan van Driel
Dr. Mary Anne Smith

To recreate the full names, all we have to do is string the individual bits of data together. At the same time, if we want to access just the first name, we can do that too, through the first name field. It makes everything much cleaner and much simpler to manage. It also increases the overall flexibility of the data.

But why would we want to be able to access just the first name? Personalization of marketing materials is the one big reason that comes to mind (check out these personalization demos that I helped put together). Using this technique (in either direct mail or email) generally involves inserting a name into a greeting line, or even into the body copy itself, depending on how elaborate you want to get. Trust me when I say that having the name information broken out will make the setup process much easier for you or your client.


There is no doubt that web designers already wear a lot of different hats. I know I do. So, I’m certainly not suggesting that you run out and become an expert on information architecture, statistics, database analysis and the like. However, if you do find that the sites you are working on need to capture some form of data about your client’s customers, then considering the four areas cited above should help you record and maintain clean, usable data.

You might also want to considering checking out the Address Accuracy & Database Primer, a free eBook (pdf format) that I helped put together a few years ago. It’s primary focus is on addressing for delivery through the Canadian postal system, but there is also an entire chapter just about data and databases that you might find useful.

Of course, you also need to work towards maintaining usability in your sites, so I would recommend a certain level of caution when it comes to validation and data segmentation. They are important, but if you use too much, you may be in danger of sacrificing the usability of the site. It’s a fine balance that needs to be maintained, so be sure to pay careful attention to it!

Your Turn to Talk

What do you do to help maintain the quality and integrity of the data that is being captured on the websites you create? Are you going to do anything differently now? Do you do anything other than what we covered here? Have your say!



  1. Capturing Customer Data on the Web…

    In the real world data is an immediate entity that drives much of what goes on in our everyday lives. In fact, this whole thing that we call the internet has been constructed on data of some type….

Leave a Reply

Your email address will not be published. Required fields are marked *

* Copy This Password *

* Type Or Paste Password Here *