Internet cookies are incredibly simple, but they are one of those things that have taken on a life of their own. Cookies started receiving tremendous media attention back in February 2000 because of Internet privacy concerns, and the debate still rages.

On the other hand, cookies provide capabilities that make the Web much easier to navigate. The designers of almost every major site use them because they provide a better user experience and make it much easier to gather accurate information about the site's visitors.

Now, we will take a look at the basic technology behind cookies, as well as some of the features they enable. You will also have the opportunity to see a real-world example of what cookies can and cannot do using a sample page that we developed here at stuff.dewsoftoverseas.com.

Cookie Basics
In April of 2000 I read an in-depth article on Internet privacy in a large, respected
newspaper, and that article contained a definition of cookies. Paraphrasing, the definition went like this:

    Cookies are programs that Web sites put on your hard disk. They sit on your computer gathering information about you and everything you do on the Internet, and whenever the Web site wants to it can download all of the information the cookie has collected. [wrong]
Definitions like that are fairly common in the press. The problem is, none of that information is correct. Cookies are not programs, and they cannot run like programs do. Therefore, they cannot gather any information on their own. Nor can they collect any personal information about you from your machine.

Here is a valid definition of a cookie:

    A cookie is a piece of text that a Web server can store on a user's hard disk. Cookies allow a Web site to store information on a user's machine and later retrieve it. The pieces of information are stored as name-value pairs. [Correct]
For example, a Web site might generate a unique ID number for each visitor and store the ID number on each user's machine using a cookie file.

If you use Microsoft's Internet Explorer to browse the Web, you can see all of the cookies that are stored on your machine. The most common place for them to reside is in a directory called c:\windows\cookies. When I look in that directory on my machine, I find 165 files. Each file is a text file that contains name-value pairs, and there is one file for each Web site that has placed cookies on my machine.

You can see in the directory that each of these files is a simple, normal text file. You can see which Web site placed the file on your machine by looking at the file name (the information is also stored inside the file). You can open each file by clicking on it.

For example, I have visited goto.com, and the site has placed a cookie on my machine. The cookie file for goto.com contains the following information:

    UserID    A9A3BECE0563982D    www.goto.com/

Goto.com has stored on my machine a single name-value pair. The name of the pair is UserID, and the value is A9A3BECE0563982D. The first time I visited goto.com, the site assigned me a unique ID value and stored it on my machine.

(Note that there probably are several other values stored in the file after the three shown above. That is housekeeping information for the browser.)

Amazon.com stores a bit more information on my machine. When I look at the cookie file Amazon has created on my machine, it contains the following:

 session-id-time  954242000  amazon.com/
 session-id  002-4135256-7625846  amazon.com/
 x-main  eKQIfwnxuF7qtmX52x6VWAXh@Ih6Uo5H  amazon.com/
 ubid-main  077-9263437-9645324  amazon.com/

It appears that Amazon stores a main user ID, an ID for each session, and the time the session started on my machine (as well as an x-main value, which could be anything).

The vast majority of sites store just one piece of information -- a user ID -- on your machine. But there really is no limit -- a site can store as many name-value pairs as it likes.

A name-value pair is simply a named piece of data. It is not a program, and it cannot "do" anything. A Web site can retrieve only the information that it has placed on your machine. It cannot retrieve information from other cookie files, nor any other information from your machine.

How Does Cookie Data Move?
As you saw in the previous section, cookie data is simply name-value pairs stored on your hard disk by a Web site. That is all cookie data is. The Web site stores the data, and later it receives it back. A Web site can only receive the data it has stored on your machine. It cannot look at any other cookie, nor anything else on your machine.

The data moves in the following manner:

  • If you type the URL of a Web site into your browser, your browser sends a request to the Web site for the page (see How Web Servers and the Internet Work for a discussion). For example, if you type the URL http://www.amazon.com into your browser, your browser will contact Amazon's server and request its home page.

  • When the browser does this, it will look on your machine for a cookie file that Amazon has set. If it finds an Amazon cookie file, your browser will send all of the name-value pairs in the file to Amazon's server along with the URL. If it finds no cookie file, it will send no cookie data.

  • Amazon's Web server receives the cookie data and the request for a page. If name-value pairs are received, Amazon can use them.

  • If no name-value pairs are received, Amazon knows that you have not visited before. The server creates a new ID for you in Amazon's database and then sends name-value pairs to your machine in the header for the Web page it sends. Your machine stores the name-value pairs on your hard disk.

  • The Web server can change name-value pairs or add new pairs whenever you visit the site and request a page.
There are other pieces of information that the server can send with the name-value pair. One of these is an expiration date. Another is a path (so that the site can associate different cookie values with different parts of the site).

You have control over this process. You can set an option in your browser so that the browser informs you every time a site sends name-value pairs to you. You can then accept or deny the values.

How Do Web Sites Use Cookies?
Cookies evolved because they solve a big problem for the people who implement Web sites. In the broadest sense, a cookie allows a site to store state information on your machine. This information lets a Web site remember what state your browser is in. An ID is one simple piece of state information -- if an ID exists on your machine, the site knows that you have visited before. The state is, "Your browser has visited the site at least one time," and the site knows your ID from that visit.

Web sites use cookies in many different ways. Here are some of the most common examples:

  • Sites can accurately determine how many people actually visit the site. It turns out that because of proxy servers, caching, concentrators and so on, the only way for a site to accurately count visitors is to set a cookie with a unique ID for each visitor. Using cookies, sites can determine:
    • How many visitors arrive
    • How many are new vs. repeat visitors
    • How often a visitor has visited

    The way the site does this is by using a database. The first time a visitor arrives, the site creates a new ID in the database and sends the ID as a cookie. The next time the user comes back, the site can increment a counter associated with that ID in the database and know how many times that visitor returns.

  • Sites can store user preferences so that the site can look different for each visitor (often referred to as customization). For example, if you visit msn.com, it offers you the ability to "change content/layout/color." It also allows you to enter your zip code and get customized weather information. When you enter your zip code, the following name-value pair gets added to MSN's cookie file:

     WEAT  CC=NC%5FRaleigh%2DDurham®ION=  www.msn.com/
    

    Since I live in Raleigh, NC, this makes sense.

    Most sites seem to store preferences like this in the site's database and store nothing but an ID as a cookie, but storing the actual values in name-value pairs is another way to do it (we'll discuss later why this approach has lost favor).

  • E-commerce sites can implement things like shopping carts and "quick checkout" options. The cookie contains an ID and lets the site keep track of you as you add different things to your cart. Each item you add to your shopping cart is stored in the site's database along with your ID value. When you check out, the site knows what is in your cart by retrieving all of your selections from the database. It would be impossible to implement a convenient shopping mechanism without cookies or something like them.
In all of these examples, note that what the database is able to store is things you have selected from the site, pages you have viewed from the site, information you have given to the site in online forms, etc. All of the information is stored in the site's database, and in most cases, a cookie containing your unique ID is all that is stored on your computer.

An Example
To give you a simple example of what cookies and a database can do, we have created a simple history and statistics system for this article. This system runs on the stuff.dewsoftoverseas.com servers and lets you view your activity on the stuff.dewsoftoverseas.com site. Here's how it works:

Try the URL for the history page now: Then go view a couple of other pages on stuff.dewsoftoverseas.com and try it again. You will see that the statistics change and so does the list of files. (Also note that the stuff.dewsoftoverseas.com Registration System allows you to reset your history list whenever you like.)

Problems with Cookies
Cookies are not a perfect state mechanism, but they certainly make a lot of things possible that would be impossible otherwise. Here are several of the things that make cookies imperfect.

There are probably not any easy solutions to these problems, except asking users to register and storing everything in a central database.

When you register with the stuff.dewsoftoverseas.com registration system, the problem is solved in the following way: The site remembers your cookie value and stores it with your registration information. If you take the time to login from any other machine (or a machine that has lost its cookie files), then the server will modify the cookie file on that machine to contain the ID associated with your registration information. You can therefore have multiple machines with the same ID value.

Why the Fury Around Cookies?
If you have read the article to this point, you may be wondering why there has been such an uproar in the media about cookies and Internet privacy. You have seen in this article that cookies are benign text files, and you have also seen that they provide lots of useful capabilities on the Web.

There are two things that have caused the strong reaction around cookies:

For more information on cookies and related topics, check out the links on the next page!

Lots More Information!

 Related stuff.dewsoftoverseas.com Articles

 More Great Links!

General Information About Cookies

Media Commentary About Cookies

Cookie Gripes