Modeling and Designing Databases - The Entity Relationship Model
(Page 3 of 4 )
At a basic level, databases store information about distinct objects, or entities, and the associations, or relationships, between these entities. For example, a university database might store information about students, courses, and enrollment. A student and a course are entities, while an enrollment is a relationship between a student and a course. Similarly, an inventory and sales database might store information about products, customers, and sales. A product and a customer are entities, while a sale is a relationship between a customer and a product.
A popular approach to conceptual design uses the Entity Relationship (ER) model, which helps transform the requirements into a formal description of the entities and relationships that appear in the database. We’ll start by looking at how the Entity Relationship modeling process itself works, then apply it in “Entity Relationship Modeling Examples” for three sample databases.
Representing Entities To help visualize the design, the Entity Relationship Modeling approach involves drawing an Entity Relationship (ER) diagram. In the ER diagram, an entity set is represented by a rectangle containing the entity name. For our sales database example, the product and customer entity sets would be shown as in Figure 4-1.
We typically use the database to store certain characteristics, or attributes, of the entities. In a sales database, we could store the name, email address, postal address, and telephone number for each customer. In a more elaborate customer relationship managment (CRM) application, we could also store the names of the customer’s spouse and children, the languages the customer speaks, the customer’s history of interaction with our company, and so on. Attributes describe the entity they belong to.
An attribute may be formed from smaller parts; for example, a postal address is composed of a street number, city, ZIP code, and country. We classify attributes as composite if they’re composed of smaller parts in this way, and as simple otherwise.
Some attributes can have multiple values for a given entity. For example, a customer could provide several telephone numbers, so the telephone number attribute is multivalued.
Attributes help distinguish one entity from other entities of the same type. We could use the name attribute to distinguish between customers, but this could be an inadequate solution because several customers could have identical names. To be able to tell them apart, we need an attribute (or a minimal combination of attributes) guaranteed to be unique to each individual customer. The identifying attribute or attributes form a key.
In our example, we can assume that no two customers have the same email address, so the email address can be the key. However, we need to think carefully about the implications of our choices. For example, if we decide to identify customers by their email address, it would be hard to allow a customer to have multiple email addresses. Any applications we build to use this database might treat each email address as a separate person, and it might be hard to adapt everything to allow people to have multiple email addresses. Using the email address as the key also means that every customer must have an email address; otherwise, we wouldn’t be able to distinguish between customers who don’t have one.
Looking at the other attributes for one that can serve as an alternative key, we see that while it’s possible that two customers would have the same telephone number (and so we cannot use the telephone number as a key), it’s likely that people who have the same telephone number never have the same name, so we can use the combination of the telephone number and the name as a composite key.
Clearly, there may be several possible keys that could be used to identify an entity; we choose one of the alternative, or candidate, keys to be our main, or primary, key. You usually make this choice based on how confident you are that the attribute will be non-empty and unique for each individual entity, and on how small the key is (shorter keys are faster to maintain and use).
In the ER diagram, attributes are represented as labeled ovals and are connected to their owning entity, as shown in Figure 4-2 . Attributes comprising the primary key are shown underlined. The parts of any composite attributes are drawn connected to the oval of the composite attribute, and multivalued attributes are shown as double-lined ovals.
Attribute values are chosen from a domain of legal values; for example, we could specify that a customer’s given names and surname attributes can each be a string of up to 100 characters, while a telephone number can be a string of up to 40 characters. Similarly, a product price could be a positive rational number.
Attributes can be empty; for example, some customers may not provide their telephone numbers. The primary key of an entity (including the components of a multiattribute primary key) must never be unknown (technically, it must be NOT NULL); for example,

Figure 4-2. The ER diagram representation of the customer entity
if it’s possible for a customer to not provide an email address, we cannot use the email address as the key.
You should think carefully when classifying an attribute as multivalued: are all the values equivalent, or do they in fact represent different things? For example, when listing multiple telephone numbers for a customer, would they be more usefully labeled separately as the customer’s business phone number, home phone number, cell phone number, and so on?
Let’s look at another example. The sales database requirements may specify that a product has a name and a price. We can see that the product is an entity because it’s a distinct object. However, the product’s name and price aren’t distinct objects; they’re attributes that describe the product entity. Note that if we want to have different prices for different markets, then the price is no longer just related to the product entity, and we’d need to model it differently.
For some applications, no combination of attributes can uniquely identify an entity (or it would be too unwieldy to use a large composite key), so we create an artificial attribute that’s defined to be unique and can therefore be used as a key: student numbers, Social Security numbers, driver’s license numbers, and library card numbers are examples of unique attributes created for various applications. In our inventory and sales application, it’s possible that we could stock different products with the same name and price. For example, we could sell two models of “Four-port USB 2.0 Hub,” both at $4.95 each. To distinguish between products, we can assign a unique product ID number to

Figure 4-3. The ER diagram representation of the product entity
each item we stock; this would be the primary key. Each product entity would have name, price, and product ID attributes. This is shown in the ER diagram in Figure 4-3 .
Next: Representing Relationships >>
More Database Code Articles
More By O'Reilly Media
|
This article is excerpted from chapter 4 of the book Learning MySQL, written by Seyed M.M. "Saied" Tahaghoghi and Hugh E. Williams (O'Reilly, 2006; ISBN: 0596008643). Check it out today at your favorite bookstore. Buy this book now.
|
|