Clusters and Other Data Structures for Oracle - Data Design
(Page 3 of 4 )
Tables and columns present a logical view of the data in a relational database. The flexibility of a relational database gives you many options for grouping the individual pieces of data, represented by the columns, into a set of tables. To use Oracle most effectively, you should understand and follow some firmly established principles of database design.
The topic of database design is vast and deep: we won’t even pretend to offer more than a cursory overview. For more information, we recommend the book Oracle Design by Dave Ensor and Ian Stevenson (O’Reilly; see Appendix B for details).
When E. F. Codd created the concept of a relational database in the 1960s, he also began work on the concept of normalized data design. The theory behind normalized data design is pretty straightforward: a table should contain only the information that is directly related to the key value of the table. The process of assembling these logical units of information is called normalization of the database design.
Normalized Forms
In fact, there is more than one type of normalization. Each step in the normalization process ends with a specific result called a normalized form. There are five standard normalized forms, which are referred to as first normal form (1NF), second normal form (2NF), and so on. The normalization process that we describe briefly in this section results in third normal form (3NF), the most common type of normalization.
Explaining the complete concepts that lie behind the different normal forms is beyond the scope of this chapter and book.
The concept of normalized table design was tailored to the capabilities of the relational database. Because you could join data from different tables together in a query, there was no need to keep all the information associated with a particular object together in a single record. You could decompose the information into associated units and simply join the appropriate units together when you needed information that crossed table boundaries.
There are many different methodologies for normalizing data. The following is one example:
Identify the objects your application needs to know (the entities). Examples of entities, as shown in Figure 4-3, include employees, locations, and jobs.
Identify the individual pieces of data, referred to by data modelers as attributes, for these entities. In Figure 4-3, employee name and salary are attributes. Typically, entities correspond to tables and attributes correspond to columns.
As a potential last step in the process, identify relationships between the entities based on your business. These relationships are implemented in the database schema through the use of a combination known as a foreign key. For example, the primary key of the DEPARTMENT NUMBER table would be a foreign key column in the EMPLOYEE NAME table used to identify the DEPARTMENT NUMBER in which an employee works. A foreign key is a type of constraint; constraints are discussed later in this chapter.
Normalization provides benefits by avoiding storage of redundant data. Storing the department in every employee record not only would waste space but also would lead to a data maintenance issue. If the department name changed, you would have to update every employee record, even though no employees had actually changed departments. By normalizing the department data into a table and simply pointing to the appropriate row from the employee rows, you avoid both duplication of data and this type of problem.
Normalization also reduces the amount of data that any one row in a table contains. The less data in a row, the less I/O is needed to retrieve it, which helps to avoid this performance bottleneck. In addition, the smaller the size of a row, the more rows are retrieved per data block, which increases the likelihood that more than one desired row will be retrieved in a single I/O operation. And the smaller the row, the more rows will be kept in Oracle’s system buffers, which also increases the likelihood that a row will be available in memory when it’s needed, thereby avoiding the need for any disk I/O at all.
Finally, the process of normalization includes the creation of foreign key relationships and other data constraints. These relationships build a level of data integrity directly into your database design.
Figure 4-3 shows a simple list of attributes grouped into entities and linked by a foreign key relationship.

Figure 4-3. The normalization process
However, there is an even more important reason to go through the process of designing a normalized database. You can benefit from normalization because of the planning process that normalizing a data design entails. By really thinking about the way the intended applications use data, you get a much clearer picture of the needs the system is designed to serve. This understanding leads to a much more focused database and application.
Gaining a deep understanding of the way your data will be used also helps with your other design tasks. For instance, once you’ve completed an optimal logical database design, you must go back and consider what indexes you should add to improve the anticipated performance of the database and whether you should designate any tables as part of a cluster or hash cluster.
Since adding these types of performance-enhancing data structures doesn’t affect the logical representation of the database, you can always make these types of modifications later when you see the way an application uses the database in test mode or in production.
Next: Constraints >>
More Database Articles Articles
More By O'Reilly Media
|
This article is excerpted from chapter four of the book Oracle Essentials, Fourth Edition Oracle Database 11g, written by Rick Greenwald, Robert Stackowiak, and Jonathan Stern (O'Reilly, 2007; ISBN: 0596514549). Check it out today at your favorite bookstore. Buy this book now.
|
|