An Entity-Relationship diagram (ERD) typically serves as the main deliverable of a conceptual data model. While newer approaches to E-R modeling have developed, the E-R approach is still cited by some professionals as “the premier model for conceptual database design” . An ERD is a logical representation of an organization’s data, and consists of three primary components:
- Entities – Major categories of data and are represented by rectangles
- Attributes – Characteristics of entities and are listed within entity rectangles
- Relationships – Business relationships between entities and are represented by lines
An Entity is a person, place, object, event, or concept that an organization wants to maintain data on. Each entity has a unique identity that differentiates it from other entities. A point of distinction must be made between entity types and entity instances. An entity type is a collection of entities that share common properties . Entity types are also known as entity classes. An entity instance is an individual occurrence of an entity type. A data model describes an entity type only once; however there may be numerous instances of that type within a database.
Entity type definition is important to requirements determination and structuring. Type definitions should (1) include the unique characteristics for the type, (2) clarify what instances are included and excluded in the type, (3) identify when an instance is created, deleted, or change into another entity type, and (4) specify what history needs to be kept about entity instances. A sound understanding of business should help the conceptual data modeler when defining entity types.
An Attribute is a characteristic of an entity that is relevant to the organization. When defining an attribute, an analyst should state why the attribute is important, what is included in the attribute’s value, the source of the value, and whether or not that value can change. Again, a sound understanding of an organization’s business should assist the analyst in compiling relevant attributes.
Candidate keys are attributes that uniquely identify each instance of an entity type. An analyst must select one candidate key as an entity type’s identifier. This identifier will then be used to uniquely identify each instance of that entity type. It is critical that the identifier’s value remain permanent over time, and never contain a null value.
Other types of attributes that analysts should be aware of when creating E-R diagrams include:
- Multivalued Attributes – attributes that may contain multiple values
- Required Attributes – attributes that must contain a value for each instance
- Optional Attributes – attributes that do not need a value for each instance
- Composite Attributes – attributes that contains meaningful component parts
- Derived Attributes – attributes whose values are computed from other data in the database
Relationships link the various components in an E-R diagram together. It is usually best to think of relationships as verbs and entities as nouns, which together comprise a complete sentence . For example, "an Invoice Was_Sent_To a Customer". Relationships depict either some kind of event occurring or a natural link between entity instances. A relationship’s degree indicates the number of entity types that participate in a given relationship. The three most common relationship degrees are: Unary (between instances of one entity type), Binary (between instances of two entity types), and Ternary (between three entity types). Below, illustrations are listed for each of the relationship degree types :
Cardinalities in a relationship identify the number of instances of an entity that can (or in some cases, must) associate with another entity. Minimum and maximum cardinality determine the status of relationships, which can be: (1) one-to-one, (2) one-to-many, and (3) many-to-many.
When defining relationships, analysts should use concrete, verb names (such as Assigned_To) since relationships represent actions. Definitions should also include why the relationship is important, and provide examples. Additionally, analysts should also explain optional participation or any maximum cardinality (where relevant). An understanding of an organization’s business will most likely help the analyst in clarifying these relationship definitions.
Associative Entities are entity types that link the instances of various entity types. Associative entities contain attributes that are unique to the relationship between those entity instances. In essence, this is simply a relationship that the analyst has modeled as an entity type. In many cases, the reason for creating the associative entity is to modify a many-to-many relationship.
Supertypes and Subtypes serve as a way for analysts to account for entity types that share many similarities with only a few differences. A subtype is a sub-grouping of entities within a general entity type that share unique characteristics which differentiate the grouping from the original entity. A supertype is a generic entity that has a relationship with at least one subtype. There are four key business rules which govern supertypes and subtypes:
- Total Specialization – each entity instance must be a member of some subtype
- Partial Specialization – an entity instance is not required to be a member of a subtype
- Disjoint – each entity instance may be a member of a maximum of one subtype
- Overlap – an entity instance may be a member of multiple subtypes
Subtypes benefit from attribute inheritance, which means that subtypes inherit all values of their supertypes .
The final components of E-R Diagrams are business rules. Business rules are used to preserve the integrity of the data model. The four primary types of business rules are:
- Entity Integrity – Each instance of an entity must contain a unique identifier (See above).
- Referential Integrity – These are constraints on the relationships between entity types (such as foreign key constraints).
- Domains – Details the possible data types and range values that an attribute may assume.
- Trigger – These are constraints on data manipulation operations (such as Inserts, Updates, and Deletes).
What Makes A Good E-R Diagram?
When asking what makes a “good” E-R diagram, analysts are really asking, “How well does this model support a sound overall system design that meets the business requirements?”  Below are several criteria that analysts should consider when ascertaining the effectiveness of any E-R diagram: 
- Completeness – Does the model support all the necessary data?
- Data Reusability – Can the data be made available to support additional information requirements?
- Stability – If the organization’s business requirements change, can the model remain intact?
- Flexibility – Can the model be readily extended to support new business requirements?
- Elegance – Does the model provide a neat and simple classification of the data?
- Communication – Does the model represent concepts that users and programmers will understand?
- Integration– Does the proposed model fit with the organization’s existing databases?
- Conflicting Objectives – Many of the above objectives will conflict with one another. For example, a model may be simple and elegant, but do a poor job capturing all of the necessary business requirements... Does the model provide the best balance among conflicting objectives?
Obviously, a sound understanding of an organization’s business will aid the analyst in trying to satisfy the above criteria. Solid business understanding is more critical to some criterion than others. For instance, an analyst’s understanding of the organization’s business will most likely critically impact her ability to create a stable and flexible model while facilitating effective communication between users and programmers. Whereas, analyzing the integration potential between current and proposed systems may require a greater proportion of information systems expertise.
As stated in the introduction, it can be a challenge to create an effective data model that satisfies current requirements and possesses the ability to grow with an organization over time. Depending on the complexity of the organization and model requirements, this can consume a great amount of time and resources. For this reason, many organizations scale back on this portion of systems analysis (i.e. both Requirements Determination and Structuring). However, many organizations would benefit from a greater commitment to data modeling. The next section of this paper discusses the benefits of an effective data model (as well as the consequences for improper modeling).