Join | Renew | Donate
Stay on top of all DAMA-RMC news and announcements here.
A foreign key is used in physical and sometimes logical relational data modeling schemes to represent a relationship. A foreign key may be created implicitly when a relationship is defined between two entities, depending on the database technology or data modeling tool, and whether the two entities involved have mutual dependencies.
In the example shown in the Foreign Keys figure, Registration contains two foreign keys, Student Number from Student and Course Code from Course. Foreign keys appear in the entity on the many side of the relationship, often called the child entity. Student and Course are parent entities and Registration is the child entity.
An attribute is a property that identifies, describes or measure an entity. Attributes may have domains. The physical correspondent of an attribute in an entity is a column, field, tag, or node in a table, view, document, graph or file.
In data models, attributes are generally depicted as a list within the entity rectangle, as shown in the Attributes figure, where the attributes of the entity Student include Student Number, Student First Name, Student Last Name and Student Birth Date.
January 2024 Newsletter.pdf
An arity of two is also known as binary. A binary relationship, the most common on a traditional data model diagram, involves two entities. The Binary Relationship figure, a UML class diagram, shows that both Student and Course are entities participating in a binary relationship.
An arity of three, known as ternary, is a relationship that includes three entities. An example in fact-based modeling (object -role notation) appears in the Ternary Relationship figure. Here Student can register for a particular Course in a given Semester.
A unary (also known as a recursive or self-referencing) relationship involves only one entity. A one-to-many recursive relationship describes a hierarchy, whereas a many-to-many relationship describes a network or graph. In a hierarchy, an entity instance has at most one parent (or higher-level entity). In relational modeling, child entities are on the many side of the relationship, with parent entities on the one side of the relationship. In a network, an entity instance can have more than one parent.
For example, a Course can require prerequisites. If, in order to take the Biology Workshop, one would first need to complete the Biology Lecture, the Biology Lecture is the prerequisite for the Biology Workshop. In the following relational data models, which use information engineering notation, one can model this recursive relationship as either a hierarchy or network.
The first example (Figure 32) is a hierarchy and the second (Figure 33) is a network. In the first example, the Biology Workshop requires first taking the Biology Lecture and the Chemistry Lecture. Once the Biology Lecture is chosen as the prerequisite for the Biology Workshop, the Biology Lecture cannot be the prerequisite for any other courses. The second example allows the Biology Lecture to be the prerequisite for other courses as well.
In a relationship between two entities, cardinality captures how many of one entity (entity instances) participates in the relationship with how many of the other entity. Cardinality is represented by the symbols that appear on both ends of a relationship line. Data rules are specified and enforced through cardinality. Without cardinality, the most one can say about a relationship is that two entities are connected in some way.
For cardinality, the choices are simple: zero, one or many. Each side of a relationship can have any combination of zero, one or many ('many' means could be more than 'one'). Specifying zero or one allows us to capture whether or not an entity instance is required in a relationship. Specifying one or many allows us to capture how many of a particular instance participates in a given relationship.
These cardinality symbols are illustrated in the following information engineering example of Student and Course.
The business rules are:
Each Student may attend one or many Courses.
Each Course may be attended by one or many Students.
December 2023 Newsletter.pdf
Relationship
A relationship is an association between entities (Chen, 1976). A relationship captures the high-level interactions between conceptual entities, the detailed interactions between logical entities, and the constraints between physical entities.
Relationship Aliases
The term relationship can go by other names. Relationship aliases can vary based on scheme. In relational schemes the term relationship is often used, dimensional schemes the term navigation path is often used, and in NoSQL schemes terms such as edge or link are used, for example. Relationship aliases can also vary based on level of detail. A relationship at the conceptual and logical levels is called a relationship, but a relationship at the physical level may be called by other names, such as constraint or reference, depending upon the database technology.
Graphic Representation of Relationships
Relationships are shown as lines on the data modeling diagram. This figure is an Information Engineering example.
In this example, the relationship between Student and Course captures the rule that a Student may attend Courses. The relationship between Instructor and Course captures the rule that an Instructor may teach Courses. The symbols on the line (called cardinality) capture the rules in a precise syntax. (These will be explained next week). A relationship is represented through foreign keys in a relational database and through alternative methods for NoSQL databases such as though edges or links.
Graphic Representation of Entities
In data models, entities are generally depicted as rectangles (or rectangles with rounded edges) with their names inside, such as in this figure, where there are 3 entities: Student, Course and Instructor.
Definition of Entities
Entity definitions are essential contributors to the business value of any data model. They are core Metadata. High quality definitions clarify the meaning of business vocabulary and provide rigor to the business rules governing entity relationships. They assist business and IT professionals in making intelligent and application design decisions. High quality data definitions exhibit three essential characteristics:
Clarity: The definition should be easy to read and grasp. Simple, well-written sentences without obscure acronyms or unexplained ambiguous terms such as sometimes or normally.
Accuracy: The definition is a precise and correct description of the entity. Definitions should be reviewed by experts in the relevant business areas to ensure that they are accurate.
Completeness: All of the parts of the definition are present. For example, in defining a code, examples of the code values are included. In defining an identifier, the scope of uniqueness is included in the definition.
Data Modeling is the process of discovering, analyzing, and scoping data requirements, and then representing and communicating these data requirements in a precise form called the data model. Data modeling is a critical component of data management. The modeling process requires that organizations discover and document how their data fits together. The modeling process itself designs how data fits together (Simsion, 2013). Data models depict and enable an organization to understand its data assets.
There are a number of different schemes used to represent data. The six most commonly used schemes are: Relational, Dimensional, Object-Oriented, Fact-Based, Time-Based, and NoSQL. Models of these schemes exist at three levels of detail: conceptual, logical, and physical. Each model contains a set of components. Examples of components are entities, relationships, facts, keys, and attributes. Once a model is built, it needs to be reviewed and once approved, maintained.
Data models comprise and contain Metadata essential to data consumers. Much of this Metadata uncovered during the data modeling process is essential to other data management functions. For example, definitions for data governance and lineage for data warehousing and analytics.
Develop a Roadmap
If an enterprise were developed from scratch (free from dependencies on existing processes), and optimal architecture would be based solely on the data required to run the enterprise, priorities would be set by business strategy, and decisions could be made unencumbered by the past. Very few organizations are ever in this state. Even in an ideal situation, data dependencies would quickly arise and need to be managed. A roadmap provides a means to manage these dependencies and make forward-looking decisions. A roadmap helps an organization see trade-offs and formulate a pragmatic plan, aligned with business needs and opportunities, external requirements, and available resources.
A roadmap for Enterprise Data Architecture describes the architecture's 3 - 5 year development path. Together with the business requirements, consideration of actual conditions, and technical assessments, the roadmap describes how the target architecture will become reality. The Enterprise Data Architecture roadmap must be integrated into an overall enterprise architecture roadmap that includes high-level milestones, resources needed, and costs estimations, divided in business capability work streams. The roadmap should be guided by a data management maturity assessment.
Most business capabilities require data as an input; others also produce data on which other business capabilities are dependent. The enterprise architecture and the Enterprise Data Architecture can be formed coherently by resolving this data flow in a chain of dependencies between business capabilities.
A business-data-driven roadmap starts with the business capabilities that are most independent (i.e., have the least dependency from other activities), and ends with those who are most dependent on others. Dealing with each business capability in sequence will follow an overall business data origination order. This figure shows an example chain of dependency, with the lowest dependency at the top. Product Management and Customer Management do not depend on anything else and thus constitute Master Data. The highest dependency items are on the bottom where Customer's Invoice Management depends on Customer Management and Sales Order Management, which in turn depends on two others.
Therefore, the roadmap would ideally advise starting at Product Management and Customer Management capabilities and then resolve each dependency in steps from top to bottom.
Featured articles coming soon!
About us| Events | Learn | Join DAMA-RMC| Contacts
© DAMA-RMC 2022