What is data modeling?
Data modeling is an essential first step in the design of any database. The database designer or database administrator creates models to represent the data objects to be stored, their attributes, and the relationships between them. The diagrams or documents created act as blueprints for the final result and keeps the project on track, enhances visibility to all stakeholders, ensures consistency, and helps the developer avoid errors that might be difficult or costly to correct at later stages.
What’s the purpose of physical data modeling?
There are three stages of data modeling: conceptual, logical, and physical. Conceptual data models focus on the general structure of the system, the entities to be included, the business requirements of the database to be built, and the types of data to be stored. Logical models require more care and at this point the entities are fleshed out with detailed attributes and relationships. No attention is paid to the technical aspects of the database in either of these stages of data modeling.
Physical models are all about building the database. Primary, secondary, and foreign keys are worked out, along with restraints and the actual tables and columns to be used. Database management and optimization come into play, as does data normalization. Hardware is now also considered, along with the specific Database Management System (DBMS), such as MySQL, MongoDB, or Azure SQL Database, that will best suit the project. The database schema is fully worked out and any necessary changes are made to the model created at the logical stage.
What are the steps of physical data modeling?
Because the physical data model is the final stage of modeling before building the database, it can require some steps to transform the more abstract models created in earlier stages. Conceptual and logical models are concerned with entities, attributes, and relationships. Physical data models are concerned with tables, columns, and keys.
The physical data model also has to take into account the physical needs and constraints of the system. These are dictated by the DBMS, the available hardware, or speed requirements. Changes will be made to improve overall performance, usability, and perhaps reduce both the size of the database and how rapidly information can be accessed.
The first step when moving from the logical to the physical model is to convert entity types into tables. Once the tables exist, the attributes can be converted into columns within the tables. Next, the relationships determine which foreign keys link these tables.
While the logical model might have spent time on data normalization, the physical model might take a step back from this and indulge in denormalization. This reverses the process of removing redundancy and introducing the duplication of data to speed up read and write access in real-world databases. The denormalization might be a built-in part of the DBMS, or it might be a selective process carried out by the administrator based on knowledge of the project and the types of data being stored.
These kinds of trade-offs and compromises, where the abstract models of the earlier stages must confront the constraints of the actual database, are an integral step of the physical data model. Only after the database has gone through this final, practical filter will it be ready to be implemented.
The advantages of physical data modeling
One of the main advantages of creating a physical data model is that the designer greatly reduces the risk of creating a flawed database design. Even an experienced database designer can make mistakes when dealing with a new project. Going through all stages of data modeling provides significant protection against errors, oversights, and costly revisions.
Even at the physical model stage, there will still be stakeholders interested in making sure that the database does exactly what they need. Being able to show them a detailed blueprint for the database not only reassures them that the design is going according to plan, but also gives them the opportunity to point out missed opportunities or potential improvements.
Each of the stages of data modeling builds upon the last to create a fully worked out design that ultimately can be transformed into a real-world database schema. The physical data model is the final, vital stage in the design process, and it is the model created at this point that will be closest to the actual database.
Example of a data model
Let’s look at a simple example of a physical database model. We’ll use our own Gleek diagramming tool to create the model. Gleek has a unique syntax that enables designers to rapidly create diagrams without using a mouse or fiddling with drawing tools. It’s especially suited to creating informal conceptual models, but it can just as easily be used for physical data models.
To keep things simple, we’ll only model a small part of a larger database designed to manage appointments in a medical clinic. The database needs to keep track of doctors, patients, and appointments. The Gleek templates already include an example of an entity-relationship diagram showing this system, so you might like to check that out first.
First, we need a table to represent the patient. To keep things simple, we’ll only include the bare minimum of information needed to differentiate patients in the clinic. We need an id number, which acts as the primary key, first and last names, phone number, and email address.
Next, our clinic needs doctors. Again, we’ll strip back the kind of data stored so that we identify doctors by only their id, name, and phone number.
We’ll keep things even simpler for the room in the clinic where the appointment will take place. In real life, the database would need a more complex time tracking system to avoid double booking and generate a calendar of availability. We’ll stick to just room id and a single Y/N for availability.
Now comes the most important element – the appointment itself. This ties together all the previous elements and pulls in their id numbers as foreign keys. It also includes an id number for the appointment and a time for it to start.
Finally, we tie it all together with the relationships between the tables. We’ll use Crow’s Foot Notation for this. Both a patient and a doctor can have none or many appointments, while an appointment must be associated with one and only one room.
That gives us a simple example of an appointment system in diagram form. If you want to get started with data modeling, you might also like to check out our article on data modeling techniques.