Data Modeling – A Comprehensive Overview In 5 Points

Introduction

There are many instances like while building a house first its blueprint is made, or when someone plans to open a hotel, then he must also plan the parking space around the hotel for the ease of the guests, in these cases, a ‘Data Model’ comes to the rescue. It may happen that without a proper data model small details may be left out which may lead to bigger problems in the future. 

  1. What is data modeling?
  2. Types of Data Models
  3. Data Modeling Process
  4. Types of Data Modelling
  5. Benefits Of Data Modeling

1) What is data modeling?

If we go through the definition of data modeling, then it is defined as a technique that represents the nature of data taken by the developers and thus a data model is built according to the requirements of the client following all the parameters. It is not completed in a single day rather it is a continuous process involving many steps and is made only after analyzing and understanding the client’s requirements. 

It can also be called database modeling since each data model is further implemented into a database. It also behaves as a great tool of communication between the business people who are in need of the model and the technical experts who create these models as per the former’s requirement. The details of data which may be multiple are clearly shown in these models.

Let us continue the example stated in the first paragraph where we read about building the data model of a house. Now, in this case, the owner of the land gives a task to make a house, further the architect builds a data model (blueprint of the house) and gives it to the engineer (the technical expert here and will help to make the building); the final output which is the building shall be called as a data warehouse.

After preparing the model, it must be discussed with the client whether their requirements are fulfilled or not. Therefore, this is how the modeling of a house is completed. So the model is like a representation of the real-world object. These types of models can be used in the future also, hence they are long-term in nature. Developers and Modelers may change but the company might be using this for a long time. 

Data modeling involves the process of normalization, first suggested by Edgar Codd (a data scientist), which means any irregularity is avoided or redundancy is eliminated. Normalizing a data model means structuring the data which helps to focus each model on any one topic or theme. Whenever any client opts for data modeling then his first priority shall be that his data are secured and they need to be able to trust the technical expert who is building the model. Therefore data integrity becomes their foremost priority before they make any decision while selecting the model. The 2 rules that are important in maintaining data integrity:

Entity Integrity – which means that reliable data are exchanged within a single entity or table. Here the use of the primary key is essential to ensure integrity.

Referential Integrity – which means that reliable data are exchanged between two entities or tables. Here the use of the foreign key is an essential step.

2) Types of Data Models:

There are three levels of a data model that are meant for different kinds of clients as per their requirement and are further discussed below.

  • Conceptual Data Model – It gives just the basic idea and not the technical details about the final product which is the data warehouse to the clients. There is not much need to care about the actual data. Here entities, relations and attributes are defined. Generally, business users are targeted in this model.
  • Logical Data Model – This model describes the data like tables, columns, object-oriented classes in detail. It is like an extension to the conceptual data model and entities, relations, attributes along with primary key and foreign key are also defined here. Logical Model and Conceptual Data Model are independent of databases that will be used to implement these models. Like the conceptual data model here also business users are targeted.
  • Physical Data Model – This model answers the question of “How the actual data can be stored?”. It cares about the actual physical statement to store data like partitions, CPU spaces, and various objects like views, procedures, etc. This is made differently for different databases as per requirements.  In this model always correct data type must be entered as the wrong data type may result in the usage of extra memory. Further, these models are then transformed into SQL DDL statements; where again these statements are used to create tables and relationships.  The primary key, foreign key, column names and data types are mentioned in this model. The target users are generally technical experts rather than business users. 

3) Data Modeling Process:

Data Modeling, as stated above, consists of  many steps however it can be summed up as the following 5 steps:

Step 1 – To have a basic understanding of the application and how it works.

Step 2 –  If required then model the queries as required by the application.

Step 3 – After the above, design the tables.

Step 4 – Determine the primary keys and make necessary changes.

Step 5 – At last, use the right type of data effectively. 

4) Types of Data Modelling:

ER Diagrams- As the data model deals with multiple real-world objects, it is important to develop relations between them, hence an entity-relation (ER) diagram is used for this purpose. It is a data modeling technique that shows the relationship between entities and describes the structure of the database with the help of a diagram. This is the first step that is needed to be done after the requirements of the client are gathered by the experts.  Further how each component is related shall also be described by these ER diagrams.

It consists of two terms – Entity and Relationship; while entity can mean source or destination of data and it has an independent existence which can represent either animate or inanimate object.  An entity set is a collection of a similar entity; an attribute describes the details of the entity. 

On the other hand, relationships, represented by diamonds, defines the relations among entities. There are various types of relationships like one-to-one, one-to-many, many-to-many and many-to-one.

Entities are connected with each other hence this connection reflects the relationships among them and such relationships further reflect the business rules.

Generic Data Model – There are two types of languages one is a natural language which has been evolved naturally among human beings, other is artificial which is coded artificially into a computer. This generic model behaves like a natural language. It must also contain generic entity types.

Semantic Data Model – This model describes the meaning of the data given here. It is a conceptual model. This model can be used to plan data resources, the building of shareable databases, etc.

5) Benefits Of Data Modeling

Data Model gives a road map to the future, that is, provides a vision and also helps to find out whether there are any weaknesses in the formulation of the plan. With the absence of a model, some entities may be missed out while creating the data warehouse (the final output) which may result in a huge amount of losses, especially in big companies.

Conclusion

It can be concluded that data modelling is an inevitable part of our daily life and also while operating any business, it needs special attention from both IT and business stakeholders as it is beneficial for both of them.

If you are interested in making a career in the Data Science domain, our 11-month in-person Postgraduate Certificate Diploma in Data Science course can help you immensely in becoming a successful Data Science professional. 

ALSO READ

Related Articles

loader
Please wait while your application is being created.
Request Callback