When the developers try to design a new enterprise database, they follow the three normal forms as a silver bullet. The developers tend to think that normalization is the only way to design. With this mindset, they sometimes meet hurdles as the project goes forward. Even though normalization rules are important, but if you take them as a mark on the stone, there are chances of trouble. Here, we will discuss some rules by the experts that will help in DB design.
Database design rules by experts
Rule #1: Consider the nature of your application
Is your application OLAP or OLTP? While starting with database design, the first thing you should analyze is your app’s nature, whether it is analytical or transactions. You may find that many developers are applying normalization as a default rule without thinking twice about the nature of the application and then later getting into trouble with performance and customization issues. As discussed, there are two types of applications as analytical and transaction-based; let us discuss what these are:
- Transactional: In this database application, the end-user will be interested more in CRUD: Creating, reading, updating, and reading the data records. This approach is known as the OLTP database.
- Analytical: In these DBMS applications, the end-user is interested in analysis, reporting, forecasting, etc. Such databases will have a lesser number of inserts and updates. The primary intention of such databases is to analyze the data quickly and effectively. These types of DBs are called OLAP.
In other terms, if you are thinking of inserts, updates and deletes happen more prominently in your database operations. You may consider a normalized database table design, and otherwise, try for a flat and denormalized data structure.
Rule #2: Breaking data into logical pieces
It is the primary rule by considering the first normal form. The sign of violation of this particular rule is when you see the queries use too many string parsing functions as char index and substring. In such cases, this rule has to be reapplied. For example, in a table with student names, you may want to query the student names. The better approach you may take is to break the Student Name fields into more logical pieces to write optimal and clean queries.
Rule #3: Do not overdo with rule #2
Developers sometimes tend to overdo things. If you ask them to do it in this way, they will keep on doing it. When we try to implement things discussed in rule #2, overdoing, the same can have unwanted consequences. When you think of decomposing the same, pause to it and ask yourself whether it is needed. As we discussed, this decomposition should also be logical.
For example, in the table mentioned above for students, you can take the field for a phone number. The international codes for the phone numbers may rarely come separately until the application demands. So, it may be wise to leave it as it may lead to more complications in the table. You may get the assistance of DBMS experts of RemoteDBA.com if you are doubtful about these.
Rule #4: Treating non-uniform duplicate data as an enemy
The primary concern about duplicate data is not the disk space it acquires but the confusion it may create. Say, for example, in the student’s table; there may be entries like “5th Standard” duplicated with “Fifth standard,” which both represent the same. With this, you can say that the data that comes from your system may have instances of data entry and poor validation. If you want to fetch a report out of it using some criteria, these types of entry would be shown as different entities, and it will be confusing for the end-users. The primary solution for this would be to move your data into another master table altogether and then refer to them using foreign keys.
Rule #5: Watch for the data separated with separators
The second rule in the first normal says about avoiding the repeating data groups. You can see that some fields are stuffed with too much data. These are called repeating groups. If you have to manipulate such data, the query would be much complex, and also the performance of such queries may be doubtful. Such columns with data stuffed using separators may need special attention. The best approach here would be to move those fields to another table and then link them using keys for easy management. For this, we may apply the first normal form’s second rule to avoid the repeating groups. You can create a separate table and use many-to-many relationships with the primary table.
Rule #6: Watch for any partial dependencies
You may watch for the fields which are partially dependent on the primary keys. We can see that the primary key in many tables is created in an illogical way. So, try to apply sense to move these out and rightly associate those with the standard tables. The second normal form’s rule is also applicable, as all keys must be dependent fully on the primary key.
Rule #7: Previously choosing the derived columns
If you are trying to set up OLTP applications, then getting rid of the derived columns may be an ideal thought unless there are any pressing reasons for optimizing performance. In terms of OLAP DBs, where there are many summations and calculations to be done, these fields may be necessary for optimizing performance.
Rule #8: If performance is the objective, do not be so strict on avoiding redundancy
Do not make it a hard and fast rule to avoid redundancy fully. If there any pressing call for performance, then you may think of denormalization. In the case of normalization, you may need to make the joins using many tables, and in the case of denormalization, it may be reduced, so performance is increased.
Hope these rules are found relevant to you if you plan to build a fresh database of your choice to meet your enterprise DBMS needs.