Jypoly - Database Design 2007 - JAMK

Transcription

Database DesignIIO30100 Tietokantojen suunnitteluMichal ZabovskyDepartment of InformaticsFaculty of Management Science and InformaticsUniversity of Zilina Slovak RepublicPresentation overview Software developmentDatabase systemsDatabase design and modelingIntroduction to data modeling using CASE toolsLogical modelPhysical modelReverse engineeringPractical issuesMarch 2007Database Design 200721

SW developmentSoftware development process requirements (http://www.epri.com):Concept development1.2.3.4.5.6.Defining requirementsDesignImplementationAlpha and beta testsFinal acceptance testSupport and maintenanceOther development requirements: Roles Process variations Topics to consider Background materialMarch 2007Database Design 20073Functional specificationTypical section for the functional specification are: IntroductionSoftware overviewSpecific function descriptionExternal interfacesPerformanceDesign constraintsAttributesAdditional requirementsMarch 2007Database Design 200742

Functional specificationTypical section for the functional specification are: Introduction – describes document Purpose Scope Organization Software overviewSpecific function descriptionExternal interfacesPerformanceDesign constraintsAttributesAdditional requirementsMarch 2007Database Design 20075Functional specificationTypical section for the functional specification are: IntroductionSoftware overview Product descriptionProduct functional capabilitiesUser characteristicsUser operations and practicesGeneral constraintsAssumptionsOther softwareSpecific function descriptionExternal interfacesPerformanceDesign constraintsAttributesAdditional requirementsMarch 2007Database Design 200763

Functional specificationTypical section for the functional specification are: IntroductionSoftware overviewSpecific function description DescriptionInputsProcessingOutputsExternal interfacesPerformanceDesign constraintsAttributesAdditional requirementsMarch 2007Database Design 20077Functional specificationTypical section for the functional specification are: IntroductionSoftware overviewSpecific function descriptionExternal interfaces User interfacesHardware interfacesSoftware interfacesCommunication InterfacesPerformancesDesign constraintsAttributesAdditional requirementsMarch 2007Database Design 200784

Functional specificationTypical section for the functional specification are: IntroductionSoftware overviewSpecific function descriptionExternal interfacesPerformance Response timesThroughputData volumeNumber of concurrent usersPeak loadDesign constraintsAttributesAdditional requirementsMarch 2007Database Design 20079Functional specificationTypical section for the functional specification are: IntroductionSoftware overviewSpecific function descriptionExternal interfacesPerformanceDesign constraints RAMHard disk spaceHW limitationsCommunication standardsAttributesAdditional requirementsMarch 2007Database Design 2007105

Functional specificationTypical section for the functional specification are: IntroductionSoftware overviewSpecific function descriptionExternal interfacesPerformanceDesign constraintsAttributes SecurityReliability, availability, MaintainabilityConfiguration and compatibilityInstallationUsabilityAdditional requirementsMarch 2007Database Design 200711Functional specificationTypical section for the functional specification are: IntroductionSoftware overviewSpecific function descriptionExternal interfacesPerformanceDesign constraintsAttributesAdditional requirements March 2007Data storage / databaseAdministrationUser documentationOther requirementsDatabase Design 2007126

Database systemDatabase system is essentially nothing more than a computerizedrecord-keeping system. The database itself can be regarded as akind of electronic filing cabinet.The user of the system will be given facilities to perform variety ofoperations on such files, including the following among others: Adding new, empty files to the databaseInserting new data into existing filesRetrieving data from existing filesUpdating data in existing filesDeleting data from existing filesRemoving existing files, empty or otherwise, from the databaseMarch 2007Database Design 200713Database system componentsDatabase system involves four major components (by C. J. Date) Data Integrated - Data, held in the database, can be placed in several otherwisedistinct data files. Main purpose is redundancy elimination. Shared – Each object in the database can be shared by several differentusers. HardwareSoftwareUsers End users Application programmers Database administratorMarch 2007Database Design 2007147

Database requirementsDatabase system must ensure following requirements: Data definitions and data manipulation commands separation.Data independence. Logical – the change in the data model could not affect application. Physical – the change in the data storage should not affect application. Procedural and non-procedural interface.Minimization of data redundancy.Data sharing.Secure data access.Data consistency and integrity.March 2007Database Design 200715DataData - values actually stored in the database.Information – the meaning of data.Persistent data – permanently stored in the database (in theopposite of transient data).Input data - refers to information entering the system.Output data – refers to messages and results emanating from thesystem. It might be transformed or derived from persistent data.Database consists of persistent data.March 2007Database Design 2007168

Data representationOverall efficiency of RDBMS is the essential requirement ofdatabase design. Hence decision, which data type would be usedfor data representation is critical and should be retained on thedatabase designer consideration.We should consider about: Numerical data representation (often is necessary to define accuracy orrange) Character data representation (different encoding can be used) Data encoding (e.g. 1 Red, 2 Green etc.) Units for numeric data Data materialization Structure of stored records Structure of stored filesMarch 2007Database Design 200717Database administratorMain tasks of DBA are: Defining conceptual scheme (database design)Defining internal scheme (data representation)Defining security and integrity rulesDefining backup and recovery proceduresLiaising with usersMonitoring performance and responding to changing requirementsMarch 2007Database Design 2007189

Database designTo build good application systems, design must proceed in a set oforderly steps using proper techniques at each step. Suchtechniques must be chosen so that the output of one step can beused as the input for the following step.Design methodologies are collections of techniques organized intoset of steps that supports an orderly way for designing database.Design methodologies also provide the documentation necessary torecord any design decisions. Many methodologies now providesoftware support or CASE (Computer Assisted SoftwareEngineering) tools to support designers and maintain thedocumentation.March 2007Database Design 200719Toad Data Modeler (former Case Studio )Case Studio is now called Toad DataModeler and is FREEWARE!http://www.casestudio.czMarch 2007Database Design 20072010

Data modelingData modeling is the act of exploring data-oriented structures. Likeother modeling artifacts data models can be used for a variety ofpurposes, from high-level conceptual models to physical datamodels.Data models used in practice: Conceptual data models (CDM)Logical data models (LDM)Physical data models (PDM)March 2007Database Design 200721Conceptual modelOne of the most difficult problem during the design phase istransformation of data to information stored in the database. Theproblem grows with the very weak process formalization. Data itselfprovides just partial information, hence additional description mustbe created by the conceptual model. This description must followconceptual view of users to the particular real world part.Conceptual design works with terms like entity, object, relation,attribute, property etc. Since the terms are close to real world themodels are sometimes called object-oriented models.Each conceptual model solves three problems: Data structureData manipulationData integrityMarch 2007Database Design 20072211

Logical modelLogical model is used to explore: Domain conceptsRelationshipsThis could be done for the scope of a single project or for entireenterprise.Logical model identifies: Logical entity types, typically referred to simply as entity typesData attributes describing those entitiesRelationships between the entitiesMarch 2007Database Design 200723Physical modelPhysical model is used to design the internal schema of database.Physical model depicts: Data tablesColumnsRelationshipsPhysical data model shows greater detail, including an associativetable required to implement the association as well as the keysneeded to maintain the relationships.PDM should also reflect database naming standards and indicatethe data types for the columns, such as integer and char(5).March 2007Database Design 20072412

LDM and PDMMarch 2007Database Design 200725Entity-relationship modelEntity-relationship model (E-R model) helps to describe user’sapplication on the conceptual level - defines top-level datasemantics. It identifies the most important entities and relationshipsin the data.The next step is to construct a set of normalized relations. This setof relations removes any unnecessary redundancy. Then the logicaldefinition is created.The form of the definition depends on the database managementsoftware that is to be used to implement the database. It defines thedatabase schema using the DBMS data definition language.March 2007Database Design 20072613

How to model data1.2.3.4.5.6.7.8.Identify entity typesIdentify attributesApply naming conventionsIdentify relationshipsApply data model patternsAssign keysNormalize to reduce data redundancyDenormalize to improve performanceMarch 2007Database Design 200727MS Visio – New modelMarch 2007Database Design 20072814

MS Visio – New modelMarch 2007Database Design 200729Toad Data Modeler – New modelMarch 2007Database Design 20073015

E-R model elementsEntity is the real life object instance that is considered to beimportant in a system.Example: “student Peter Novák, born number 123456/7890“Entity set is a collection of objects with the same properties.Relationship models connection between one or more entities (way,how entities interact with each other).Example: entity „student Peter Novák, born number 123456/7890“interacts (“attend”) with entity “course Database Systems“Attributes describes the properties of entities and relationships.An entity identifier is and attribute (or set of attributes) whose valuesidentify a unique entity. Identifiers are either unique (single attribute)or composite (made up of more than one attribute).March 2007Database Design 200731MS Visio – New entityentityMarch 2007Database Design 20073216

Toad Data Modeler – New entityentityMarch 2007Database Design 200733IdentifiersSo far, we have made a distinction between entities and entity sets.E-R diagrams only include sets but not individual entities. However,one requirement of all systems is to be able to identify individualentities without necessarily showing them on the E-R diagram. Suchidentification is needed to determine the activities carried out by theentities. The usual practice is to identify entities by an entityidentifier.An entity identifier is an attribute (or set of attributes) whose valuesidentify a unique entity.Since entity can have more then one candidate for identifier, isnecessary to choose such identifier which is the best for efficiency ormemory consumption.March 2007Database Design 20073417

MS Visio – Primary keyMarch 2007Database Design 200735Toad Data Modeler – Primary keyMarch 2007Database Design 20073618

AttributesAttributes describe the properties of entities and relationships. Eachsuch attribute is given a name which refer to a particular property.Each attribute is characterized by: NameTypeFlag, whether the attribute is part of identifierFlag, whether the value could be undefined (NULL)Flag, which specifies the unique value for attributeMarch 2007Database Design 200737MS Visio - AttributesMarch 2007Database Design 20073819

Toad Data Modeler - AttributesMarch 2007Database Design 200739Cardinality of relationshipsCardinality shows the number of relationships in which one entitycan appear.Cardinality 1:1 - entity can participate in only one relationship. In thecase that both entities are of the same type relationship is calledrecursive.Example: person -drives- carCardinality 1:N – entity from one set can participate in more thanone relationship.Example: person –use - carCardinality M:N – each entity from one set can participate in morethan one relationship.Example: teacher - teaching - subjectMarch 2007Database Design 20074020

ParticipationParticipation is some additional information that is often stored aboutrelationship. This describes whether each entity in a set must participate ina relationship. Participation can be either mandatory or optional.Department must have teacher and each teacher must be assigned todepartment.teacher - has - departmentTeacher must have department and department need not to have teacher.teacher - has -O departmentDepartment need not to have teacher and teacher need not to havedepartment.teacher O- has -O departmentMarch 2007Database Design 200741MS Visio - RelationshipsMarch 2007Database Design 20074221

Toad Data Modeler - RelationshipsMarch 2007Database Design 200743Identifying relationshipPrimary key migrates from the parent entity to the child entity and therebecomes a part of the primary key. It is used when the primary key of thechild entity is unable to provide definite identification. An entity, connectedwith a parent entity through an identifying relationship, is called a dependententity and is shown in a model with rounded corners.March 2007Database Design 20074422

Toad Data Modeler – Identifying rel.March 2007Database Design 200745Nonidentifying relationshipPrimary key migrates from the parent entity to the child entity and does notbecome a part of the primary key. This kind of relationship is often calledmaster-detail relationship.March 2007Database Design 20074623

Toad Data Modeler – Nonidentifying rel.rel.March 2007Database Design 200747Relationship M:NMost of the RDBMS is not able to represent relationship M:N directly. Alsoother reasons lead to decomposition to the two relationships of 1:N type.March 2007Database Design 20074824

Informative relationshipPrimary key does not migrate at all, no referential integrity or triggers aregenerated, it is only information about the existence of some relationship.March 2007Database Design 200749Toad Data Modeler - Self relationshipTo create a self relationship, please right click on your entity and select AddSelf-Relationship.In most cases this kind of relationship is used to model hierarchy – e.g.relationship children – parents.March 2007Database Design 20075025

Toad Data Modeler– Display leveldisplay levelMarch 2007Database Design 200751Exercise - objectivesCreate data model for the information system of faculty. In the IS wewant to store necessary information about: studentsteachersassigned coursesspecializationsMarch 2007Database Design 20075226

Exercise - entitiesMarch 2007Database Design 200753Exercise – entities descriptionsMarch 2007Database Design 20075427

Exercise – relationshipsMarch 2007Database Design 200755Exercise – domains definitionMarch 2007Database Design 20075628

DenormalizationDenormalization is process, which is from the theoretical point ofview unacceptable. Common argument is that price fordenormalization is very high (especially risks attended to thisprocess). But in the real life is process of denormalization usedsince significant overall system efficiency improvement.March 2007Database Design 200757Database implementationWhen the logical and consequently physical model is designed,whole model can be implemented on particular database system.Implementation consists of following steps: Database system installationDatabase configuration and tuning based on particular usageDatabase and relations creationData import or creationUser and access configurationArchiving and monitoring system configurationMarch 2007Database Design 20075829

agiledata.org/essays/dataModeling101.htmlThank youMichal Zábovský, michal.zabovsky@fri.utc.skDepartment of InformaticsFaculty of Management Science and InformaticsUniversity of ZilinaSlovak RepublicMarch 2007Database Design 20075930

Toad Data Modeler - New entity entity March 2007 Database Design 2007 34 Identifiers So far, we have made a distinction between entities and entity sets. E-R diagrams only include sets but not individual entities. However, one requirement of all systems is to be able to identify individual entities without necessarily showing them on the E-R .