Database Fundamentals - ESP

Transcription

DatabaseFundamentalsRobert J. RobbinsJohns Hopkins Universityrrobbins@gdb.orgFile: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 1

What is a Database?General: A database is any collection of related data.Restrictive: A database is a persistent, logically coherentcollection of inherently meaningful data, relevantto some aspects of the real world.The portion of the real world relevant to the database is sometimes referredto as the universe of discourse or as the database miniworld. Whatever itis called, it must be well understood by the designers of the database.File: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 2

What is a Database Management System?A database management system (DBMS) is acollection of programs that enables users to createand maintain a database.According to theANSI/SPARC DBMS Report (1977), a DBMSshould be envisioned as a multi-layered system:External Level(individual user views)Conceptual Level(Enterprise-wide view\)Internal Level(storage view)Storage Level(physical storage)File: N drive:\jhu\class\1995\db-fund.pptExternalView 1 ExternalView nConceptualSchemaInternalSchemaPhysicalDatabase 1994, 1995 Robert RobbinsDatabase Fundamentals: 3

What Does a DBMS Do?Database management systems provide severalfunctions in addition to simple file management: allow concurrency control security maintain data integrity provide for backup and recovery control redundancy allow data independence provide non-procedural query language perform automatic query optimizationFile: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 4

Who Interacts with a DBMS?Many different individuals are involved with adatabase management system over its life: systems analysts database designers database administrators application developers usersFile: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 5

Components of a Database SystemDirect ablesFileManagerSystemCatalogFile: N abase 1994, 1995 Robert RobbinsMetadataDatabaseDatabase Fundamentals: 6

Relational Database ModelWhat is a relational database? a database that treats all of its data as acollection of relationsWhat is a relation? a kind of set a subset of a Cartesian product an unordered set of ordered tuplesFile: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 7

Basic Set ATIONexampleNote:any collection of distinct entities of anysort.A { 1,2,3,4,5,6 }B { H,T }C { R,B }D { Grant, Sherman, Lee }a set of ordered pairs, produced bycombining each element of one set witheach element of another set.B x C { H,R , H,B , T,R , T,B }Cartesian products may be generated bymultiplying any number of sets together.The actual number of sets involved in aparticular case is said to be the “degree”or “arity” of that Cartesian product.a subset of a Cartesian productQ { H,R , H,B }Relations may be of any degree (arity).File: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 8

Basic Set ConceptsA set is usually indicated by including a commadelimited list of the names its members within apair of wavy brackets:R { 1,2,3,4,5,6 }G { Marshall, Eisenhower, Bradley }The members of a set are unordered. Two setsare considered equivalent if and only if theycontain exactly the same members, without regardfor the order in which the members are listed.R { 1,2,3,4,5,6 } { 3,2,1,6,4,5 }G { Marshall, Eisenhower, Bradley } { Bradley, Marshall, Eisenhower }File: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 9

Basic Set ConceptsAn ordered double (or triple or quadruple or ntuple) is usually indicated by including a commadelimited list of the names its members within apair of pointed brackets:S 2,4 C Marshall, Eisenhower, Bradley Order must be maintained in ordered n-tuples.Two tuples are considered different if they containthe same members in a different order.S 2,4 4,2 C Marshall, Eisenhower, Bradley Bradley, Eisenhower, Marshall A set may consist of an unordered collection ofordered tuples. For example, we could imaginethe set of all ordered pairs of integers, such thatthe first element is the square root of the secondelement.R { 1,1 , 2,4 , 3,9 . }As this ellipsis indicates, sets can beinfinite in size. However, sets thatare actually represented in a databasemust be finite.File: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 10

Basic Set ConceptsLETR be the set of possible outcomes when rollinga single red die.R { 1,2,3,4,5,6 }LETB be the set of possible outcomes when rollinga single blue die.B { 1,2,3,4,5,6 }THENThe Cartesian product R x B gives the set ofoutcomes when the two dice are rolledtogether:R x B: {File: N drive:\jhu\class\1995\db-fund.ppt 1,1 1,2 1,3 1,4 1,5 1,6 3,1 3,2 3,3 3,4 3,5 3,6 5,1 5,2 5,3 5,4 5,5 5,6 2,1 2,2 2,3 2,4 2,5 2,6 4,1 4,2 4,3 4,4 4,5 4,6 6,1 6,2 6,3 6,4 6,5 6,6 } 1994, 1995 Robert RobbinsDatabase Fundamentals: 11

Relation: Subset of a Cartesian ProductSet RSet B112233445566A Cartesian product of two setscan be generated by combiningevery member of one set withevery member of the other set.This results in a complete set ofordered pairs, consisting ofevery possible combination ofone member of the first setcombined with one member ofthe second set. The number ofelements in a Cartesian productis equal to M x N, where M andN give the number of membersin each set.Starting two sets. 1,1 1,2 1,3 1,4 1,5 1,6 3,1 3,2 3,3 3,4 3,5 3,6 5,1 5,2 5,3 5,4 5,5 5,6 2,1 2,2 2,3 2,4 2,5 2,6 4,1 4,2 4,3 4,4 4,5 4,6 6,1 6,2 6,3 6,4 6,5 6,6 A Cartesian product of two sets,shown as a list of ordered pairs.File: N drive:\jhu\class\1995\db-fund.ppt112233445566A Cartesian product of two sets,shown as a connection diagram,with each member of the first setconnected to each member of theother set. 1994, 1995 Robert RobbinsDatabase Fundamentals: 12

Relation: Subset of a Cartesian Product 1,1 1,2 1,3 1,4 1,5 1,6 A Cartesian product pairs everymember of the first set with everymember of the second set. 2,1 2,2 2,3 2,4 2,5 2,6 A relation pairs somemembers of the first setwith some members ofthe second set. 3,1 3,2 3,3 3,4 3,5 3,6 1,1 2,2 3,3 4,4 4,1 4,2 4,3 4,4 4,5 4,6 5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6 File: N drive:\jhu\class\1995\db-fund.ppt 5,5 6,6 A relation, therefore, must alwaysbe representable as a subset of someCartesian product. 1994, 1995 Robert RobbinsDatabase Fundamentals: 13

Relation: Set of Ordered TuplesA binary relation is a set of ordered doubles, with one element a member of thefirst set and one element a member of the second set. Generally, we couldrepresent a set of ordered doubles as below. S1 is the first set and S2 the second.S1xS2 By adding sets, relations can be extended to include ordered triples, orderedquadruples or, in general, any ordered n-tuple, as below. A relation with nparticipating sets is said to be of degree n or to possess arity n.S1 xS2xS3xx Sn File: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 14

Relations as a DatabaseAn n-ary relation (i.e., a subset of a Cartesian product of n sets) could be berepresented in a computer system as an n-column tabular file, with one memberfrom the first set named in the first column of each record and one member ofthe second set in the second column, etc.S1 xS2xS3xxSn Codd recognized that many of the files used in computerized informationsystems were very similar in structured to tabularized relations.SmithRobertL.1154 Elm StreetGlendaleMD21200SmithJudyF.1154 Elm StreetGlendaleMD21200JonesGregG.765 Cedar LaneTowsonMD21232HarrisLloydK.2323 Maple DrTowsonMD21232 Ziegler FredFile: N drive:\jhu\class\1995\db-fund.ppt K. 7272 Cherry Ln. 1994, 1995 Robert Robbins Baltimore MD 21208Database Fundamentals: 15

Relations as a DatabaseThe business data file resembles a relation in a number of ways. The tabularfile itself corresponds to a relation. Each column, or attribute, in the filecorresponds to a particular set and all of the values from a particular columncome from the same domain, or set. Each row, or record, in the filecorresponds to a pSmithRobertL.1154 Elm StreetGlendaleMD21200SmithJudyF.1154 Elm StreetGlendaleMD21200JonesGregG.765 Cedar LaneTowsonMD21232HarrisLloydK.2323 Maple DrTowsonMD21232 Ziegler Fred K. 7272 Cherry Ln. Baltimore MD 21208If such a file is to be genuinely interchangeable with a relation, certaincontraints must be met: every tuple must be unique every attribute within a tuple must be single-valued in in all tuples, the values for the same attribute must come from thesame domain or set no attributes should be nullFile: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 16

Relations as a DatabaseAn essential attribute of a relation is that every tuple must be unique. Thismeans that the values present in some individual attribute (or set of attributes)must always provide enough information to allow a unique identification ofevery tuple in the relation. In a relational database, these identifying valuesare known as key values or just as the key.Sometimes more than one key could be defined for given table. Forexample, in the table below (which represents, perhaps, a patient record file),several columns might serve as a key. Either patient number (assigned bythe hospital) or social security number (brought with the patient) arepossibilities. In addition, one might argue that the combination of last name,address, and birth date could collectively serve as a key.Any attribute or set of attributes that might possibly serve as a key is knownas a candidate key. Keys that involve only one attribute are known assimple keys. Keys that involve more than one attribute are composite keys.patient #SS #Last Nameaddressbirth dateP-64122123-45-6789Smith123 Main Street10 MAY 44P-75642001-32-6873Pedersen1700 Cedar Barn Way31 MAR 59P-70875444-44-5555Wilson1321 North South St7 AUG 90P-79543555-12-1212Grant808 Farragut Avenue1 DEC 66 P-71536 888-88-8888 MacPherson 1617 Pennsylvania Ave11 APR 60In designing a database, one of the candidate keys for each relation must bechosen to be the primary key for that table. Choosing primary keys is acrucial task in database design. If keys need to be redesignated, the entiresystem may have to be redone. Primary keys can never be null and shouldnever be changed. Sometimes none of the candidate keys for a relation arelikely to remain stable over time. Then, an arbitrary identifier might be createdto serve as a primary key. Such arbitrary keys are also known as surrogatekeys.File: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 17

Relations as a DatabaseA binary relation (i.e., a subset of a Cartesian product of two sets) could be berepresented in a computer system as two-column tabular file, with one memberfrom the first set named in the first column of each record and one member ofthe second set in the second column. For example, a binary relation could beused to provide unique three-letter identifiers for academic departments.Additional relations could be used to give more information about individualdepartments or individual faculty members.ZOLZoologyPSDPolitical ScienceCPSComputer ScienceHISHistory ACC AccountingZOLZoologyRoom 203Natural Science Bldg355 4640CPSComputer ScienceRoom 714AWells Hall355 5210BSPBiological ScienceRoom 141Natural Science Bldg353 4610CEMChemistryRoom 320Chemistry Bldg355 9175 PSDRoom 303Political Science 355 6590South Kedzie Hall999-99-9999JohnsonWilliamF.1533 Raleigh Dr.BaltimoreMD21211888-88-8888JohnsonWilliamF.2842 Colony Ave.BaltimoreMD21201777-77-7777BrownJamesG.99 W. East St.TowsonMD21232666-66-6666BrownGwenK.99 W. East St.TowsonMD21232 111-11-1111 ZieglerFile: N drive:\jhu\class\1995\db-fund.ppt Samual L. 7272 Cherry Ln. 1994, 1995 Robert Robbins Baltimore MD 21208Database Fundamentals: 18

Relations as a DatabaseYet another relation could be used to show what faculty were members of whatdepartments. Notice that faculty member 999-99-9999 is a member of morethan one department and that, even on this short list, the department of zoologyhas two members 66-66-6666ZOL 999-99-9999BSPRelations of this sort, that combine identifiers from two other relations, providethe “glue” that holds a relational database together. other fieldsMember-of RelationèSS NumberSS NumberDepartments RelationèçFaculty RelationDept CodeDept Codeother fields Whenever the values in an attribute column in one table “point to” primary keysin another (or the same) table, the attribute column is said to be a foreign key.Columns containing foreign keys are subject to an integrity constraint: anyvalue present as a foreign key must also be present as a primary key.File: N drive:\jhu\class\1995\db-fund.ppt 1994, 1995 Robert RobbinsDatabase Fundamentals: 19

Relational Database OperatorsData models consist of data structures andpermitted operations on those data structures.Part of Codd’s genius was to recognize thatmany of the

A database is any collection of related data. Restrictive: A database is a persistent, logically coherent collection of inherently meaningful data, relevant to some aspects of the real world. The portion of the real world relevant to the database is sometimes referred to as the universe of discourse or as the database miniworld. Whatever itFile Size: 214KBPage Count: 31