Database Management Systems: An Architectural View

Transcription

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyDatabase Management Systems:An Architectural ViewLecture 1Database Management Systems: An Architectural ViewSeptember 7, 20201

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyOutline1.2.3.4.What is a Database? A DBMS?Why use a DBMS?Databases in ContextDesign and Implementation ProcessDatabase Management Systems: An Architectural ViewSeptember 7, 20202

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyWhat is a Database?Database Management Systems: An Architectural ViewSeptember 7, 20203

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyWhat is a Database?Database Management Systems: An Architectural ViewSeptember 7, 20204

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyWhat is a Database?A collection of related data, most often – reflects some aspect of the real world– logically coherent with inherent meaning– designed, built, and populated with data for aspecific purpose intended group of users some preconceived applications with which theseusers are interested application requirements in terms of performance,security, redundancy, concurrency, etc.Database Management Systems: An Architectural ViewSeptember 7, 20205

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyDatabase Management SystemDBMSA collection of programs that enables usersto create and maintain a database– Supports specifying the data types,structures, and constraints of the data– Stores the data on some medium undercontrol of the DBMS– Supports querying and updating thedatabase– Protects data against malfunction andunauthorized accessDatabase Management Systems: An Architectural ViewSeptember 7, 20206

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyWhy use a DBMS?Common tradeoff in CS:A. Code from scratch Pros: you know your problem best (so fast, customized) Cons: slow, labor intensive, need to add/change features?B. Find a library/tool that solves [part of] your problem Pros: fast via bootstrapping, better designed? Cons: understand the tool, may not be efficient, support?DBMSs adopt some set of limiting assumptions inorder to efficiently support a useful feature set overa wide class of possible databasesDatabase Management Systems: An Architectural ViewSeptember 7, 20207

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyExample: Student Records Given a school with MANY students (NEU: 25k, UM: 45k), eachwith some data (name, ID, DOB, classes) Write a program that can efficiently – Retrieve a random student– Retrieve the first/last student, according to Last name DOB– Retrieve a student by ID Name (with *’s)– Retrieve a class roster (all students in class X)– Handles adding/removing/editing students/classes– Handles multiple simultaneous reads/writes– Provides differing access rights– Handles OS faults/power outages Database Management Systems: An Architectural ViewSeptember 7, 20208

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyMany Kinds of DBMSs (1) Graph databases– Create nodes, edges, labels– Query about relationships and paths Find your friends Find someone that can help you learn databases Spatial databases– Objects in 2D/3D– Query locations, relations Collision detectionDatabase Management Systems: An Architectural ViewSeptember 7, 20209

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyMany Kinds of DBMSs (2) Document stores– Create dynamic documents– Query about contents Find by author, title, content, etc. patterns Key-Value stores– Associative array– Scalable, fault-tolerant– QueryDatabase Management Systems: An Architectural ViewSeptember 7, 202010

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRelational DBMSWe focus on relational databasesBased on the relational data model– Researched 45 years, widely used Free/paid implementations for personal use,embedded systems, small/large enterpriseDatabase Management Systems: An Architectural ViewSeptember 7, 202011

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRelational Databases (1)Table or “Relation”Table neDormAgeGPABen Bayer305-61-2435555-12341193.21Chung-cha Kim422-11-2320555-98762253.53Barbara Benson533-69-1238555-67581193.25ConstraintsHas to beuniqueMust be aninteger 0Must be anumber [0,4]Database Management Systems: An Architectural ViewSeptember 7, 202012

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRelational Databases (2)More Tables!STUDENTNameSSNPhoneDormAgeGPABen Bayer305-61-2435555-12341193.21Chung-cha Kim422-11-2320555-98762253.53Barbara Benson533-69-1238555-67581193.25CLASSSSNDORMValues in one table can beforced to come fromanother(“Referential Integrity”)Class305-61-2435 COMP355IDName422-11-2320 COMP3551555 Huntington533-69-1238 MATH6502Baker305-61-2435 MATH650422-11-2320BIOL110Database Management Systems: An Architectural ViewSeptember 7, 202013

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRelational Databases (3)Queries!STUDENTNameSSNPhoneDormAgeGPABen Bayer305-61-2435555-12341193.21ResultChung-cha Kim422-11-2320555-98762253.533.23Barbara Benson533-69-1238555-67581193.25CLASSWhat is the average GPAof students in MATH650?1. Find all SSN in tableClass whereClass MATH6502. Find all GPA in tableStudent whereSSN #13. Average GPA in #2SSNDORMClass305-61-2435 COMP355IDName422-11-2320 COMP3551555 Huntington533-69-1238 MATH6502Baker305-61-2435 MATH650422-11-2320BIOL110Database Management Systems: An Architectural ViewSeptember 7, 202014

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRelational Databases (3)Queries!STUDENTNameSSNPhoneDormAgeGPABen Bayer305-61-2435555-12341193.21ResultChung-cha Kim422-11-2320555-98762253.533.23Barbara Benson533-69-1238555-67581193.25CLASSSSNDORMSELECT AVG(STUDENT.GPA)FROMSTUDENT INNER JOIN CLASSON STUDENT.SSN CLASS.SSNWHERE CLASS.Class 'MATH650';Class305-61-2435 COMP355IDName422-11-2320 COMP3551555 Huntington533-69-1238 MATH6502Baker305-61-2435 MATH650422-11-2320BIOL110Database Management Systems: An Architectural ViewSeptember 7, 202015

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRelational Databases (4)Users!DBMSApplication ServerClientsDatabase Management Systems: An Architectural ViewSeptember 7, 202016

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyDatabases in ContextThree-Tier ArchitectureDatabase Management Systems: An Architectural ViewSeptember 7, 202017

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRelational DBMS Features (1) Data independence via data models– Conceptual representation independent of underlyingstorage or operation implementationER DiagramsRelationsDatabase Management Systems: An Architectural ViewSeptember 7, 202018

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRelational DBMS Features (2) Operation abstraction via – Declarative languages Structured Query Language (SQL)– Data definition, manipulation, query– Programmatic APIs Function libraries (focus), embedded languages,stored procedures, etc.Database Management Systems: An Architectural ViewSeptember 7, 202019

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRelational DBMS Features (3) Reliable concurrent transactions– (A)tomicity: “all or nothing”– (C)onsistency: valid - valid’– (I)solation: parallel execution, serial result– (D)urability: once it is written, it is so High performance– Buffering, caching, locking (like a mini OS)– Query optimization, redundant datastructures (e.g. indexes, materialized views)Database Management Systems: An Architectural ViewSeptember 7, 202020

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRelational DBMS Features (4) Authentication and authorization– Discussed in context of other securityconcerns/techniques Backup and recovery– Logging, replication, migrationDatabase Management Systems: An Architectural ViewSeptember 7, 202021

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyWhy NOT to use a DBMSYour application involves a single user has simple/well-defined data/operations– DBMS may be overkillHowever, DBMS techniques may be useful– We will discuss useful and scalable indexingstructures and processesDatabase Management Systems: An Architectural ViewSeptember 7, 202022

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyDatabases in ContextPeople1. Database designers2. System analysts & applicationprogrammers3. Database administrators4. End users5. Back-enda. DBMS designer/implementerb. Tool developersc. SysAdminsDatabase Management Systems: An Architectural ViewSeptember 7, 202023

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRelational DBMSDatabase Management Systems: An Architectural ViewSeptember 7, 202024

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyDatabase Design and Implementation ProcessDatabase Management Systems: An Architectural ViewSeptember 7, 202025

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyRequirements Collection & Analysis Data/Constraints“The company is organizedinto departments. Eachdepartment has a uniquename, number, and aparticular employee whomanages the department. Wekeep track ” Functional Needs– Operations/queries/reports Frequency– Performance, security, etc.Database Management Systems: An Architectural ViewSeptember 7, 202026

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyConceptual DesignDataApplication Software– UML– Form design Database– Transaction design– Report designDatabase Management Systems: An Architectural ViewSeptember 7, 202027

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyLogical DesignDataApplication Supporting code (thatdoes not depend upondatabase)– Possibly using techniquesfrom databases (e.g.indexing) NormalizationDatabase Management Systems: An Architectural ViewSeptember 7, 202028

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyPhysical DesignDataApplication Index, materialized viewselection and analysis Implementing operationsas queries Implementing constraintsas keys, triggers, views Implementing multi-usersecurity as grantsDatabase Management Systems: An Architectural ViewSeptember 7, 202029

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskyImplementation and TuningDataApplication DDL statements De-normalization,updating indexes/materialized views Query integration Profilingqueries/operations Security, concurrency,performance, etc.analysisDatabase Management Systems: An Architectural ViewSeptember 7, 202030

CS3200 – Database Design・ ・ ・ Fall 2020・ ・ ・ DerbinskySummary A database is a collection of related data that reflects some aspectof the real world; is logically coherent with inherent meaning; and isdesigned, built, and populated with data for a specific purpose A database management system (DBMS) is a collection ofprograms that enables users to create and maintain a database There are many types – we will focus on relational databases(RDBMS) The typical database design process is an iterative process ofrequirements collection/analysis, conceptual design, logical design,physical design, and system implementation/tuningDatabase Management Systems: An Architectural ViewSeptember 7, 202031

Database Management Systems: An Architectural View 13 Name SSN Phone Dorm Age GPA Ben Bayer 305-61-2435 555-1234 1 19 3.21 Chung-chaKim 422-11-2320 555-9876 2 25 3.53 Barbara Benson 533-69-1238 555-6758 1 19 3.25 STUDENT ID Name 1 555 Huntington 2 Baker DORM SSN Class 305 -612435COMP355