Database Development Process CS263 Lecture 2

Database Development Process
CS263 Lecture 2
Enterprise data modelling
• Is the first step in database development, in which the
scope and general contents of organisational databases are
specified
• The ‘enterprise’ is the complete entity being modelled, be
it a corner shop or a whole corporation. In practice,
creating an enterprise-wide data model is a fairly rare
occurrence – only likely to occur when there is a major
change in direction for the organisation (such as complete
systems re-design or business take-over)
• However, it is vital that some form of enterprise data
model is in place
•
Enterprise data modelling
• Ideally, the model should encompass both current and
historical data (often as two or more discrete models with
links between them)
• The model should reflect both the data and the processes of
the organisation, plus the views of these components
required by the various sections of the organisation. Thus
sales data (invoices etc.) will be viewed from a monetary
aspect by the accountant, from a quantitative aspect by the
store staff and from both these aspects by the marketing
people
• These various views of the data will then become
information meaningful to the recipient.
Segment from enterprise data model (Pine Valley
Furniture Company)
Information Engineering
• “A data-oriented methodology to create and maintain
information systems.”
• Top-down planning approach.
• Four steps:
– Planning (Results in an Information Systems
Architecture)
– Analysis – meet the various members of the
organisation (from top to bottom) to discuss and agree
on the data requirements and the processes involved
– Design
– Implementation
Information systems architecture
• A high level data model such as previous Fig. is only one
part of an overall Information Systems Architecture (ISA)
for an organisation. ISA has 6 components
• 1. Data
• 2. Processes that manipulate that data ( represented by
DFDs, Object Models)
• 3. Network – transports data around the organisation and
between the organisation and its key business partners
• 4. People – who perform processes and are the sources and
receivers of data and information
Information systems architecture
• 5. Events and points in time when processes are performed
• 6. Reasons for events and rules that govern the processing
of data
Information systems planning
• Goal is to “Align IT with the business strategies of the
organisation” Has three steps:
• 1. Identify strategic planning factors
• Organization goals (e.g. maintain 10% yearly
growth rate)
• Critical success factors (e.g. make high quality
products)
• Problem areas (e.g. increasing competition)
Information systems planning
2. Identify corporate planning objects:
• Organizational units (departments of the
organisation), Organizational locations (places
where business operations occur), Business
functions (related groups of business processes
supporting the mission of the organisation), and
Entity types (major categories of data about the
people, places and things managed by the
organisation) and Information (application) systems
(the application software and supporting procedures
for handling data
Information systems planning
3. Develop enterprise model, which consists of:
• Decomposition of business functions (break down
functions of an organisation into progressively
greater levels of detail, generally in a hierarchical
form)
• Data decomposition (process of breaking down
organisational data requirements into identifiable
components, generally in hierarchical form)
• Enterprise data model (see later)
• Planning matrices (see later)
Example of process decomposition of an order
fulfillment function
Enterprise data model
• “Sets the range and general contents of
organizational databases.”
• “Results in a total picture or explanation of
organizational data, not in the design for a
particular database.”
• Entity-relationship diagram.
• Descriptions of entity types.
• Relationships between entities.
• Business rules.
Planning matrices
• Provide an explicit approach for describing business
requirements because they provide an explicit approach for
describing business requirements without requiring that the
database be explicitly modelled. Function to data entity
• Location to function (which business function is being
performed at which location)
• Unit to function (which business functions are the
responsibility of which business units)
• IS application to data entity (explains how each
information system interacts with each data entity)
Planning matrices
• Supporting function to data entity (which data are
captured, used, updated, deleted within each function)
• IS application to business objective (shows which
information systems support each business objective)
Following Fig. illustrates a possible function-to-data entity
matrix. Can be used for many purposes, including the
following 3:
• Identify orphans – indicate which data entities are not used
by any function, or which function do not use any entities
• Spot missing entities – Employees involved with each
function who examined the matrix can identify any
entities that may have been missed
• Prioritise development – If a given function has a high
priority for systems development (perhaps because it is
related to important organisational objectives) then the
entities used by that area also have a high priority in
database development
Data Entity
Types
Business
Function
Business Planning
Product Development
Materials Management
Order Fulfillment
Order Shipment
Sales Summarization
Production Operations
Finance and Accounting
Customer
Product
Raw Material
Order
Work Center
Work Order
Invoice
Equipment
Employee
Example business function-todata entry matrix
X X
X
X
X X
X X
X X
X
X X
X X
X
X
X
X X X X
X
X X X X X X X
X X
X
X
X
X
X
X X X X
X X
X X X
X X X
Systems development life cycle
(SDLC)
• Is a complete set of steps that a team of IS professionals
follow to specify, develop, maintain and replace
information systems
• Process is often viewed as a cascade of steps (see
following figure)
• Cascade or ‘waterfall’ approach as each step flows into the
next – though steps can overlap in time and it is possible to
backtrack when prior decisions need to be reconsidered
Database development activities during the
SDLC
Project Identification
and Selection
Project Initiation
and Planning
Analysis
Logical Design
Physical Design
Implementation
Maintenance
Alternative IS development
approaches
• The previous approach is methodical and highly structured,
which includes many checks and balances at each step.
Often criticised for the length of time needed until a
working system produced
• Increasingly, organisations use more Rapid Application
Development (RAD) methods which follow an iterative
process of rapidly repeating analysis, design and
implementation steps until convergence on the system the
user wants
Alternative IS development
approaches
• One of the most popular RAD methods is Prototyping (Fig.
2-6)
• An iterative process in which requirements are converted
to a working system that is continually revised through
close work between analysts and users
• With increasing popularity of visual RAD tools (VB,
Delphi, Visual C++, Java) prototyping is becoming the IS
development methodology of choice.
The prototyping methodology and database
development process
Managing the people
• To determine whether a project is
progressing on time and within budget, the
project leader develops detailed schedules
of project activities
• These are often depicted in graphical form,
such as the Gannt and Pert charts in the
following Figs.
Figure 2-7a Gantt Chart
Figure 2-7b PERT chart
Three schema architecture for
database development
• Conceptual Schema (during the Analysis phase) is a
detailed specification of the overall structure of
organisational data that is independent of any database
management technology) – depicted in ER or objectmodelling notations (a data model). Specifications stored
as metadata in repository or data dictionary
• External Schema (or user view) is some portion of the
database that is required for a user to perform some task.
Also independent of database technology but typically
contains a subset of the associated conceptual schema.
Often the original description of a user view is is a
computer screen display, business transaction or report.
Three schema architecture
• Logical version of a user view can be represented by ERDiagram, object diagram or as relations.
• Internal Schema (physical schema) contains specifications
for how data from a conceptual schema are stored in a
computer’s secondary memory.
Three-schema database architecture
Three tiered architecture
• Typically develop conceptual and external schemas
iteratively (see Fig.).
• Often, a first go at the conceptual schema is developed
based on the organisations enterprise data model and a
general understanding of the database requirements
• Then external schemas for each transaction, report and
screen display are developed
• Often, further analysis of external schemas will yield new
attributes and possibly entities and relationships not shown
in the original conceptual schema
• So then the conceptual schema is modified and augmented
Process of developing three-schema architecture for a
database project
Three-tiered database location
architecture
• The data for a given information system may reside in
multiple locations or tiers of computers, in order to balance
various technical and organisational factors
• Four tiers are possible for data to be on – on a client server,
an application server, a Web server and a database server.
However, three tiers are more commonly considered (Fig.
2-10):
• Client tier – typically a desktop or laptop computer, which
concentrates on managing the user-system interface and
localised data (also called presentation tier). Web scripting
tasks may be executed here.
Three-tiered
• Application/Web server tier – processes HTTP protocols,
scripting tasks, performs calculations and provides access
to data (also called process services tier).
• Enterprise server (minicomputer or mainframe tier) –
performs sophisticated calculations and manages the
merging of data from multiple sources across the
organisation (also called the data services tier)
Three-tiered client/server database architecture
Reasons for using a client/server
architecture
• It allows for simultaneous processing on multiple
processors for the same application, thus improving
application response time and data processing speed
• It is possible to take advantage of the best data processing
features of each computer platform (e.g. the advanced user
interface capabilities of PCs versus the speed of
minicomputers and mainframes)
• Can mix client technologies (by different companies such
as Intel, Sun, Motorola) and yet share common data
• Can change technologies in any tier with limited impact on
the other tiers
Reasons for using a client/server
architecture
• Processing can be performed close to the source of
processed data, improving response times and reducing
network traffic
• Allows for and encourages open system standards
• Ease of separating the development of the database and the
modules that contain the database from the information
system modules that present the contents to end users.
These are typically developed in Powerbuilder, VB or
Delphi and interact through middleware to the routines that
access and analyse the data
Using and defining views
• Views provide users controlled access to tables.
Advantages of dynamic views:
– Simplify query commands
– Help provide data security and confidentiality
– Enhance programming productivity
– Contain most current base table data
– Use little storage space
– Provide a customised view for a user
– Establish physical database independence
• CREATE VIEW command
Using and defining views
• Disadvantages of views:
• Use processing time re-creating view each time it is
referenced
• May or may not be directly updateable
• Some disadvantages can be overcome by using
materialised views which are stored physically on disk and
refreshed at appropriate intervals
The SELECT clause
• Specifies (or projects) what data elements (columns) are to
be included in the view table
• The FROM clause lists the tables and views involved in
the view development
• The WHERE clause specifies the names of the common
columns used to join the tables
• Because a view is a table its rows may not be sorted
Views
• e.g., build a query to generate an invoice for order number
1004 - where rather than having to specify the joining of
four tables, the query can include all relevant data elements
from the view table INVOICE_V:
• CREATE VIEW INVOICE_V AS
• SELECT CUSTOMER_ID, CUSTOMER_ADDRESS,
PRODUCT_ID, QUANTITY etc.
• FROM INVOICE_V
•
WHERE ORDER_ID = 1004;
Views
• A view may join multiple tables or views together and may
contain derived (or virtual) columns
• e.g., if a user only wants to know the total value of the
orders placed for each furniture product, a view can be
created from the previous view (INVOICE_V)
• We can assign a different name (alias) to a view column
than the associated base table or expression column name
• Here PRODUCT is a renaming of PRODUCT_ID, local to
only this view
Views
• TOTAL is the column name given to the expression for
total sales of each product
• The expression can now be referenced via this view in
subsequent queries as if it were a column (rather than a
derived expression)
• CREATE VIEW ORDER_TOTALS_V AS
• SELECT PRODUCT_ID PRODUCT,
SUM(UNIT_PRICE*QUANTITY) TOTAL
• FROM INVOICE_V
•
GROUP BY PRODUCT_ID;
Views
• Views can establish security because tables and columns
that are not included will not be obvious to users of the
view
• Restricting access to a view with GRANT and REVOKE
statements adds another layer of security
• Some people advocate the creation of a view for every
single base table, even if that view is identical to the base
table
• This can create to greater programming productivity as
databases evolve, through greater flexibility when base
tables are changed
Views
• The following examples are based on the following views:
• CREATE VIEW CUSTOMER_V AS SELECT * FROM
CUSTOMER_T;
• CREATE VIEW ORDER_V AS SELECT * FROM
ORDER_T;
• CREATE VIEW ORDER_LINE_V AS SELECT * FROM
ORDER_LINE_T;
• CREATE VIEW PRODUCT_V AS SELECT * FROM
PRODUCT_T;
Views
• Updating data directly from a view rather than from base
tables is possible under certain limitations
• Usually permitted as long as the update is unambiguous in
terms of data modification in the base table
• When the CREATE VIEW statement contains any of the
following five situations, that view may not be updated
directly:
• 1. The SELECT clause includes the keyword DISTINCT
(see later)
• 2. The SELECT clause contains expressions, including
derived columns, aggregates, statistical functions etc.
Views
• 3. The FROM clause, or a subquery, or a UNION clause
references more than one table
• 4. The FROM clause or a subquery references another
view that is not updateable
• 5. The CREATE VIEW command contains a GROUP BY
or HAVING clause
• It could happen that an update to an instance would result
in the instance disappearing from the view.
• A view EXPENSIVE_STUFF_V lists all furniture
products where the unit price is > $300.
Views
• That view will include PRODUCT_ID 5, a writers desk =
$325 dollars. If we update this view and reduce writer’s
desk to $295, the desk will no longer appear in the view
because its unit price is now less than $300.
• If it is desired to track everything with an original price
over $300, we must include a WITH CHECK OPTION
clause after the SELECT clause in the CREATE VIEW
COMMAND.
• This will cause UPDATE or INSERT statements to be
rejected when they would cause rows to be removed from
the view
Sample CREATE VIEW
• CREATE VIEW EXPENSIVE_STUFF_V AS
• SELECT PRODUCT_ID, PRODUCT_NAME, UNIT_PRICE
•
FROM PRODUCT_T
•
WHERE UNIT_PRICE >300
•
WITH CHECK_OPTION;
•CHECK_OPTION works only for updateable
views and prevents updates that would create
rows not included in the view, or cause updated
or inserted rows to be removed from the view