Lecture 20 - Data Validation

Lecture 20 – Data Validation
1
26/3/15
Threats to the Database
 User Errors
 Software errors
 Hardware failures
 Malicious damage
 Breach of confidentiality
 Concurrency errors
Data Integrity
 Entity Integrity
 Referential Integrity
Data Validation
 Defence against erroneous data entry to the database
 Example
You can define rules to validate certain fields against
specific values, a range of values, patterns, and length.
You can define multiple rules, which are listed in the
ruleList element. Each rule you define is contained
within a rule element, and each rule element only
defines one rule (though the rule can be complex and
include conditional rules and operators).
Type Checking
 Range of acceptable input values
 Admissible operations on the data
 How the data is handled on output
Validation Techniques
 Constraints definable with the DBMS
 Field level validation
 Input mask
 Validation rule/text required
 Record level validation
 Control validation
 Form-level validation
 SQL constraints
More validation…
 Assertions
 General controls on content of the database
 Triggers
 General validation mechanism enabling execution of units of
code when certain related events occur
Back-up and Recovery
 Strategy for protecting against loss of database data
 Periodic copying of the data to some other storage
unit and place the copy in a secure location
 Typically taken when the system is inactive
 Frequency dictated by:



the rate of transactions applied to the database
The level of availability of service demanded by the application
The balance of time requires to perform the backup compared
with the potential delay in recovery
Transaction Logs
 Journal to record the effect of all changes made to
the database by transaction by application systems
 In the event of systems failure, it is possible to
recover the database by restoring the most recent
backup and then re-executing the transactions
recorded in the transaction log
Checkpoints
 Transaction status records taken at intervals during
normal processing of transactions
 At checkpoint:



Initiation of new transactions are temporarily suspended
All memory buffers are flushed to disk (ensures that all
committed transactions have been actioned on the physical
database)
All currently active transactions are noted and recorded in the
transaction log
Data Privileges and Permissions
 Multi-user databases require control of data in terms
of access to modification and accessibility
 Necessary to ensure who can read or change the data
 Who has permission to:


Create/change or drop a table
To grant privileges to others
Facilities in SQL
 Use the commands:
 GRANT
 REVOKE

E.g
GRANT select, update ON Order TO Joe, Mary;
REVOKE update ON Order TO Joe:
Facilities in MS Access
 Administer permission allows a user or group to
change anyone’s permissions
 Allowed by user/group by object and permission
 This should only be accessed where necessary
For consideration…
 Describe the techniques that can be employed to
ensure data recovery in the event of a system failure
 Consider the techniques that can be employed to
ensure that data maintains its accuracy in the
database
NOSQL
 NoSQL encompasses a wide
variety of different database
technologies and were developed
in response to a rise in the
volume of data stored about
users, objects and products, the
frequency in which this data is
accessed, and performance and
processing needs.
 Relational databases, on the
other hand, were not designed to
cope with the scale and agility
challenges that face modern
applications, nor were they built
to take advantage of the cheap
storage and processing power
available
Examples Mongo DB
CouchDB
 http://www.mongodb.com/learn
/nosql
Why NoSql
16
 NoSQL technology was pioneered by leading internet
companies — including Google, Facebook, Amazon,
and LinkedIn to overcome the limitations relational
database technology
 Enterprises are adopting NoSQL for a growing
number of uses cases, a choice that is driven by four
megatrends: Big Users, Big Data, the Internet of
Things, and Cloud Computing
REVISIT LOCKS
17
 Data concurrency means that many users can
access data at the same time.
 Data consistency means that each user sees a
consistent view of the data, including visible changes
made by the user's own transactions and
transactions of other users.
Continued…
18
 Locks are mechanisms that prevent destructive
interaction between transactions accessing the same
resource.
Exclusive Locking
 The resources have been locked only allow accessing
the locking operation, other operations will not be
accepted.
 However, when the object has other locks, we cannot
use exclusive lock. Exclusive lock can be released
until the end of transaction.
More on locking
 Shared locking: The resources which have been locked only allow
other users reading, but not modifying.
 Update locking: Update lock is created to avoid deadlock - it will lock
the data with update lock firstly, and the data can be read, but cannot
be modified.
Locking
 Optimistic locking assumes that multiple
transactions can complete without affecting each
other, and that therefore transactions can proceed
without locking the data resources that they affect.
 Before committing, each transaction verifies that no
other transaction has modified its data. If the check
reveals conflicting modifications, the committing
transaction rolls back
Optimistic locking
 When your application uses long transactions or
conversations that span several database
transactions, you can store versioning data, so that if
the same entity is updated by two conversations, the
last to commit changes is informed of the conflict,
and does not override the other conversation's work.
 This approach guarantees some isolation, but scales
well and works particularly well in Read-Often
Write-Sometimes situations.
Pessimistic locking
 Assumes that concurrent transactions will conflict
with each other, and requires resources to be locked
after they are read and only unlocked after the
application has finished using the data.
Database Deadlock
 One of the classical reason we have a database
deadlock is when two transactions are inserting
\updating tables in a different order.
 e.g. Transaction A inserts in Table A then Table B
 and Transaction B inserts in Table B followed by A
 Such a scenario is always at risk of a database
deadlock (assuming you are not using serializable
isolation level)
 With a lock-based concurrency control DBMS implementation, serializability
requires read and write locks (acquired on selected data) to be released at the
end of the transaction.
Dealing with Deadlocks
A deadlock occurs when two or more session are
waiting for data locked by each other, resulting in all
the sessions being blocked.
Oracle automatically detects and resolves deadlocks
by rolling back the statement associated with the
transaction that detects the deadlock.
Typically, deadlocks are caused by poorly
implemented locking in application code.
 Prevention
 Detection