Various Database Structures | CompTIA IT Fundamentals FC0-U61 | 5.2

In this video you will learn about various database structures such as:  structured vs. semi-structured vs. non-structured databases, relational databases, and non-relational databases.

Structured vs. Semi-Structured vs. Non-Structured

There are three different types of data structures:  structured, semi-structured, and non-structured.

Structured Data

A structured database is divided into rows and columns.  Think of data in a Microsoft Excel spreadsheet.

Data in a Microsoft Excel spreadsheet

The header row is the equivalent of the fields in a structured database, and each row is equivalent to a record.  Data mining is used on structured databases to find patterns, establish relationships, and forecast future trends.  It provides the information needed for data analysis. Although structured data is an excellent way to store some types of information, such as names, phone numbers, addresses, etc.; it is not suitable for everything.

Semi-Structured Data

Semi-structured data is a form of structured data that does not obey the formal structure of data models associated with relational databases or other forms of data tables, but nonetheless contains tags or other markers to separate elements and enforce hierarchies of records and fields within the data. Instead of using rows and columns, semi-structured data uses elements and hierarchies. By using tags, it’s easier to store nested lists of varying lengths (i.e., vehicles owned) than it would be with a relational database.  Two common types of semi-structured data include XML (Extensible Markup Language) and JSON (JavaScript Object Notation).

JSON is often used for transfers between web servers and apps.  XML can be easily converted to JSON, and JSON is supported by a number of so-called NoSQL non-relational database apps as well as recent versions of relational databases.  One of the reasons that JSON is becoming popular is that it is more flexible than structured data. With structured data, changes in the database schema (the design) or data stored in the database itself can take a lot of time to make.  JSON enables quick changes in data and how it is stored.

JSON Viewer example

Non-Structured Data

Non-structured data, also known as unstructured data, is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Non-structured data is typically text-heavy, but may contain information such as email, word processing documents, presentations, instant messages, and others that are not stored in a data structure.  Techniques such as machine learning and semantic technology are used by data analytics apps to analyze the text in unstructured data for meaning and information.

Relational Databases

A relational database is a type of database that stores and provides access to data points that are related to one another. Relational databases are typically accessed using SQL (Structured Query Language). The essential elements of a relational database include: schema, tables, fields/columns, and constraints.

Schema

The database schema of a database is its structure described in a formal language supported by the database management system (DBMS). The term schema usually refers to the structure of a database as described either in text or graphically.  The logical database schema defines the database structure (tables, views, and integrity constraints), whereas the physical schema deals with how the data is stored and indexed.  The schema is normally stored in a data directory. The database schema must be carefully planned before you begin to import or enter data into the database. Changing the schema after data is present requires a number of steps to avoid data loss.

Tables

A table is a collection of related data held in a table format within a database. It consists of columns (vertical) and rows (horizontal). Each column has a label. Any specific table has a specific number of columns, but the number of rows varies according to the amount of data in the table.

Rows/Records

Each row in a table is a record in the table.

Fields/Columns

In a standard table view, each column is a field.  Some columns are key fields.

Primary Key

A primary key contains a unique value for each row, such as user ID.  Each table must have a primary key.

Foreign Key

A foreign key relates (links) two database tables together.  A foreign key can be more than one column and it must be present in both database tables you are trying to relate.  The foreign key in one table must be the primary key or candidate key in another table. A candidate key is one or more columns that contain information unique to each record and doesn’t refer to any other data.  If a key is two or more columns, it is known as a composite key.

Constraints

Constraints are properties assigned to a column or set of columns that define what type of data can be placed in the column.  Constraints are used to prevent inconsistent data and to enforce data integrity. Constraints are not the same as data integrity settings but are set up at the same time when an SQL database is being created.  The following is a partial list of SQL constraints:

  • CHECK:  Determines that all values in a column satisfy specified conditions.
  • DEFAULT:  Uses the provided default value for a column when no value is specified in a record.
  • FOREIGN KEY:  Unique identifier for a row/record in another database table.
  • INDEX:  Uses a specified field for faster search and retrieval.
  • NOT NULL:  Prevents a column from having a NULL value.
  • PRIMARY KEY:  Unique identifier for each row/record in a database table.
  • UNIQUE:  Requires that all the values in a column be different.

Non-Relational Databases

A non-relational database is a database that does not incorporate the table/key model that relational database management systems (RDBMS) promote.  These kinds of databases require data manipulation techniques and processes designed to provide solutions to big data problems that big companies face.  The most popular emerging non-relational database is called NoSQL (also known as a flat file database). This type of database stores information in a single database, which eliminates the need to relate multiple tables together.  However, this type of database requires the designer to manually program the steps in a transaction (transactions are easy to do in an SQL database). There are two types of non-relational databases: key/value and document.

Key/Value Databases

A key/value database, or key/value store, is a data storage paradigm designed for storing, retrieving, and managing associative arrays, and a data structure more commonly known today as a dictionary or hash table. Dictionaries contain a collection of objects, or records, which in turn have many different fields within them, each containing data. These records are stored and retrieved using a key that uniquely identifies the record, and is used to find the data within the database. Key/value databases support nested values as well.

Document Databases

A document database (also known as a document store or document-oriented database) is a computer program designed for storing, retrieving and managing document-oriented information, also known as semi-structured data such as XML or JSON.  A document database is a collection of documents and a way to query the contents of the document database. Along with key/value stores, a document database is one of the major types of NoSQL non-relational databases. A document database can store many types of information about the documents in the database, such as title, author, keywords, etc.