Oracle Database 21c review: The old RDBMS is new again

Major upgrade to the ever-evolving Oracle Database brings JavaScript support, graph optimizations, in-memory enhancements, and dramatic improvements to JSON operations and in-database machine learning.

Oracle Database 21c review: The old RDBMS is new again
Thinkstock
At a Glance

Oracle Database 21c, the new release of the longtime industry leading RDBMS, is currently available in the Oracle Cloud, where it can be deployed as a Virtual Machine DB System (for clusters and single instance) or a Bare Metal DB System (single instance). It’s also available in the Always Free Autonomous Database Service, with limited storage and CPU capacity. It will be available more broadly later this year, including on-premises.

While Oracle Database 20c was scheduled for release in 2020, that release was cancelled because of the COVID-19 pandemic and its effect on customer IT departments. All of the features planned for 20c were rolled into 21c.

There are several notable points to be made about Oracle Database 21c. It can serve as a transactional, analysis, or enterprise database, and tune itself for its usage. It supports relational, time series, graph, spatial, text, OLAP, XML, and JSON data. It has in-database training of machine learning models that includes AutoML, and that training can take advantage of the extra CPUs in storage servers on Exadata hardware. Speaking of Exadata hardware, Oracle Database 21c takes full advantage of its Intel Optane persistent memory, and the Autonomous Database supports automatic index creation, which is also available for Oracle Databases running on Exadata, in the cloud or on-premises.

For analysis, Oracle Database 21c can automatically configure which columns to place in memory, can perform SIMD vectorized hash joins, and can complete hybrid scans against in-memory columns and on-disk row stores. For in-database programming, Oracle Database 21c adds support for JavaScript, along with interfaces between JavaScript and Oracle Database types, SQL, and PL/SQL stored procedures.

Oracle claims high performance for all of its converged data types, even compared against databases specialized for JSON documents or graphs. The hard benchmark numbers for Oracle Database 21c are not yet available, however.

Oracle Database history

Oracle Database 21c is the result of four decades of development. Oracle Database was the first commercial SQL-based relational database (1979), implemented based on a published IBM specification. (IBM held off releasing its own relational database, DB2, to avoid cannibalizing sales of its hierarchical database, IMS.)

Briefly, Oracle Version 3 (1983) was a rewrite in portable C, to run on mainframes, minicomputers, and personal computers. Version 6 had enhancements to disk I/O, row locking, scalability, and backup and recovery. Oracle7 (1992) introduced PL/SQL stored procedures and triggers. Oracle8 had objects and table partitioning. Oracle8i (1999) provided native support for internet protocols and server-side support for Java. Oracle9i introduced Oracle Real Application Clusters (RAC) in 2001; RAC is a shared disk, multiple instance architecture. Oracle Database 10g introduced grid computing in 2003.

Oracle Database 11g (2007) improved manageability, diagnosability, and availability. Oracle Database 12c (2013) was designed for the cloud, featuring a new multitenant architecture, in-memory column store, and support for JSON documents. And Oracle Database 21c improves the developer experience with features such as Oracle Blockchain Tables and native JSON data types. Enhancements to Automatic In-Memory make the in-memory column store largely self-managing.

The Oracle Database is only one of Oracle’s many product lines. Oracle also offers nearly a dozen application development tools for Oracle Database, four data integration tools, nine other databases including MySQL and Berkeley DB, and half a dozen big data products — and that’s without going into Java, storage, operating systems, applications, or the Oracle Cloud.

Oracle Database overview

The Oracle Database 21c Technical Architecture document runs to 40 pages with block diagrams on almost every page (see the first two below). I don’t have room for that, nor would you want to read it here. Instead, let’s hit the high points.

First of all, Oracle is a converged database engine. It offers many data types and data models, several workload types, and a bunch of ancillary capabilities for developers and analysts.

Oracle Database supports relational, time series, graph, spatial, text, OLAP, XML, and JSON data — all at once. It supports transactional (OLTP), analytic (OLAP), machine learning (ML), IoT, streaming, and blockchain workloads. It has integrated microservices, events, REST services, in-database ML training, CI/CD, and low-code application development. Again, that’s all in one database, rather than requiring an assortment of special-purpose databases. Oracle contends that a converged database engine saves customers money, reduces data transport and transformation, and simplifies analytics and applications.

Oracle offers database availability on-premises, in its public cloud, and in an extension of its cloud running on-prem. Oracle Database can be hosted, co-managed, or fully managed (autonomous).

Autonomous Database can perform automated scaling, automated tuning, automated provisioning, automated fault management, and automated patching. Automated tuning means that the execution of your Oracle Databases is continually analyzed and tuned to optimize database performance for application development, production transaction processing, and data warehousing workloads that change over time. That includes creating indexes based on usage, which is especially important for data warehouses.

Exadata hardware and software underlies Autonomous Database. More explicitly, Exadata powers Oracle Exadata Database Machine (dedicated on-prem), Oracle Exadata Cloud Service, and Oracle Exadata Cloud@Customer (cloud on-prem). Exadata Database Machine X8M (the latest model) has a high-availability, scale-out design, separate database and storage servers, uses Intel Optane persistent memory (PMEM) as part of a layered storage system, and uses in-memory columnar hybrid compressed format with vector processing for analytics and reporting. X8M takes advantage of remote direct memory access (RDMA) over converged 100Gbps Ethernet (RoCE) for internal interconnects, which completely bypasses the normal operating system stacks.

An Oracle database is a multi-tenant container database (CDB) that holds one system seed pluggable database and any number of user-created pluggable databases (PDBs). (See the third diagram below.) Users interact only with the PDBs, and a user may never realize that the PDB they are using isn’t a standalone database. In the past, Oracle also supported non-CDB database instances, but that option is no longer supported in Oracle Database 21c.

An Oracle Real Application Clusters (RAC) database architecture consists of multiple instances that run on separate server machines, all sharing a single database. The cluster presents a single end point to users and applications, through a listener process. RAC offers even higher availability than single Exadata machines.

Traditional Oracle databases were strictly disk-based, which limited their performance. In recent years, Oracle added an in-memory option for “hot” data. Oracle Database In-Memory is a suite of features that greatly improves performance for real-time analytics and mixed workloads. The in-memory column store is the key feature of Database In-Memory. Transactional queries always go to the row store; the in-memory column store is only for analytic and reporting queries.

oracle database 21c 01 Oracle

Oracle Database 21c server block diagram. This is the highest-level diagram. Note that the Listener process often runs outside the server, especially in clustered environments.

oracle database 21c 02 Oracle

Oracle Database instance diagram. The instance handles memory and processes.

oracle database 21c 03 Oracle

An Oracle Database is a multi-tenant container database (CDB) with multiple user-created pluggable databases (PDB).

What’s new in Oracle Database 21c

According to Oracle, the 200-plus new features in Oracle Database 21c fall into six categories: application development; big data and data warehousing solutions; database upgrade and utilities; management solutions; performance and high availability options; and security solutions.

Reading through all the improvements, it is clear to me that Oracle Database 21c is a mature database being improved by smart people in multiple ways to make it work better, faster, and more reliably, especially in the cloud. There are way too many new features to cover them all. I’ll pick out a few of the high-level points, mostly following William Hardie’s overview blog post.

Blockchain tables

While I’m something of a blockchain skeptic, Oracle’s addition of a blockchain ledger to its converged database will probably be useful for financial institutions. Basically, a blockchain is an immutable table that chains rows together cryptographically. The Oracle implementation uses an SHA2-512 computation over the current and previous rows as a hash.

There are a few deviations from immutability in the implementation, however. For one, rows can be set to NO DELETE UNTIL n DAYS AFTER INSERT, so that, for example, the blockchain table can conform to GDPR’s “right to delete” requirement when used for an external user table, after the retention period has expired.

Blockchains are normally supposed to be distributed; an Oracle database is typically centralized. Oracle does have a separate distributed Blockchain Platform. Oracle suggests using the integrated blockchain table in Oracle Database 21c when you don’t really need a distributed ledger, for example when the blockchain is used internally by an enterprise for financials or compliance data.

Native JSON type

Oracle Database has had JSON support since version 12c, but only within other data types (VARCHAR2, CLOB, or BLOB). Now it also has a native JSON type, which is stored in binary format, similar to the JSONB used by MongoDB. As a native type, JSON parsing happens only on inserts. This can result in read and update operations being four or five times faster than text-based JSON, and updates to very large JSON documents being 20 to 30 times faster.

In-database JavaScript execution

While the Oracle Database has supported running programming languages within the server for decades (PL/SQL since 1992, Java since 1999, and so on), Oracle Database 21c adds support for JavaScript via the Oracle Multilingual Engine (MLE), powered by GraalVM. The MLE automatically maps JavaScript data types to Oracle Database data types and vice versa. The JavaScript code itself can execute PL/SQL (stored procedures) and SQL through a built-in JavaScript module.

SQL macros

Oracle developers have long used PL/SQL stored procedures and table functions to simplify SQL queries. Unfortunately, the Oracle PL/SQL and SQL engines operate in different contexts, and calling PL/SQL from SQL causes an expensive context switch. To avoid this, Oracle Database 21c adds a SQL macro capability with the new SQL_MACRO keyword, which causes functions to run in the SQL engine and avoid the context switch.

In-memory database enhancements

Database In-Memory used to require considerable management from users or DBAs. Now you can set INMEMORY_AUTOMATIC_LEVEL to HIGH, and all columns will be considered for in-memory analysis.

In addition, in-memory hash joins have been optimized using SIMD vectorization, resulting in up to 10x speedups. Also, columnar scans have been sped up for the case where not every column is in memory. When that happens, the optimizer can now perform a hybrid scan and fetch projected column values from the row store if needed, instead of needing to perform the scan entirely on the row store.

Optimized graph models

Graph databases can have millions or even billions of edges and vertices. In Oracle Database 21c, the graph representation has been optimized, allowing the analysis of larger graphs using less memory than previous versions. This version also allows developers to create or extend graph algorithms in Java. These user-defined graph algorithms can run as quickly as native algorithms.

Persistent memory (PMEM) support

Oracle Database 21c added support for a Persistent Memory (PMEM) store on single instance databases. This allows for storage of data and redo in local PMEM on commodity hardware. Oracle Database 21c runs SQL directly on data stored in the mapped PMEM file system, eliminating the need for a large buffer cache and reduces the code path. And it provides much faster transaction durability and near-instant recovery. This feature is especially useful for workloads that tend to be I/O bound.

Oracle Machine Learning

While Oracle performed in-database machine learning in previous database versions, Oracle Database 21c offers a new AutoML option and new machine learning algorithms. Oracle AutoML does automatic algorithm selection, automatic feature engineering, and automatic model tuning.

At a Glance
  • Oracle Database 21c is the newest in a long line of relational databases going back to 1979. It is now a converged database, meaning that it supports relational, time series, JSON, graph, spatial, text, OLAP, and XML data all at once in the same database, and supports transactional, analytic, machine learning, IoT, streaming, and blockchain workloads.

    Pros

    • Supports relational, time series, JSON, graph, spatial, text, OLAP, and XML data
    • Supports transactional, analytic, machine learning, IoT, streaming, and blockchain workloads
    • Cloud pricing is competitive
    • Machine learning and AutoML can run in-database
    • Multiple languages can run in-database including JavaScript

    Cons

    • Limited availability, currently only in Oracle Cloud
    • On-premises pricing has historically been viewed as high by customers
1 2 Page 1
Page 1 of 2