What Is a Vertical Database?
By David Dunning
A vertical database is one in which the physical layout of the data is column-by-column rather than row-by-row. Rather than being arranged in horizontal record structures and processed vertically, data in a vertical database is arranged in vertical structures, known as predicate trees, or P-trees, and processed horizontally.
Horizontal databases are suitable for applications where the requested result is a set of horizontal records, but less so for applications such as data mining, where researchers are typically interested in results that can be expressed succinctly. P-trees, on the other hand, are well suited to data mining. P-trees are usually created by decomposing each attribute, or column, of a table of horizontal records into separate bit vectors, or array data structures. P-trees can be one-dimensional, two-dimensional or multi-dimensional; if the data to be stored in the database has natural dimensions -- for instance, geospatial data or geographic information -- the dimensions of the P-tree are matched to those of the data.
Data in a vertical database is processed through fast logical operators, such as AND, OR, exclusive OR and complement. Furthermore, by arranging data column-wise rather than row-wise, it is possible to execute queries, or searches, on the data without accessing pages on a hard disk that aren’t affected by the query and so increase the speed of data retrieval. This is an important consideration when data mining in very large data repositories.
Another advantage of vertical databases is that they allow data to be stored in large pages. A large page size means that a large number of relevant data items can be retrieved in a single read operation. By contrast, a single read operation on a horizontal database retrieves not only relevant data items, but also attributes, or columns, that aren’t relevant to the query in question and favors small page sizes.
Vertical databases have received renewed interest from the scientific community in recent years. The number of simultaneous users in scientific database applications is typically much smaller than in commercial applications, but users tend to submit more complex, unforeseen queries. In addition, scientific database applications must typically provide a more automated response to complex queries because of the absence of database and systems support staff. Scientific users typically prefer to work with dedicated, in-house computer systems, so scientific database applications need to be portable between various models of computer. Vertical databases are better, on all these counts, than their horizontal counterparts.
A full-time writer since 2006, David Dunning is a professional freelancer specializing in creative non-fiction. His work has appeared in "Golf Monthly," "Celtic Heritage," "Best of British" and numerous other magazines, as well as in the book "Defining Moments in History." Dunning has a Master of Science in computer science from the University of Kent.