Traverse

OrientDB is a graph database. This means that the focal point is on relationships (links) and how they are managed. The standard SQL language is not enough to work with trees or graphs because it lacks the recursion concept. This is the reason why OrientDB provides a new command to traverse trees and graphs: TRAVERSE. Traversing is the operation that crosses relationships between records (documents, vertexes, nodes, etc). This operation is much much faster than executing a JOIN in a Relational database.

The main concepts of Traversal are:

  • target, as the starting point where to traverse records. Can be:
  • fields, the fields to traverse. Use *, any() or all() to traverse all fields in a document
  • limit, the maximum number of records to retrieve
  • predicate, as the predicate to execute against each traversed document. If the predicate returns true, the document is returned, otherwise it is skipped
  • strategy, indicates how the graph traversed:

Traversing strategies

DEPTH_FIRST strategy

This is the default strategy used by OrientDB for traversal. It explores as far as possible along each branch before backtracking. It's implemented using recursion. To know more look at Depth-First algorithm. Below the ordered steps executed while traversing the graph using DEPTH_FIRST strategy:

Depth-first-tree

BREADTH_FIRST strategy

It inspects all the neighboring nodes, then for each of those neighbor nodes in turn, it inspects their neighbor nodes which were unvisited, and so on. Compare BREADTH_FIRST with the equivalent, but more memory-efficient iterative deepening DEPTH_FIRST search and contrast with DEPTH_FIRST search. To know more look at Breadth-First algorithm. Below the ordered steps executed while traversing the graph using BREADTH_FIRST strategy:

Breadth-first-tree

Context variables

During traversal some context variables are managed and can be used by the traverse condition:

  • $depth, as an integer that contain the depth level of nesting in traversal. First level is 0
  • $path, as a string representation of the current position as the sum of traversed nodes
  • $stack, as the stack current node traversed
  • $history, as the entire collection of visited nodes

The following sections describe various traversal methods.

SQL Traverse

The simplest available way to execute a traversal is by using the SQL Traverse command. For instance, to retrieve all records connected from and to Movie records up to the 5th level of depth:

for (OIdentifiable id : new OSQLSynchQuery<ODocument>("traverse in, out from Movie while $depth <= 5")) {
  System.out.println(id);
}

Look at the command syntax for more information.

Native Fluent API

Native API supports fluent execution guaranteeing compact and readable syntax. The main class is OTraverse:

  • target(<iter:Iterable<OIdentifiable>>), to specify the target as any iterable object like collections or arrays of OIdentifiable objects.
  • target(<iter:Iterator<OIdentifiable>>), to specify the target as any iterator object. To specify a class use database.browseClass(<class-name>).iterator()
  • target(<record:OIdentifiable>, <record:OIdentifiable>, ... ), to specify the target as a var ars of OIterable objects
  • field(<field-name:string>), to specify the document's field to traverse. To add multiple field call this method in chain. Example: .field("in").field("out")
  • fields(<field-name:string>, <field-name:string>, ...), to specify multiple fields in one call passing a var args of Strings
  • fields(Collection<field-name:string>), to specify multiple fields in one call passing a collection of String
  • limit(<max:int>), as the maximum number of record returned
  • predicate(<predicate:OCommandPredicate>), to specify a predicate to execute against each traversed record. If the predicate returns true, then the record is returned as result, otherwise false. it's common to create an anonymous class specifying the predicate at the fly
  • predicate(<predicate:OSQLPredicate>), to specify the predicate using the SQL syntax.

In the traverse command context iContext you can read/put any variable. Traverse command updates these variables:

  • depth, as the current depth of nesting
  • path, as the string representation of the current path. You can also display it. Example: select $path from (traverse * from V)
  • stack, as the List of operation in the stack. Use it to access to the history of the traversal. It's a List<OTraverseAbstractProcess<?>> where process implementations are:
    • OTraverseRecordSetProcess, usually the first one it's the base target of traverse
    • OTraverseRecordProcess, represent a traversed record
    • OTraverseFieldProcess, represent a traversal through a record's field
    • OTraverseMultiValueProcess, use on fields that are multivalue: arrays, collections and maps
  • history, as the set of records traversed as a Set<ORID>.

Example using an anonymous OCommandPredicate as predicate

for (OIdentifiable id : new OTraverse()
              .field("in").field("out")
              .target( database.browseClass("Movie").iterator() )
              .predicate(new OCommandPredicate() {

    public Object evaluate(ORecord iRecord, ODocument iCurrentResult, OCommandContext iContext) {
      return ((Integer) iContext.getVariable("depth")) <= 5;
    }
  })) {

  System.out.println(id);
}

Example using the OSQLPredicate as predicate

for (OIdentifiable id : new OTraverse()
              .field("in").field("out")
              .target(database.browseClass("Movie").iterator())
              .predicate( new OSQLPredicate("$depth <= 5"))) {

  System.out.println(id);
}

Other examples

OTraverse gets any Iterable, Iterator and Single/Multi OIdentifiable. There's also the limit() clause. To specify multiple fields use fields(). Full example:

for (OIdentifiable id : new OTraverse()
              .target(new ORecordId("#6:0"), new ORecordId("#6:1"))
              .fields("out", "int")
              .limit(100)
              .predicate( new OSQLPredicate("$depth <= 10"))) {

  System.out.println( id);
}