1. Summary
A vertex condition is a set of one or more constraints applied to vertices when evaluating graph queries, expressed as key/value pairs within a dictionary.
1.1. Syntax
The general syntax of <vertex_condition> is a Python dictionary:
<vertex_condition> ::= {
<constraint_1>,
<constraint_2>,
...
}
where <constraint_n> ::= <constraint_name> : <constraint_value>
As a shorthand, when the only constraint is vertex name <vertex_condition> may be a string:
<vertex_condition> :: = "<str>" | "<prefix>*"
1.2. Constraints
The following constraints are available for use in a vertex condition. Multiple constraints are evaluated using AND-logic, i.e. all must evaluate to True for vertex condition to match.
| Name | Syntax | Description |
|---|---|---|
|
Controls whether to restrict vertex matching to vertices of a particular manifestation. |
|
|
Restrict vertex matches to those of the specified type. |
|
|
Vertices must have a certain number of incident arcs. |
|
|
Vertices must have a certain number of inarcs. |
|
|
Vertices must have a certain number of outarcs. |
|
|
Restrict on vertex identifier. When a string is used it may contain a wildcard '*' at the end to specify prefix match. A list can also be used to specify individual ids (as vertex instances or non-wildcard strings.) |
|
|
Match vertex timestamps relative to an absolute time constraint. |
|
|
Match vertex timestamps relative to the current time. |
|
|
Restrict vertex matches to those vertices with a vector similar to a probe vector. |
|
|
Restrict vertex matches to those vertices containing the specified property or properties. |
|
|
Restrict vertex matches to those for which <expression> evaluates to a positive numeric value. The expression is evaluated before |
|
|
Restrict vertex matches to those vertices which have at least one neighbor matching the |
|
|
Same behavior as |
|
|
Restrict vertex matches to those for which <expression> evaluates to a positive numeric value. The expression is evaluated after |
2. Evaluation Order
Constraints are evaluated in the following order.
Order matters for execution performance because evaluation is terminated (producing a vertex condition miss) as soon as a non-matching constraint is encountered. Writing smart vertex conditions that eliminate as many matches as possible as early as possible will result in more efficient algorithms.
-
degree(when value condition is simple numeric) -
degree(when degree condition is a value range or includes arc filter) -
abstimeandreltime(abstime and reltime conditions are merged) -
similarity : hamdist(hamming distance) -
similarity : score(cosine / jaccard) -
adjacent : neighbor(exact ID check if included in neighbor constraint) -
adjacent : neighbor(may be recursive) -
traverse : neighbor(exact ID check if include in neighbor constraint) -
traverse : neighbor(may be recursive)
3. Vertex Filters
3.1. virtual - Vertex Manifestation Constraint
The virtual constraint controls whether to restrict vertex matching to vertices of a particular manifestation.
{ 'virtual' : <boolean_condition> }
3.1.1. Remarks
If <boolean_condition> is False only REAL vertices will be matched.
If <boolean_condition> is True only VIRTUAL vertices will be matched.
If the virtual constraint is not included in a vertex condition both VIRTUAL and REAL vertices will be matched.
3.2. type - Vertex Type Name
The type constraint will restrict vertex matches to those of the specified type.
3.2.1. Syntax
{ 'type' : <type_name> }
3.2.2. Remarks
The type constraint can be used to restrict matches to vertices of the specified type. When <type_name> is a string, this must match the vertex type exactly. Wildcards are not supported for type matching, with the exception of full wildcard * which will match any vertex type. When <type_name> is None, only typeless vertices are matched.
3.2.3. Examples
# Match only vertices of type "person"
{ 'type' : "person" }
# Match all vertices (same as not specifying type filter)
{ 'type' : "*" }
# Special case, match only typeless vertices
{ 'type' : None }
3.3. degree - Number of Vertex Arcs
The degree constraint will match vertices with a certain number of incident arcs, optionally restricted by arc direction or by other arc conditions.
A shorthand for restricting arc direction is provided by the indegree and outdegree constraints.
3.3.1. Syntax
{ 'degree' : <degree_condition> }
{ 'indegree' : <degree_condition> } # D_IN implied
{ 'outdegree' : <degree_condition> } # D_OUT implied
The <degree_condition> is one of:
# Apply <value_condition> to a vertex degree attribute # (degree, indegree, or outdegree) <value_condition> # Apply <value_condition> to the number of arcs incident # on vertex matching <arc_filter> ( <arc_filter>, <value_condition> )
3.3.2. Remarks
When <degree_condition> is a <value condition> it is applied directly to the vertex degree attribute. When <degree_condition> is ( <arc_condition>, <value_condition> ) the value condition is applied to the count obtained by evaluating the <arc_condition> for all arcs incident on the vertex.
The simple value condition is faster to evaluate because the value to compare is readily available as a vertex attribute.
The arc-conditional evaluation requires recursive processing at each vertex to compute the number of arcs matching the arc condition.
3.3.3. Examples
# Vertex degree must be exactly 10
neighbor = { 'degree' : 10 }
# Vertex degree must be > 10
neighbor = { 'degree' : (V_GT,10) }
# Vertex indegree must be between 5 and 15
neighbor = { 'indegree' : (V_RANGE, (5,15)) }
# Vertex must have exactly 20 inarcs with the "likes" relationship
neighbor = { 'indegree' : ( "likes", 20 ) }
# Vertex must have exactly 20 inarcs with the "likes" relationship
# having a counter value greater than 5.
neighbor = { 'degree' : ( ("likes",D_IN,M_CNT,V_GT,5), 20 ) }
# Vertex must have < 20 inarcs with the "likes" relationship
# having a counter value greater than 5.
neighbor = { 'degree' : ( ("likes",D_IN,M_CNT,V_GT,5), (V_LT,20) ) }
# Vertex must have between 15 and 25 inarcs with the "likes"
# relationship having a counter value between 3 and 7.
neighbor = {
'degree' : (
( "likes", D_IN, M_CNT, V_RANGE, (3,7) ),
(V_RANGE, (15,25))
)
}
3.4. indegree - Number of Vertex Inarcs
3.5. outdegree - Number of Vertex Outarcs
3.6. id - Vertex Identifier Name
The id constraint will restrict vertex matches to those with specific identifier(s).
3.6.1. Syntax
{ 'id' : <exact> }
{ 'id' : <prefix> }
{ 'id' : <instance> }
{ 'id' : [<exact1>, <exact2>, ...] }
3.6.2. Remarks
The id constraint can be used to restrict matches to vertices with specific name(s). Four types of constraint values are supported:
-
<exact>(string) : A vertex with this exact name will match -
<prefix>(string ending with*) : Vertices with names starting with this prefix will match -
<instance>(pyvgx.Vertex) : Only this exact vertex instance will match -
[<exact1>, <exact2>, …]: Vertices with any of these exact names will match
3.6.3. Examples
# Match only the vertex "Alice"
{ 'id' : "Alice" }
# Match vertices starting with "Ali", including "Ali", "Alice", "Alien"
{ 'id' : "Ali*" }
# No id constraint
{ 'id' : "*" }
# Assuming A=graph.OpenVertex( "Alice" ), this will match "Alice"
{ 'id' : A }
# Match vertices "Alice", "Bob" or "Charlie"
{ 'id' : ["Alice", "Bob", "Charlie"] }
3.7. abstime - Absolute Vertex Timestamps
This constraint specifies timestamp conditions that are evaluated relative to the absolute timestamp, i.e. seconds since 1970.
3.7.1. Syntax
{
'abstime' : {
'created' : <value_condition>,
'modified' : <value_condition>,
'expires' : <value_condition>
}
}
3.7.2. Remarks
The timestamp conditions 'created', 'modified' and 'expires' are all optional and may be specified at the same time. When more than one of these are specified they must all match, i.e. AND logic applies.
A timestamp <value_condition> is specified as absolute time, i.e. seconds since 1970.
3.8. reltime - Relative Vertex Timestamps
This constraint specifies timestamp conditions that are evaluated relative to the current time.
3.8.1. Syntax
{
'reltime' : {
'created' : <value_condition>,
'modified' : <value_condition>,
'expires' : <value_condition>
}
}
3.8.2. Remarks
The timestamp conditions 'created', 'modified' and 'expires' are all optional and may be specified at the same time. When more than one of these are specified they must all match, i.e. AND logic applies.
A timestamp <value_condition> is specified relative to the current time. Positive values indicate the number of seconds into the future relative to current time. Negative values indicate a number of seconds in the past relative to current time.
3.9. similarity - Vertex Vector Similarity
The similarity constraint is used to restrict vertex matches to those vertices with a vector similar to a probe vector.
Similarity of two vectors can be expressed using the similarity score or the hamming distance of the vectors' fingerprints.
3.9.1. Syntax
# Similarity score, e.g. Cosine or Jaccard
{
'similarity': {
'vector' : <probe_vector>,
'score' : <value_condition>
}
}
# Hamming distance
{
'similarity': {
'vector' : <probe_vector>,
'hamdist' : <value_condition>
}
}
# Cosine/Jaccard AND Hamming distance
{
'similarity': {
'vector' : <probe_vector>,
'score' : <value_condition>,
'hamdist' : <value_condition>
}
}
3.9.2. Probe Vector Syntax
There are four ways to express the <probe_vector>:
-
A pyvgx.Vertex instance:
{ 'vector' : <pyvgx.Vertex> } -
A Python list:
{ 'vector' : [ ( <term1> , <weight1> ), ( <term2> , <weight2> ), ... ] } -
A pyvgx.Vertex instance:
# use the vector of this vertex instance { 'vector' : <pyvgx.Vertex> } -
The literal string
"tail"(applicable for neighborhood queries):# use the vector of the anchor vertex { 'vector' : "tail" }
3.9.3. Similarity Measures
When 'score' is used, a similarity score from 0.0 - 1.0 is computed using the currently active similarity parameters for the graph, typically a combination of Cosine and Jaccard computed between the vertex vector and the <probe_vector>. The <value_condition> must express matching for a value in the appropriate 0.0 - 1.0 range.
When 'hamdist' is used, the hamming distance of the fingerprints for the vertex vector and the <probe_vector> is computed. Fingerprints are 64-bit integers. The <value_condition> must express matching for a value in the range 0 - 64.
3.10. property - Vertex Property Constraint
This constraint is used to match vertices based their property names and property values.
3.10.1. Syntax
{
'property' : {
<prop1> : <value_condition>,
<prop2> : <value_condition>,
...
}
}
3.10.2. Remarks
The property constraint may contain any number of conditions. When multiple <prop> : <value_condition> entries are given they must all match, i.e. AND logic applies.
To test for the existence of a property regardless of its value specify None for the <value_condition>.
Property values can be numeric (integer or float) or strings. Prefix matching on strings is supported, i.e. a wildcard * is permitted at the end of the string condition.
3.10.3. Examples
# Only match vertices that:
# have a numeric property 'price' with value less than 100.0
# AND
# have a string property 'color' with value equal to 'black'
# AND
# have a string property 'manufacturer' with a value starting
# with 'Mic'
# AND
# have a property called 'screen_size' regardless of
# value / value type
{
'property' : {
'price' : (V_LT, 100.0),
'color' : 'black'
'manufacturer' : 'Mic*',
'screen_size' : None
}
}
3.11. filter - General Filter Expression
This constraint uses the general VGX Expression Language to express arbitrarily complex filters.
3.11.1. Syntax
{
'filter' : <expression>
}
3.11.2. Remarks
This constraint is a match if <expression> evaluates to a positive numeric value, or produces a string, vertex, or other non-NULL object.
The expression must be a local evaluator, i.e. it cannot reference any of the next* arc attributes or head vertex attributes/properties, or perform any stage*() or collect*() operations. (Use adjacent or traverse filters for non-local evaluators.)
| Due to the no short-circuiting policy in the expression language it is important to take advantage of the overall vertex condition’s other constraints that can be applied before the filter expression is evaluated. Although it may be tempting to pack all constraints into a single evaluator expression this is usually not the most efficient way to write vertex conditions. |
3.11.3. Examples
{
'filter' : "vertex['score'] in range( 2.5, 7.5 ) && .id == '*s'"
}
3.12. adjacent - Neighborhood Adjacency Constraint
This constraint requires a vertex to be adjacent to at least one other vertex in the graph according to a set of criteria, all of which are optional.
3.12.1. Syntax
{
'adjacent' : {
'arc' : <arc_condition>,
'filter' : <expression>,
'neighbor' : <vertex_condition>,
'assert' : <bool>
}
}
3.12.2. adjacent : arc
The <arc condition> matches arcs between a vertex and its neighbors. The default arc condition is to accept all arcs in any direction, i.e. ("*", pyvgx.D_ANY). This condition is evaluated first.
3.12.3. adjacent : filter
The filter expression is evaluated for each matching arc. It must evaluate to a positive numeric value in order to match, which is the default.
3.12.4. adjacent : neighbor
Each matching arc which also passes the filter test is traversed to the neighbor vertex, where a new vertex condition is evaluated recursively. The default vertex condition is * which means no traversal occurs. If this condition is a match the overall adjacency constraint is a match.
3.12.5. adjacent : assert
The adjacent constraint evaluates to True or False depending on whether all specified criteria are met. However, by setting 'assert' : True or 'assert' : False it is possible to explicitly control the effect the adjacent constraint has on the containing vertex condition. It may be desirable to continue the query even if the adjacent constraint is not met, which, for instance, would be the case when using adjacent to modify a running score.
3.12.6. Remarks
The adjacent constraint is non-greedy. It will traverse arcs and recursively evaluate neighbor vertex conditions only until the first match is found. This means not all arcs are necessarily traversed, thus speeding up execution in cases where a match is found.
Remember that a vertex can be connected to another vertex via a multiple arc, i.e. several individual arcs with different relationship types and/or modifiers. In this case traversal will potentially occur multiple times from a vertex to its head vertex. Every arc, including each individual arc within a multiple arc, is evaluated and traversed independently.
The neighbor <vertex condition> is applied recursively. This enables powerful, complex graph traversal specifications to be expressed in a single statement.
It should be noted that nesting neighbor conditions will increase query execution time due to the
recursive traversal patterns that must be followed during evaluation. Query execution time can range from microseconds
to years depending on the graph structure and the amount of neighbor recursion used.
|
3.12.7. Examples
{
'adjacent' : {
'arc' : ('likes', D_OUT, M_FLT, V_GT, 2.5)
}
}
{
'adjacent' : {
'arc' : ('knows', D_OUT),
'filter' : "vertex.type == 'person' && next['age'] in {80,90,100}"
}
}
{
'type' : 'person',
'adjacent' : {
'arc' : ('knows', D_OUT),
'filter' : "next['age'] in {80,90,100}"
}
}
{
'type' : 'person',
'adjacent' : {
'arc' : ('knows', D_OUT),
'neighbor' : "Bob",
'filter' : "mul( R1, 100 )",
'assert' : True
}
}
{
'adjacent' : {
'arc' : ('likes', D_OUT, M_FLT, V_GT, 2.5),
'neighbor': {
'adjacent' : {
'arc' : ('knows', D_OUT),
'neighbor': {
'id' : 'Bob'
}
}
}
}
}
{
'type' : 'person'
'adjacent' : {
'arc' : ("called", D_IN)
'neighbor' : {
'type' : 'person',
'id' : 'Bob',
'adjacent' : {
'arc' : ("purchased", D_OUT),
'neighbor' : {
'type' : 'product'
'adjacent' : {
'arc' : ("makes", D_IN)
'neighbor' : {
'type' : 'company',
'id' : 'ACME'
}
}
}
}
}
}
}
{
'adjacent' : {
'neighbor': {
'type' : 'person'
}
}
}
{
'adjacent' : {
'neighbor': {
'type' : 'person',
'adjacent' : {
'arc' : ("visited", D_OUT),
'neighbor' : {
'type' : 'country',
'id' : 'Italy'
}
}
}
}
}
{
'type' : 'person',
'adjacent' : {
'arc' : ('friend', D_OUT),
'neighbor': {
'type' : 'person',
'adjacent' : {
'arc' : ('purchased', D_OUT),
'neighbor': {
'type' : 'product',
'price' : (V_GT, 1000.0),
'adjacent' : {
'arc' : ('viewed', D_IN)
'neighbor': {
'id' : 'Bob'
}
}
}
}
}
}
}
3.13. traverse - Neighborhood Adjacency Constraint with Arc Collection
This constraint requires a vertex to be adjacent to at least one other vertex in the graph according to a set of criteria, all of which are optional. Matched arcs can be collected into the query result set. If no collection is specified traverse and adjacent have equivalent behavior.
3.13.1. Syntax
{
'traverse' : {
'arc' : <arc_condition>,
'filter' : <expression>,
'neighbor' : <vertex_condition>,
'assert' : <bool>,
'collect' : C_NONE | C_COLLECT | C_SCAN | <arc_condition>
}
}
3.13.2. traverse : arc
The <arc condition> matches arcs between a vertex and its neighbors. The default arc condition is to accept all arcs in any direction, i.e. ("*", pyvgx.D_ANY). This condition is evaluated first.
3.13.3. traverse : filter
The filter expression is evaluated for each matching arc. It must evaluate to a positive numeric value in order to match, which is the default.
3.13.4. traverse : neighbor
Each matching arc which also passes the filter test is traversed to the neighbor vertex, where a new vertex condition is evaluated recursively. The default vertex condition is * which means no traversal occurs. If this condition is a match the overall traverse constraint is a match.
3.13.5. adjacent : assert
The traverse constraint evaluates to True or False depending on whether all specified criteria are met. However, by setting 'assert' : True or 'assert' : False it is possible to explicitly control the effect the traverse constraint has on the containing vertex condition. It may be desirable to continue the query even if the traverse constraint results in no traversal, which, for instance, would be the case when using traverse to modify a running score.
3.13.6. traverse : collect
Arcs which satisfy all of arc, filter and neighbor can be collected into the result set. Possible settings for collect are:
| Collect Value | Description |
|---|---|
|
All matching arcs are collected into the result set. |
|
Individual arcs of a multiple arc that has at least one match can be collected using a separate arc condition. |
|
This turns |
|
This is the default when |
3.13.7. Remarks
When no collect clause is given traverse and adjacent have the same behavior. When a collect clause other than C_NONE is specified traverse becomes a greedy version of adjacent. The overall match outcome of a traverse constraint is independent of the collect clause.
Both adjacent and traverse can be used at the same time in a vertex condition. This allows a fork to be created, enabling completely independent neighborhood constraints to be specified. However, adjacent must produce a match in order for traverse to be evaluated.
If adjacent includes a recursive vertex condition it is possible to include both adjacent and traverse in the nested vertex condition. A recursive bifurcating query structure can thus be created, like this:
{
'adjacent' : {
'arc' : ...
'filter' : ...
'neighbor' : {
'adjacent' : {
'neighbor' : {
'adjacent' : ...
'traverse' : ...
}
},
'traverse' : {
'neighbor' : {
'adjacent' : ...
'traverse' : ...
},
'collect' : ...
}
}
},
'traverse' : {
'arc' : ...
'filter' : ...
'neighbor' : {
'adjacent' : {
'neighbor' : {
'adjacent' : ...
'traverse' : ...
}
},
'traverse' : {
'neighbor' : {
'adjacent' : ...
'traverse' : ...
},
'collect' : ...
}
},
'collect' : ...
}
}
3.13.8. Examples
{
'traverse' : {
'arc' : ("called", D_OUT),
'collect' : C_COLLECT
}
}
{
'type' : 'person',
'traverse' : {
'arc' : ("friend", D_OUT),
'collect' : ("visited", D_OUT, M_CNT, V_GTE, 2),
'neighbor': {
'type' : 'person'
}
}
}
{
'type' : 'person',
'adjacent' : {
'arc' : ("friend", D_OUT),
'neighbor' : {
'type' : 'person',
'adjacent' : {
'arc' : ("visited", D_OUT),
'neighbor' : {
'type' : 'country',
'id' : 'Japan'
}
}
}
},
'traverse' : {
'arc' : ("visited", D_OUT),
'collect' : C_COLLECT,
'neighbor' : {
'type' : 'country'
}
}
}
3.14. post - General Filter Expression - Post Traversal
This constraint uses the general VGX Expression Language to express arbitrarily complex filters applied after any adjacency/traversal constraints.
3.14.1. Syntax
{
'post' : <expression>
}
3.14.2. Remarks
This constraint is a match if <expression> evaluates to a positive numeric value, or produces a string, vertex, or other non-NULL object.
|
The |
The expression must be a local evaluator, i.e. it cannot reference any of the next* arc attributes or head vertex attributes/properties, or perform any stage*() or collect*() operations.
3.14.3. Examples
graph.Neighborhood(
"A",
hits = 10,
neighbor = {
'filter' : "do( store( R1, 0 ) )",
'adjacent' : {
'arc' : ( "to", D_OUT ),
'filter' : """
void(
storeif(
stageif( next.arc.value > load( R1 ) ),
R1,
next.arc.value
)
)
"""
},
'post' : "commit()"
},
sortby = S_VAL
)
This query uses the stage/commit technique to pick a single arc from the second level neighborhood. We use 'filter' to reset a threshold level to zero before each inner traversal. Then we use 'adjacent' to iterate over all second level neighbors. The use of void(…) ensures exhaustive scan since it produces a miss for each arc traversed. When an arc with a larger value than the current threshold is encountered it is staged and the threshold is updated with the new value. (Staging a new arc will replace any previously staged arc.)
At the end of each inner traversal (when 'adjacent' completes) the 'post' expression is evaluated. Here we commit the staged arc, which collects it into the result set.
Note that 'post' is executed even through 'adjacent' always returns False (due to void() in its filter. Also note that no first-level arcs are collected since the neighbor={…} condition is always False (again due to void() in the inner filter.)
