PyVGX 3.8 Reference

This is a complete pyvgx reference. A compact reference is also available.

1. pyvgx: Python API for VGX

VGX is a general purpose graph and vector search platform. With its open ended architecture you can build search engines, vector databases, graph databases, reinforcement learning models, approximate nearest neighbor analysis, recommendation engines, and many other applications.

The PyVGX Python module exposes the VGX platform’s SDK to enable application development via Python plugins. You write plugin code to implement your application’s functionality and to define its service API endpoints.

You first define a graph model to represent your data, then implement plugin algorithms to populate and query your model. All data is kept in memory for maximum performance. Graph updates are made in real-time with no indexing delay.

VGX and pyvgx are both implemented in C. The latter is an extension module that can be imported and used to build applications in Python.

import pyvgx
pyvgx.system.Initialize()
g = pyvgx.Graph( "mygraph" )
g.Connect( "A", "to", "B" )
g.Connect( "A", "to", "C" )
g.Neighborhood( "A" ) # -> ['B', 'C']

When implementing plugins it is useful to think of the Python interpreter as the orchestration layer and VGX as the execution engine. Pyvgx acts as a bridge between the two. An optimal plugin implementation, in terms of computational efficiency, spends a minimal amount of time in the Python interpreter. You should strive to avoid loops in pure Python and instead find ways to use the pyvgx API to perform iterative operations for you. Performing as much work as possible within the VGX core has the dual benefit of 1) more efficient (native) execution and 2) bypassing the Python global interpreter lock, allowing full parallel utilization of multi-core CPU hardware.

Multiple VGX instances can be connected together to form a cluster. The HTTP Server can dispatch requests to multiple partitions and automatically handle load balancing or failover among multiple replica. Data input per partition is fed to each partition’s master/provider instance which attaches to one or more subscriber instances mirroring graph data in the master instance. A special dispatcher for feeding data can be set up, distributing input data among partitions according to plugin-defined rules.

When adding a plugin pyvgx automatically generates an endpoint in the VGX built-in HTTP server. A minimal application will usually have one plugin endpoint for feeding data and another for serving search requests.

The standard anatomy of a VGX service built on pyvgx is a Python program that bootstraps and configures the service, registers the application’s plugins, and then runs in the background until shutdown. Aside from process startup and shutdown activities, the Python interpreter is only invoked as part of plugin execution triggered via service endpoints.

Example.py

from pyvgx import *
system.Initialize( "service" )

def sqrt( request, x:float ) -> float:
    return x ** 0.5

system.AddPlugin( sqrt )
system.StartHTTP( 9000 )
system.RunServer( "Square Root Server" )

When you run the above Example.py you have a server that can compute square roots. The request http://localhost:9000/vgx/plugin/sqrt?x=9 returns:

{
    "status": "OK",
    "response": 3.0,
    "port": [9000, 1],
    "exec_id": 1,
    "exec_ms": 0.548
}

For multi-node systems the pyvgx.VGXInstance helper class can be used for implementing services.

1.1. Summary of Features

1.1.1. Graph Structure

Graphs are built from two basic object types: vertices (nodes) and arcs (directed edges). Vertices are connected explicitly via arcs or implicitly via their similarity (represented by a vector metric) to other vertices.

Explicit connections are realized internally as adjacency vectors directly associating vertex pairs. Index-free adjacency allows efficient movement from node to node by following direct links. This network of vertices connected by arcs is the graph.
Implicit connections are expressed via similarity vectors. Vertices with such vectors assigned to them become indirectly associated by a chosen vector metric such as Euclidean distance or Cosine similarity. A proximity graph can be constructed to facilitate efficient vector search, using the explicit network to form navigable paths between implicitly connected items.

Two vertices may be connected by a simple arc (one connection) or a multiple arc (many connections). A vertex may be connected to any number of other vertices.

Two explicitly connected vertices are neighbors. The set of arcs between a vertex and all its neighbors is the vertex neighborhood.

The total number of explicit connections a vertex has is the vertex degree. Outbound arcs and inbound arcs are counted as outdegree and indegree, respectively.

The graph’s size is the sum of all degrees for all vertices in the graph, i.e. the total number of explicit connections. The total number of vertices in a graph is the graph order.

Both vertices and arcs may have properties. Vertices properties are freely assignable key-value pairs. Arc properties include a relationship type, modifier (value type), and a numeric value. Queries use these properties for filtering, scoring, sorting and display of search results.

1.1.2. Graph Search

Graph search can be global or local. Global queries scan the entire graph to return any matching vertices or arcs. Local queries scan the neighborhood around a selected anchor vertex.

Neighborhood queries can be nested to extend several levels beyond the immediate neighbors of the anchor. Independent filters and collection parameters can be specified for each level. Results can be sorted by any attribute or property, or according to a user-definable rank score. Traversal filters, ranking formulas, and result field selectors all use a common expression language which allows mathematical formulas of any complexity to be executed.

Neighborhood navigation queries can be used to automatically explore the entire graph from any starting vertex. This is especially useful for traversing proximity graphs where nodes have been connected based on their mutual vector similarity. Traversal of such graphs form the basis of highly efficient vector search, and can be combined with any other filtering to implement hybrid semantic search.

1.1.3. Implicit Connections

A vertex becomes implicitly connected to other vertices by setting the vertex vector. Vectors are numeric arrays or lists of weighted features.

The implicit connection strength between two vertices with vectors is determined by their vector similarity. When configured in Euclidean Vector mode similarity can be computed either as the Euclidean distance or the Cosine similarity between vectors. When configured in Feature Vector mode similarity can be computed as the Cosine similarity or the Jaccard index, or a combination of the two.

A global similarity query will return all vertices implicitly connected to a probe vertex using a globally configured similarity threshold to determine whether a sufficiently strong connection exists. Vector similarity can also be combined with explicit connection filters in neighborhood queries.

For efficient discovery of implicit connections (i.e. vector search) a proximity graph can be built and searched using neighborhood navigation queries.

1.1.4. Capacity

All graph data is held in memory, ensuring low latency and predictable performance (soft realtime) for all operations. The system may hold any number of independent graphs of any size and order at the same time, limited only by the available memory on your machine. Graphs may be saved to disk and restored later.

1.1.5. Object Expiration

Vertices and arcs may be assigned expiration timestamps (time-to-live or TTL) for automatic deletion at a future point in time. Expiration times are independently assignable for all objects. TTL has a resolution of one second for any expiration timestamp ranging from imminent to infinity.

1.1.6. Concurrency

Graphs may be accessed simultaneously by multiple threads. Safe, concurrent access is governed by various internal locking mechanisms that appear seamless at the API level. Optional timeouts may be specified for most operations. Python’s global interpreter lock (GIL) is released during execution of graph operations, thus allowing parallel execution across multiple CPU-cores within the same Python process.

1.1.7. Plugins and VGX Server

A plugin framework allows remote execution of Python functions that you define and register with the system. Plugins are invoked by sending HTTP requests to VGX Server, a fully asynchronous, multithreaded request engine running within the VGX core. This server is independent from the Python interpreter.

1.2. API Components

pyvgx module

Common functions, constants, exceptions, and classes:

pyvgx.system
pyvgx.op
pyvgx

pyvgx.Graph Objects

Create graph structure, run graph traversal queries, and manage graph lifecycle.

Section 3.2, “Graph Attributes”
Section 3.3, “Graph Methods”

pyvgx.Vertex Objects

Inspect and modify vertices.

Section 4.2, “Vertex Attributes”
Section 4.3, “Vertex Methods”

pyvgx.Memory Objects

Create and manipulate raw memory arrays for use with expression evaluators.

pyvgx.Vector Objects

Similarity vectors.

pyvgx.Similarity Objects

Compute vector similarity and manage similarity parameters.

pyvgx.Query Objects

Reusable graph queries with pre-defined parameters.

VGX Server - Multi-node System Framework with Plugin Support

Define application plugins and expose functions as HTTP endpoints.

VGX Transaction Protocol

Protocol used by VGX Transaction Interconnect to send graph data to subscribers.

2. pyvgx module

The pyvgx module API provides functions for initializing the system, managing graph objects, printing log messages, and a few other convenience functions. The module also defines object types, exception types, and several constants that are used in method calls throughout the API.

2.1. Functions

2.1.1. `pyvgx.system` Namespace

2.1.1.1. AddPlugin

pyvgx.system.AddPlugin( [plugin[, name[, graph[, engine[, pre[, post]]]]]] ): See VGX Server system.AddPlugin()

2.1.1.2. CancelSync

pyvgx.system.CancelSync()

Stop any synchronization that was started with pyvgx.System.Sync().

Subscribers are generally left in an inconsistent state if synchronization is terminated before it completes.

2.1.1.3. ClearReadonly

pyvgx.system.ClearReadonly(): Exit readonly mode for all loaded graphs. (Make all graphs writable.)

2.1.1.4. CountReadonly

pyvgx.system.CountReadonly(): Return the number of readonly graphs.

2.1.1.5. DeleteGraph

pyvgx.system.DeleteGraph( name[, path[, timeout]] ): Remove a graph identified by name and optionally path from the graph registry. If the graph is currently in use and cannot be freed within the given timeout (in milliseconds) an exception is raised. The default behavior is nonblocking, i.e. timeout=0.

This will only remove the graph from memory and the graph registry file on disk, not the actual graph data on disk. Future initializations will not automatically load this graph into memory.

Return True if graph was deleted, False otherwise.

2.1.1.6. DispatcherConfig

pyvgx.system.DispatcherConfig(): See VGX Server system.DispatcherConfig()

2.1.1.7. DurabilityPoint

pyvgx.system.DurabilityPoint(): Return a 3-tuple (txid, serial, ts) representing the most recent durable transaction received from an attached provider. A durable transaction is a transaction whose effects are included in a graph snapshot on disk, i.e. a transaction applied prior to a call to pyvgx.system.Persist(). The durability point is thus the last transaction applied before the snapshot was created.

2.1.1.8. EventsResumable

pyvgx.system.EventsResumable(): Return True if any loaded graph has its TTL event processor suspended, otherwise return False.

2.1.1.9. ExitRunServer

pyvgx.system.ExitRunServer(): Causes RunServer() to exit.

Calling this function has no effect unless RunServer() is currently blocking.

2.1.1.10. Fingerprint

pyvgx.system.Fingerprint(): Return a digest string representative of the current state of all loaded graphs. Two VGX instances will have the same fingerprint only when their loaded graphs are identical.

2.1.1.11. GetBuiltins

pyvgx.system.GetBuiltins(): See VGX Server system.GetBuiltins()

2.1.1.12. GetGraph

pyvgx.system.GetGraph( name ): Return an already open graph instance. This is useful for temporarily accessing graphs currently owned by another thread.

2.1.1.13. GetPlugins

pyvgx.system.GetPlugins(): See VGX Server system.GetPlugins()

2.1.1.14. GetProperties

pyvgx.system.GetProperties( [ timeout ] ): Return a dict of all system properties. Raise AccessError if the operation cannot complete within the given timeout in milliseconds which defaults to 1000 ms.

2.1.1.15. GetProperty

pyvgx.system.GetProperty( key[, default[, timeout ]] ): Return system property key if it exists, otherwise return default if provided or raise LookupError if no default provided. Raise AccessError if the operation cannot complete within the given timeout in milliseconds which defaults to 1000 ms.

2.1.1.16. HasProperty

pyvgx.system.HasProperty( key[, timeout ] ): Return True if system property key exists, otherwise return False. Raise AccessError if the operation cannot complete within the given timeout in milliseconds which defaults to 1000 ms.

2.1.1.17. Initialize

pyvgx.system.Initialize( [vgxroot[, http[, attach[, bind[, durable[, events[, idle[, readonly[, euclidean ]]]]]]]]] )

Initialize the library, with one or more optional configuration parameters.

vgxroot: Specify disk storage location for graph data by supplying a root directory in the optional vgxroot parameter. The default root directory is the current working directory.

http: Start HTTP service on port http if > 0. The service enables plugin execution via HTTP requests. Use system.AddPlugin() to make a plugin available in the HTTP service.

attach: Stream VGX Transactions to remote VGX destination(s) and/or local file(s) by specifying a URI string (or list of URI strings) in the optional attach parameter. When attached, the local VGX instance becomes a VGX operation source. In a multi-node setup this instance takes on the Data Provider role.

bind: Start VGX Transaction service on port bind if > 0. When this service is running, the local VGX instance becomes a VGX operation destination. At most one VGX source can be attached at any given time. In a multi-node setup this instance takes on the Data Subscriber role.

durable: When bind is specified and durable is True this instance supports durable writes for transactional input data provided by an attached VGX operation source. See pyvgx.op.Bind() for details.

events: Globally disable event processing (TTL) for all graphs by setting optional events parameter to False. Default is True, i.e. TTL event processing is enabled.

It is highly recommended to disable TTL events for a VGX instance that will inhabit the role of Data Subscriber. I.e., if the bind parameter is given one should also set events to False. Deletion of graph data is thus completely driven by the Data Provider, which guarantees data consistency across all VGX instances.

idle: When idle parameter is True load all graphs with event processing (TTL) temporarily paused. Resume event processing per graph as needed using pyvgx.Graph.EventEnable(). Default is False, i.e. TTL event processing is running (unless globally disabled with events=False.)

readonly: When readonly parameter is True load all graphs in readonly mode. Default is False, i.e. readonly/writable state is restored from persisted state.

euclidean: When euclidean parameter is True (default) VGX is initialized in Euclidean Vector mode, i.e. pyvgx.Vector objects are numeric arrays [c₁, c₂, …, c_n]. When euclidean parameter is False VGX is initialized in Feature Vector mode, i.e. vectors are maps of (dimension, weight) key-value pairs.

Graph registry

Existing graph data in the selected root directory will be loaded into memory for all graphs in the graph registry. The graph registry is stored in the .registry directory within the root directory and contains references to all active graphs. Inactive graphs will not be loaded automatically. Inactive graphs are made active and loaded into memory by calling the Graph() constructor with the graph name of an existing, inactive graph. Active graphs are made inactive and deleted from memory (but not from disk) by calling system.DeleteGraph() with the graph name of an active graph.

It is possible to make all graphs under the root directory inactive by manually removing the graph registry before calling system.Initialize().

It is not possible to remove graphs from disk using this API. Graphs can be made inactive in the registry, but their data will still exist on disk. Fully removing graph data from disk can be performed by calling pyvgx.Graph.Erase() on a graph instance.

This function should only be called when the system is uninitialized. Repeated calls will raise an exception. To reset the system to the uninitialized state, use system.Unload().

2.1.1.18. IsInitialized

pyvgx.system.IsInitialized(): Returns True if the system has been initialized, otherwise False.

2.1.1.19. Meminfo

pyvgx.system.Meminfo(): Return a tuple (total, process) where total is the amount of memory installed on the host machine and process is the amount of memory currently used by the Python process. Both numbers are in bytes.

2.1.1.20. NumProperties

pyvgx.system.NumProperties( [ timeout ] ): Return the number of system properties. Raise AccessError if the operation cannot complete within the given timeout in milliseconds which defaults to 1000 ms.

2.1.1.21. Persist

pyvgx.system.Persist( [timeout[, force[, remote]]] ): Persist all loaded graphs and system graph to disk. This is a shorthand for calling pyvgx.Graph.Save() on each graph individually, and the optional parameters apply as they do for Save(), except for the system graph where remote is ignored (local only.)

2.1.1.22. Registry

pyvgx.system.Registry(): This function returns a dictionary { <graph_name> : (<size>, <order>) } of all graph names and their order/size that currently exist in the graph registry, i.e. all graph data currently residing in memory.

2.1.1.23. RemovePlugin

pyvgx.system.RemovePlugin( name ): See VGX Server system.RemovePlugin()

2.1.1.24. RemoveProperties

pyvgx.system.RemoveProperties( [ timeout ] ): Remove all system properties and return the number of properties removed. Raise AccessError if the operation cannot complete within the given timeout in milliseconds which defaults to 1000 ms.

2.1.1.25. RemoveProperty

pyvgx.system.RemoveProperty( key[, timeout ] ): Remove system property key if it exists, otherwise raise LookupError. Raise AccessError if the operation cannot complete within the given timeout in milliseconds which defaults to 1000 ms.

2.1.1.26. RequestHTTP

pyvgx.system.RequestHTTP( address[, path[, query[, content[, headers[, timeout]]]]] )

Send HTTP request. This function returns the raw response content returned by the server.

address is a tuple (host, port)
path is the URI path, defaults to /
query is a URI-encoded string, or a list of key-val tuples, or a dict
content is a string, and implies HTTP POST
headers is a dict of HTTP headers to include in the request
timeout is in milliseconds, defaults to 5000

2.1.1.30. ResumeEvents

pyvgx.system.ResumeEvents(): Resume the TTL event processor for all loaded graphs.

2.1.1.31. Root

pyvgx.system.Root(): Return the graph root directory.

2.1.1.32. RunServer

pyvgx.system.RunServer( [ name[, watchdog[, interval[, logpath ]]]] )

Block forever until interrupted by signal.SIGINT or by another thread calling ExitRunServer().

An optional name string may be used to identify a VGX instance when accessed via the HTTP Service.

A parameterless function may be supplied to watchdog, and will be called regularly as defined by interval (in milliseconds). Default interval is 5000 ms.

When logpath directory is specified, all output normally written to stdout and stderr is redirected to files under the logpath directory. Log files are rotated once per hour and automatically deleted after 30 days. Two sets of logs are produced:

logpath/vgx.YYYY-mm-dd-HHMMSS (LogInfo(), LogWarning(), LogError())
logpath/access.YYYY-mm-dd-HHMMSS (LogTimestamp())

2.1.1.33. ServerAdminIP

pyvgx.system.ServerAdminIP(): See VGX Server system.ServerAdminIP()

2.1.1.34. ServerHost

pyvgx.system.ServerHost(): See VGX Server system.ServerHost()

2.1.1.35. ServerMetrics

pyvgx.system.ServerMetrics( percentiles ): See VGX Server system.ServerMetrics()

2.1.1.36. ServerPorts

pyvgx.system.ServerPorts(): See VGX Server system.ServerPorts()

2.1.1.37. ServerPrefix

pyvgx.system.ServerPrefix(): See VGX Server system.ServerPrefix()

2.1.1.38. ServiceInHTTP

pyvgx.system.ServiceInHTTP( [service_in] ): See VGX Server system.ServerInHTTP()

2.1.1.39. SetProperties

pyvgx.system.SetProperties( properties[, timeout ] ): Set multiple system properties from key-value pairs in dict properties. Raise AccessError if the operation cannot complete within the given timeout in milliseconds which defaults to 1000 ms.

2.1.1.40. SetProperty

pyvgx.system.SetProperty( key, value[, timeout ] ): Assign value to system property key. Raise AccessError if the operation cannot complete within the given timeout in milliseconds which defaults to 1000 ms.

2.1.1.41. SetReadonly

pyvgx.system.SetReadonly(): Enter readonly mode for all loaded graphs. (Make all graphs readonly.)

The system graph is not affected by this, and is always writable.

2.1.1.42. StartHTTP

pyvgx.system.StartHTTP( port[, ip[, prefix[, servicein[, dispatcher]]]] ): See VGX Server system.StartHTTP()

2.1.1.43. Status

pyvgx.system.Status( [graph[, simple]] ): Return a dictionary of various system information, counters and resource usage. Alternatively request information about a specific graph. In both cases a smaller set of essential information is returned if simple is True.

2.1.1.44. StopHTTP

pyvgx.system.StopHTTP(): See VGX Server system.StopHTTP()

2.1.1.45. SuspendEvents

pyvgx.system.SuspendEvents(): Suspend the TTL event processor for all loaded graphs.

2.1.1.46. Sync

pyvgx.system.Sync( [hard[, timeout]] )

Transfer the entire contents of ths graph instance to any attached VGX destinations. When hard is True (default is False) this operation implies pyvgx.Graph.Truncate() at all destinations before loading the transferred graph data.

Synchronizing data this way temporarily suspends normal operation of the VGX source instance and may take a long time to complete.

2.1.1.47. System

pyvgx.system.System(): Return the system graph instance. The system graph is automatically created when vgx is initialized. This function raises EnvironmentError if the system graph cannot be acquired within 5 seconds.

Do not modify the system graph. Doing so may destabilize the system and lead to undefined behavior.

2.1.1.48. Unload

pyvgx.system.Unload( [persist] ): Reset the system to the uninitialized state. This will remove all graph data from memory. After calling this function it is possible to call system.Initialize() again. An exception is raised if any system components are currently in use.

The optional persist parameter takes a boolean value and controls whether any unsaved graph data in memory will be persisted to disk. By default data is not persisted. Pass True to persist all data.

2.1.1.49. WritableVertices

pyvgx.system.WritableVertices(): Return the total number of vertices currently opened writable for all loaded graphs.

2.1.2. `pyvgx.op` Namespace

It is possible to capture API operations in a VGX source instance and transfer them as VGX Transactions to another VGX destination instance. This is achieved by attaching the source instance (provider) to the destination instance (subscriber.) A source can attach to one or more destinations. When attached, a socket connection is established between source and destination.

Operations performed on graphs in the VGX source instance will be captured and streamed over the connected sockets to all VGX destination instances. A destination instance consumes the incoming operation stream to maintain an identical copy of the source graphs.

Below are the API functions used to manage production, transfer and consumption of the operation data stream.

2.1.2.1. Allow

pyvgx.op.Allow( opcode_filter ): Enable execution by VGX Transaction service or pyvgx.op.Consume() of any opcode(s) specified by opcode_filter, which may be an individual operator opcode or an operator group. Filters are cumulative, i.e. repeated calls to this function will add to the set of allowed opcodes. Filters are exclusive, i.e. an allowed opcode is not denied.

2.1.2.2. Assert

pyvgx.op.Assert(): Transmit a state-assert operation for all graphs in the system to any attached destination(s). This communicates expected object counts as extracted from local graph(s), which the remote graphs can use to validate against. Remote graphs may then choose the appropriate action depending on the degree of mismatch, if any.

Implicit calls to op.Fence() are made before and after the local object counts are gathered. The transactions produced by state-assert itself will be sent in between the two fence operations.

If the assert operation cannot complete within one minute OperationTimeout is raised.

Any other error condition will raise InternalError.

2.1.2.3. Attach

pyvgx.op.Attach( [URIs[, handshake[, timeout]]] ): Establish producer-consumer relationships between the local VGX source/producer instance and one or more VGX destination/consumer instances. Destinations will receive a stream of graph modification operations (VGX Transactions) in real time as they are applied to the local VGX instance.

When attached, the local VGX instance becomes a VGX operation source. (The source is sometimes also referred to as producer or provider.)

The URIs parameter can be a string (single destination) or a list of strings (multiple destinations.) If the URIs parameter is omitted any URIs registered with a previous call to system.SetDefaultURIs() will be used by default.

URI scheme vgx is used for streaming operations to a remote VGX instance. Attaching to a remote instance implies opening a TCP/IP socket connection. The state of being attached is independent of the state of any TCP/IP socket connection. The connection may break due to network issues or other temporary conditions, but the source is still considered attached to the destination. A broken connection will be re-established automatically without loss of data.

URI scheme file is used to write operations to a local file. This can be useful for testing or development purposes.

The handshake parameter has a default value of True and implies verification of protocol version and other internal information exchange when attaching to a destination.

The default attach timeout is 30000 (in milliseconds.)

2.1.2.4. Attached

pyvgx.op.Attached(): Return a list of URI strings identifying this VGX instance’s currently attached subscribers.

2.1.2.5. Bind

pyvgx.op.Bind( port[, durable[, snapshot_threshold]] )

Start VGX Transaction service on the given port.

When bound, the local VGX instance becomes a VGX operation destination (also referred to as consumer or subscriber.)

A VGX operation source may attach to the local VGX instance and stream its operations over the connection into the local instance. Operations are streamed as VGX transactions. The local instance applies all incoming VGX transactions as they arrive, thus re-creating exact replicas of graphs in the source instance.

At most one VGX operation source may be attached at a time. Attempting to attach a second source will disconnect the first source. Since all attached sources automatically try to re-connect if connection is lost there will be a constant fight between sources, most likely resulting in errors and data inconsistencies. Do not attach more than once source to any destination.

When durable is True (default is False) the VGX instance will maintain a durable transaction log on disk.

When bound in durable mode all inbound transactions are persisted to disk in real time. The operation source does not consider the transaction complete until the operation destination has confirmed the durable write.

Inbound transactions are persisted to transaction logs on disk. Transaction logs are kept until a new snapshot is created. When a graph snapshot is created using Save(), all transactions applied prior to this durability point are no longer needed by that particular graph. The earliest durability point of all loaded graphs determines the cutoff at which old transactions are discarded. Transactions older than the earliest durability point are automatically removed from disk.

Enable automatic snapshots by setting snapshot_threshold to the maximum number of transaction log bytes allowed on disk. When this limit is exceeded the system will persist all graphs and clear the transaction log.

2.1.2.6. Bound

pyvgx.op.Bound(): Return a tuple (port, durable) indicating the running state of VGX Transaction Service, where port and durable reflect corresponding parameters values provided to op.Bind().

If no VGX Transaction service is running (0, False) is returned.

2.1.2.7. Consume

pyvgx.op.Consume( data[, timeout ] )

Submit operation data for execution by the local VGX instance. The data parameter is either a single string (str or bytes) or a list of strings. If timeout is 0 (the default) this function returns immediately after queuing operations for asynchronous execution. Specify a positive timeout (in milliseconds) to block while waiting for operations to be executed, or otherwise raise OperationTimeout.

Data submitted via this function will incrementally modify the local instance. The only requirement for data is no partial tokens are allowed, i.e. the data string (or each string in list of strings) must start and end on complete token boundaries. Operation tokens are separated by whitespace characters (" ", "\t" or "\n".)

If the pyvgx instance is populated exclusively via this call it is recommended to initialize the system with event processing globally disabled using pyvgx.system.Initialize( events=False ). Expiration events (TTL) are managed by the source instance and transferred via the operation stream.

Instead of manually calling op.Consume(), use op.Bind() in a VGX destination instance to start the VGX Transaction service, which allows a VGX source instance to attach using op.Attach() and have its data automatically transferred to the destination.

2.1.2.8. Counters

pyvgx.op.Counters(): Return a dict of various internal counters related to operation I/O. Mainly useful for debugging and performance reporting.

2.1.2.9. DataCRC32c

pyvgx.op.DataCRC32c( data ): Return the CRC32-c checksum of data (str or bytes). This differs from standard CRC32-c computation in that the checksum is computed over all tokens in data, not the raw string.

2.1.2.10. Deny

pyvgx.op.Deny( opcode_filter ): Prevent execution by VGX Transaction service or pyvgx.op.Consume() of any opcode(s) specified by opcode_filter, which may be an individual operator opcode or an operator group. Filters are cumulative, i.e. repeated calls to this function will add to the set of denied opcodes. Filters are exclusive, i.e. a denied opcode is not allowed.

2.1.2.11. Detach

pyvgx.op.Detach( [URI[, force]] )

Disconnect the operation stream from all attached destinations, or from a specific destination if URI is specified.

Normally any attached destination will remain attached even if the socket loses its connection. A new connection will be attempted periodically, while transactions are queued internally until such connection can be made. To override this behavior set force=True. A forced detach will erase the pending backlog of transactions and immediately detach destinations that are currently without a socket connection.

All vertices acquired writable by any thread at the time of op.Detach() will be forcibly released, even if no operation stream was attached. Calling op.Detach() thus has the side-effect similar to calling pyvgx.Graph.CloseAll() once for each running thread.

2.1.2.12. Fence

pyvgx.op.Fence( [timeout] ): Flush all pending transactions, then return None. This has the effect of enforcing transaction order in such a way that any graph modifications made prior to the call of this function are guaranteed to have arrived at and been accepted by any attached destination(s) before any graph modifications made after the call of this function. It is the equivalent of calling op.Suspend() immediately followed by op.Resume().

If the system cannot flush all pending transactions within the given timeout (in milliseconds) OperationTimeout is raised. The default timeout is 60000 (one minute.)

Any other error condition will raise InternalError.

2.1.2.13. GetDefaultURIs

pyvgx.op.GetDefaultURIs(): Return a list of the current default URIs.

2.1.2.14. Heartbeat

pyvgx.op.Heartbeat( tick ): Enable (tick=True) or disable (tick=False) transmission of the TIC opcode to any attached VGX destinations. It is enabled by default.

A heartbeat tick is sent after 4 seconds of inactivity and will continue to be sent every 4 seconds for as long as no other operations are sent.

2.1.2.15. Pending

pyvgx.op.Pending(): Return the amount of data (in bytes) waiting to be processed asynchronously following a non-blocking call to pyvgx.op.Consume(), or currently waiting to be processed by a running VGX Transaction service.

2.1.2.16. ProduceComment

pyvgx.op.ProduceComment( string ): Inject message string (str or bytes) into the operation output stream. This has no functional side-effects and is provided for logging, testing, debugging, etc.

2.1.2.17. ProduceRaw

pyvgx.op.ProduceRaw( data ): Inject arbitrary data (str or bytes) into the operation stream. This currently has no functional side-effects and is reserved for future implementations where transferring arbitrary data payloads to remote VGX instances may be required.

2.1.2.18. Profile

pyvgx.op.Profile( profile ): Apply an opcode execution profile, which is a pre-defined set of allowed and denied opcodes. Allowed opcodes will be executed by VGX Transaction service or calls to pyvgx.op.Consume(), and denied opcodes will be ignored. Applying a profile is destructive, i.e. any existing opcode filters will be overridden by the profile. However, after applying a profile it is possible to modify the filters in effect with additional calls to pyvgx.op.Deny() or pyvgx.op.Allow().

2.1.2.19. Reset

pyvgx.op.Reset()

Reset the operation parser used by pyvgx.op.Consume() to its initial state. This will permit already consumed transactions to be processed again.

This function is useful for testing and debugging. Re-processing already processed transactions may have unintended side-effects and should never be performed in a live system unless part of a carefully designed recovery protocol.

2.1.2.20. Resume

pyvgx.op.Resume(): Resume the production of operation output and return None on success. After this function returns all graphs will start producing output transactions for committed changes, including any modifications that were queued while in a suspended state. Transaction order is guaranteed.

OperationTimeout is raised immediately if production of transaction data cannot be resumed.

Any other error condition will raise InternalError.

2.1.2.21. ResumeTxInput

pyvgx.op.ResumeTxInput( [timeout_ms] ): Resume subscriber’s internal processing of transactions received when VGX Transaction service is running. Previously queued data will start modifying the state of graphs, while new incoming data is applied to the back of this queue, until the entire backlog has been processed. Depending on how long transaction input was suspended it may take some time for the subscriber’s state to reflect provider’s state in real time.

OperationTimeout is raised if internal processing cannot be resumed by the given timeout (default 60000.)

2.1.2.22. ResumeTxOutput

pyvgx.op.ResumeTxOutput( [timeout_ms] ): Resume provider’s output of transaction data to attached subscribers. All buffered data will be flushed to attached outputs as fast as possible until drained.

OperationTimeout is raised if transaction output cannot be resumed by the given timeout (default 60000.)

2.1.2.23. SetDefaultURIs

pyvgx.op.SetDefaultURIs( URIs ): Register one or more URIs to use as default when calling system.Initialize() or op.Attach() without the URI parameter. The URI parameter is a string or a list of strings. Pass [] or None to clear the default URIs.

2.1.2.24. StrictSerial

pyvgx.op.StrictSerial( strict ): Enable (strict=True) or disable (strict=False) check for strictly increasing transaction serial numbers by the operation parser. This check is enabled by default.

When enabled all data received by VGX Transaction service or manually submitted via pyvgx.op.Consume() is required to contain sequential transactions with serial numbers increasing by exactly one for each new transaction.

If a transaction’s serial number is less than the previous serial number the transaction data is ignored and a regression error is logged.

If a transaction’s serial number is greater than the previous serial number plus one the transaction is accepted and a gap warning is logged.

2.1.2.25. Subscribe

pyvgx.op.Subscribe( address[, hardsync[, timeout]] )

Send a request to VGX provider instance located at address to become a subscriber of that instance’s graph data updates. If the provider is able to accept the request it will attach to the local VGX Transaction service, which must already be running. If a local VGX Transaction is not running this function will raise an exception.

The provider address is given as a tuple (host, port), where port is the HTTP Service running on provider host.

When subscription is successfully established, the provider will transfer all its data to the subscriber. Assuming the subscriber is empty (no graph data) at the time of subscription the subscriber will be fully synchronized with the provider once that transfer completes.

If hardsync is True, any pre-existing subscriber data will be truncated before the provider transfers its data.

An optional timeout in milliseconds may be specified. This only relates to the time it takes to establish a connection between the provider and subscriber, NOT to the time it takes to synchronize all data.

This operation may take a long time to complete. Any other subscribers will not receive updates from the provider during the time it takes to synchronize the new subscriber.

2.1.2.26. Suspend

pyvgx.op.Suspend( [timeout] ): Suspend the production of operation output and return None on success. After this function returns it is guaranteed that all committed changes to all graphs have been sent to and confirmed by any attached destination(s). No further transactions will be produced while in the suspended state. Local graph modifications are still allowed and will be queued up for transmission once the system is no longer in a suspended state. The system will remain in a suspended state until op.Resume() is called.

OperationTimeout is raised if production of transaction data cannot be suspended by the given timeout (default 60000.)

Any other error condition will raise InternalError.

2.1.2.27. SuspendTxInput

pyvgx.op.SuspendTxInput( [timeout_ms] ): Halt subscriber’s internal processing of transactions received when VGX Transaction service is running. The attached provider may continue to send data but it will be queued in the subscriber’s memory (and disk if VGX Transaction service is configured durable) while transaction input is suspended. Incoming data will not modify the state of any graphs while suspended.

OperationTimeout is raised if internal processing cannot be suspended by the given timeout (default 60000.)

2.1.2.28. SuspendTxOutput

pyvgx.op.SuspendTxOutput( [timeout_ms] ): Halt provider’s output of transaction data to attached subscribers. Internal graph operations continue to produce transactions, but these are queued in buffers that continue to grow for as long as output is suspended.

OperationTimeout is raised if transaction output cannot be suspended by the given timeout (default 60000.)

2.1.2.29. Throttle

pyvgx.op.Throttle( [rate[, unit ]]): Configure global data consumption rate limits. Operation data streamed to a running VGX Transaction service or submitted manually via pyvgx.op.Consume() will be processed internally at a maximum rate specified in units per second.

Available units are "transactions", "operations", "opcodes" and "bytes". Multiple rate limits can be set by calling op.Throttle() more than once with different unit parameters. Throttling occurs when any configured limit is reached.

Throttle rate must be a non-negative number or None. When rate is numeric a data processing speed limit is set for the selected unit measured in such units per second. The default unit is "bytes". When rate is None or 0.0 and unit is given any limit is removed for that unit. When rate is None or 0.0 and unit is not given all limits are removed and no throttling occurs.

This method returns a dictionary of configured rate limits.

2.1.2.30. URI

pyvgx.op.URI( uri_string ): Parse uri_string and return a 7-tuple containing the extracted URI elements: ( scheme, user, host, port, path, query, fragment ).

2.1.2.31. Unbind

pyvgx.op.Unbind(): Stop a running VGX Transaction service. _

2.1.2.32. Unsubscribe

pyvgx.op.Unsubscribe(): Request an already attached VGX provider instance to detach itself. No further graph data updates will be received once the provider has detached.

2.1.2.33. VerifyCRC

pyvgx.op.VerifyCRC( verify ): Enable (verify=True) or disable (verify=False) CRC validation in the operation parser. CRC validation is enabled by default. When enabled the operation data parser performs CRC validation of all data received by VGX Transaction service or manually submitted via pyvgx.op.Consume().

2.1.3. `pyvgx` Namespace

2.1.3.1. AutoArcTimestamps

pyvgx.AutoArcTimestamps( [enable] ): Enable (enable=True) or disable (enable=False) automatic inclusion of timestamp arcs for new relationships. Disabled by default.

2.1.3.2. avxbuild

pyvgx.avxbuild(): Return the AVX version used when pyvgx was built.

2.1.3.3. compress

pyvgx.compress( obj ): Return bytes representing a compressed, serialized Python object obj.

2.1.3.4. cpuid

pyvgx.cpuid( leaf[, subleaf[, obj ]] ): Execute the cpuid instruction to return information about your machine’s processor. This maps directly to the CPUID instruction and requires knowledge of the CPU vendor’s documentation to supply the appropriate values in leaf and subleaf, and to interpret the return value.

By default a 4-tuple (EAX, EBX, ECX, EDX) is returned. If obj is non-zero a more verbose dict is returned.

2.1.3.5. crc32c

pyvgx.crc32c( x ): Return the CRC32-c checksum of string x.

2.1.3.6. decode_utf8

pyvgx.decode_utf8( utf8 ): Decode bytes utf8 to string.

2.1.3.7. decompress

pyvgx.decompress( bytes ): Return Python object represented by bytes previously generated by pyvgx.compress().

2.1.3.8. deserialize

pyvgx.deserialize( bytes ): Return Python object represented by bytes previously generated by pyvgx.serialize().

2.1.3.9. enable_selftest

pyvgx.enable_selftest( enable ): Specify enable True or False to globally enable or disable selftest() and selftest_all(). The selftest functions are enabled by default.

2.1.3.10. encode_utf8

pyvgx.encode_utf8( string ): Encode string to utf8 bytes.

2.1.3.11. ihash128

pyvgx.ihash128( x ): Compute a 128-bit hash of integer x and return the result as a string of 32 hex digits.

2.1.3.12. ihash64

pyvgx.ihash64( x ): Return a 64-bit integer hash of integer x.

2.1.3.13. initadmin

pyvgx.initadmin(): Initialize basic administrative plugins, without the need for system.Initialize().

2.1.3.14. LogDebug

pyvgx.LogDebug( message ): Output a log message with the DEBUG label. This may have no effect if debug messages are disabled in the core as is typical in release builds of the software. For debug messages to be emitted a debug build of pyvgx may be necessary.

2.1.3.15. LogError

pyvgx.LogError( message ): Output a log message with the ERROR label.

2.1.3.16. LogInfo

pyvgx.LogInfo( message ): Output a log message with the INFO label.

2.1.3.17. LogTimestamp

pyvgx.LogTimestamp( obj[, ts[, clf]] )

Write the contents of obj to the log output file specified by a previous call to OpenAccessLog(). If no output is open the call to LogTimestamp() is silently ignored.

Supported types for obj are: string, PluginRequest and PluginResponse.

When timestamp ts is not provided the current system time is implied. When a specific timestamp is provided it is interpreted according to type. An integer is interpreted as nanoseconds since 1970. A float is interpreted as seconds since 1970.

When obj is a PluginRequest or PluginResponse the log entry may be written in the Common Log Format when clf is True. When obj is a string clf is ignored.

Common Log Format (clf=1)

127.0.0.1 - - [11/Jun/2025:16:43:41 -0400] "GET /vgx/plugin/myservice HTTP/1.1" 200 -

Plain Output (clf=0 default)

[2025-06-11 16:25:28.984]  <string_value_if_obj_is_string>
[2025-06-11 16:43:37.640]  GET /vgx/plugin/myservice <request_serial_number>

2.1.3.18. LogWarning

pyvgx.LogWarning( message ): Output a log message with the WARNING label.

2.1.3.19. meminfo

pyvgx.meminfo(): Return a 2-tuple (total, used) where total is the total system memory (bytes) and used is the memory (bytes) currently in use by this VGX instance.

2.1.3.20. OpenAccessLog

pyvgx.OpenAccessLog( filepath ): Enable asynchronous log output to filepath. Use LogTimestamp() to write log entry to file.

Passing filepath=None closes the file and disables further logging.

2.1.3.21. popcnt

pyvgx.popcnt( x ): Return the number of bits set to one in integer x.

2.1.3.22. profile

pyvgx.profile(): Execute a basic system performance benchmark and print results to stdout.

2.1.3.23. rstr

pyvgx.rstr( n ): Return a random string value with n characters [a-z]

2.1.3.24. selftest

pyvgx.selftest( testroot[, library[, names[, force]]] ): Run selected internal test(s) of the core libraries, using directory testroot to store temporary data on disk. Allowed library values are vgx, comlib, framehash. Supply list of strings in names to specify individual tests defined for the library. These names are subject to change and not documented here. Use force=True to delete existing output files generated by previous tests.

2.1.3.25. selftest_all

pyvgx.selftest_all( testroot ): Run all internal tests of the core libraries, using directory testroot to store temporary data on disk.

2.1.3.26. serialize

pyvgx.serialize( obj ): Return bytes representing a serialized Python object obj.

2.1.3.27. SetOutputStream

pyvgx.SetOutputStream( filepath ): Redirect all log output to the file specified by filepath. Set filepath=None to redirect all log output to stderr. This is the default.

2.1.3.28. sha256

pyvgx.sha256( x ): Compute sha256 of string x and return the result as a string of 64 hex digits.

2.1.3.29. strhash128

pyvgx.strhash128( x ): Return a 128-bit (hex string) hash of string x.

2.1.3.30. strhash64

pyvgx.strhash64( x ): Return a 64-bit integer hash of string x.

2.1.3.31. threadid

pyvgx.threadid(): Return the integer ID of the current thread.

2.1.3.32. threadinit

pyvgx.threadinit( [seed] ): Seed the internal random generators with unique starting points for the current thread.

2.1.3.33. threadlabel

pyvgx.threadlabel( label ): Assign a string label (up to 15 characters) identifying the calling thread. This can be changed at any time and as many times as you need. It can be useful in certain debugging scenarios as it allows certain error messages to be more easily identified and interpreted.

2.1.3.34. timestamp

pyvgx.timestamp(): Return the number of seconds since computer system boot time.

2.1.3.35. tokenize

pyvgx.tokenize( text ): Return a list of all tokens in UTF-8 encoded string text using simple, pre-defined tokenization rules. This tokenizer will preserve case and accents. Tokens are sequences of word characters unbroken by punctuation or whitespace. Punctuation characters become individual tokens. A partial definition of word characters and punctuation has been implemented for the unicode basic multilingual plane, from ASCII through Cyrillic (codepoints 0 - 0x4FF).

2.1.3.36. tokenize_min

pyvgx.tokenize_min( text ): Return a list of normalized and lowercased non-punctuation tokens in UTF-8 encoded string text.

2.1.3.37. version

pyvgx.version( [verbosity] ): Return a string containing version information for various system components. Pass an integer to the optional verbosity parameter. Higher verbosity produces more version information.

2.1.3.38. vgxrpndefs

pyvgx.vgxrpndefs(): Return a dict containing all functions and constants supported by the VGX Expression Language.

2.2. Constants

Several types of integer constants are available for use with API methods. The first letter of the constant name indicates the type of constant:

Arc Direction Constants (D_) are used to specify arc direction in queries that perform arc filtering and traversal.
Arc Modifier Constants (M_) specify the value modifiers for arc relationships. They may be used when creating arcs and in query filters.
Value Condition Constants (V_) are used in query filters to specify a value condition when matching arcs or vertex properties.
Collect Constants (C_) are used in neighborhood queries to control how results are produced.
Result Field Constants (F_) are used in queries to specify which attributes to return in search results.
Result List Entry Constants (R_) are used in queries to specify the general format of search results.
Sort Specification Constants (S_) are used in queries to specify how search results are sorted.
Timestamp Constants (T_) represent various internal timestamp limits.
Opcode Constants (op.OP_) represent operator opcodes, filters and profiles.

Additionally, the following constants can be used to reference Evaluator Memory Register slots.

pyvgx.Rn: Memory register Rn index = -n where n is 1, 2, 3 or 4.

2.3. Objects

pyvgx.DefaultSimilarity: A built-in Similarity Object used for creating and comparing stand-alone Vector Objects.

2.4. Exceptions

A number of specialized pyvgx exceptions are defined. They are summarized below.

2.4.1. AccessError

exception pyvgx.AccessError: Sufficient access to a vertex, graph, or other object could not be obtained.

2.4.2. ArcError

exception pyvgx.ArcError: An invalid arc specification was used, or a delete operation could not complete due to locked terminal(s).

2.4.3. DataError

exception pyvgx.DataError: Bad configuration or corrupted data found.

2.4.4. EnumerationError

exception pyvgx.EnumerationError: The core system was unable to translate between external and internal representations of data.

2.4.5. InternalError

exception pyvgx.InternalError: An unknown, internal error has occurred.

2.4.6. OperationTimeout

exception pyvgx.OperationTimeout: A system management operation could not be completed.

2.4.7. PluginError

exception pyvgx.PluginError: Internal error encountered during plugin execution.

2.4.8. QueryError

exception pyvgx.QueryError: A graph query is invalid.

2.4.9. RequestError

exception pyvgx.RequestError: Incorrect parameters or invalid data encountered when processing a PluginRequest object.

2.4.10. ResponseError

exception pyvgx.ResponseError: Incorrect parameters or invalid data encountered when processing a PluginResponse object.

2.4.11. ResultError

exception pyvgx.ResultError: A result could not be generated after query completion.

2.4.12. SearchError

exception pyvgx.SearchError: The execution of a graph query failed due to an internal error.

2.4.13. VertexError

exception pyvgx.VertexError: A vertex was assigned an invalid name, type, or other attribute.

3. Graph Objects

The pyvgx.Graph type represents graph objects.

3.1. Graph Class

class pyvgx.Graph( name[, path[, local[, timeout ]]] ): Create a Python wrapper object for VGX graph name, which is created if it does not already exist. Optionally store data on disk in directory path under vgxroot.

Graph operations are normally broadcast to remote instances when attached but this can be overridden by setting local=True. Local graphs are private to the local VGX instance.

Allow blocking for timeout milliseconds if VGX graph is not immediately available for ownership by current thread.

3.2. Graph Attributes

name: The graph’s name as defined by the graph constructor’s name parameter.

path: The graph’s full path as defined by a combination of vgxroot and the graph constructor’s (optional) path parameter.

size: The graph size is the number of explicit connections (arcs) in the graph.

order: The graph order is the number of vertices in the graph.

objcnt: The graph objcnt is a dict of object counters:
{'order': o, 'size': s, 'properties': p, 'vectors': v}

ts: The current timestamp (floating point) of the graph, measured in seconds since 1970.

sim: Return the graph’s pyvgx.Similarity object used for configuring vector similarity computation.

3.3. Graph Methods

3.3.1. Accumulate

Accumulate( initial, relationship, terminal[, delta[, timeout ]] ): Accumulate the floating point value for M_ACC arc of type relationship from initial to terminal (names or writable vertex objects.) The accumulation value is specified by delta parameter, which defaults to 1.0. Negative delta is allowed. If the arc does not exist it is created with a value equal to delta. An optional timeout (in milliseconds) allows blocking while waiting for vertex access. The default is nonblocking. Returns the new floating point value of the arc after accumulation has been applied.

3.3.2. Adjacent

Adjacent( id[, arc[, neighbor[, filter[, memory[, timeout[, limexec ]]]]]]): Return True if the vertex id (name or readable vertex object) has a neighbor matching arc, neighbor and filter query conditions, otherwise return False. Evaluation of filter may optionally use the specified memory object (advanced use case.) By default this method will not block. Optional timeout (in milliseconds) and hard execution limit limexec can be specified. See Adjacent() for details.

3.3.3. Aggregate

Aggregate( id[, … ] ): Perform aggregation of property values in the neighborhood of id (name or readable vertex object.) See Aggregate() for usage.

3.3.4. ArcValue

ArcValue( initial, arc, terminal[, timeout[, limexec ]] ): Return the value of an explicit connection from initial to terminal (names or readable vertex objects) matching the specified arc filter. The optional timeout (in milliseconds) allows blocking while waiting for vertex access, or sets an upper execution time limit when limexec is true. The default is nonblocking without execution limit. See ArcValue() for details.

3.3.5. Arcs

Arcs( [ … ] ): Perform a global arc search in the graph. See Arcs() for usage.

3.3.6. ClearGraphReadonly

ClearGraphReadonly(): Make a readonly graph writable. OperationTimeout is raised if the graph cannot become writable at this time.

3.3.7. Close

Close(): Close graph instance. This has the same effect as using the del operator on the Python graph instance.

3.3.8. CloseAll

CloseAll()

Close all vertices opened by the current thread.

ADVICE: Do not re-open any vertices in the same scope after CloseAll() has been called.

This method releases all vertex locks directly in the VGX core without notifying the Python objects (i.e. pyvgx.Vertex wrappers) referencing those core vertices. If access is later attempted via a pyvgx.Vertex instance whose core vertex was closed an exception will be raised.

Furthermore, and more importantly, if the core vertex is later re-opened in another part of the program while the original pyvgx.Vertex instance never went out of scope it will again be able to access the core vertex, including the ability to close the vertex, which most certainly will have unintended consequences for the part of the program which is currently the legitimate owner of the core vertex.

3.3.9. CloseVertex

CloseVertex( vertex_object )

Release the vertex_object access lock. Returns True if lock was released, False otherwise. Although the Python vertex_object object still exists it can no longer be used to access the graph. Deleting vertex_object using Python’s del operator, or waiting for vertex_object to go out of scope implicitly releases the vertex lock.

Implicit vertex release depends on Python’s garbage collection. The only way to guarantee immediate release is to call CloseVertex().

3.3.10. CloseVertices

CloseVertices( vertex_objects ): Release one lock for each vertex object in vertex_objects list. Returns the number of vertices for which a lock was released, which may be less than the length of vertex_objects if any of those vertices were already released.

3.3.11. CommitAll

CommitAll(): Commit modifications to all write-locked vertices owned by the current thread, and keep vertices open. Return the number of write-locked vertices.

3.3.12. Connect

Connect( initial, arc, terminals[, lifespan[, condition[, timeout ]]] )

Create explicit connection(s) between vertices by inserting an arc from the initial vertex to each vertex in terminals, which may be a list or a singleton. A vertex may be given as a string (vertex ID) or a writable vertex object.

The arc encapsulates a relationship type, modifier and value. Specify arc parameter as a tuple of zero to three elements using arc insertion syntax:

arc ::= ( [relationship[, modifier[, value]]] )

Omitted arc elements default to "__related__", M_STAT, and 1, respectively.

Optionally specify arc lifespan in seconds, after which the relationship is automatically deleted.

Specifying lifespan implicitly creates (or updates) three additional timestamp arcs for the relationship with modifiers M_TMC (creation time), M_TMM (modification time), and M_TMX (expiration time). At or after the expiration time all arcs from initial to terminal sharing the same relationship are automatically deleted.

Save memory by using M_FWDONLY modifier bitmask to prevent implicit creation of reverse arc from terminal back to initial.

When the optional condition parameter is supplied the arc is created or updated only if the condition is met. The condition references an already existing arc from initial to terminal, and it may reference an arc with any relationship or modifier. The condition parameter uses arc filter syntax:

condition ::= ( [relationship[, direction[, modifier[, condition, value]]]] )

An optional timeout (in milliseconds) allows blocking while waiting for vertex access. The default is nonblocking. Returns 1 if a new arc was created, 0 otherwise.

3.3.13. Count

Count( initial, relationship, terminal[, delta[, timeout ]] ): Increment the integer value for M_CNT arc of type relationship from initial to terminal (names or writable vertex objects.) The increment value is specified by delta parameter, which defaults to 1. Negative delta is allowed. If the arc does not exist it is created with a value equal to delta. An optional timeout (in milliseconds) allows blocking while waiting for vertex access. The default is nonblocking. Returns the new integer value of the arc after increment has been applied.

3.3.14. CountDefinitions

CountDefinitions(): Return the number of pre-defined expressions (See Define().)

3.3.15. CreateVertex

CreateVertex( id[, type[, lifespan[, properties ]]] ): Create a new vertex identified by id and with optional type. An optional lifespan (in seconds) may be specified to trigger automatic vertex deletion when the vertex reaches the maximum age (which is infinite by default.) A dict of { str : object } pairs may be passed in properties to initialize or update the vertex properties. Returns 1 if the vertex is created, 0 if the vertex already exists.

3.3.16. DebugCheckAllocators

DebugCheckAllocators( [ name ] ): Perform allocator consistency checks. By default all allocators are checked. Select a specific allocator with name. This method returns None. If internal allocator problems are detected an exception is raised.

3.3.17. DebugDumpGraph

DebugDumpGraph(): Print internal graph information.

3.3.18. DebugFindObjectByIdentifier

DebugFindObjectByIdentifier( identifier ): Return a representation of an internal object identified by string identifier.

3.3.19. DebugGetObjectByAddress

DebugGetObjectByAddress( address ): Return a representation of an internal object located at memory address.

3.3.20. DebugPrintAllocators

DebugPrintAllocators( [ name ] ): Print internal allocator information to stdout. By default all allocators are dumped. Select a specific allocator with name.

3.3.21. Define

Define( expression ): Create a new function formula that can be used by queries for filtering and ranking. The expression parameter is a string of the form <name> := <formula>. See Define() for details.

3.3.22. Degree

Degree( id[, arc[, filter[, timeout[, limexec ]]]] ): Return the number of arcs incident on id (name or readable vertex object) matching the optional arc and filter conditions. The optional timeout (in milliseconds) allows blocking while waiting for vertex access, or sets an upper execution time limit when limexec is true. The default is nonblocking without execution limit. See Degree() for details.

3.3.23. DeleteVertex

DeleteVertex( id[, timeout ] ): Remove the vertex identified by id from the graph. An optional timeout (in milliseconds) can be specified to wait for vertex access in case it may be in use by other threads.

3.3.24. Dimension

Dimension( code ): Feature vectors: Return vector dimension associated with enumeration code if mapping exists, otherwise raise EnumerationError.

Euclidean vectors: N/A

3.3.25. Disconnect

Disconnect( id[, arc[, neighbor[, timeout ]]] ): Remove one or more explicit connections incident on the specified id (name or writable object). The optional arc and neighbor filters are used to limit arc removal to only those arcs matching the filter criteria. An optional timeout (in milliseconds) allows blocking while waiting for necessary graph access to be obtained. Returns the number of arcs removed.

3.3.26. EnumDimension

EnumDimension( dim ): Feature vectors: Return enumeration code for vector dimension dim after defining mapping if it does not already exist.

Euclidean vectors: N/A

3.3.27. EnumKey

EnumKey( key ): Return enumeration code for property key after defining mapping if it does not already exist.

3.3.28. EnumRelationship

EnumRelationship( rel ): Return enumeration code for relationship type rel after defining mapping if it does not already exist.

3.3.29. EnumValue

EnumValue( string ): Return enumeration code for property key after defining mapping if it does not already exist.

3.3.30. EnumVertexType

EnumVertexType( vtype ): Return enumeration code for vertex type vtype after defining mapping if it does not already exist.

3.3.31. Erase

Erase(): Remove all graph data from memory and disk. Only a basic file structure for the graph will remain.

3.3.32. EscalateVertex

EscalateVertex( readonly_vertex_object[, timeout ] ): Promote acquisition status from readonly to writable for readonly_vertex_object without intermittent release. An optional timeout (in milliseconds) allows blocking while waiting for writable access to be obtained. The default is nonblocking. If the vertex is already writable or the readonly vertex cannot be acquired writable AccessError is raised. This method returns nothing.

3.3.33. Evaluate

Evaluate( expression[, tail[, arc[, head[, vector[, memory ]]]]] ): Execute the expression, which is a string defining a new formula, or referencing a pre-defined formula. The optional parameters are used to supply information that may be referenced in the formula. See Evaluate() for details.

3.3.34. EventBacklog

EventBacklog(): Return a string representing the current state of the internal event processor.

3.3.35. EventDisable

EventDisable(): Disable the internal event processor. This halts the time-to-live processing of vertices and arcs, i.e. TTL will be disabled.

3.3.36. EventEnable

EventEnable(): Start the internal event processor. This resumes the time-to-live processing of vertices and arcs, i.e. TTL will be enabled.

3.3.37. EventFlush

EventFlush(): Manually move all pending internal events from queues to their respective schedules. The method is provided for debugging purposes.

3.3.38. EventParam

EventParam(): Return a dictionary of parameters currently in effect for the event processor.

3.3.39. GetDefinition

GetDefinition( name ): Return the expression string of previously defined formula name.

3.3.40. GetDefinitions

GetDefinitions(): Return a list of all previously defined formulas.

3.3.41. GetMemoryUsage

GetMemoryUsage( [ metric ] ): Return current memory usage information for graph.

3.3.42. GetOpenVertices

GetOpenVertices( [ threadid ] ): Return a list of all of this graph’s vertices currently acquired by all threads (the default), or by a single thread specified by threadid.

3.3.43. GetVertex

GetVertex( id ): Return a pyvgx.Vertex object identified by id in readonly mode.

3.3.44. GetVertexID

GetVertexID( [ offset ] ): Return the identifier string for the vertex specified by its internal integer offset. If offset is not specified return the identifier string of a random vertex in the graph. Vertices are enumerated by offsets in the range [0, order-1]. A negative offset counts from the end. Out of range offsets raise IndexError.

3.3.45. HasVertex

HasVertex( id ): Return True if vertex identified by id exists in this graph, otherwise return False.

3.3.46. Inarcs

Inarcs( id[, hits[, timeout[, limexec ]]] ): Return a list of all outarcs of id (name or readable object), up to a maximum of hits entries. The default is to return all outarcs. The optional timeout (in milliseconds) allows blocking while waiting for vertex access, or sets an upper execution time limit when limexec is true. The default is nonblocking without execution limit.

3.3.47. Initials

Initials( id[, … ] ): Return a list of vertex names of all vertices with explicit connection(s) to the vertex id (name or readable object), optionally filtered and sorted as specified by other parameters. See Initials() for details.

3.3.48. IsDefined

IsDefined( name ): Return True if the named expression has been defined with Define( "name := …" ), otherwise return False.

3.3.49. IsGraphLocal

IsGraphLocal(): Return True if the graph is local only, otherwise False.

3.3.50. IsGraphReadonly

IsGraphReadonly(): Return True if the graph is currently readonly, otherwise False.

3.3.51. Key

Key( code ): Return property key associated with enumeration code if mapping exists, otherwise raise EnumerationError.

3.3.52. Lock

Lock( [ id[, linger[, timeout ]]] ): Acquire and return a mutex lock object. All arguments are optional.

3.3.53. Memory

Memory( [initializer] ): Return a new pyvgx.Memory object associated with this graph. See Evaluator Memory for usage details.

3.3.54. Neighborhood

Neighborhood( id[, … ] ): Perform a neighborhood search in the graph starting at id (name or readable vertex object.) See Neighborhood() for usage.

3.3.55. NewAdjacencyQuery

NewAdjacencyQuery( [ … ] ): Create a new reusable pyvgx.Query object for performing adjacency tests in the graph. This methods takes the same arguments as pyvgx.Graph.Adjacent(), except for timeout and limexec which are instead specified as arguments to pyvgx.Query.Execute(). It is possible to omit the id argument (anchor vertex) when creating the query object, but pyvgx.Query.id needs to be assigned before calling pyvgx.Query.Execute().

3.3.56. NewAggregatorQuery

NewAggregatorQuery( [ … ] ): Create a new reusable pyvgx.Query object for performing aggregations in the graph. This methods takes the same arguments as pyvgx.Graph.Aggregate(), except for timeout and limexec which are instead specified as arguments to pyvgx.Query.Execute(). It is possible to omit the id argument (anchor vertex) when creating the query object, but pyvgx.Query.id needs to be assigned before calling pyvgx.Query.Execute().

3.3.57. NewArcsQuery

NewArcsQuery( [ … ] ): Create a new reusable pyvgx.Query object for performing global arc searches in the graph. This methods takes the same arguments as pyvgx.Graph.Arcs(), except for hits, offset, timeout, and limexec which are instead specified as arguments to pyvgx.Query.Execute().

3.3.58. NewNeighborhoodQuery

NewNeighborhoodQuery( [ … ] ): Create a new reusable pyvgx.Query object for performing neighborhood searches in the graph. This methods takes the same arguments as pyvgx.Graph.Neighborhood(), except for hits, offset, timeout, and limexec which are instead specified as arguments to pyvgx.Query.Execute(). It is possible to omit the id argument (anchor vertex) when creating the query object, but pyvgx.Query.id needs to be assigned before calling pyvgx.Query.Execute().

3.3.59. NewVertex

NewVertex( id[, type[, lifespan[, properties[, timeout ]]]] ): Return a pyvgx.Vertex object identified by id. An optional type may be specified. An optional lifespan (in seconds) may be specified to trigger automatic vertex deletion when the vertex reaches the maximum age (which is infinite by default.) A dict of { str : object } pairs may be passed in properties to initialize or update the vertex properties. If the vertex does not already exist it is created and committed. The returned vertex is acquired in the writable state. By default this method will not block. An optional timeout (in milliseconds) can be specified.

3.3.60. NewVerticesQuery

NewVerticesQuery( [ … ] ): Create a new reusable pyvgx.Query object for performing global vertex searches in the graph. This methods takes the same arguments as pyvgx.Graph.Vertices(), except for hits, offset, timeout, and limexec which are instead specified as arguments to pyvgx.Query.Execute().

3.3.61. OpenNeighbor

OpenNeighbor( id[, arc[, mode[, timeout]]] ): Acquire and return one of the neighbors of vertex id, optionally filtered by arc. The returned pyvgx.Vertex object will be acquired according to mode, which may be 'r' (the default) or 'a'. The optional timeout (in milliseconds, default=0) allows blocking while waiting for vertex access.

3.3.62. OpenVertex

OpenVertex( id[, mode[, timeout ]] ): Acquire and return a pyvgx.Vertex object identified by id, which may be a string or a memory address. Use the optional mode parameter to control how the vertex is acquired. Acquisition is recursive. Mode 'w' acquires the vertex writable after implicitly creating and committing the vertex if it does not already exist. Mode 'r' acquires an already existing vertex readonly. Mode 'a' (the default) acquires the vertex writable without creating. Supplying an integer id (memory address) is only supported for modes 'r' and 'a'.

3.3.63. OpenVertices

OpenVertices( idlist[, mode[, timeout ]] )

Acquire and return multiple pyvgx.Vertex objects identified by strings or vertex instances in idlist. Use the optional mode parameter to control how the vertices are acquired. Acquisition is recursive. Mode 'r' acquires vertices readonly. Mode 'a' (the default) acquires vertices writable. All vertices must be acquired for this method to succeed. If one or more vertices cannot be acquired AccessError is raised.

When VGX is attached to an output stream all output is halted while one or more vertices are acquired writable via this method. All graph operations performed while at least one writable acquisition by this method is in effect will be queued internally and emitted in bulk once all such acquired vertices have been released.

3.3.64. Order

Order( [ type ] ): Return the number of vertices in the graph, optionally restricted to vertices of the specified type.

3.3.65. Outarcs

Outarcs( id[, hits[, timeout[, limexec ]]] ): Return a list of all inarcs of id (name or readable object), up to a maximum of hits entries. The default is to return all inarcs. The optional timeout (in milliseconds) allows blocking while waiting for vertex access, or sets an upper execution time limit when limexec is true. The default is nonblocking without execution limit.

3.3.66. PropertyKeys

PropertyKeys(): Return dict of all existing property key mappings.

3.3.67. PropertyStringValues

PropertyStringValues(): Return dict of all existing property string value mappings.

3.3.68. Relationship

Relationship( code ): Return relationship type associated with enumeration code if mapping exists, otherwise raise EnumerationError.

3.3.69. Relationships

Relationships(): Return dict of all existing relationship mappings.

3.3.70. RelaxVertex

RelaxVertex( writable_vertex_object ): Relax acquisition status from writable to readonly for writable_vertex_object without intermittent release, then return True. If vertex was acquired writable more than once then this method has the same effect as CloseVertex() and False is returned. This method will never block.

3.3.71. ResetCounters

ResetCounters(): Reset query counters.

3.3.72. ResetSerial

ResetSerial( [ sn ] ): Force the graph input serial number to 0, or to sn if provided. This allows already consumed operation data to be re-submitted.

3.3.73. Save

Save( [ timeout[, force[, remote ]]] ): Persist the graph to disk. An optional timeout (in milliseconds) allows blocking while waiting for the entire graph to become idle in order for the operation to proceed. The default is 1000. Data is normally saved incrementally, i.e. only modified structures are written to disk. To perform a complete serialization set force to True. Remote destinations will be notified about the local persist only if remote is set to True (default is False.)

3.3.74. Search

Search( [ … ] ): Perform a generic graph search and print the results in human readable form to stdout. See Search() for usage.

3.3.75. SetGraphReadonly

SetGraphReadonly( [ timeout ] ): Make the graph readonly. An optional timeout (in milliseconds) allows blocking while waiting for graph to become idle. The default is nonblocking. OperationTimeout is raised if the graph cannot enter readonly mode. No vertices can be acquired writable when a graph is readonly. However, queries are still allowed.

3.3.76. ShowOpenVertices

ShowOpenVertices(): Print a summary of all of this graph’s vertices currently acquired by all threads.

3.3.77. ShowVertex

ShowVertex( id [, timeout ] ): Print a representation of internal data for the vertex identified by id. This method does not block by default. An optional timeout may be specified.

3.3.78. Size

Size(): Return the number of explicit connections in the graph.

3.3.79. Status

Status(): Return a dictionary of various information, counters, and resource usage for the graph.

3.3.80. Sync

Sync( [ hard[, timeout ]] ): Update data on all attached destinations to match this VGX source instance. When the optional parameter hard (bool) is True the destination VGX instances are truncated before they are re-populated with data from the source. The default is hard=False, which means the destinations are not truncated prior to receiving the source data.

The optional timeout (in milliseconds) has a default of 30000, and is applied at multiple stages internally when switching states.

3.3.81. Synchronized

Synchronized( function, *args, **kwds ): Call function( *args, **kwds ) in a synchronized context and return its returned value. Only one thread of execution will be allowed to perform a synchronized call at a time, even if the called functions are different.

3.3.82. Terminals

Terminals( id[, … ] ): Return a list of vertex names of all vertices with explicit connection(s) from the vertex id (name or readable object), optionally filtered and sorted as specified by other parameters. See Terminals() for details.

3.3.83. Truncate

Truncate( [ type ] ): Delete all graph data, or if type is specified erase all vertices of that type. All arcs incident on the removed vertices will also be removed.

3.3.84. Value

Value( code ): Return property string value associated with enumeration code if mapping exists, otherwise raise EnumerationError.

3.3.85. VertexDescriptor

VertexDescriptor( id ): Return a string representing internal descriptor data for the vertex identified by id.

3.3.86. VertexIdByAddress

VertexIdByAddress( address ): Return the identifier of vertex at memory address.

3.3.87. VertexType

VertexType( code ): Return vertex type associated with enumeration code if mapping exists, otherwise raise EnumerationError.

3.3.88. VertexTypes

VertexTypes(): Return dict of all existing vertex type mappings.

3.3.89. Vertices

Vertices( [ … ] ): Perform a global vertex search in the graph. See Vertices() for usage.

3.3.90. VerticesType

VerticesType( type ): Return a list of names of all vertices of the given type in this graph.

4. Vertex Objects

Vertices are exposed via the Python layer as pyvgx.Vertex objects that allow access to vertices within a graph. Vertex objects are created using a suitable vertex access method of the pyvgx.Graph object. See the wrapper object description for more details around the interaction between Python objects and the core graph.

To modify a vertex (e.g. set a property) it has to be acquired writable from the graph. Query operations can be performed with either writable or readonly access. Only one thread can hold writable access to a vertex at a time. Multiple threads can hold readonly access simultaneously. Writable access and readonly access are mutually exclusive.

4.1. Vertex Class

class pyvgx.Vertex( graph, id[, type_[, mode[, timeout ]]] ): Create a Python wrapper object for VGX vertex id in graph, optionally specifying a vertex type if a new vertex is created and controlling writable or readonly access via mode. Optionally block for timeout milliseconds while waiting for vertex to become accessible to the calling thread.

4.2. Vertex Attributes

id: The vertex name
internalid: Internal 128-bit hash of vertex name
type: The vertex type name
isolated: True when vertex degree is zero, False otherwise
deg: The total number of arcs incident on this vertex
ideg: The number of inbound arcs, i.e. the number of arcs for which this vertex is a terminal vertex
odeg: The number of outbound arcs, i.e. the number of arcs for which this vertex is an initial vertex
vector: Similarity vector
properties: Dictionary of vertex properties
tmc: Vertex creation timestamp, in seconds since 1970
tmm: Vertex modification timestamp, in seconds since 1970
tmx: Vertex expiration timestamp, in seconds since 1970
rtx: Vertex remaining time to live (seconds) until expiration
c1: Dynamic rank 1^st order coefficient c1. This attribute is writable
c0: Dynamic rank 0^th order coefficient c0. This attribute is writable
b1: Special internal use: ANN seed number. This attribute is readonly
b0: Special internal use: ANN arc LSH rotate amount. This attribute is readonly
virtual: True if the vertex is virtual, False if the vertex is real
address: Vertex memory address
index: Vertex object offset in memory
bitindex: Vertex bitvector quadword offset
bitvector: Vertex bitvector quadword
op: Operation id of the last modifying graph operation for this vertex
refc: Vertex object reference count (for diagnostic purposes)
bidx: Vertex object allocator block index (for diagnostic purposes)
oidx: Vertex object allocator block offset (for diagnostic purposes)
handle: Numeric (long) vertex identifier (process independent, unlike address)
enum: Numeric (int) vertex identifier (process independent, unlike address), usable in graphs with < 2 billion vertices. (May return -1 in larger graphs, if so use handle)
descriptor: A numeric value representing various internal attributes (for diagnostic purposes)
readers: Number of readonly acquisitions for this vertex
owner: ID of thread holding one or more write-locks for this vertex. Positive integer when write-locked, 0 when no thread owns write-lock
xrecursion: Number of write-locks held by vertex owner

4.3. Vertex Methods

4.3.1. Adjacent

Adjacent( [ … ]): Shorthand for pyvgx.Graph.Adjacent( id, … ) where id is implied.

4.3.2. Aggregate

Aggregate( [ … ]): Shorthand for pyvgx.Graph.Aggregate( id, … ) where id is implied.

4.3.3. ArcLSH

ArcLSH(): Return 32 lsh bits from a region of the 64-bit LSH not overlapping with LSH segment used to generate a projection key

4.3.4. ArcValue

ArcValue( [ … ]): Shorthand for pyvgx.Graph.ArcValue( initial, … ) where initial is implied.

4.3.5. AsDict

AsDict(): Return a dictionary representation of the vertex.

4.3.6. ClearExpiration

ClearExpiration(): Remove any TTL expiration for this vertex.

4.3.7. Close

Close(): Release vertex access lock.

4.3.8. Commit

Commit(): Commit all vertex modifications and mark vertex object as dirty. This has the same effect as CloseVertex() but without releasing the lock. Returns the graph operation id.

4.3.9. Debug

Debug(): Print various internal vertex object information. _

4.3.10. Degree

Degree( [ … ]): Shorthand for pyvgx.Graph.Degree( id, … ) where id is implied.

4.3.11. Descriptor

Descriptor(): Return a string representation of the vertex descriptor.

4.3.12. Escalate

Escalate( [ timeout ] ): Promote vertex acquisition status from readonly to writable. An optional timeout (in milliseconds) allows blocking while waiting for writable access to be obtained. The default is nonblocking. If the vertex is already writable or the readonly vertex cannot be acquired writable AccessError is raised.

4.3.13. GetExpiration

GetExpiration(): Return the vertex expiration time in seconds since 1970.

4.3.14. GetProperties

GetProperties(): Return all vertex properties as a dictionary.

4.3.15. GetProperty

GetProperty( name[, default ] ): Return the value of vertex property name. If the property does not exist the default value is returned. If no default is specified, None is returned.

Python dictionary syntax is also supported: vertex[ name ]. However, in this case LookupError is raised if the property does not exist.

4.3.16. GetRank

GetRank(): Return the currently assigned ranking coefficients.

4.3.17. GetType

GetType( type): Return the vertex type.

4.3.18. GetTypeEnum

GetTypeEnum(): Return the vertex type enumeration code.

4.3.19. GetVector

GetVector(): Return a pyvgx.Vector object representing the similarity vector assigned to vertex. If vertex has no similarity vector a null-vector is returned.

4.3.20. HasProperties

HasProperties(): Return True if vertex has any properties, False otherwise.

4.3.21. HasProperty

HasProperty( name[, value_filter ] ): Return True if the vertex has a property called name optionally matching value_filter, otherwise False.

Python dictionary syntax is also supported: name in vertex

4.3.22. HasVector

HasVector(): Return True if the vertex has a similarity vector, otherwise False.

4.3.23. Inarcs

Inarcs( [ … ]): Shorthand for pyvgx.Graph.Inarcs( id, … ) where id is implied.

4.3.24. IncProperty

IncProperty( name[, increment ] ): Increment the vertex property name by numeric increment value, which defaults to 1. Positive and negative numbers are allowed. If the property does not exist it is created and initialized to increment. The resulting value of property name after increment is returned.

4.3.25. Initials

Initials( [ … ]): Shorthand for pyvgx.Graph.Initials( id, … ) where id is implied.

4.3.26. IsExpired

IsExpired(): Return True if this vertex has an expiration timestamp in the past, otherwise False. It is possible for the vertex to be expired and not deleted if it is currently in use.

4.3.27. IsVirtual

IsVirtual(): Return True if the vertex is virtual. Return False if the vertex is real.

4.3.28. items

items(): Return a list of all key-value pairs for all properties of this vertex.

4.3.29. keys

keys(): Return a list of all property names of this vertex.

4.3.30. Neighborhood

Neighborhood( [ … ]): Shorthand for pyvgx.Graph.Neighborhood( id, … ) where id is implied.

4.3.31. Neighbors

Neighbors( [ … ]): Shorthand for combining the results of Initials( … ) and Terminals( … ) into one list.

4.3.32. NumProperties

NumProperties(): Return the number of properties for this vertex.

4.3.33. OpenNeighbors

OpenNeighbors( [ … ]): Shorthand for pyvgx.Graph.OpenNeighbors( id, … ) where id is implied.

4.3.34. Outarcs

Outarcs( [ … ]): Shorthand for pyvgx.Graph.Outarcs( id, … ) where id is implied.

4.3.35. Readable

Readable(): Return True if vertex object has access to the graph (writable or readonly), otherwise False.

4.3.36. Readonly

Readonly(): Return True if vertex object is readonly, otherwise False.

4.3.37. Relax

Relax(): Relax vertex acquisition status from writable to readonly. Return True if vertex was relaxed. Return False if vertex is still writable.

4.3.38. RemoveProperties

RemoveProperties(): Remove all properties for this vertex.

4.3.39. RemoveProperty

RemoveProperty( name ): Delete the vertex property name. LookupError is raised if the property does not exist.

Python dictionary syntax is also supported: del vertex[ name ]

4.3.40. RemoveVector

RemoveVector(): Remove the similarity vector from vertex if one exists.

4.3.41. SetExpiration

SetExpiration( expires[, relative ] ): Schedule the vertex for automatic deletion (time-to-live). If relative is False (the default), expires is an absolute timestamp in seconds since 1970. If relative is True, expires is a number of seconds into the future relative to the current time.

4.3.42. SetProperties

SetProperties( dict ): Set multiple vertex properties provided in dictionary dict.

4.3.43. SetProperty

SetProperty( name[, value[, virtual ]] ): Assign a property name to the vertex. If no value is supplied it defaults to None. Acceptable values are numbers, strings, lists of numeric values, dicts of {int:float} items, or any Python object supporting the Pickle protocol. This method does not return anything.

Properties are stored in memory by default. To store properties on disk instead set virtual=True or prefix name with asterisk (*).

Python dictionary syntax is also supported: vertex[ name ] = value

4.3.44. SetRank

SetRank( [ c1[, c0 ]] ): Assign dynamic rank coefficients to vertex for use in ranking expressions. The normal use of these coefficients is for evaluating the linear ranking function rank() = c1 ⋅ ∑x + c0, but can also be used for other ranking functions where they take on different meanings depending on context (such as c1=latitude, c0=longitude for the georank() function). It is also possible to use c1 and c0 as light-weight general purpose numeric properties.

4.3.45. SetType

SetType( type): Change the vertex type.

4.3.46. SetVector

SetVector( elements ): Assign a similarity vector to this vertex. The vector is generated from dimension-weight pairs (feature vector) or floating point values (euclidean vector) in elements. Any previous vector is removed and replaced by a new vector.

4.3.47. Terminals

Terminals( [ … ]): Shorthand for pyvgx.Graph.Terminals( id, … ) where id is implied.

4.3.48. values

values(): Return a list of all property values of this vertex.

4.3.49. Writable

Writable(): Return True if vertex object is writable, otherwise False.

5. Memory Objects

The pyvgx.Memory type represents arrays of numeric data which can be used with expression evaluators in queries.

class pyvgx.Memory( graph_instance[, initializer] ): Create a memory object associated with graph_instance, optionally initialized according to initializer. The memory object can only be used in queries against the associated graph instance, and only by the same thread which constructed the object. Attempting to use the memory object with another graph instance or from another thread will raise AccessError.

The memory object initializer can be an integer (default=8) specifying the memory capacity (number of elements), or a list of numbers used to initialize the memory array. In both cases the element capacity will be a power of two, large enough to accommodate the number of elements in the initializer. The size of the memory object is therefore greater than or equal to the initializer. If an integer is specified all elements will be initialized to zero. If a list is specified all elements will be set accordingly, and if the length of the list is not a power of two any remaining elements will be set to zero. Any non-numeric elements in a list initializer will be ignored and receive a default value of zero.

Memory objects can be passed as arguments to queries, which may read or write the elements as part of executing filters and ranking formulas. The same memory object can be used multiple times with queries, making it possible to pass information from one query to the next.

Memory objects support sequence lookup, assignment and length methods, i.e. memobj[ n ], memobj[ n ] = x, and len( memobj ). Slice syntax is also supported, i.e. memobj[ a:b ], memobj[ a:b ] = […].

5.1. Memory Attributes

counters: Navigation progress counters, returns tuple: (score_evaluations, threshold_contributions, accepted_into_frontier, accepted_into_result, recursion_depth, node_expansions, node_visits, node_revisits_avoided).
order: The memory order (readonly) is the log2 of its size. The memory object size is always a power of two and will be returned by len( memobj ).
Rn: Memory register Rn (read and write), where n is 1, 2, 3, or 4.
vector: Memory probe vector (read and write). Must be a pyvgx.Vector. Accessible as M.vector in the expression language.

5.2. Memory Methods

5.2.1. AsList

AsList(): Return a new list object from the memory object’s data. Other ways to achieve the same are list( memobj ) and memobj[:].

5.2.2. VSetAdd

VSetAdd(): Add a vertex (address) to vset in the memory object.

5.2.3. VSetClear

VSetClear(): Clear the contents of any vset or iset in the memory object.

5.2.4. DualInt

DualInt( a[, b] )

Convert between packed and unpacked 2-tuple of unsigned 32-bit integer values as follows:

When both integers a and b are given return integer (a << 32) | b
When a is a 2-tuple (and b omitted) return integer (a[0] << 32) | a[1]
When a is an integer (and b omitted) return tuple ( (a >> 32), (a & 0xffffffff) )

5.2.5. Reset

Reset( value[, increment ] ): Initialize all elements of the memory array to value, optionally incrementing the value of each successive element by increment. Both parameters must be numeric (integer or floating point.) This also resets the stack pointer and cleans up all previously assigned string objects. Note that all elements of the memory array are affected, including registers R1 - R4.

5.2.6. Sort

Sort( [ start[, end[, reverse ] ] ] ): Sort the elements in the specified range [start, end-1] using an element comparator automatically chosen according to the type of the element at index start. By default, start = 0, end = R4, reverse = False.

5.2.7. Stack

Stack(): Return a new list object containing the values currently pushed on the memory stack. (The stack pointer can only be manipulated in evaluator expressions using push/pop operations.)

6. Vector Objects

The pyvgx.Vector type represents vectors that can be used for similarity matching. Two distinct and mutually exclusive vector modes are supported: feature vectors and Euclidean vectors. Vector mode is selected at system initialization and applies globally to all graphs in the vgx instance. Once selected at first initialization the vector mode cannot be changed.

class pyvgx.Vector( [ data ] ): Create a new vector using the default similarity context. If no data is supplied a null-vector is created. If data is a vector object a copy is created. Otherwise the supplied data must be compatible with the selected vector mode.

Feature vector mode: data has the form [(<dimension>, <weight>), (<dimension>, <weight>), …] where <dimension> is a string and <weight> is a number. The maximum length of <dimension> is 27 bytes. The range of <weight> is [0.0078125, 1.875], internally quantized into 64 discrete buckets. The maximum number of vector features is 48. See similarity/vector.html for details.

Euclidean vector mode: data has the form [c₁, c₂, …, c_n] where c_i is a number, and n is a multiple of 32 or 64. (The required multiple is 64 if pyvgx.avxbuild() returns 512, otherwise the multiple is 32.) The maximum number of vector components is 65472. For performance reasons vector components are quantized to eight bits internally (7-bit resolution plus sign.) The component with the largest absolute value will be most accurately stored, while the component with the smallest absolute value will be least accurately stored.

6.1. Vector Attributes

length: The number of vector components
magnitude: The vector magnitude
fingerprint: A 64-bit dimensionality-reduced representation of the vector. This is a pre-computed fingerprint equal to pyvgx.Vector.Fingerprint(0).
external: Feature vectors: The original list of dimension-weight pairs

Euclidean vectors: A list of (quantized) floating point values
internal: Feature vectors: The internal enumerated version of the vector

Euclidean vectors: A bytearray of internally encoded vector components

6.2. Vector Methods

6.2.1. AsDict

AsDict(): Return a dictionary representation of the vector.

6.2.2. Debug

Debug: Print an internal representation of the vector to stdout.

6.2.3. Fingerprint

Fingerprint( [seed] ): Return a 64-bit fingerprint representing the vector. Different values for seed (default 0) will result in alternative projections of the n-dimensional vector onto 64-dimensional binary space. Fingerprints are proxies for vector direction only, and are thus insensitive to vector magnitude.

6.2.4. Projections

Projections( seed[, lsh[, lcm[, reduce[, expand]]]] ): Return the vector’s LSH, "low confidence mask", and index projections for a given seed.

7. Similarity Objects

The pyvgx.Similarity type represents a context within which similarity vectors can be created, stored and compared.

class pyvgx.Similarity( [ graph ] ): Create a new similarity object, optionally associated with graph instance. If no graph instance is specified a stand-alone similarity object is created. When graph is specified the new similarity object will replace the previous similarity object associated with that graph.

Similarity objects can be configured by modifying its attributes as outlined below.

All graph instances contain their own unique similarity contexts which can be configured independently per graph. When creating a new graph it will be assigned a default similarity object automatically, which can be configured later. If a new similarity object is explicitly created and then assigned to the graph, it will replace the default similarity object.

7.1. Similarity Attributes

hamming_threshold: Maximum number of fingerprint bits that may differ for two vectors to be considered similar.

sim_threshold: The lowest similarity measure between two vectors for those vectors to be considered similar.

cosine_exp: A number between 0.0 and 1.0 determining how strongly Cosine similarity contributes to vector similarity score.

jaccard_exp: Feature Vector mode: A number between 0.0 and 1.0 determining how strongly Jaccard index contributes to vector similarity score.

Euclidean Vector mode: Not applicable. Must be set to 0.0.

min_cosine: The minimum Cosine similarity allowed between two vectors for similarity score to be nonzero.

min_jaccard: Feature Vector mode: The minimum Jaccard index allowed for two vectors for similarity score to be nonzero.

Euclidean Vector mode: Ignored.

min_isect: Feature Vector mode: The minimum number of shared dimensions allowed for two vectors for similarity score to be nonzero.

Euclidean Vector mode: Must be set to 1.

max_vector_size: The maximum number of dimensions allowed in vectors.

seeds: Feature Vector mode: Ignored.

Euclidean Vector mode: [0, 1087, 1381, 1663, 1993, 2293, 2621, 2909]

nsegm

Currently not used.

nsign

Currently not used.

7.2. Similarity Methods

7.2.1. AsDict

AsDict(): Return a dictionary containing all the configuration attributes of the similarity object.

7.2.2. Cosine

Cosine( v1, v2 ): Return the cosine similarity of two vectors v1 and v2.

Feature vectors: The cosine similarity measures how aligned two feature vectors are, i.e. the direction of the vectors are compared resulting in score 1.0 when fully aligned (regardless of magnitude) and 0.0 when the vectors are orthogonal (no shared dimensions.)

Euclidean vectors: The cosine similarity of two euclidean vectors equals cos( θ ) where θ is the angle between the vectors, and ranges from -1 to 1 since vector components may be negative or positive.

7.2.3. CreateProjectionSets

CreateProjectionSets( nseeds, ksize ): Initialize a graph structure used for approximate nearest neighbor (ANN) vector search. Parameter nseeds specifies the number of sets, and ksize specifies the number of bits per projection key.

7.2.4. DeleteProjectionSets

DeleteProjectionSets(): Destroy graph structure previously created by CreateProjectonSets().

7.2.5. EuclideanDistance

EuclideanDistance( v1, v2 )

Return the Euclidean distance between two vectors v1 and v2.

This method is only relevant in Euclidean vector mode.

7.2.6. Fingerprint

Fingerprint( vector[, seed] ): Return a 64-bit fingerprint for vector (instance or element list.) The default seed = 0. Different values for seed will result in alternative projections of the n-dimensional vector onto 64-dimensional binary space.

7.2.7. HammingDistance

HammingDistance( v1, v2 )

Return the number of differing bits (0 - 64) in the fingerprints of vectors v1 and v2. The vector fingerprint is a 64-bit integer representing the vector, i.e. a dimensionality reduced representation of the vector. Similar vectors will generally have similar fingerprints and the number of differing bits will be small.

Hamming distance is an effective similarity measure for vectors with many dimensions (>10), since each dimensions has limited contribution to the fingerprint. Vectors with fewer dimensions are more sensitive to changes, and for such vectors the fingerprints are less reliable indicators of similarity. For large vectors where fingerprints are reasonably reliable representations of the vectors, similarity comparison using hamming distance is computationally orders of magnitude more efficient than cosine or jaccard.

7.2.8. Jaccard

Jaccard( v1, v2 )

Return the weighted jaccard similarity (0.0 - 1.0) of two vectors v1 and v2. The jaccard similarity measures the relative amount of overlap between the two vectors. Vectors interpreted as sets rather than geometric objects in this context, i.e. the jaccard similarity can be much less than 1.0 even if the vectors are completely aligned, if their magnitudes are different.

This method is only relevant in Feature vector mode.

7.2.9. NewCentroid

NewCentroid( vector_list ): Return a new centroid vector computed from vectors in vector_list.

Feature vectors: The centroid represents the common dimensions of all vectors in vector_list and can be used to extract the most prominent features from a set of vectors.

Euclidean vectors: The centroid represent an average of the set of vectors.

7.2.10. NewVector

NewVector( elements_list ): Return a new vector associated with this similarity object, setting the vector elements according to elements_list.

7.2.11. Projections

Projections( seed ): Return list of projection names [b'_xxxx|xxx', …].

7.2.12. rvec

rvec( n ): Return a new, random Euclidean vector with n dimensions.

7.2.13. Similarity

Similarity( v1, v2 )

Return the similarity score for two vectors v1 and v2. The similarity score is computed as cosine(v1, v2)^cosine_exp * jaccard(v1,v2)^jaccard_exp. The returned value is in the range 0.0 - 1.0.

This method is only relevant in Feature vector mode.

8. Query Objects

The pyvgx.Query type represents reusable queries that are defined once and may be executed multiple times.

There are five concrete "pseudo subtypes" of this type that behave differently:

pyvgx.ArcsQuery
pyvgx.VerticesQuery
pyvgx.NeighborhoodQuery
pyvgx.AdjacencyQuery
pyvgx.AggregatorQuery

The behavior is dictated by internal object properties that cannot be modified once the object instance has been created.

8.1. Query Object Factory Functions

It is not possible to instantiate a pyvgx.Query object directly. Instead, one of the following graph query object methods must be called to return the appropriately configured pyvgx.Query object:

NewArcsQuery( [ … ] )
NewVerticesQuery( [ … ] )
NewNeighborhoodQuery( [ … ] )
NewAdjacencyQuery( [ … ] )
NewAggregatorQuery( [ … ] )

8.2. Executing Query Objects

Once the desired pyvgx.Query object has been created it can be executed by calling its Execute() method. The query may be executed more than once, possibly generating different result sets each time depending on the state of the graph at execution time or the presence of any random or cumulative elements in the query itself.

8.3. Query Attributes

type: Return the name of query pseudo subtype as a string.
id: Return or assign query anchor vertex identifier. (Not available for global queries.)
opid: Return the (integer) value of the graph’s internal operation counter as it was at the end of the most recent call to Execute().
texec: Return the total execution time in seconds for the most recent call to Execute().
error: Return the error message as a string for the most recent call to Execute() if query execution failed. Return None if query execution was successful.
reason: Return the error reason integer code for the most recent call to Execute() if query execution failed. Return 0 if query execution was successful.

8.4. Query Methods

8.4.1. Execute

Execute( [ hits[, offset[, timeout[, limexec[, cache ]]]]] ): hits : (int) Maximum number of search hits to return. (default=-1, all results)

offset : (int) Search offset. Must be 0 or greater. (default=0)

timeout : (int) Query timeout specification in milliseconds. (default=0, non-blocking)

limexec : (bool) When True, limit query execution in accordance with timeout even when not blocked on vertex acquisition. When False, any timeout is applied to vertex acquisition only. (default=False)

cache : (bool) When True, enable returning of previous result set (without internal re-execution of search) if the graph state remains unchanged since previous execution and none of the query parameters have changed since previous execution. (default=False)

PyVGX 3.8 Reference

1. pyvgx: Python API for VGX

1.1. Summary of Features

1.1.1. Graph Structure

1.1.2. Graph Search

1.1.3. Implicit Connections

1.1.4. Capacity

1.1.5. Object Expiration

1.1.6. Concurrency

1.1.7. Plugins and VGX Server

1.2. API Components

2. pyvgx module

2.1. Functions

2.1.1. pyvgx.system Namespace

2.1.1.1. AddPlugin

2.1.1.2. CancelSync

2.1.1.3. ClearReadonly

2.1.1.4. CountReadonly

2.1.1.5. DeleteGraph

2.1.1.6. DispatcherConfig

2.1.1.7. DurabilityPoint

2.1.1.8. EventsResumable

2.1.1.9. ExitRunServer

2.1.1.10. Fingerprint

2.1.1.11. GetBuiltins

2.1.1.12. GetGraph

2.1.1.13. GetPlugins

2.1.1.14. GetProperties

2.1.1.15. GetProperty

2.1.1.16. HasProperty

2.1.1.17. Initialize

2.1.1.18. IsInitialized

2.1.1.19. Meminfo

2.1.1.20. NumProperties

2.1.1.21. Persist

2.1.1.22. Registry

2.1.1.23. RemovePlugin

2.1.1.24. RemoveProperties

2.1.1.25. RemoveProperty

2.1.1.26. RequestHTTP

2.1.1.27. RequestRate

2.1.1.28. ResetMetrics

2.1.1.29. RestartHTTP

2.1.1.30. ResumeEvents

2.1.1.31. Root

2.1.1.32. RunServer

2.1.1.33. ServerAdminIP

2.1.1.34. ServerHost

2.1.1.35. ServerMetrics

2.1.1.36. ServerPorts

2.1.1.37. ServerPrefix

2.1.1.38. ServiceInHTTP

2.1.1.39. SetProperties

2.1.1.40. SetProperty

2.1.1.41. SetReadonly

2.1.1.42. StartHTTP

2.1.1.43. Status

2.1.1.44. StopHTTP

2.1.1.45. SuspendEvents

2.1.1.46. Sync

2.1.1.47. System

2.1.1.48. Unload

2.1.1.49. WritableVertices

2.1.2. pyvgx.op Namespace

2.1.2.1. Allow

2.1.2.2. Assert

2.1.2.3. Attach

2.1.2.4. Attached

2.1.2.5. Bind

2.1.2.6. Bound

2.1.2.7. Consume

2.1.2.8. Counters

2.1.2.9. DataCRC32c

2.1.2.10. Deny

2.1.2.11. Detach

2.1.2.12. Fence

2.1.2.13. GetDefaultURIs

2.1.2.14. Heartbeat

2.1.2.15. Pending

2.1.2.16. ProduceComment

2.1.1. `pyvgx.system` Namespace

2.1.2. `pyvgx.op` Namespace

2.1.3. `pyvgx` Namespace