Interface QueryApi


public interface QueryApi

QueryAPI

Query API is a Groovy API available in all logics to query the data from Pricefx.

Its goal is to provide unified access to data in Pricefx and long term wise it should be the only API for querying data.

The main entrance point for use of QueryAPI is SandboxAPI.queryApi() that returns a QueryApi interface. This interface provides access to a set of methods which you will need for querying. Let’s look at this query which reads a single row with sku = “MB-0001” from the Product master table:


 def qapi = api.queryApi()
 def p = qapi.tables().products()
 return qapi.source(p, p.sku().equal("MB-0001"))
     .stream { it.toList() }
 

Pipeline based data queries

Query api provides a pipeline based data query approach. Contrary to SQL which provides a declarative language that describe the data query in a single (but sometime complex) statement, pipeline based data queries gives the user the ability describe his query as a sequence of data transformation stages.

Stages can be seen as a data transformation node that takes a table as input and provides a table as output.

Like regular database tables, a table is a concrete or virtual (table view) data set structured by named columns and composed of data rows.

There exists two kind of stages

  • Source stages: which are the starting stage of every pipelined query, their source is a concrete database table. Their output is a subset of theirs rows and columns provided as a table view
  • Transformation stages: which are transforming the output of a previous stage, getting optionally a concrete databases table as additional input and providing a table view as output

Here is an example of representation of a pipelined query with data example:

 [ SourceStage: table=Products, columns=[Product.sku, Product.label]] (output) ->
       ------------------
      | sku   | label  |
      ------------------
      | S1    | foo    |
   -> | S2    | bar    | ->
      | S3    | egg    |
      | S4    | spam   |
      ------------------
 -> (input) [ InnerJoinStage: table=Costs, columns=[Costs.sku], criteria=(Costs.sku = input.sku) ] (output) ->
      -------------------------
      | sku   | label  | Cost |
      -------------------------
      | S1    | foo    | 1O   |
   -> | S2    | bar    | 15   | ->
      | S3    | egg    | 5    |
      | S4    | spam   | 20   |
      -------------------------
 -> (input) [ FilterStage: filter=input.Cost > 10 ] (output) ->*
      -------------------------
      | sku   | label  | Cost |
      -------------------------
   -> | S2    | bar    | 15   |
      | S4    | spam   | 20   |
      -------------------------
 -> (input) [ SortByStage: order=descending(input.Cost) ] (output) ->
      -------------------------
      | sku   | label  | Cost |
      -------------------------
   -> | S4    | spam   | 20   |
      | S2    | bar    | 15   |
      -------------------------
 

As we can see each stage can only refer to its input columns (which is the previous stage output) and if it is available a concrete source table (used here in SourceStage and JoinStage).

The above example is implemented with QueryApi as follows:


 def qapi = api.queryApi()
 def p = qapi.tables().products()
 def c = qapi.tables().productExtensions("Costs")
 def query = qapi.source(p, [p.sku(), p.label()])
     .innerJoin(c, { input -> [c.Cost] }, { input -> input.sku.equal(c.sku()) })
     .filter { input -> input.Cost.greaterThan(10) }
     .sortBy { input -> [qapi.orders().descendingNullsLast(input.Cost)] }
 
As shown in this code snippet, each stage can refer to columns
  • of a concrete table when it is belonging to its arguments. This is the case of
    • the source stage which uses p to refer to the concrete product table columns [...]source(p, [p.sku(), p.label()])
    • the join stage which uses c to refer to the concrete costs table columns [...]innerJoin(c, { input -> [c.Cost] }, { input -> input.sku.equal(c.sku()) })
  • of the table view it gets as input. To refer to these columns the client code should provide a function getting a Tables.Columns object which give access to this input columns. In the above example this is the case for the last three stages of the query pipeline.

Tables.Columns objects are immutable map associating columns names to their reference. As groovy supports the . operator to access the entry of a map, columns can be accessed directly with it, like {input -> input.MyColumnName}

By default, columns defined from a concrete source table have the same name as the accessor used to get it. For example, qapi.tables().products().sku() will lead to a column named "sku". This name can be anyway changed using Expression.as(String).

Executing the query and fetching the data

Once a data pipeline has been defined, the user can run the corresponding query and fetch its results using PipelineStage.stream(Function). This method gets a function as argument which will be in charge of consuming the result stream. This PipelineStage.ResultStream is an Iterable of PipelineStage.ResultStream.ResultRow which are giving access to the result rows data in the same way columns are provided when adding pipeline stages, i.e. using an immutable map view indexed by column names.

 def qapi = api.queryApi()
 def p = qapi.tables().products()
 def allSkuSize = 0
 return qapi.source(p, [p.sku()])
         .stream { it.sum {
             row -> row.sku.size()
         }}*
 
Since:
14.0 - Caribou Lou