Using the Filter Operator to Conditionally Split or Filter Data

This topic explains how the Filter operator works and the actions you can take on its Properties view.

Introduction

The Filter operator accepts a single Input Stream and applies one or more predicates (tests) to the arriving tuples. A predicate returns TRUE or FALSE. If a tuple's data matches a TRUE condition in a predicate, it is moved onto a designated output stream. You can define a Filter operator so that tuples not matching any of the predicates are either dropped or sent to an output stream.

Use the Filter operator when you want to choose different processing of data depending on some characteristic of the data. For example, if you wanted to create an alert if a stock trade transaction exceeded a specified number of shares, then use a predicate such as numShares > threshold to send those tuples to a stream to do the alert.

General Tab

Name: Every application component must have a unique name. Use this field to specify or change the component's name. The name must contain only alphabetic characters, numbers, and underscores, and no hyphens or other special characters. The first character must be alphabetic or an underscore.

Enable Error Output Port: Check this box to add an Error Port to this component. In the EventFlow canvas, the Error Port shows as a red output port, always the last port for the component. See Using Error Ports and Error Streams to learn about Error Ports.

Description: Optionally, enter a description to briefly describe the component's purpose and function. In the EventFlow canvas, you can see the description by pressing Ctrl while the component's tooltip is displayed.

Predicate Settings Tab

The Predicate Settings tab allows you to specify one or more predicates (tests) to apply to the arriving tuples. A predicate returns TRUE or FALSE. If a tuple's data matches a TRUE condition in a predicate, it is moved onto a designated output stream.

Each predicate is a test to be performed on the input tuple. Predicates are evaluated in the order in which they appear. If the predicate evaluates to TRUE, the tuple is sent to the corresponding output stream; if not, the next predicate is evaluated. A tuple is sent only to the first stream whose predicate is matched. If the Create Output Port for Non-matching Tuples option is checked and no other expressions return TRUE, the tuple is sent to an additional output port. If this option is not selected, no tuple is emitted.

Dynamic Variables Tab

The Dynamic Variables tab allows you to define variables for this operator that can then be used in one of its expressions. A dynamic variable can be updated by any input stream or output stream in your application. For more information, see Using Dynamic Variables.

Concurrency Tab

Run this component in a separate thread

This option causes the server to process the component's requests concurrently with other processing in the application. You can distribute the processing of the threads automatically across multiple processors on an SMP machine.

If this is a compute-intensive component and you know that it can run without data dependencies on other components in the StreamBase application, you may be able to improve performance by enabling this option.

Caution

These features are not suitable for every application. For details, see Execution Order, Concurrency, and Parallelism. It includes important guidelines for the use of these features.

Run in parallel threads

If you checked the first option, you can also choose this option, which causes the server to run multiple instances of this component. That is, each instance runs in its own thread. At run time, tuples are dispatched to particular instances based on the Key Expression value (which must evaluate to an int).

Null Values

  • In an operation that performs sorting, any tuple with a null value in the ordering field or in a Boolean expression, will be ignored.

  • If the evaluation of a predicate results in a NullValueException error, the tuple will be dropped.

  • If this component contains a Group Options tab, any null value in a Group By expression will be grouped.

    For more information, see Using Nulls in StreamBase Applications.