This topic describes the syntax and use of each function in the StreamBase expression language. See StreamBase Expression Language Features for information on using the expression language in general. This page is organized into the following sections:
The following table is a complete, alphabetized index into the functions provided in the StreamBase expression language, with links to the description for each function.
The majority of functions in the StreamBase expression language are simple functions, which operate on a single tuple field at a time. You can use simple functions in expressions for any StreamBase operator (except the Heartbeat, Metronome, and Union operators, which do not accept expressions).
The expression language also supports aggregate functions, which are described in Aggregate Functions Overview.
Simple functions are organized into the following categories:
This category includes the following functions:
Function syntax:
T catchexception(Targ1[, ...[, Targn]])
Attempts to evaluate all arguments in order, returning the first one that evaluates without an error, or null if all arguments evaluate to an error. Accepts all data types, but all arguments must have the same type. The returned value has the same data type, T, as the arguments.
For example, the following example attempts to parse str as an interval, but returns a null timestamp if parsing fails:
catchexception(interval(str))
The following example attempts to divide a by
b, but returns -1 if the
division fails:
catchexception(a/b, -1)
Function syntax:
string error(string message)
Throws an error with message text message
that can be caught using the StreamBase error streams mechanism. In
the default configuration, this produces an error message on the console, and the
server continues processing.
Function syntax:
T throw(string message)
Throws an error with message text message
that can be caught using the StreamBase error streams mechanism. In
the default configuration, this produces an error message on the console, and the
server continues processing. For the purposes of typechecking, the return of this
function must be cast with one of the type casting functions. For example, you can
use:
bool(throw("Unmatched filter
option"))
as the last expression of a filter to produce an error message when none of the earlier conditions of the filter was met.
This category includes the following functions:
Function syntax:
T callcpp(stringclass, [arg1, ...,argn])
As an alternative to using callcpp(), you can define
an alias for your custom functions. See Using Function Aliases.
Use an alias or callcpp() to run a custom C++
function directly from a StreamBase operator that uses an expression.
The return type, T, of callcpp() is the same as the
return type of the called function. Custom C++ functions are functions you build
with the StreamBase C++ Client API. You can use the simple form of
callcpp() in any expression except aggregate
expressions. (To use callcpp() in aggregate
expressions, refer to the
aggregate callcpp().)
For callcpp() to locate the function being called,
you must specify the location of the containing DLL or .so file, and must register the function with a custom-function element in the StreamBase
Server configuration file. See C++ Function Overview for
details.
To learn about coding custom C++ functions, refer to Creating Custom C++ Functions in the API Guide.
Function syntax:
T calljava(stringclass, stringmethod, [arg1, ...,argn])
As an alternative to using , you can define an alias for your custom
functions. See Using Function Aliases.
calljava
Use an alias or calljava() to run a custom Java
function directly from a StreamBase operator that uses an expression.
The return type, T, of calljava() is the same as the
return type of the called function. Custom Java functions are functions you build
with the StreamBase Java Client API. You can use the simple form of
calljava() in any expression except aggregate
expressions. (To use calljava() in aggregate
expressions, refer to the
aggregate calljava().)
You can also use calljava() (or an alias) to run any
public function in any class on StreamBase Server's classpath,
including standard library functions.
The calljava() function distinguishes simple from
aggregate functions by the number of arguments. For custom simple functions, you
must specify both class and method names. For custom aggregate functions, you
specify only the class name.
To learn about coding custom Java functions, refer to Using the StreamBase Java Function Wizard in the API Guide. For information on the classpath requirements for custom Java functions, see Java Function Overview.
The function you write to be called with calljava()
can have any number of arguments, including none. Each calljava() argument arg1 through
argn associates a StreamBase data type
with one of the following primitive or Java object types:
| StreamBase Data Type | Java Primitive | Java Object |
|---|---|---|
| blob | — | com.streambase.sb.ByteArrayView |
| bool | boolean | java.lang.Boolean |
| double | double | java.lang.Double |
| int | int | java.lang.Integer |
| long | long | java.lang.Long |
| list | — | java.util.List |
| string | byte[] | java.lang.String |
| timestamp | — | com.streambase.sb.Timestamp |
| tuple | — | com.streambase.sb.Tuple |
In writing your Java function, use the Java primitive if the execution speed of
your function is paramount. However, if you pass a null for any argument that
resolves in your function to a Java primitive, calljava() returns an error and does not call your function.
Only use Java primitives if your application otherwise insures that null values
cannot be passed to your function through calljava().
Use the Java object when your application might pass null to your function
through calljava(), and when no Java primitive is
available.
This category includes the following functions:
Function syntax:
double black_scholes(string type, double underlying, double strike,
double dividendYield, double riskFreeInterestRate,
double Volatility, timestamp exerciseDate, [timestamp dealDate])
The black_scholes function calculates fair-value and
risk statistics (delta, gamma, vega) for European style options on securities with
continuous dividend yields. This is known as the Black-Scholes Generalized model.
It also calculates implied volatility.
The black_scholes function takes eight input
arguments; the first seven arguments are required, while the eighth argument,
dealDate, is optional.
The arguments are:
-
type: Required. Should be set to one of the following string values:-
Call (The price of an option.)
-
Put (The price of an option.)
-
DeltaCall (The sensitivity of the price of an option to changes in the price of the stock.)
-
DeltaPut (The sensitivity of the price of an option to changes in the price of the stock.)
-
ThetaCall (Theta measures how the price of an option changes with time.)
-
ThetaPut (Theta measures how the price of an option changes with time.)
-
Gamma (The sensitivity of a stock's delta to the stock price.)
-
Vega (The rate of change of the value of an option with respect to the volatility of the stock's price.)
-
RhoCall (The sensitivity of the price of an option with respect to the risk-free interest rate.)
-
RhoPut (The sensitivity of the price of an option with respect to the risk-free interest rate.)
-
ImpliedVolatilityCall (Implied volatility of the underlying stock for a given price.)
-
ImpliedVolatilityPut (Implied volatility of the underlying stock for a given price.)
-
-
underlying: Required. A double. The price of the underlying stock. -
strike: Required. A double. The strike price of the stock on the exercise date. -
dividendYield: Required. A double. For example, 0.03 for 3.0%. -
riskFreeInterestRate: Required. A double. For example, 0.05 for 5.0%. -
Volatility(or value): Required. A double. This input argument is overloaded. When computing anything except the implied volatility for a given call or put price this argument is the volatility of the stock. For example, 0.2 for 20%. When computing implied volatility this argument is the option price. -
exerciseDate: Required. A timestamp. The option exercise date. -
dealDate: Optional. A timestamp representing the option deal date. If not provided, the current date is the default.
Function syntax:
double compound_interest(double principleValue, double matureValue,
double numberOfPeriods, [double guess])
Returns the compound interest given the principle value, mature value, and number
of periods. An optional fourth argument, guess, can be
used to set the initial compound interest used by the function. The compound_interest function uses the Newton-Raphson algorithm to
compute compound interest rate. This algorithm performs the same calculations
repetitively. Each iteration results in a compound interest that is closer to the
final result. The number of iterations required to get the results within an
acceptable inaccuracy (1.0e-4) depends on the number
of periods and the initial compound interest used to start the calculation.
If the guess argument is not provided, the default
initial value is 0.01 (1%). If the function returns a
NaN (not a number), you can provide the initial compound interest in this fourth
argument. If the initial value is closer to the actual rate of the computation, it
can be sped up significantly.
This category includes the following functions:
Function syntax:
long inet_aton(string address)
Takes a string containing an IPv4 address in dotted-quad notation ("nnn.nnn.nnn.nnn") and returns a long containing that address encoded in network byte order.
Function syntax:
string inet_ntoa(intaddress) string inet_ntoa(longaddress)
Takes an int or long containing an IPv4 address encoded in network byte order and returns a string containing that address in dotted-quad notation ("nnn.nnn.nnn.nnn").
A list is an ordered collection of values (called elements), each of which is of the same StreamBase data type, called the list’s element type. The element type can be any StreamBase data type, such as an int, a tuple, or even a list (thus allowing for constructions such as a list of list of int).
Individual elements in a list can be accessed using their zero-based integer position
(their index) in the list. In any expression
in an EventFlow or StreamSQL program, use brackets to address individual elements of
a list. Thus, for a list field named L, use L[0] to address the first element in the list,
L[1] for the second element, and
L[length(L)] to address the last
element.
In most list-related functions that take an index, you can also use a negative index
to count backward from the end of the list. Thus, for a list L, L[-1] is equivalent to L[length(L) - 1].
The number of elements in a list is determined at application run-time. A null list is not the same as a list with zero elements (an empty list), which is not null.
*Functions marked with an asterisk work with both lists and strings, and are found in the Simple Functions: Utilities section of this page.
**The list() function is found in the Simple Functions: Type Conversions section of this page.
Function syntax:
list(T) append(list(T)L, Te1, Te2..., Ten)
Returns its argument list, L, with
elements e1 through en added to the end of the list in the order
specified. The data types of the appended elements e and the element type of list L must be the same, except that element types int
and long can be coerced to long and double following the rules in Data Type
Coercion. The element type of the returned list is the same as the element type
of the argument list and of the appended elements.
The function takes any number of element arguments, including zero. That is,
append(L) returns list L unchanged.
If any list element is null, a null element is appended to the list in the position
specified.
Function syntax:
double avg(list(double)L) int avg(list(int)L) long avg(list(long)L) timestamp avg(list(timestamp)L)
Given a list, L, of doubles, ints, longs,
or timestamps, returns the average of all members of the list. For a returned
average of timestamp values to make sense, the list should contain all interval
timestamps or all absolute timestamps, but not both. The data type of the returned
value is the same as the element type of the argument list.
See also the aggregate version of avg().
Function syntax:
list(T) concat(list(T)L1, list(T)L2[, ..., list(T)Ln])
Returns a list produced by concatenating the elements of its argument lists in the order specified. Takes two or more list arguments, which must have the same element type. The element type, T, of the returned list is the same as the element types of the argument lists.
See also the aggregate version of concat().
Function syntax:
bool contains(list(T)L, Titem)
Takes two arguments: a list with element type T, and a value, item, of the same type. Returns true if item is a member
of the specified list, or false if not.
If the entire list is null, returns null. If a member
of the list is null, and item is null,
returns true. If no member of the list is null, and
item is null, returns false.
Function syntax:
int count_distinct_elements(list(T) L)
Returns the number of unique elements in the argument list, L, with any element type, T.
For example,
count_distinct_elements(list(1, 2, 3, 3, 4, 4, 5,
6, 6, 7))
returns 7.
See also the related function, unique() and the aggregate function count_distinct().
Function syntax:
T dotproduct(list(T)L1, list(T)L2, ...)
Returns the sum of the element-wise product of its argument lists, whose element types must be int, double, or long. Accepts any number of lists, but is typically used with two list arguments. The data type of the returned value is the same as the element type of the argument lists. If the element type of the argument lists do not match, they are promoted using the rules in Data Type Coercion.
This function produces the dot product, also known as the scalar product, of two or more numeric lists. The function multiplies the elements of each list in element order, L1.first times L2.first, then L2.second times L2.second, and so on. The resulting list of element products is summed and returned. If an argument list is null, the result is null. If any element in any argument list is null, the result is null. If an argument list is shorter than another argument list, the extra positions are filled in with 1 (or 1.0 or 1L), which preserves the extra position elements of the longer list unchanged.
Function syntax:
list(T) emptylist(T x)
Returns an empty list with the same element type as the argument. The argument
x's value is ignored, and only its type
is considered. This function is typically used like these examples:
emptylist(int())
emptylist(string())
You can also use an argument with a value, or an expression that resolves to a value, like the following examples. Only the type of the argument is considered:
emptylist(17)
emptylist("IBM")
Function syntax:
list(T) filternull(list(T) L)
Returns a list composed of the elements of the argument list, L, with any null elements removed. The element
type, T, of the returned list is the same as the element type of the argument list.
Function syntax:
T firstelement(list(T) L)
Returns the first element of argument list L. The data type, T, of the returned value is the
same as the element type of the argument list.
The usage firstelement(L) is the
equivalent of L[0].
Function syntax:
list(T) insertelement(list(T)L, intindex, Te)
Returns a list composed of the elements of the argument list, L, with element e inserted before index position index in the argument list. The element type, T, of
the returned list is the same as the element type of the argument list and of the
inserted element, e.
The usage insertelement( is the equivalent of append(). The
usage L, length(L), e)insertelement( is the equivalent of prepend().
L, 0, e)
Function syntax:
T lastelement(list(T) L)
Returns the last element of argument list L. The data type of the returned value is the same
as the element type of the argument list.
The usage lastelement(L) is the
equivalent of L[-1].
Function syntax:
T maxelement(list(T) L)
This function returns the maximum non-null value using the greater-than relational
operator across all elements in the argument list, L. The data type of the returned value is the same
as the element type of the argument list.
StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.
See also the simple version of max() and the aggregate version of max().
Function syntax:
double median(list(double)L) double median(list(int)L) double median(list(long)L)
Given a list, L, of doubles, ints, or
longs, returns a double, the median of the elements of L. If the list has an even number of elements,
returns the average of the middle two.
See also the aggregate version of median().
Function syntax:
T minelement(list(T) L)
This function returns the minimum non-null value using the less-than relational
operator across all elements in the argument list, L. The data type of the returned value is the same
as the element type of the argument list.
StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.
See also the simple version of min() and the aggregate version of min().
Function syntax:
list(T) nulllist(T x)
Notice that the name of the function has three lowercase letter L's. Returns a null
list with the same element type as the argument. The argument x's value is ignored, and only its type is
considered. This function is typically used like these examples:
nulllist(double())
nulllist(timestamp())
You can also use an argument with a value, or an expression that resolves to a value, like the following examples. Only the type of the argument is considered:
nulllist(17)
nulllist("IBM")
Function syntax:
list(T) prepend(list(T)L, Te)
Returns a list containing element e at
the start, then all of list L. The data
type of element e and the element type of
list L must be the same, except that
argument types int and long can be coerced to long and double following the rules
in Data Type Coercion. The element type of the returned list is the same
as the element type of the argument list and of the prepended element, e.
Function syntax:
double product(list(double)L) int product(list(int)L) long product(list(long)L)
Given a list, L, of doubles, ints, or
longs, returns the product of each member of the list. The data type of the
returned value is the same as the element type of the argument list.
See also the aggregate version of product().
Function syntax:
list(int) range(intstart, intend[, intstep])
Returns a list of ints with starting value start and ending value end, inclusive of the first item in the list and
exclusive of the last item. Thus, range(1,
10) returns a list with nine elements, not ten:
list(int) [1, 2, 3, 4, 5, 6, 7, 8, 9]
The default value for step is 1, and it
cannot be zero. You can supply an alternate step value to produce a list with skipped elements.
For example, a step of 2 returns every
other number:
range(0, 8, 2)
returns
list(int) [0, 2, 4, 6]
You can use a negative value for step,
which generates the list backwards. For example:
range(3, 0, -1)
returns
list(int) [3, 2, 1]
while
range(6, 1, -2)
returns
list(int) [6, 4, 2]
Function syntax:
list(string) regexsplit(stringinput_str, stringregex_delimiter[, intlimit])
Use the regexsplit() function in the same ways as the split() function, except that you can specify a regular expression as the field delimiter. See Sun's Pattern class documentation for the supported regular expression syntax.
The following example parses input_str
into list elements using either colon, semicolon, or a space as the field
delimiter:
regexsplit("aaa:bbb;ccc:ddd;eee fff", "[:;
]")
returns
list(string) [aaa, bbb, ccc, ddd, eee, fff]
See split() for further details and for instructions on using the
optional limit argument.
Function syntax:
list(T) removeelement(list(T)L, intindex)
Returns a list composed of the elements of the argument list, L, with the element at index position index removed. The element type, T, of the returned
list is the same as the element type of the argument list.
Function syntax:
list(T) replaceelement(list(T)L, intindex, Te)
Returns a list composed of the elements of the argument list, L, with the element at index position index replaced by element e. The element type of the returned list is the
same as the element type of the argument list.
Function syntax:
list(T) reverse(list(T) L)
Returns its argument list, L, in reverse
order. The element type, T, of the returned list is the same as the element type of
the argument list.
Function syntax:
list(T) sort(list(T) L)
Returns its argument list, L, sorted.
Sorting occurs in the same manner as with the less-than operator. The element type,
T, of the returned list is the same as the element type of the argument list.
StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.
Function syntax:
list(string) split(stringinput_str, stringdelimiter[, intlimit])
Returns a list of strings parsed from input_str, splitting the input string into pieces
delimited by the character or phrase specified as delimiter. The delimiter character or phrase itself is not
preserved in the resulting list when a match is found. For example:
split("aaa:bbb:ccc:ddd:eee", ":")
returns
list(string) [aaa, bbb, ccc, ddd, eee]
If the delimiter character or phrase is not found in input_str, then input_str is returned verbatim as the only element
of the resulting list. If the delimiter character or phrase occurs adjacent to
another instance of the delimiter, the result is an empty list element. For
example, split("aaa::bbb", ":")
results in the following three-element list: [aaa, ,bbb].
You can specify an optional third argument, limit, an integer specifying the number of list
elements in the result list. Use the limit argument in the following ways:
-
If
limitis any negative number,input_stris split into as many list elements as necessary, with no limit on the length of the list. All empty list elements are preserved, including any trailing empty elements. For example,split("aaa::bbb::", ":", -1)results in a five-element list: [aaa, ,bbb, , ]. -
If
limitis zero or unspecified,input_stris split into as many list elements as necessary, but any trailing empty elements are discarded. For example,split("aaa::bbb::", ":", 0)andsplit("aaa:bbb::", ":")both result in a three-element list: [aaa, ,bbb]. -
If
limitis positive,input_stris split into limit-1 elements, plus one more element containing the remainder ofinput_str. For example,split("aaa:bbb:ccc:ddd:eee", ":", 3)results in a three-element list: [aaa, bbb, ccc:ddd:eee].Use this feature to split a string into the components you want to process, following by a catch-all element containing the remainder.
See also the regexsplit() function, a variation of split() that allows you to use a regular expression to specify the delimiter.
Function syntax:
double stdev(list(double)L) double stdev(list(int)L) double stdev(list(long)L) timestamp stdev(list(timestamp)L)
Given a list, L, of doubles, ints, longs,
or timestamps, returns the standard deviation for all members of the list. If the
argument list is a list of doubles, ints, or longs, returns a double. If the
argument list is a list of timestamps, returns an interval timestamp. For timestamp
values, the list should contain all interval timestamps or all absolute timestamps,
but not both.
See also the aggregate version of stdev().
Function syntax:
double stdevp(list(double)L) double stdevp(list(int)L) double stdevp(list(long)L) timestamp stdevp(list(timestamp)L)
Given a list, L, of doubles, ints, longs,
or timestamps, returns the standard deviation for all members of the list. If the
argument list is a list of doubles, ints, or longs, returns a double. If the
argument list is a list of timestamps, returns an interval timestamp. For timestamp
values, the list should contain all interval timestamps or all absolute timestamps,
but not both.
With stdevp() the data provided is the entire
population, while with stdev(), the data provided is
treated as a random sample. The stdevp() function is
calculated using the biased (or n) method. The stdev() function is
calculated using the unbiased (or n-1) method.
See also the aggregate version of stdevp().
Function syntax:
list(T) sublist(list(T)L, intstart[, intend[, intstep]])
Returns a list composed of a section of the argument list, L, starting with index start, which is included in the result list. If
end is specified, it is not included in
the result list. The default value for step is 1. You can supply an alternate step value to skip elements in the argument list.
For example, a step of 2 returns every
other value in the list. You must specify an end value in order to specify a step value. The element type, T, of the returned
list is the same as the element type of the argument list.
In contrast with the range() function, step
cannot be negative.
Function syntax:
double sum(list(double)L) int sum(list(int)L) long sum(list(long)L) timestamp sum(list(timestamp)L)
Given a list, L, of doubles, ints, longs,
or timestamps, returns the sum of all members of the list. The element type of the
returned list is the same as the element type of the argument list.
When the list element type is timestamp, summing elements of the list follows the rules for adding timestamps as shown in the table in timestamp Data Type. That is, you cannot sum a list of two or more absolute timestamps. However, you can sum a list composed of all interval timestamps, or one composed of exactly one absolute timestamp plus one or more interval timestamps.
See also the aggregate version of sum().
Function syntax:
list(T) unique(list(T) L)
Returns a list composed of the unique elements in the argument list, L. The returned list is not reordered in any
sequence: the function only removes duplicates of existing elements, and returns
the list in the order of the argument list. The element type, T, of the returned
list is the same as the element type of the argument list.
For example:
unique(list(1, 2, 3, 3, 4, 4, 5, 6, 6,
7))
returns:
list(int) [1, 2, 3, 4, 5, 6, 7]
See also the related function, count_distinct_elements().
Function syntax:
double variance(list(double)L) double variance(list(int)L) double variance(list(long)L)
Given a list, L, of doubles, ints, or
longs, returns a double, the variance for all members of the list. Variance is a
measure of the dispersion of a set of data points around their mean value.
See the aggregate version of variance() for more on variance.
Function syntax:
double variancep(list(double)L) double variancep(list(int)L) double variancep(list(long)L)
Given a list, L, of doubles, ints, or
longs, returns a double, the variance for all members of the list. Variance is a
measure of the dispersion of a set of data points around their mean value.
The variancep() function is similar to the
variance() function: with variancep() the data provided is the entire population, while
with variance(), the data provided is treated as a
random sample. The variancep() function is calculated
using the biased (or n)
method. The variance() function is calculated using
the unbiased (or n-1) method.
See also the aggregate version of variancep().
This category includes the following functions:
In addition, the following functions return the maximum and minimum values for numeric data types:
Function syntax:
int abs(inte) double abs(doublee) long abs(longe)
Returns the absolute value of an int, double, or long expression e. The return type is the same as the expression's
type.
Function syntax:
double acos(double x)
Returns in radians the arc cosine of x,
which is the value whose cosine is x.
Undefined outside of the range –1 to 1.
Function syntax:
double asin(int x)
double asin(double x)
Returns in radians the arc sine of x,
which is the value whose sine is x. Undefined outside of the range –1 to 1.
Function syntax:
double atan(double x)
Returns in radians the arc tangent of x,
which is the value whose tangent is x.
Function syntax:
double atan(doublex, doubley)
Returns in radians the arc tangent of two variables, x and y.
This is similar to calculating the arc tangent of (y/x), except that the signs of both arguments are
used to determine the quadrant of the result. tangent is x.
Function syntax:
double cbrt(intx) double cbrt(doublex) double cbrt(longx)
Returns the (real) cube root of x.
Function syntax:
double ceil(intx) double ceil(doublex)
Returns the smallest double value that is greater than or equal to the argument and
is equal to a mathematical integer. The function name, ceil, is shorthand for ceiling. The argument can be a double or
an int. Examples:
-
ceil(7.01)returns the double8.0. -
ceil(8.0)returns the double8.0. -
ceil(8)returns the double8.0.
See also the related function, floor().
Function syntax:
int exp(int x)
Returns the value of e (the base of natural logarithms) raised to the power of
x.
Function syntax:
double floor(intx) double floor(doublex)
Returns the largest double value that is less than or equal to the argument
x, and equal to a mathematical integer.
The argument can be a double or an int. Examples:
-
floor(7.01)returns the double7.0. -
floor(8.0)returns the double8.0. -
floor(8)returns the double8.0.
See also the related function, ceil().
Function syntax:
double log1p(double x)
Returns the natural logarithm of the sum of the argument and 1.
Function syntax:
T max(Te1, ..., Ten)
This function returns the maximum non-null value using the greater-than relational
operator across all its arguments e1
through en, which must all be expressions
resolving to the same data type. The data type, T, of the returned value is the
same as the arguments.
StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.
See also the aggregate version of max() and the maxelement() function for lists.
Function syntax:
double maxdouble()
This function returns the maximum value for the double data type.
Function syntax:
T min(Te1, ..., Ten)
This function returns the minimum non-null value using the less-than relational
operator across all its arguments e1
through en, which must all be expressions
resolving to the same data type. The data type of the returned value is the same as
the arguments.
StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.
See also the aggregate version of min() and the minelement() function for lists.
Function syntax:
double mindouble()
This function returns the minimum value for the double data type.
Function syntax:
double pow(intx, inty) double pow(doublex, doubley) double pow(longx, longy)
Returns the value of x raised to the
power of y. Accepts an int, double, or
long, and returns a double.
Function syntax:
double random()
Returns a random double value with a positive sign, greater than or equal to 0.0 and less than 1.0.
Function syntax:
long round(doublee) long round(inte, intn) long round(doublee, intn) long round(longe, intn)
With one argument, a double, returns the closest long to the argument. Floating-point numbers are NOT supported.
With two arguments, round() is similar to the round() found in Microsoft Excel, but not identical to it. Specify a double, long, or int in the first argument. Specify an int in the second argument to designate the number of digits to round the first argument to. Use 0 to specify rounding to an integer, a positive integer to round the digits after the decimal point, or a negative number to specify rounding digits before the decimal point. If the first argument is an int or long, the second argument must be negative to actually perform rounding; otherwise the first argument is returned unchanged.
Function syntax:
double securerandom()
A more secure form of random() that generates random bytes, converted to return a double.
Function syntax:
int sign(double x)
Returns the sign of a double: -1 if less than zero, 0 if equal to zero, and 1 if greater than zero.
Function syntax:
double to_degrees(double x)
Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
This category includes the following functions:
The functions in this group detect NaN (not a number) values.
Function syntax:
bool isnan(doublevalue) bool isnan(intvalue) bool isnan(longvalue) bool isnan(timestampvalue)
Returns true if the argument is NaN (not a number), or
returns null if the argument is null. Returns
false otherwise.
Function syntax:
bool notnan(doublevalue) bool notnan(intvalue) bool notnan(longvalue) bool notnan(timestampvalue)
Returns false if the argument is NaN (not a number),
or returns null if the argument is null. Returns
true otherwise.
Notice that notnan() returns the opposite of
isnan(), except if the argument is null, in which
case they both return null.
This category includes the following functions:
| get_conf_param(), get_*_conf_param() | getContainer() |
| getHostName() | getLeadershipStatus() |
| getNodeName() | getPath() |
| securitytag() |
Function syntax:
string get_conf_param(default-value,path1,path2,path3, ...) boolean get_boolean_conf_param(default-value,path1,path2,path3, ...) int get_int_conf_param(default-value,path1,path2,path3, ...) long get_long_conf_param(default-value,path1,path2,path3, ...) double get_double_conf_param(default-value,path1,path2,path3, ...)
Returns the value portion of a name-value pair specified with a param element in the currently active server configuration
file, usually named sbd.sbconf. Use the typed
versions of the function to convert the returned value to the specified data type.
These functions only return values from param
elements. Other configuration file elements cannot be accessed with these
functions.
The first argument, default-value,
specifies a value to return if parsing the configuration file returns nothing. The
arguments path1, path2 ... pathn specify element and attribute names in the
XML path to the value of interest (without counting the streambase-configuration top-level element).
For example, the value of the TCP port might be set in the configuration file with an entry like the following:
<server>
<param name="tcp-port" value="9900" />
</server>
To extract the TCP port value from the configuration file, use the int version of
the get_conf_param() function as follows:
get_int_conf_param(10000, "server",
"tcp-port")
In another example, some of the values set in the configuration file's security element are nested deeper:
<security>
<ssl-authentication>
<param name="keystore" value="../test/proxy/security/signed.keystore"/>
<param name="keystore-password" value="secret"/>
</ssl-authentication>
</security>
To extract the keystore-password value, use the following function:
get_conf_param("verysecret", "security",
"ssl-authentication", "keystore-password")
Function syntax:
string getContainer()
Returns as a string the name of the container in which the current application is loaded. Can be used in expressions like this example:
if getContainer() == "default" ...
Function syntax:
string getHostName()
Returns as a string the machine name of the host on which StreamBase Server is running. Can be used in expressions like this example:
if getHostName() == "fasthost" ...
Function syntax:
string getLeadershipStatus()
Returns the string LEADER or NON_LEADER, which represent the current StreamBase Server's leadership status as a member of a high availability cluster. Servers have LEADER status by default on startup, which can be changed with the sbadmin setLeadershipStatus command.
Function syntax:
string getNodeName()
Returns as a string the name of the node in which the current application is loaded. Can be used in expressions like this example:
if getNodeName() == "primaryserver"
...
Function syntax:
string getPath(string componentname)
Returns the qualified path of the specified component of a StreamBase application. Can be used for verifying the container name for the specified component at run time, in expressions like this example:
if getPath("Symbol") == "default.Symbol"
...
Function syntax:
int securitytag()
Returns the value of the security tag for the current component, in an application for which security tagging has been enabled. Security tagging is enabled and tag values are defined in the application's server configuration file, as described in security-tagging .
For example, if a security tag value of 2 was set for the input stream used by a
Map operator that does not modify security levels, a securitytag call in that operator returns a value of 2. For
operators that aggregate security values from multiple inputs, a different return
value is possible.
Note
A return value of 0 can indicate either that no value was set for the current component, or that an explicit value of 0 was set.
This category includes the following functions:
| format() | lower() | rtrim() | string() ** |
| indexof() * | ltrim() | strlen() | substr() |
| lastindexof() * | regexmatch() | strresize() | trim() |
| length() * | regexmatch_ignorecase() | strresizetrunc() | upper() |
*Functions marked with an asterisk work with both strings and lists, and are found in the Simple Functions: Utilities section of this page.
**The string() function is found in the Simple Functions: Type Conversions section of this page.
The behavior of functions that deal with strings changes when Unicode support is enabled for StreamBase Server as described in Unicode Support. These cases are noted for each function in this section.
Function syntax:
string format([intlength,] stringformat-string,arg0,arg1, ...,argn)
Returns a formatted string for arguments arg0 through argn, each formatted according to a format
specification in format-string. The
returned string is truncated to length
characters, if specified. The length
argument was required in previous StreamBase releases, but is no
longer required or recommended. The length argument is preserved to maintain
compatibility with existing expressions that use format().
Format specifications are patterns specified in the manner of class java.text.MessageFormat in the Sun
Java Platform SE
reference documentation.
For example, an incoming stream containing NYSE stock ticks might have a field
named symbol containing four-character stock symbols.
Let's say a downstream operator expects symbols to be prefixed with a string
identifying their source stock exchange. In this case, use an expression like the
following in a Map operator:
format("NYSE:{0}", symbol)
An incoming stream might have a price field specified
to four decimal places. When the flow of tuples reaches an output stream that must
be formatted for human readability, you can use an expression like the following to
round the price field to two decimal places. This
example assumes that the price field contains values
of type double, which matches the format specifier ###.##.
format("{0,number,###.##}", price)
The format for argument arg0 is specified
with the portion of the format-string
that defines {0}. The format for arg1 is in the format-string portion that defines {1}, and so on. The data type of arg0 must match the type expected by the format-string specifier for {0}, and so on for each argument.
If you use format() to modify a tuple, use {0}, {1},
{2} and so on to specify the format for each field.
See the documentation for java.text.MessageFormat
for further details and more examples.
Function syntax:
string ltrim(stringstr) string ltrim(strings1, strings2)
The same as trim(), except that the first argument is trimmed on the left side only.
Function syntax:
bool regexmatch(stringregex, stringstr)
Attempts to match the specified regular expression in regex against the entire specified string in str. To match any substring of the specified string, use wildcards
before and after your regular expression. For example, the regex IBM matches only when the entire string is "IBM", while the regex
.*IBM.* matches if "IBM" is anywhere in the supplied
string str.
Function syntax:
bool regexmatch_ignorecase(stringregex, stringstr)
Same as regexmatch(), but the supplied regex matches str without regard to case.
Function syntax:
string rtrim(stringstr) string rtrim(strings1, strings2)
The same as trim(), except that the first argument is trimmed on the right side only.
Function syntax:
int strlen(string str)
Returns the length of the string str.
This function returns the number of graphemes, not bytes, for all supported
Unicode character
sets.
Function syntax:
string strresize(stringstr, intlength)
The strresize() function performs no action: it
simply returns its str argument and
ignores the length argument. The function
is preserved for compatibility with previous StreamBase releases, in
which the StreamBase string data type had a fixed maximum size. To
perform string truncation, use substr() or
strresizetrunc().
Function syntax:
string strresizetrunc(stringstr, intlength)
Returns its argument str, truncated to
the number of bytes specified in length.
Note
When using strresizetrunc() with Unicode multibyte character
sets, count the length argument in
bytes, not graphemes.
Function syntax:
string substr(stringstr, intbegin, intlength)
Returns a substring of the string str,
starting at the zero-indexed character begin with a size of length characters.
For all supported Unicode character sets, specify length in number of graphemes, not bytes.
Function syntax:
string trim(stringstr) string trim(strings1, strings2)
With one argument, returns the specified string with leading and trailing spaces removed.
With two arguments, the second argument contains the characters to be removed from the string specified in the first argument.
Function syntax:
string upper(string str)
Returns its argument in uppercase letters.
This category includes the following functions:
Function syntax:
timestamp sleep(timestamp t)
Pauses execution of the current thread for the specified time period. The argument
must be an interval timestamp, and the return value is the argument. This function
can be useful in conjunction with operators that run in a separate thread, and is
provided for debugging or for rare uses. The sleep()
function blocks the concurrent portion of an application, if concurrency is
enabled, or blocks the container holding the application, if concurrency is not
enabled. Therefore, use sleep() with care in
production applications. To learn about threading in StreamBase
applications, refer to Execution
Order, Concurrency, and Parallelism.
Function syntax:
string systemenv(stringkey[, intlength])
Calls the standard Java method java.lang.System.getenv(). Returns the
string value of the environment variable specified by the key argument. If length is specified, the returned string is
truncated to the specified number of bytes.
Note
When using systemenv() with Unicode multibyte character
sets, count the length argument in
bytes, not graphemes.
Function syntax:
string systemproperty(stringkey[, stringdefaultstring][, intlength])
Calls the standard java.lang.System.getProperty()
method. Returns the string value of the system property specified by the key argument. Use the standard set of system
property keys described in Java documentation for java.lang.System.getProperties()
(for example, os.name, java.home, and user.name), or
specify a StreamBase-specific Java property.
If length is specified, the returned
string is truncated to the specified number of bytes.
If the optional defaultstring argument is used, and
there is no system property for the specified key, the function returns the string
specified in the defaultstring argument. If no default
string is provided, null is returned.
Note
When using systemproperty() with Unicode multibyte character
sets, count the length argument in
bytes, not graphemes.
There are two representations of time held by the timestamp data type. An absolute timestamp holds an exact date and time value with a maximum precision of milliseconds. An interval timestamp holds an amount representing elapsed time.
In both cases, the internal representation of the timestamp is the same, a long value representing seconds. In the case of absolute timestamps, the value is the number of seconds elapsed since the epoch, which is defined as midnight of January 1, 1970 UTC. In the case of interval timestamps, the value is simply the number of seconds in the interval.
Absolute timestamps are expressed in the time format patterns of the SimpleDateFormat class described in
the Sun Java Platform
SE reference documentation. Examples on this page are taken from the results of
running sbd --eval commands.
See timestamp Data Type for more information on the timestamp data type, including information on adding and subtracting timestamp values and comparing timestamps with relational operators.
Timestamp functions are organized into the following categories:
| Simple Functions: Absolute Timestamps |
| Simple Functions: Interval Timestamps |
| Simple Functions: Working with Timestamp Fields |
This category includes the following functions, which return an absolute timestamp or a string converted from an absolute timestamp:
| date() | epoch() | format_time() |
| from_gmtime() | from_localtime() | nanotime() |
| now() | parse_time() | strftime() |
| strptime() | today() | today_utc() |
Function syntax:
timestamp date(timestamp ts)
Accepts a timestamp ts and returns an
absolute timestamp for the year, month, and day portion of ts, with all time values set to zero. This, in
effect, extracts and returns the timestamp for midnight local time for the date
in the argument ts.
For example, the function now(), run on 12 Feb 2009 in the EST time zone, returns a full timestamp like this one:
(timestamp) 2009-02-12 15:48:45.679-0500
But the same function used as an argument for date():
date(now())
returns the following:
(timestamp) 2009-02-12 00:00:00.000-0500
Function syntax:
timestamp epoch()
Returns an absolute timestamp representing the beginning of counted time. For both Windows and UNIX, the epoch is midnight, January 1st, 1970. The value returned is adjusted for the current time zone. To see the absolute epoch, use:
epoch() - timezoneoffset()
Function syntax:
string format_time(timestampts, stringformatstring)
Converts the specified timestamp field, ts, to a string, using the specified formatstring pattern. The timestamp field can
then be manipulated as a string instead of a timestamp. This function uses the
time format patterns from the SimpleDateFormat class described
in the Sun Java
Platform SE reference documentation.
For example, the function now(), run on 12 Feb 2009 in the EST time zone, returns a timestamp like this one:
(timestamp) 2009-02-12 15:48:45.679-0500
The same function used as an argument for format_time():
format_time(now(), "EEEE, MMM d, yyyy HH:mm
zzzz")
returns the following:
(string) Thursday, Feb 12, 2009 15:48 Eastern Standard Time
You can use any combination of time format designators from the SimpleDateFormat class. For example:
format_time(now(), "yyyy-MM-dd G, HH:mm
a")
returns:
(string) 2009-02-12 AD, 15:48 PM
This function performs the same task as strftime(), but uses a different library source for its
format pattern strings. The strftime() function is
faster than format_time(), and should be used when
performance is critical in your application. The format pattern strings used by
format_time() are easier to use.
Function syntax:
timestamp from_gmtime(intyear, intmonth, intdayOfMonth, inthour, intminute, doubleseconds)
Creates an absolute timestamp in the UTC (GMT) time zone from the specified
integer components of a date and time. All six arguments are required. You can
express the seconds argument as a
double with precision down to milliseconds, or you can use an integer, which is
promoted automatically to a double by the rules of StreamBase
Data Type Coercion.
You might use this function on a stream that has the date and time broken into integer fields to concatenate and translate those fields into a single timestamp value.
Notice that the returned timestamp shows a moment in the UTC time zone, but that
moment is translated and displayed in the local time zone. For example, compare
the same arguments given to from_gmtime() and
from_localtime() when run in the EST time zone:
from_gmtime(2009, 3, 17, 18, 30,
05)
returns:
(timestamp) 2009-03-17 14:30:05.000-0400
while
from_localtime(2009, 3, 17, 18, 30,
05)
returns:
(timestamp) 2009-03-17 18:30:05.000-0400
Function syntax:
timestamp from_localtime(intyear, intmonth, intdayOfMonth, inthour, intminute, doublesecond)
Creates an absolute timestamp in the local time zone from the specified integer
components of a date and time. All six arguments are required. You can express
the seconds argument as a double with
precision down to milliseconds, or you can use an integer, which is promoted
automatically to a double by the rules of StreamBase Data Type
Coercion.
You might use this function on a stream that has the date and time broken into integer fields to concatenate and translate those fields into a single timestamp value.
For example:
from_localtime(2009, 3, 17, 18, 30,
05)
returns:
(timestamp) 2009-03-17 18:30:05.000-0400
Function syntax:
long nanotime()
Returns the current value of the system timer as a long value in nanoseconds. This function does not guarantee to return an accurate measurement of the system clock to the nanosecond level of precision, but does provide an accurate counter at nanosecond precision. Therefore, use this function only to measure elapsed time between two invocations, like the following example:
long startcount = nanotime()
// process tuple data
long elapsedtime = nanotime() - startcount
Differences in two calls to nanotime() are not accurate for a span greater than 263 (about 292 years).
Function syntax:
timestamp now()
Returns an absolute timestamp value representing the current time in the local time zone.
You can specify an alternate now() implementation
in the StreamBase Server configuration file. The value is set in the
runtime section:
<param name="now-implementation" value="system" />
Specify one of these values, as appropriate for your StreamBase application:
| Value | Meaning |
|---|---|
| system |
Directs the now() function to use Java's
System.currentTimeMillis(). This is the
default.
|
| thread |
Directs the now() function to use a
background thread that checks the time approximately every millisecond.
This option results in decreased accuracy, but may be more efficient than
system if you call now() more frequently than 1000 times per second.
|
Function syntax:
timestamp parse_time(stringtimestring, stringformatstring)
Parses a string of time and date information and returns an absolute timestamp
value. The timestring argument can be
any string representation of date and time, as long as its components are
individually parseable.
You must also specify a formatstring
argument to specify how the timestring
argument is to be interpreted. The format string must specify a time format
pattern as defined in the SimpleDateFormat class,
described in the Sun Java Platform SE reference documentation.
Use this function to read a time and date string in any format in an incoming
stream, and still interpret that string as an absolute timestamp. For example, an
incoming field, TradeTime, might contain the time
and date in the following string format:
TradeTime = "11:17 Fri Feb 13, 09"
In this case, interpret the field TradeTime as a
StreamBase timestamp with an expression like the following:
parse_time(TradeTime, "HH:mm EE MMM dd,
yy")
which returns:
(timestamp) 2009-02-13 11:17:00.000-0500
This function performs the same task as strptime(), but uses a different library source for its
format pattern strings. The strptime() function is
faster than parse_time(), and should be used when
performance is critical in your application. The format pattern strings used by
parse_time() are easier to use. See also the
timestamp() function.
Function syntax:
string strftime(stringformatstring, timestampts)
Converts the specified timestamp field, ts, to a string, formatted according to the
specified formatstring pattern. The
timestamp field can then be displayed and manipulated as a string instead of a
timestamp.
This function uses time format patterns based on the strftime library, part of the Open Group's Single UNIX Specification. This function accepts
all of the format patterns of the Open Group's strftime function as documented at the Open Group's strftime page, with the following exceptions:
-
The StreamBase
strftime()function does not support %R. -
For %z, the StreamBase
strftime()function returns the current time formatted as if you had specified"%a, %d %b %Y %H:%M:%S %Z". For example,strftime("%z", now())returns a string like the following:Thu, 18 Feb 2010 15:42:35 EST
-
The StreamBase
strftime()function adds support for the following format patterns above and beyond the patterns supported by the Open Group's strftime specification:%f Returns the milliseconds portion of timestamp ts.%s Returns the time value of timestamp tsin milliseconds divided by 1000.%+ Returns the current time formatted as if you had specified "%a %b %d %H:%M:%S %Z %Y". For example,strftime("%+", now())returns a string like the following:Thu Feb 18 15:42:35 EST 2010
As an example of using strftime(), consider that the function now(), when run on 18 Feb 2010 in the EST time zone, returns a timestamp like this one:
(timestamp) 2010-02-18 15:42:35.461-0500
Here is the same function used as an argument for strftime():
strftime("Traded at %H:%M and %S.%f seconds on %b
%d, %Y", now())
which returns the following:
(string) Traded at 15:42 and 35.461 seconds on Feb 18, 2010
This function performs the same task as format_time(), but uses a
different library source for its format pattern strings. This strftime() function is faster than format_time(), and should be used when performance is critical
in your application. The format pattern strings used by format_time() are easier to use.
Function syntax:
timestamp strptime(stringtimestring, stringformatstring)
Parses a string of time and date information and returns an absolute timestamp
value. The timestring argument can be
any string representation of date and time, as long as its components are
individually parseable.
You must also specify a formatstring
argument to specify how the timestring
argument is to be interpreted. The format string must specify a time format
pattern using one or more of the following subset of the patterns defined for the
strptime library (part of the Open Group's Single UNIX
Specification): %b, %B, %d, %D, %H, %m, %M, %S, %x, %y, and %Y. strptime() also accepts %f, which designates the milliseconds
portion of the seconds field. You can use an exclamation point in the format
string to indicate that the remainder of the input string is optional.
Use this function to read a time and date string in any format in an incoming
stream, and still interpret that string as an absolute timestamp. For example, an
incoming field, TradeTime, might contain the time
and date in the following string format:
TradeTime = "15-51-26.984 May 12, 09"
In this case, interpret the field TradeTime as a
StreamBase timestamp with an expression like the following:
strptime(TradeTime, "%H-%M-%S.%f %b %d,
%y")
which returns:
(timestamp) 2009-05-12 15:51:26.984-0400
This function performs the same task as parse_time(), but uses a
different library source for its format pattern strings. This strptime() function is faster than parse_time(), and should be used when performance is critical
in your application. The format pattern strings used by parse_time() are easier to use. See also the timestamp() function, and strpinterval(), a function similar to strptime(), but for interval timestamps.
Function syntax:
timestamp today()
Returns an absolute timestamp value for the beginning of the current day (midnight) in the local time zone.
Function syntax:
timestamp today_utc()
Returns an absolute timestamp value for the beginning of the current day (midnight) in the UTC (GMT) time zone, translated for the local time zone.
Thus, when today() and today_utc() are run in sequence in the EST time zone, they
return:
2009-02-13 00:00:00.000-0500 2009-02-12 19:00:00.000-0500
On UNIX, today_utc() requires the TZ environment
variable to be set to translate correctly to the local time zone. On Windows, the
TZ environment variable is ignored if set.
This category includes the following functions, which return an interval timestamp:
| days() | hours() | interval() |
| milliseconds() | minutes() | seconds() |
| strpinterval() | time() | timezoneoffset() |
| weeks() |
Function syntax:
timestamp days(intx) timestamp days(doublex)
Returns an interval timestamp representing an interval of x days.
Function syntax:
timestamp hours(intx) timestamp hours(doublex)
Returns an interval timestamp representing an interval of x hours.
Function syntax:
timestamp interval(string formatstring)
Parses the argument formatstring as a
time format string and returns an interval timestamp. This function uses the time
format patterns from strftime library, described in the Open Group's Single UNIX Specification.
The time format string is in the form %H:%M:%S!.!%f.
The exclamation points show where the remainder of the string is optional. Thus,
you can enter strings in any of the following formats, where hh = hours, mm =
minutes, ss = seconds, and ff = a fractional second value in milliseconds:
-
hh:mm:ss
-
hh:mm:ss.
-
hh:mm:ss.ff
For example:
interval("2:00:01")
interval("2:00:01.")
both return (timestamp) 7201.0, the number of
seconds in the specification of two hours plus one second.
interval("2:00:01.333")
returns (timestamp) 7201.333, the number of seconds
in the specification of two hours plus one and a third seconds.
Function syntax:
timestamp milliseconds(intx) timestamp milliseconds(doublex)
Returns an interval timestamp representing an interval of x milliseconds.
Function syntax:
timestamp minutes(intx) timestamp minutes(doublex)
Returns an interval timestamp representing an interval of x minutes.
Function syntax:
timestamp seconds(intx) timestamp seconds(doublex)
Returns an interval timestamp representing an interval of x seconds.
Function syntax:
timestamp strpinterval(stringtimestring, stringformatstring)
Parses a string of time interval information and returns an interval timestamp
value. The timestring argument can be
any string representation of a time interval, as long as its components are
individually parseable.
You must also specify a formatstring
argument to specify how the timestring
argument is to be interpreted. The format string must specify a time format
pattern using a subset of the patterns defined for the strptime library, (part of the Open Group's Single UNIX Specification): %H, %M, and %S.
strpinterval() also accepts %f, which designates
the milliseconds portion of the seconds field. You can use an exclamation point
in the format string to indicate that the remainder of the input string is
optional.
Use this function to read a time interval string in any format in an incoming
stream, and still interpret that string as an interval timestamp. For example, an
incoming field, ElapsedTime, might contain a time
interval expressed in following string format:
ElapsedTime = "1, 7, 5.250"
which you know from the data vendor's documentation means one hour and 7 minutes,
plus five and a quarter seconds. In this case, interpret the field ElapsedTime as a StreamBase interval timestamp with
an expression like the following:
strptime(ElapsedTime, "%H, %M,
%S.%f")
which returns:
(timestamp) 4025.25
If the ElapsedTime field in the incoming stream
sometimes shows the seconds field and sometimes doesn't, use an exclamation point
to show that everything after the minutes pattern is optional:
strptime(ElapsedTime, "%H, %M!,
%S.%f")
Then when a new ElapsedTime value comes through set
to "2, 4", it is still
interpreted and converted to an interval timestamp:
(timestamp) 7440.0
See also the strptime(), which performs for absolute timestamps what this function does for interval timestamps.
Function syntax:
timestamp time(timestamp ts)
Returns as an interval timestamp the number of seconds elapsed between the start
of the day expressed in the argument, ts, and the present moment. Thus, if run at the
same instant, the following expressions return the same result: time(now()) and now() - today().
timestamp timezoneoffset()
timestamp timezoneoffset(timestamp ts)
Without an argument, returns a signed interval timestamp that expresses (in
seconds) the number of hours that the local time zone is offset from UTC. Thus,
timezoneoffset() returns as an interval the same
value that format_time(now(),
"Z") returns as a string. Because interval timestamps are
expressed in seconds, you can divide the results by 3600 to obtain for display
purposes the number of hours in the offset.
The returned time zone offset value is calculated relative to the system time for
Windows, and relative to the system time or to the time zone specified in the
TZ environment variable for UNIX. On UNIX, make sure
that the environment from which the sbd process is launched either does not have a
TZ variable or that it is correctly set for the local
system's time zone. The TZ environment variable is
ignored on Windows.
You can subtract the return of timezoneoffset()
from any absolute timestamp value to generate a new absolute timestamp that
represents the same moment in the UTC time zone. For example, now() - timezoneoffset() returns the present
moment in the UTC time zone, and returns the same result when run from any time
zone.
Because the value returned from timezoneoffset() is
signed, you do not need to consider whether to add or subtract a positive or
negative offset value from the timestamp of interest. Always use a subtraction
operation, which insures that negative timezoneoffset() values (that is, those from time zones west of
UTC) are effectively added, while positive values (from time zones east of UTC)
are subtracted.
Once you have a timestamp value representing UTC time, you can convert it to a timestamp for another time zone by adding or subtracting a fixed value representing the offset of the time zone of interest. For example, the following expression returns the current time for Sydney, Australia for April to October, when Sydney is on standard time:
(now() - timezoneoffset()) -
hours(10.0)
(For October to April, when Sydney is on daylight savings time, use
hours(11.0).)
If you provide an absolute timestamp as an argument, timezoneoffset() returns the UTC offset of the system's current
time zone at the date and time of the specified timestamp. The offset is
calculated based only on the date and time information in the specified
timestamp, relative to the system's current time zone (or, on UNIX, to the zone
in TZ). You can use the argument form of timezoneoffset() to determine the UTC offset for future or
historical dates.
For example, the following commands show that on the US East Coast, the UTC offset changed at 2:00 AM on March 8, 2009, when daylight savings time began.
sbd --eval "timezoneoffset(timestamp('2009-03-08 01:59:59'))/3600"
timestamp (-5.0)
sbd --eval "timezoneoffset(timestamp('2009-03-08 02:00:00'))/3600"
timestamp (-4.0)
sbd --eval "timezoneoffset(timestamp('2009-03-08 03:00:00'))/3600"
timestamp (-4.0)
(Actually, at 2:00 AM clocks were turned ahead one hour to 3:00 AM, but
timezoneoffset() correctly calculates the UTC
offset for both 2:00 and 3:00 AM.)
Function syntax:
timestamp weeks(intx) timestamp weeks(doublex)
Returns an interval timestamp representing an interval of x weeks.
This category includes the following functions, used to get and set individual fields in absolute timestamps:
Function syntax:
double get_millisecond(timestamp ts)
Returns the milliseconds portion of the absolute timestamp ts. The returned value is a double, and includes
only the fractional value of the milliseconds of the current second.
Function syntax:
double get_second(timestamp ts)
Returns the seconds portion of the absolute timestamp ts. The returned value is a double, and includes
a fractional value in milliseconds.
Function syntax:
int get_minute(timestamp ts)
Returns the minutes portion of the absolute timestamp ts.
Function syntax:
int get_day_of_week(timestamp ts)
Returns an integer representing the day of the week in the absolute timestamp
ts. The returned value is between
0 (Sunday) and 6
(Saturday).
Function syntax:
int get_day_of_month(timestamp ts)
Returns an integer representing the day of the month in the absolute timestamp
ts. The returned value is between
1 and 31.
Function syntax:
int get_month(timestamp ts)
Returns an integer representing the month in the absolute timestamp ts. The returned value is between 1 (January) and 12 (December).
Function syntax:
timestamp set_second(timestampts, intx)
Returns a modified version of the absolute timestamp ts, with the seconds portion changed to
x.
Function syntax:
timestamp set_minute(timestampts, intx)
Returns a modified version of the absolute timestamp ts, with the minutes portion changed to
x.
Function syntax:
timestamp set_hour(timestampts, intx)
Returns a modified version of the absolute timestamp ts, with the hours portion changed to x.
Function syntax:
timestamp set_day_of_week(timestampts, intx)
Returns a new timestamp that has the same value as absolute timestamp ts, but with the day of the week set to the value
of x. Days of the week are numbered
from 0 to 6, Sunday
through Saturday. Day 7 is also considered to be Sunday.
Function syntax:
timestamp set_day_of_month(timestampts, intx)
Returns a modified version of the absolute timestamp ts, with the day of month portion changed to
x.
Function syntax:
timestamp set_month(timestampts, intx)
Returns a modified version of the absolute timestamp ts, with the month portion changed to x.
Function syntax:
timestamp set_year(timestampts, intx)
Returns a modified version of the absolute timestamp ts, with the year portion changed to x.
Function syntax:
long to_milliseconds(timestamp ts)
Converts an interval or absolute timestamp to a long value representing
milliseconds. This function is similar to to_seconds(), but is more precise and faster. If the argument
ts is an interval, returns the number
of milliseconds in the interval. For example, to_milliseconds(now() - today()) returns the number of
milliseconds since midnight.
If the argument ts is an absolute time,
returns the number of milliseconds since the epoch, Jan 1, 1970. For example,
to_milliseconds(timestamp("2007-11-29 17:32"))
returns 1196375520000.
Function syntax:
double to_seconds(timestamp ts)
Converts an interval or absolute timestamp to a double in seconds. If the
argument ts is an interval, returns the
number of seconds in the interval. For example, to_seconds(now() - today()) returns the number of seconds since
midnight.
If the argument ts is an absolute time,
returns the number of seconds since the epoch, Jan 1, 1970. For example,
to_seconds(timestamp("2007-11-29 17:32")) returns
1.19637552E9.
This category includes the following functions:
Function syntax:
blob blob(string arg) blob blob()
Converts arg to a value of type blob
where arg is a string. Returns a blob
representation of the string. With no argument, returns the same as blob(null).
Function syntax:
bool bool(bool arg) bool bool(double arg) bool bool(int arg) bool bool(long arg) bool bool(string arg) bool bool()
Converts arg to a value of type bool
where arg is a bool, double, int, long,
or string.
-
An int argument of 0 returns the Boolean
false. All other int values, positive or negative, returntrue. -
A double argument of 0 returns the Boolean
false. All other double values, positive or negative, returntrue. -
A string argument
falsereturns a Booleanfalseand the stringtruereturns Booleantrue. The string is compared in a case-insensitive manner. For example,bool("FALSE")andbool("false")are equivalent. Any string other thanfalseortrueis an error. -
A bool value returns itself.
-
With no argument, returns the same as
bool(null).
Note
bool() does not support a blob value. To detect whether a blob is null, you can
instead use isnull( (where e)e is a blob).
Function syntax:
double double(bool arg) double double(double arg) double double(int arg) double double(long arg) double double(string arg) double double()
Converts arg to a value of type double
where arg is a bool, double, int, long,
or string.
-
A bool argument returns
1.0if true, or0.0if false. -
A double argument returns itself.
-
An int argument returns the same value converted to type double. For example,
double(3)returns3.0. -
A long argument returns the same value converted to type double. For example, double(15L) returns 15.0.
-
A string argument is parsed as a decimal number. For example,
double("123.456")returns123.456, butdouble("7abc")returns an error. Scientific notation is supported, sodouble("1.2E4")returns12000.0. -
With no argument, returns the same as
double(null).
Also See: The to_seconds() function, used to convert timestamps to double.
Function syntax:
int int(bool arg) int int(double arg) int int(int arg) int int(long arg) int int(string arg) int int()
Converts arg to a value of type int where
arg is a bool, double, int, long, or
string.
-
A bool argument returns
1if true, or0if false. -
A double argument has its fractional part truncated. For example,
int(3.4)returns3. -
An int argument returns itself.
-
A long argument returns the int same as a long value. For example, int(150L) returns 150.
-
A string argument is parsed as a decimal number. For example,
int("123")returns123, butint("7abc")is an error. -
With no argument, returns the same as
int(null).
Function syntax:
list list(e1,e2,e3, ...)
Returns a list containing each of the specified elements e1, e2,
and so on, where each specified element is of the same data type. The specified
elements can be literals or expressions, as long as each resolves to (or can be
coerced to) the same data type. You can create a list of any
StreamBase data type. Use emptylist() to return an empty list, and use nulllist()
to return a null list.
Function syntax:
long long(boolarg) long long(doublearg) long long(intarg) long long(longarg) long long()
Converts expression arg to a value of
type long where arg evaluates to a bool,
double, int, or long.
-
An int argument returns the same value, but with type long.
-
A double argument has its fractional part truncated, and the remainder returned with type long.
-
A bool argument returns
1if true, or0if false. -
A long value returns itself.
-
With no argument, returns the same as
long(null).
Function syntax:
string string(blobarg) string string(boolarg) string string(doublearg) string string(intarg) string string(longarg) string string(stringarg) string string(timestamparg) string string()
Converts arg to a value of type string
where arg is a blob, bool, double, int,
long, string, or timestamp.
-
An int, double or long argument returns the string representation of
arg. -
A string argument returns itself.
-
A bool argument returns the string
trueif true, orfalseif false. -
A timestamp argument, if represented as a date and time, returns a string that includes the explicit time zone. A timestamp represented as a numeric value returns the string representation of that value.
-
A blob argument returns the string representation of the blob.
-
With no argument, returns the same as
string(null).
Function syntax:
timestamp timestamp(string timestring)
timestamp timestamp()
Returns an absolute timestamp parsed from the argument timestring, a string representation of a date with
optional time. With no argument, returns the same as timestamp(null).
The timestring argument must be in the
form of a time format pattern as defined in the SimpleDateFormat class, described
in the Sun Java
Platform SE reference documentation.
You can specify only the date portion:
timestamp("2009-02-12")
If you specify any part of the time value, you must specify at least the hours and minutes:
timestamp("2009-02-12 14:28") or
timestamp("2009-02-12
14:28:05.250")
You can also specify the entire SimpleDateFormat
string, including time zone specification:
timestamp("2009-02-12
14:28:05.250-0800")
See also the strptime() and parse_time() functions, used to convert strings to timestamps.
Function syntax:
tuple tuple(...) tuple tuple()
Takes one or more comma-separated value expressions as arguments, each field named with the AS keyword. The result is a single tuple. For example:
tuple(126.50 as price, "IBM" as
symb)
You can optionally specify a name for the resulting tuple with a final AS keyword:
tuple(126.50 as price, "IBM" as symb) as
tick1
The expression tuple() returns a
non-null tuple with no fields.
Function syntax:
tuple tuple_constructor(...)
The tuple_constructor function is not a function named tuple_constructor. Instead, it is a function generated by syntactic sugar that takes the identifier of a named schema as its name, and takes one or more comma-separated value expressions as arguments. The values must be compatible with the named schema. The result is a single tuple with the same identifier and schema as the named schema.
For example, consider the following StreamSQL fragment:
CREATE SCHEMA point(x double, y double); CREATE INPUT STREAM in (tag string, point); CREATE OUTPUT STREAM out; SELECT point(1.0, 2.0) AS p FROM in INTO out;
The first two statements define a named schema and an input stream that includes
it. In the last statement, the point() function
(generated automatically from the CREATE SCHEMA statement) selects for the values
1.0 and 2.0. This tuple is equivalent to:
tuple(1.0 AS x, 2.0 AS y) AS p
The names of named schemas can be used as zero-argument functions to construct null values of the appropriate type. The result is the entire tuple field returned as null, as opposed to each field of the named tuple returned as null. This might be appropriate to describe, for example, a complex incoming tuple composed of two tuple fields, buy_order and sell_order. Only one of these two is used per transaction, buy or sell. Instead of filling in null values for each field of the unused tuple, you can designate the entire tuple as null.
For example, consider the following StreamSQL fragment:
CREATE SCHEMA transaction (
symbol string,
numshares double,
pershare double,
custid int
);
CREATE INPUT STREAM In (
buy_order transaction,
sell_order transaction
);
CREATE OUTPUT STREAM Out ;
SELECT transaction("DELL",2500,12.15,234623) AS buy,
transaction() AS sell
FROM In INTO Out;
This example outputs the following on stream Out:
"DELL,2500,12.15,234623",null
Using a named schema with zero arguments is the StreamSQL equivalent of using the Set null checkbox in the Manual Input view in StreamBase Studio.
This category includes the following functions:
| coalesce() | coalesce_tuples() | hash() |
| indexof() | isnull() | lastindexof() |
| length() | new_tuple() | notnull() |
| unzip() | zip() |
Function syntax:
T coalesce(Targ1, ..., Targn)
Returns the first non-null argument, or a null value if all arguments are null. Accepts all data types, but all arguments must have the same type. The returned value has the same data type as the arguments. For tuple arguments, returns the first tuple that is non-null. This function stops evaluating arguments after the first non-null argument is detected.
Since literal values are never null, you can specify a literal value as the last
argument to provide an effective default value for the list. In this way,
coalesce() can be used to emulate the NVL() function provided by Oracle PL/SQL and the two-argument
ISNULL() function provided by Microsoft T/SQL. For example, the following
expression returns the value of fieldA if it is non-null, or 0
if it is null:
coalesce(fieldA, 0)
The following example returns the first non-null field among fieldA, fieldB, and fieldC in that order, or returns
-99999 if all three fields are null:
coalesce(fieldA, fieldB, fieldC,
-99999)
Function syntax:
T coalesce_tuples(Targ1, ..., Targn)
For non-hierarchical data types, coalesce_tuples() acts the same as the coalesce() function. For hierarchical types, this function coalesces each sub-field, recursing into further nested fields as necessary.
For example, the following function:
coalesce_tuples( tuple(double() AS x, 1.2 AS y, double() AS z), tuple(1.0 AS x, 1.3 AS y, double() AS z), tuple(double() AS x, double() AS y, 1.4 AS z) )
returns the following tuple:
((x double, y double, z double)) 1.0, 1.2, 1.4
Remember that double() evaluates to double(null). Thus, the colasce_tuples function in this example selected:
-
The first non-null value for x, 1.0, which was found in the second input tuple.
-
The first non-null value for y, 1.2, which was found in the first input tuple.
-
The first non-null value for z, 1.4, which was found in the third input tuple.
Function syntax:
int hash(T arg)
Returns a hashed integer value derived from the argument, which can be any StreamBase data type.
Function syntax:
int indexof(stringhaystack, stringneedle, [intstart]) int indexof(list(T)L, Te, [intstart])
For strings, returns the index of the first instance of the string needle within the string haystack, for which the index is greater than or
equal to start. For all supported
Unicode character
sets, this function returns the index in number of graphemes, not bytes. For string
arguments, if needle is not found, the
function returns a value of -1.
For lists, returns the index of the first element of the list L whose value is e, starting at index start. The argument list can have any element type,
T. The element e must have the same element type as the argument
list. For list arguments, if e is not
found, the function returns a value of null.
If the search is successful, the value returned is always greater than or equal to
start. If start is unspecified, it is taken to be 0, which
designates the first character of the string or the first element of the list.
Function syntax:
bool isnull(value)
Returns true if the argument is a null value. Accepts all data types.
For list and tuple arguments, isnull() returns true if and only if the top-level list or tuple is null. If a list contains any elements at all, even if all their values are null, it is not a null list. If a tuple contains any fields at all, it is not null even if all field values are null.
Function syntax:
int lastindexof(stringhaystack, stringneedle, [intlastStart]) int lastindexof(list(T)L, Te, [intlastStart])
For strings, returns the index within string haystack of the rightmost instance of string
needle, for which the index is less than
or equal to lastStart. For all supported
Unicode character
sets, this function returns the index in number of graphemes, not bytes. For string
arguments, if needle is not found, the
function returns a value of -1.
For lists, returns the index of the last element of the list L whose value is e, for which the index is less than or equal to
lastStart. The argument list can have any
element type, T. The element e must have
the same element type as the argument list. For list arguments, if e is not found, the function returns a value of
null.
If the search is successful, the value returned is always less than or equal to
lastStart. If lastStart is unspecified, it is taken to be the
index of the last grapheme of the string or the last element of the list.
Function syntax:
int length(blobx) int length(listL) int length(stringx)
Returns the length of its argument for argument types blob, list, and string. For a blob, returns a count of bytes. For a non-null list, returns the number of elements in the list, including any null elements. For a string, returns the number of graphemes in the string, for ASCII and all supported Unicode character sets.
If the argument is a null blob, null list, or null string, the returned value is
null.
Function syntax:
tuple new_tuple(tuplet, [updated_value1ASfield1_name[,updated_value2ASfield2_name,[ ... [,updated_valueNASfieldN_name]]]])
The required first argument t is a tuple
value, and updated__valueN is a value
specified using an AS expression that names a field in tuple t.
Use the new_tuple function to modify a small number
of fields in a tuple, including nested fields, and leave the other fields
unchanged.
For example, the following expression:
new_tuple(tuple(1.0 AS x, 2.0 AS y, 3.0 AS z), 5 AS
x)
both defines a tuple and immediately replaces the value of the x field. The resulting tuple evaluates to the following:
tuple(5.0 AS x, 2.0 AS y, 3.0 AS z)
In this case, the value of the x field is modified
using 5 AS x as the updating argument.
Function syntax:
bool notnull(value)
Returns true if the argument is not a null value. Accepts all data types.
This function always returns the opposite of isnull().
Function syntax:
tuple unzip(list(tuple) listOfTuples)
Takes a list of tuples and returns a tuple of lists.
An example will clarify the function. Let's say your application extracts a list of symbol and price tuples from a stream. One list might contain the following:
[ {prices: 102.51, symbols: AAPL},
{prices: 96.52, symbols: IBM},
{prices: 36.33, symbols: HPQ} ]
If you run this list through unzip(), the returned
value is a single tuple whose fields are lists, as follows:
{
prices: [102.51, 96.82, 36.33],
symbols: [AAPL, IBM, HPQ]
}
Function syntax:
list(tuple) zip(tuple(list(T1), ..., list(Tn)) tupleOfLists)
In the spirit of the zip function in Python, zip()
takes a tuple of lists, collates it, and returns it as a list of tuples, where the
nth tuple contains the nth element from each of the argument lists. The
element types of the lists in the argument tuple can be of any type, and do not
need to be the same.
An example will clarify the function. Let's say you have a stream whose schema
accepts lists of securities price data: { prices
list(double), symbols list(string) }. One tuple might contain the
following:
{
prices: [102.51, 96.82, 36.33],
symbols: [AAPL, IBM, HPQ]
}
If you run this tuple through zip(), the returned
value is a list of tuples, as follows:
[ {prices: 102.51, symbols: AAPL},
{prices: 96.52, symbols: IBM},
{prices: 36.33, symbols: HPQ} ]
Aggregate functions are used on sets of data to return a single result. Aggregate functions evaluate columns of data from windows or tables. In EventFlow applications, aggregate functions can only be used:
-
In aggregate expressions in Aggregate operators
-
In output expressions in Query Operators that perform read operations
In StreamSQL applications, aggregate functions can be used in SELECT statements related to aggregate or query read operations.
Aggregate expressions can contain a single aggregate function, or they can combine multiple aggregate functions together with constants and fields from an input stream or an associated Query Table.
Aggregate functions are organized into the following categories:
This category includes the following functions:
Function syntax:
list(T) aggregatelist(T e)
Returns a list containing one element for each expression, e, in the Aggregate's window. The list element
type, T, of the returned list is the same as the data type of e.
Function syntax:
list(T) concat(list(T) L)
Returns a list containing the elements of each list, L, in the Aggregate's window. The list element
type, T, of the returned list is the same as the element type of the argument list.
See also the concat() function for list elements.
This category includes the following functions:
Function syntax:
T callcpp(stringclass, [arg1, ...,argn])
As an alternative to using callcpp(), you can define
an alias for your custom functions. See Using Function Aliases.
Use an alias or callcpp() to run a custom C++
aggregate function directly from a StreamBase operator that uses an
expression. The return type, T, of callcpp() is the
same as the return type of the called function. Custom C++ aggregate functions are
functions you build with the StreamBase C++ Client API, specifically
those that inherit from the sb::PluginAggregate
class. You can use the aggregate form of callcpp() in
any aggregate expression. (To use callcpp() in simple
expressions, refer to the simple callcpp().)
For callcpp() to locate the function being called,
you must specify the location of the containing DLL or .so file, and must register the function with a custom-function element in the StreamBase
Server configuration file. You specify the function as an aggregate function
by means of the type attribute of custom-function. See C++ Function Overview for
details.
To learn about coding custom C++ functions, refer to Creating Custom C++ Functions in the API Guide.
Function syntax:
T calljava(stringclass, [arg1, ...,argn])
As an alternative to using calljava(), you can define
an alias for your custom functions. See Using Function Aliases.
Use an alias or calljava() to run a custom Java
aggregate function directly from a StreamBase operator that uses an
expression. The return type, T, of calljava() is the
same as the return type of the called function. Custom Java aggregate functions are
functions you build with the StreamBase Java Client API, specifically
those that extend the AggregateWindow class. You can
use the aggregate form of calljava() in any aggregate
expression. (To use calljava() in simple expressions,
refer to the simple calljava().)
The calljava() function distinguishes simple from
aggregate functions by the number of arguments. For custom simple functions, you
must specify both class and method names. For custom aggregate functions, you
specify only the class name, because these functions must inherit from AggregateWindow, which has an init() method that is automatically run.
To learn about coding custom Java functions, refer to Using the StreamBase Java Function Wizard in the API Guide. For information on the classpath requirements for custom Java functions, see Java Function Overview.
See Return
Types and Argument Types for a discussion of using Java primitives or Java
objects in the function you write to be called with calljava().
This category includes the following functions:
Function syntax:
double alpha(doubleindex, doubleprice, doubledividend)
Returns a double that may indicate the part of a stock's movement that is independent of the index's movement.
Examples of stocks increasing in alpha could be those with takeover rumors, under strong syndicate manipulation, or having strong expectations of good results; that is, factors that make them increasingly move independently of the index.
The beta of a stock is defined as the slope of a regression line in a
scatter graph of paired data points representing percentage changes of an index
(return of an index) and the corresponding change (return of stock) in the price of
a stock. The StreamBase beta() function
also includes the stock dividend in the calculation of the stock return.
The alpha is the point where this regression line cuts the Y axis. To reiterate, a stock's beta can be described as that part of a stock's movement that is influenced by the index, while a stock's alpha can be regarded as that part of a stock's movement that is independent of the index's movement.
The alpha function's arguments are:
-
index: the end-of-period market index (double) -
price: the end-of-period stock price (double) -
dividend: the stock dividend of the period (double)
The stock price must have already figured in any stock split (or reverse split). The alpha is usually calculated over a period of 61 months. This aggregate function assumes that the input data is already normalized. That is, index, price, and dividend are of the same period. If they are of different periods, the yield (return) should be normalized first and then the intercept function used to calculate alpha.
Function syntax:
double avg(doublefield) int avg(intfield) long avg(longfield) timestamp avg(timestampfield)
Returns the average value of a numeric field, computed for the argument field for all tuples in the Aggregate's window. The
field can be an double, int, long, or timestamp. For a returned average of
timestamp values to make sense, the aggregate field should contain all interval
timestamps or all absolute timestamps, but not both.
See also the avg() function for lists.
Function syntax:
double beta(doubleindex, doubleprice, doubledividend)
Returns a double that may indicate the tendency of a
security's returns to respond to swings in the market. A beta of 1 indicates that the security's price will move with the market. A
beta of less than 1 means that the security will be less volatile than the market.
A beta value that is greater than 1 indicates that the security's price will be
more volatile than the market.
For related information, see the description of the alpha function.
Beta is a measure of a security's or portfolio's volatility, or systematic risk, in
comparison to the market as a whole. For example, if a stock's beta is 1.2, it is theoretically 20% more volatile than the market.
The function's arguments are:
-
index: the end-of-period market index (double) -
price: the end-of-period stock price (double) -
dividend: the stock dividend of the period (double)
The stock price must have already figured in any stock split (or reverse split). The beta is usually calculated over a period of 61 months. This aggregate function assumes that the input data is already normalized. That is, index, price, and dividend are of the same period. If they are of different periods, the yield should be normalized first and then use the slope function to calculate beta.
Function syntax:
double correlation_coefficient(doubleprice, doubleindex) double correlation_coefficient(intprice, intindex)
Returns the correlation coefficient for two fields, price and index, for all tuples in the Aggregate's window.
This function takes two input arguments (both can be an int or a double) and
returns a double. The correlations coefficient is a measure that determines the
degree to which two variable's movements are associated. The correlation
coefficient will vary from -1.0 to 1.0. The value -1.0 indicates perfect
negative correlation, and 1.0 indicates perfect
positive correlation. 0 means no correlation.
Notes
-
The correlation_coefficient of 0 items is null.
-
The correlation_coefficient of 1 item is 1.0.
-
The correlation_coefficient of N identical items is 1.0.
Function syntax:
double correlation_coefficientp(doubleprice, doubleindex) double correlation_coefficientp(intprice, intindex)
Returns the correlation coefficient for two fields, price and index, for all tuples in the Aggregate's window.
The correlation_coefficientp function is similar to
the correlation_coefficient function, but use
correlation_coefficientp when the data provided is
the entire population, while the correlation_coefficient function is used for a random sample. The
correlation_coefficientp function is calculated using
the biased (or n) method. The correlation_coefficient function is calculated using the
unbiased (or n-1) method.
This function takes two input arguments (both can be an int or a double) and
returns a double. The correlations coefficient is a measure that determines the
degree to which two variable's movements are associated. The correlation
coefficient will vary from -1.0 to 1.0. The value -1.0 indicates perfect
negative correlation, and 1.0 indicates perfect
positive correlation. 0 means no correlation.
Notes
-
The correlation_coefficientp of 0 items is null.
-
The correlation_coefficientp of 1 item is 1.0.
-
The correlation_coefficientp of N identical items is 1.0.
Function syntax:
int count([T expr])
Returns the number of tuples in the Aggregate's window. If this function is called with an argument, then tuples for which the argument evaluates to null do not contribute to the count. The argument can be of any data type, T, but its actual value is ignored, other than the fact that it is non-null. If no argument is specified, then all tuples are included in the count.
To clarify the operation of count(), consider the
following sequence of values for x:
x {1,2,3,4,5,6,7,8,9,10}
Calling count(x) returns 10. There
are ten items in the list.
Calling count(x > 5) also
returns 10. This is because it is counting the sequence {f,f,f,f,f,t,t,t,t,t} (five
false, five true). All elements in this sequence are non-null, so all are counted.
Calling returns 5, because it is counting the
sequence {null, null, null, null, null, 0, 0, 0, 0, 0} (five nulls, five zeros).
The five non-null elements are counted. Any non-null value other than 0 would also
count in this expression.
count(if x > 5
then 0 else null)
The following example clarifies the count()
function's behavior with query tables.
Let's say we have a table named flattop with the following schema and contents:
| Field name | Data Type | Contents of Three Rows |
|---|---|---|
| alpha | int | 2, 4, 6 |
| beta | long | 3.0, 5.0 |
The expression select count(flattop.*) from in,
tablet ... returns
3, 2
which is the number of non-null values in the alpha
field, plus the number of non-null values in the beta
field.
If you want a count of the number of items in the window, use an expression like
the following: select count() as c from in, flattop
...
Function syntax:
int count_distinct(T arg1[, T arg2 ...]])
Returns the number of tuples in the Aggregate's window with specified values for the parameters. This function must be called with at least one argument of any data type. Multiple arguments can be of different types. Use commas to separate subsequent arguments. For the purpose of this function, null values of a field are considered equivalent to each other.
See also the count_distinct_elements() function for lists.
Function syntax:
double covariance(doubleprice, doubleindex) double covariance(intprice, intindex) double covariance(longprice, longindex)
Returns the covariance for two fields, price and index, for all tuples in the Aggregate's window.
This function takes two input arguments (both can be a double, int, or long) and
returns a double. In financial applications, covariance can be used to measure the
degree to which returns on two risky assets move in tandem. A positive covariance
means that asset returns move together. A negative covariance means returns vary
inversely.
In the arguments, the price field is a
stock price, and the index is the
industry segment index.
One method of calculating covariance is to look at return deviations from the expected return in each scenario. Another method is to multiply correlation between the two variables by the standard deviation of each variable.
For example, if you owned one asset that had a high covariance with another asset that you did not own, then you would receive very little increased diversification by adding the second asset. Of course, the opposite is true as well, adding assets with low covariance to your portfolio would lower the overall portfolio risk.
Function syntax:
double covariancep(doubleprice, doubleindex) double covariancep(intprice, intindex) double covariancep(longprice, longindex)
Returns the covariance for two fields, price and index, for all tuples in the Aggregate's window.
The covariancep function is similar to the
covariance function, but use covariancep when the data provided is the entire population,
while the covariance function is used for a random
sample. The covariancep function is calculated using
the biased (or n) method. The covariance
function is calculated using the unbiased
(or n-1) method.
This function takes two input arguments (both can be a double, int, or long) and returns a double. In financial applications, covariance can be used to measure the degree to which returns on two risky assets move in tandem. A positive covariance means that asset returns move together. A negative covariance means returns vary inversely.
In the arguments, the price field is a
stock price, and the index is the
industry segment index.
One method of calculating covariance is by looking at return deviations from the expected return in each scenario. Another method is to multiply correlation between the two variables by the standard deviation of each variable.
For example, if you owned one asset that had a high covariance with another asset that you did not own, then you would receive very little increased diversification by adding the second asset. Of course, the opposite is true as well, adding assets with low covariance to your portfolio would lower the overall portfolio risk.
Function syntax:
double exp_moving_avg(doublevalue, intcount, [doubleweight]) double exp_moving_avg(intvalue, intcount, [doubleweight]) double exp_moving_avg(longvalue, intcount, [doubleweight])
Returns a weighted average where a percentage of each successive value is used with a percentage of the existing average to construct the new average. The effect of any one value on the average diminishes exponentially as new values are added.
The function takes these arguments:
-
value -
The value (double, int, or long) to be averaged.
-
count -
The number (int) of periods over which a simple average is used prior to starting the EMA.
-
weight -
An optional weighting value (a double between 0 and 1) that will be applied to both the current period's value (CPV) and prior period's average (PPA) when the Exponential Moving Average (EMA) is calculated.
If the weight is not between 0 and 1 or is not specified, it is calculated from the following formula:
weight = 2 / (count + 1)
This function returns a double. It is most commonly used with an infinite aggregate
window size (although it does not have to be). When configuring an Aggregate
operator for an infinite window, do not configure the Advance and Size fields in
StreamBase Studio, which results in infinite window size. Instead, set
the Emit field to 1 so
that the aggregate emits for every tuple.
The
function is calculated using the following formula:
exp_moving_avg
EMA= (CPV*weight) + (PPA* (1 -weight))
-
EMA: exponential moving average so far -
PPA: exponential moving average after the previous tuple -
CPV: value of current tuple
Initially there is no PPA. A simple moving average (SMA) is computed instead over as many tuples as specified by the second parameter. When the SMA is computed, a Null value is emitted for each input tuple until the Aggregate operator receives enough tuples to compute the SMA. Following this, the EMA is calculated.
Function syntax:
double intercept(doublex, doubley) double intercept(intx, inty)
For the data set of fields x and
y for all tuples in the Aggregate's
window, a best-fit linear regression line is calculated, then the function returns
the value where the line intersects the y axis. The method used to find the line
that best fits a group of points is called least squares
(or linear least squares). This function takes two input
arguments (both can be an int or a double) and returns a double.
Function syntax:
T max(f)
The argument f represents a field in the
Aggregate's window. It can also be any expression that depends on one or more
fields in the Aggregate's window.
This function returns the maximum non-null value using the greater-than relational
operator that f takes on in the
Aggregate's window. The data type, T, of the returned value is the same as the
argument.
StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.
See also the simple version of max() and the maxelement() function for lists.
Function syntax:
double median(doublex) double median(intx) double median(longx)
Returns the median value of x for this
Aggregate's window. If the window has an even number of elements, returns the
average of the middle two.
See also the median() function for lists.
Function syntax:
T min(f)
The argument f represents a field in the
Aggregate's window. It can also be any expression that depends on one or more
fields in the Aggregate's window.
This function returns the minimum non-null value using the less-than relational
operator that f takes on in the
Aggregate's window. The data type, T, of the returned value is the same as the
argument.
StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.
See also the simple version of min() and the minelement() function for lists.
Function syntax:
double product(doublef) int product(intf) long product(longf)
Returns the multiplication product computed for field f for all tuples in the Aggregate's window.
Supports data types double, int, and long.
See also the product() function for lists.
Function syntax:
double slope(doublex, doubley) double slope(intx, inty)
For the data set of fields x and
y for all tuples in the Aggregate's
window, a best-fit linear regression line is calculated, and then the function
returns the line's slope. The method that is used to find the line that best fits a
group of points is called least squares (or linear least squares). The slope function takes two input arguments (both can
be an int or a double) and returns a double.
Function syntax:
double stdev(doublef) double stdev(intf) double stdev(longf) timestamp stdev(timestampf)
Returns the standard deviation for field f for all tuples in the Aggregate's window. The
function takes one input argument. If the input argument is an int, double, or
long, it returns a double. If the input argument is a timestamp, it returns an
interval timestamp. For timestamp values, the aggregate field should contain all
interval timestamps or all absolute timestamps, but not both.
Standard deviation is a measure of the dispersion of a set of data from its mean. The more spread apart the data is, the higher the deviation. For example, in financial applications, standard deviation could be applied to the annual rate of return of an investment to measure the investment's volatility (risk). A volatile stock would have a high standard deviation. In mutual funds, the standard deviation indicates how much the return on the fund is deviating from the expected normal returns.
See also the stdev() function for lists.
Function syntax:
double stdevp(doublef) double stdevp(intf) double stdevp(longf) timestamp stdevp(timestampf)
Returns the standard deviation for field f for all tuples in the Aggregate's window. The
function takes one input argument. If the input argument is an int, double, or
long, it returns a double. If the input argument is a timestamp, it returns an
interval timestamp. For timestamp values, the aggregate field should contain all
interval timestamps or all absolute timestamps, but not both.
The stdevp() function is similar to the stdev() function: with stdevp() the
data provided is the entire population, while with stdev(), the data provided is treated as a random sample. The
stdevp() function is calculated using the biased (or n) method. The stdev() function is calculated using the unbiased (or n-1) method.
See also the stdevp() function for lists.
Function syntax:
double sum(doublef) int sum(intf) long sum(longf) timestamp sum(timestampf)
Returns the sum of field f computed for
all tuples in the Aggregate's window. The field can be an double, int, long, or
timestamp.
When the field type is timestamp, summing members of the field follows the rules for adding timestamps as shown in the table in timestamp Data Type. That is, you cannot sum an aggregate of two or more absolute timestamps. However, you can sum an aggregate field composed of all interval timestamps, or one composed of exactly one absolute timestamp plus one or more interval timestamps.
See also the sum() function for lists.
Function syntax:
double variance(doublef) double variance(intf) double variance(longf)
Returns the variance for field f for all
tuples in the Aggregate's window. The function takes one input argument (an int,
double, or long) and returns a double.
Variance is a measure of the dispersion of a set of data points around their mean value. It is a mathematical expectation of the average squared deviations from the mean. Variance measures the variability (volatility) from an average. Volatility is a measure of risk. So for example, this statistic can help determine the risk an investor might take on when purchasing a specific security.
See also the variance() function for lists.
Function syntax:
double variancep(doublef) double variancep(intf) double variancep(longf)
Returns the variance for field f for all
tuples in the Aggregate's window. The function takes one input argument (an int,
double, or long) and returns a double.
The variancep() function is similar to the
variance() function: with variancep() the data provided is the entire population, while
with variance(), the data provided is treated as a
random sample. The variancep() function is calculated
using the biased (or n)
method. The variance() function is calculated using
the unbiased (or n-1) method.
See also the variancep() function for lists.
Function syntax:
double vwap(double price, double volume)
Returns the volume-weighted average-price value from (typically) the price and volume values, for all tuples in the Aggregate's
window. Both arguments are doubles.
Function syntax:
T1 withmax(Tcompare_expr, T1result_expr)
Returns the result_expr for this
Aggregate's window that has the maximum corresponding compare_expr using the greater-than relational
operator. The data type, T1, of the returned value is the same as the result_expr. The data type of the compare_expr can be any data type, not necessarily
the same as the result_expr's type.
StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.
Function syntax:
T1 withmin(Tcompare_expr, T1result_expr)
Returns the result_expr for this
Aggregate's window that has the minimum corresponding compare_expr using the less-than relational
operator. The data type, T1, of the returned value is the same as the result_expr. The data type of the compare_expr can be any data type, not necessarily
the same as the result_expr's type.
StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.
This category includes the following functions:
Windowing functions work on columns of data from either aggregate windows or from read operations on tables.
Function syntax:
double closeval([stringdimension]) int closeval([stringdimension]) long closeval([stringdimension]) timestamp closeval([stringdimension])
The closeval() function behaves the same as
openval() except that it returns the upper limit of the
specified dimension when the window is closed.
The value returned by closeval() might differ from
any of the actual tuple values included in the window. For example, consider an
Aggregate operator with a dimension CountDim, where
the windows created by that dimension include a window with a range of 5 to 10. If
the actual tuple values in the window are 6, 7, and 9, then closeval(CountDim) returns 10 for that window, not 9.
If the dimension name is omitted and the aggregate has only one dimension, then
closeval() uses that dimension. If the dimension name
is omitted and the aggregate has more than one dimension, the result is a typecheck
error.
Function syntax:
T firstval(T f)
Returns the first value for field f in
the Aggregate's window or in a table column. Accepts all data types and returns the
same type as its argument.
Function syntax:
T lastval(T f)
Returns the last value for field f in the
Aggregate's window or in a table column. Accepts all data types and returns the
same type as its argument.
Function syntax:
double openval([stringdimension]) int openval([stringdimension]) long openval([stringdimension]) timestamp openval([stringdimension])
Can only be used in an Aggregate operator, not in a Query Operator. Returns the lower limit of the specified dimension for the current window. The dimension name is passed as a quoted string, and must match the name of a dimension in the Aggregate operator. This function works even if the dimension specified is not the one that actually opened the current window. In that case, the value returned is the value of the dimension at the time the window was created.
The value returned by openval might differ from any
of the actual tuple values included in the window. For example, consider an
Aggregate operator with a dimension CountDim, where
the windows created by that dimension include a window with a range of 5 to 10. If
the actual tuple values in the window are 6, 7, and 9, then openval(CountDim) returns 5 for that window, not 6.
If the dimension name is omitted and the aggregate has only one dimension, then
openval() uses that dimension. If the dimension name
is omitted and the aggregate has more than one dimension, the result is a typecheck
error.
