Class Stream
- java.lang.Object
-
- org.apache.storm.trident.Stream
-
- All Implemented Interfaces:
ResourceDeclarer<Stream>,IAggregatableStream
public class Stream extends Object implements IAggregatableStream, ResourceDeclarer<Stream>
A Stream represents the core data model in Trident, and can be thought of as a "stream" of tuples that are processed as a series of small batches. A stream is partitioned accross the nodes in the cluster, and operations are applied to a stream in parallel accross each partition.There are five types of operations that can be performed on streams in Trident
1. **Partiton-Local Operations** - Operations that are applied locally to each partition and do not involve network transfer 2. **Repartitioning Operations** - Operations that change how tuples are partitioned across tasks(thus causing network transfer), but do not change the content of the stream. 3. **Aggregation Operations** - Operations that *may* repartition a stream (thus causing network transfer) 4. **Grouping Operations** - Operations that may repartition a stream on specific fields and group together tuples whose fields values are equal. 5. **Merge and Join Operations** - Operations that combine different streams together.
-
-
Constructor Summary
Constructors Modifier Constructor Description protectedStream(TridentTopology topology, String name, Node node)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description StreamaddSharedMemory(SharedMemory request)Add in request for shared memory that this component will use.Streamaggregate(Aggregator agg, Fields functionFields)Streamaggregate(CombinerAggregator agg, Fields functionFields)Streamaggregate(ReducerAggregator agg, Fields functionFields)Streamaggregate(Fields inputFields, Aggregator agg, Fields functionFields)Streamaggregate(Fields inputFields, CombinerAggregator agg, Fields functionFields)Streamaggregate(Fields inputFields, ReducerAggregator agg, Fields functionFields)StreamapplyAssembly(Assembly assembly)Applies an `Assembly` to this `Stream`.StreambatchGlobal()## Repartitioning Operation.Streambroadcast()## Repartitioning Operation.ChainedAggregatorDeclarerchainedAgg()Streameach(Function function, Fields functionFields)Streameach(Fields inputFields, Filter filter)Streameach(Fields inputFields, Function function, Fields functionFields)Streamfilter(Filter filter)Returns a stream consisting of the elements of this stream that match the given filter.Streamfilter(Fields inputFields, Filter filter)Returns a stream consisting of the elements of this stream that match the given filter.StreamflatMap(FlatMapFunction function)Returns a stream consisting of the results of replacing each value of this stream with the contents produced by applying the provided mapping function to each value.StreamflatMap(FlatMapFunction function, Fields outputFields)Returns a stream consisting of the results of replacing each value of this stream with the contents produced by applying the provided mapping function to each value.StringgetName()Returns the label applied to the stream.FieldsgetOutputFields()Streamglobal()## Repartitioning Operation.GroupedStreamgroupBy(Fields fields)## Grouping Operation.StreamidentityPartition()## Repartitioning Operation.StreamlocalOrShuffle()## Repartitioning Operation.Streammap(MapFunction function)Returns a stream consisting of the result of applying the given mapping function to the values of this stream.Streammap(MapFunction function, Fields outputFields)Returns a stream consisting of the result of applying the given mapping function to the values of this stream.Streammax(Comparator<TridentTuple> comparator)This aggregator operation computes the maximum of tuples in a stream by using the givencomparatorwithTridentTuples.StreammaxBy(String inputFieldName)This aggregator operation computes the maximum of tuples by the giveninputFieldNameand it is assumed that its value is an instance ofComparable.<T> StreammaxBy(String inputFieldName, Comparator<T> comparator)This aggregator operation computes the maximum of tuples by the giveninputFieldNamein a stream by using the givencomparator.Streammin(Comparator<TridentTuple> comparator)This aggregator operation computes the minimum of tuples in a stream by using the givencomparatorwithTridentTuples.StreamminBy(String inputFieldName)This aggregator operation computes the minimum of tuples by the giveninputFieldNameand it is assumed that its value is an instance ofComparable.<T> StreamminBy(String inputFieldName, Comparator<T> comparator)This aggregator operation computes the minimum of tuples by the giveninputFieldNamein a stream by using the givencomparator.Streamname(String name)Applies a label to the stream.StreamparallelismHint(int hint)Applies a parallelism hint to a stream.Streampartition(Grouping grouping)## Repartitioning Operation.Streampartition(CustomStreamGrouping partitioner)## Repartitioning Operation.StreampartitionAggregate(Aggregator agg, Fields functionFields)StreampartitionAggregate(CombinerAggregator agg, Fields functionFields)StreampartitionAggregate(ReducerAggregator agg, Fields functionFields)StreampartitionAggregate(Fields inputFields, Aggregator agg, Fields functionFields)StreampartitionAggregate(Fields inputFields, CombinerAggregator agg, Fields functionFields)StreampartitionAggregate(Fields inputFields, ReducerAggregator agg, Fields functionFields)StreampartitionBy(Fields fields)## Repartitioning Operation.TridentStatepartitionPersist(StateFactory stateFactory, StateUpdater updater)TridentStatepartitionPersist(StateFactory stateFactory, StateUpdater updater, Fields functionFields)TridentStatepartitionPersist(StateFactory stateFactory, Fields inputFields, StateUpdater updater)TridentStatepartitionPersist(StateFactory stateFactory, Fields inputFields, StateUpdater updater, Fields functionFields)TridentStatepartitionPersist(StateSpec stateSpec, StateUpdater updater)TridentStatepartitionPersist(StateSpec stateSpec, StateUpdater updater, Fields functionFields)TridentStatepartitionPersist(StateSpec stateSpec, Fields inputFields, StateUpdater updater)TridentStatepartitionPersist(StateSpec stateSpec, Fields inputFields, StateUpdater updater, Fields functionFields)Streampeek(Consumer action)Returns a stream consisting of the trident tuples of this stream, additionally performing the provided action on each trident tuple as they are consumed from the resulting stream.TridentStatepersistentAggregate(StateFactory stateFactory, CombinerAggregator agg, Fields functionFields)TridentStatepersistentAggregate(StateFactory stateFactory, ReducerAggregator agg, Fields functionFields)TridentStatepersistentAggregate(StateFactory stateFactory, Fields inputFields, CombinerAggregator agg, Fields functionFields)TridentStatepersistentAggregate(StateFactory stateFactory, Fields inputFields, ReducerAggregator agg, Fields functionFields)TridentStatepersistentAggregate(StateSpec spec, CombinerAggregator agg, Fields functionFields)TridentStatepersistentAggregate(StateSpec spec, ReducerAggregator agg, Fields functionFields)TridentStatepersistentAggregate(StateSpec spec, Fields inputFields, CombinerAggregator agg, Fields functionFields)TridentStatepersistentAggregate(StateSpec spec, Fields inputFields, ReducerAggregator agg, Fields functionFields)Streamproject(Fields keepFields)Filters out fields from a stream, resulting in a Stream containing only the fields specified by `keepFields`.StreamsetCPULoad(Number load)Sets the CPU Load resource for the current operation.StreamsetMemoryLoad(Number onHeap)Sets the Memory Load resources for the current operation.StreamsetMemoryLoad(Number onHeap, Number offHeap)Sets the Memory Load resources for the current operation.Streamshuffle()## Repartitioning Operation.StreamslidingWindow(int windowCount, int slideCount, WindowsStoreFactory windowStoreFactory, Fields inputFields, Aggregator aggregator, Fields functionFields)Returns a stream of tuples which are aggregated results of a sliding window with everywindowCountof tuples and slides the window afterslideCount.StreamslidingWindow(BaseWindowedBolt.Duration windowDuration, BaseWindowedBolt.Duration slidingInterval, WindowsStoreFactory windowStoreFactory, Fields inputFields, Aggregator aggregator, Fields functionFields)Returns a stream of tuples which are aggregated results of a window which slides at duration ofslidingIntervaland completes a window atwindowDuration.StreamstateQuery(TridentState state, QueryFunction function, Fields functionFields)StreamstateQuery(TridentState state, Fields inputFields, QueryFunction function, Fields functionFields)StreamtoStream()StreamtumblingWindow(int windowCount, WindowsStoreFactory windowStoreFactory, Fields inputFields, Aggregator aggregator, Fields functionFields)Returns a stream of tuples which are aggregated results of a tumbling window with everywindowCountof tuples.StreamtumblingWindow(BaseWindowedBolt.Duration windowDuration, WindowsStoreFactory windowStoreFactory, Fields inputFields, Aggregator aggregator, Fields functionFields)Returns a stream of tuples which are aggregated results of a window that tumbles at duration ofwindowDuration.Streamwindow(WindowConfig windowConfig, WindowsStoreFactory windowStoreFactory, Fields inputFields, Aggregator aggregator, Fields functionFields)Returns stream of aggregated results based on the given window configuration.Streamwindow(WindowConfig windowConfig, Fields inputFields, Aggregator aggregator, Fields functionFields)Returns a stream of aggregated results based on the given window configuration which uses inmemory windowing tuple store.
-
-
-
Constructor Detail
-
Stream
protected Stream(TridentTopology topology, String name, Node node)
-
-
Method Detail
-
name
public Stream name(String name)
Applies a label to the stream. Naming a stream will append the label to the name of the bolt(s) created by Trident and will be visible in the Storm UI.- Parameters:
name- - The label to apply to the stream
-
parallelismHint
public Stream parallelismHint(int hint)
Applies a parallelism hint to a stream.
-
setCPULoad
public Stream setCPULoad(Number load)
Sets the CPU Load resource for the current operation.- Specified by:
setCPULoadin interfaceResourceDeclarer<Stream>- Parameters:
load- the amount of CPU- Returns:
- this for chaining
-
setMemoryLoad
public Stream setMemoryLoad(Number onHeap)
Sets the Memory Load resources for the current operation. offHeap becomes default.- Specified by:
setMemoryLoadin interfaceResourceDeclarer<Stream>- Parameters:
onHeap- the amount of on heap memory- Returns:
- this for chaining
-
setMemoryLoad
public Stream setMemoryLoad(Number onHeap, Number offHeap)
Sets the Memory Load resources for the current operation.- Specified by:
setMemoryLoadin interfaceResourceDeclarer<Stream>- Parameters:
onHeap- the amount of on heap memoryoffHeap- the amount of off heap memory- Returns:
- this for chaining
-
addSharedMemory
public Stream addSharedMemory(SharedMemory request)
Description copied from interface:ResourceDeclarerAdd in request for shared memory that this component will use. SeeSharedOnHeap,SharedOffHeapWithinNode, andSharedOffHeapWithinWorkerfor convenient ways to create shared memory requests.- Specified by:
addSharedMemoryin interfaceResourceDeclarer<Stream>- Parameters:
request- the shared memory request for this component- Returns:
- this for chaining
-
project
public Stream project(Fields keepFields)
Filters out fields from a stream, resulting in a Stream containing only the fields specified by `keepFields`.For example, if you had a Stream `mystream` containing the fields `["a", "b", "c","d"]`, calling"
```java mystream.project(new Fields("b", "d")) ```
would produce a stream containing only the fields `["b", "d"]`.
- Parameters:
keepFields- The fields in the Stream to keep
-
groupBy
public GroupedStream groupBy(Fields fields)
## Grouping Operation.
-
partition
public Stream partition(CustomStreamGrouping partitioner)
## Repartitioning Operation.
-
partition
public Stream partition(Grouping grouping)
## Repartitioning Operation.This method takes in a custom partitioning function that implements
CustomStreamGrouping
-
shuffle
public Stream shuffle()
## Repartitioning Operation.Use random round robin algorithm to evenly redistribute tuples across all target partitions.
-
localOrShuffle
public Stream localOrShuffle()
## Repartitioning Operation.Use random round robin algorithm to evenly redistribute tuples across all target partitions, with a preference for local tasks.
-
global
public Stream global()
## Repartitioning Operation.All tuples are sent to the same partition. The same partition is chosen for all batches in the stream.
-
batchGlobal
public Stream batchGlobal()
## Repartitioning Operation.All tuples in the batch are sent to the same partition. Different batches in the stream may go to different partitions.
-
broadcast
public Stream broadcast()
## Repartitioning Operation.Every tuple is replicated to all target partitions. This can useful during DRPC – for example, if you need to do a stateQuery on every partition of data.
-
identityPartition
public Stream identityPartition()
## Repartitioning Operation.
-
applyAssembly
public Stream applyAssembly(Assembly assembly)
Applies an `Assembly` to this `Stream`.- See Also:
Assembly
-
each
public Stream each(Fields inputFields, Function function, Fields functionFields)
- Specified by:
eachin interfaceIAggregatableStream
-
partitionAggregate
public Stream partitionAggregate(Fields inputFields, Aggregator agg, Fields functionFields)
- Specified by:
partitionAggregatein interfaceIAggregatableStream
-
partitionAggregate
public Stream partitionAggregate(Aggregator agg, Fields functionFields)
-
partitionAggregate
public Stream partitionAggregate(CombinerAggregator agg, Fields functionFields)
-
partitionAggregate
public Stream partitionAggregate(Fields inputFields, CombinerAggregator agg, Fields functionFields)
-
partitionAggregate
public Stream partitionAggregate(ReducerAggregator agg, Fields functionFields)
-
partitionAggregate
public Stream partitionAggregate(Fields inputFields, ReducerAggregator agg, Fields functionFields)
-
stateQuery
public Stream stateQuery(TridentState state, Fields inputFields, QueryFunction function, Fields functionFields)
-
stateQuery
public Stream stateQuery(TridentState state, QueryFunction function, Fields functionFields)
-
partitionPersist
public TridentState partitionPersist(StateFactory stateFactory, Fields inputFields, StateUpdater updater, Fields functionFields)
-
partitionPersist
public TridentState partitionPersist(StateSpec stateSpec, Fields inputFields, StateUpdater updater, Fields functionFields)
-
partitionPersist
public TridentState partitionPersist(StateFactory stateFactory, Fields inputFields, StateUpdater updater)
-
partitionPersist
public TridentState partitionPersist(StateSpec stateSpec, Fields inputFields, StateUpdater updater)
-
partitionPersist
public TridentState partitionPersist(StateFactory stateFactory, StateUpdater updater, Fields functionFields)
-
partitionPersist
public TridentState partitionPersist(StateSpec stateSpec, StateUpdater updater, Fields functionFields)
-
partitionPersist
public TridentState partitionPersist(StateFactory stateFactory, StateUpdater updater)
-
partitionPersist
public TridentState partitionPersist(StateSpec stateSpec, StateUpdater updater)
-
filter
public Stream filter(Filter filter)
Returns a stream consisting of the elements of this stream that match the given filter.- Parameters:
filter- the filter to apply to each trident tuple to determine if it should be included.- Returns:
- the new stream
-
filter
public Stream filter(Fields inputFields, Filter filter)
Returns a stream consisting of the elements of this stream that match the given filter.- Parameters:
inputFields- the fields of the input trident tuple to be selected.filter- the filter to apply to each trident tuple to determine if it should be included.- Returns:
- the new stream
-
map
public Stream map(MapFunction function)
Returns a stream consisting of the result of applying the given mapping function to the values of this stream.- Parameters:
function- a mapping function to be applied to each value in this stream.- Returns:
- the new stream
-
map
public Stream map(MapFunction function, Fields outputFields)
Returns a stream consisting of the result of applying the given mapping function to the values of this stream. This method replaces old output fields with new output fields, achieving T -> V conversion.- Parameters:
function- a mapping function to be applied to each value in this stream.outputFields- new output fields- Returns:
- the new stream
-
flatMap
public Stream flatMap(FlatMapFunction function)
Returns a stream consisting of the results of replacing each value of this stream with the contents produced by applying the provided mapping function to each value. This has the effect of applying a one-to-many transformation to the values of the stream, and then flattening the resulting elements into a new stream.- Parameters:
function- a mapping function to be applied to each value in this stream which produces new values.- Returns:
- the new stream
-
flatMap
public Stream flatMap(FlatMapFunction function, Fields outputFields)
Returns a stream consisting of the results of replacing each value of this stream with the contents produced by applying the provided mapping function to each value. This has the effect of applying a one-to-many transformation to the values of the stream, and then flattening the resulting elements into a new stream. This method replaces old output fields with new output fields, achieving T -> V conversion.- Parameters:
function- a mapping function to be applied to each value in this stream which produces new values.outputFields- new output fields- Returns:
- the new stream
-
peek
public Stream peek(Consumer action)
Returns a stream consisting of the trident tuples of this stream, additionally performing the provided action on each trident tuple as they are consumed from the resulting stream. This is mostly useful for debugging to see the tuples as they flow past a certain point in a pipeline.- Parameters:
action- the action to perform on the trident tuple as they are consumed from the stream- Returns:
- the new stream
-
chainedAgg
public ChainedAggregatorDeclarer chainedAgg()
-
minBy
public Stream minBy(String inputFieldName)
This aggregator operation computes the minimum of tuples by the giveninputFieldNameand it is assumed that its value is an instance ofComparable. If the value of tuple with fieldinputFieldNameis not an instance ofComparablethen it throwsClassCastException- Parameters:
inputFieldName- input field name- Returns:
- the new stream with this operation.
-
minBy
public <T> Stream minBy(String inputFieldName, Comparator<T> comparator)
This aggregator operation computes the minimum of tuples by the giveninputFieldNamein a stream by using the givencomparator. If the value of tuple with fieldinputFieldNameis not an instance ofTthen it throwsClassCastException- Parameters:
inputFieldName- input field namecomparator- comparator used in for finding minimum of two tuple values ofinputFieldName.- Returns:
- the new stream with this operation.
-
min
public Stream min(Comparator<TridentTuple> comparator)
This aggregator operation computes the minimum of tuples in a stream by using the givencomparatorwithTridentTuples.- Parameters:
comparator- comparator used in for finding minimum of two tuple values.- Returns:
- the new stream with this operation.
-
maxBy
public Stream maxBy(String inputFieldName)
This aggregator operation computes the maximum of tuples by the giveninputFieldNameand it is assumed that its value is an instance ofComparable. If the value of tuple with fieldinputFieldNameis not an instance ofComparablethen it throwsClassCastException- Parameters:
inputFieldName- input field name- Returns:
- the new stream with this operation.
-
maxBy
public <T> Stream maxBy(String inputFieldName, Comparator<T> comparator)
This aggregator operation computes the maximum of tuples by the giveninputFieldNamein a stream by using the givencomparator. If the value of tuple with fieldinputFieldNameis not an instance ofTthen it throwsClassCastException- Parameters:
inputFieldName- input field namecomparator- comparator used in for finding maximum of two tuple values ofinputFieldName.- Returns:
- the new stream with this operation.
-
max
public Stream max(Comparator<TridentTuple> comparator)
This aggregator operation computes the maximum of tuples in a stream by using the givencomparatorwithTridentTuples.- Parameters:
comparator- comparator used in for finding maximum of two tuple values.- Returns:
- the new stream with this operation.
-
aggregate
public Stream aggregate(Aggregator agg, Fields functionFields)
-
aggregate
public Stream aggregate(Fields inputFields, Aggregator agg, Fields functionFields)
-
aggregate
public Stream aggregate(CombinerAggregator agg, Fields functionFields)
-
aggregate
public Stream aggregate(Fields inputFields, CombinerAggregator agg, Fields functionFields)
-
aggregate
public Stream aggregate(ReducerAggregator agg, Fields functionFields)
-
aggregate
public Stream aggregate(Fields inputFields, ReducerAggregator agg, Fields functionFields)
-
tumblingWindow
public Stream tumblingWindow(int windowCount, WindowsStoreFactory windowStoreFactory, Fields inputFields, Aggregator aggregator, Fields functionFields)
Returns a stream of tuples which are aggregated results of a tumbling window with everywindowCountof tuples.- Parameters:
windowCount- represents number of tuples in the windowwindowStoreFactory- intermediary tuple store for storing windowing tuplesinputFields- projected fields for aggregatoraggregator- aggregator to run on the window of tuples to compute the result and emit to the stream.functionFields- fields of values to emit with aggregation.- Returns:
- the new stream with this operation.
-
tumblingWindow
public Stream tumblingWindow(BaseWindowedBolt.Duration windowDuration, WindowsStoreFactory windowStoreFactory, Fields inputFields, Aggregator aggregator, Fields functionFields)
Returns a stream of tuples which are aggregated results of a window that tumbles at duration ofwindowDuration.- Parameters:
windowDuration- represents tumbling window duration configurationwindowStoreFactory- intermediary tuple store for storing windowing tuplesinputFields- projected fields for aggregatoraggregator- aggregator to run on the window of tuples to compute the result and emit to the stream.functionFields- fields of values to emit with aggregation.- Returns:
- the new stream with this operation.
-
slidingWindow
public Stream slidingWindow(int windowCount, int slideCount, WindowsStoreFactory windowStoreFactory, Fields inputFields, Aggregator aggregator, Fields functionFields)
Returns a stream of tuples which are aggregated results of a sliding window with everywindowCountof tuples and slides the window afterslideCount.- Parameters:
windowCount- represents tuples count of a windowslideCount- the number of tuples after which the window slideswindowStoreFactory- intermediary tuple store for storing windowing tuplesinputFields- projected fields for aggregatoraggregator- aggregator to run on the window of tuples to compute the result and emit to the stream.functionFields- fields of values to emit with aggregation.- Returns:
- the new stream with this operation.
-
slidingWindow
public Stream slidingWindow(BaseWindowedBolt.Duration windowDuration, BaseWindowedBolt.Duration slidingInterval, WindowsStoreFactory windowStoreFactory, Fields inputFields, Aggregator aggregator, Fields functionFields)
Returns a stream of tuples which are aggregated results of a window which slides at duration ofslidingIntervaland completes a window atwindowDuration.- Parameters:
windowDuration- represents window duration configurationslidingInterval- the time duration after which the window slideswindowStoreFactory- intermediary tuple store for storing windowing tuplesinputFields- projected fields for aggregatoraggregator- aggregator to run on the window of tuples to compute the result and emit to the stream.functionFields- fields of values to emit with aggregation.- Returns:
- the new stream with this operation.
-
window
public Stream window(WindowConfig windowConfig, Fields inputFields, Aggregator aggregator, Fields functionFields)
Returns a stream of aggregated results based on the given window configuration which uses inmemory windowing tuple store.- Parameters:
windowConfig- window configuration like window length and slide length.inputFields- input fieldsaggregator- aggregator to run on the window of tuples to compute the result and emit to the stream.functionFields- fields of values to emit with aggregation.- Returns:
- the new stream with this operation.
-
window
public Stream window(WindowConfig windowConfig, WindowsStoreFactory windowStoreFactory, Fields inputFields, Aggregator aggregator, Fields functionFields)
Returns stream of aggregated results based on the given window configuration.- Parameters:
windowConfig- window configuration like window length and slide length.windowStoreFactory- intermediary tuple store for storing tuples for windowinginputFields- input fieldsaggregator- aggregator to run on the window of tuples to compute the result and emit to the stream.functionFields- fields of values to emit with aggregation.- Returns:
- the new stream with this operation.
-
persistentAggregate
public TridentState persistentAggregate(StateFactory stateFactory, CombinerAggregator agg, Fields functionFields)
-
persistentAggregate
public TridentState persistentAggregate(StateSpec spec, CombinerAggregator agg, Fields functionFields)
-
persistentAggregate
public TridentState persistentAggregate(StateFactory stateFactory, Fields inputFields, CombinerAggregator agg, Fields functionFields)
-
persistentAggregate
public TridentState persistentAggregate(StateSpec spec, Fields inputFields, CombinerAggregator agg, Fields functionFields)
-
persistentAggregate
public TridentState persistentAggregate(StateFactory stateFactory, ReducerAggregator agg, Fields functionFields)
-
persistentAggregate
public TridentState persistentAggregate(StateSpec spec, ReducerAggregator agg, Fields functionFields)
-
persistentAggregate
public TridentState persistentAggregate(StateFactory stateFactory, Fields inputFields, ReducerAggregator agg, Fields functionFields)
-
persistentAggregate
public TridentState persistentAggregate(StateSpec spec, Fields inputFields, ReducerAggregator agg, Fields functionFields)
-
toStream
public Stream toStream()
- Specified by:
toStreamin interfaceIAggregatableStream
-
getName
public String getName()
Returns the label applied to the stream.- Returns:
- the label applied to the stream.
-
getOutputFields
public Fields getOutputFields()
- Specified by:
getOutputFieldsin interfaceIAggregatableStream
-
-