72 Commits

Author SHA1 Message Date
Zuul
f689816936 Merge "Added license informaton for third-party libraries" 2018-03-06 15:14:57 +00:00
Kenan Karamehmedovic
344adcde36 Added license informaton for third-party libraries
Task:  6353
Story: 2001540

Depends-On: https://review.openstack.org/544813

Change-Id: Iff0d2f130defb4d26cbf047bed44f73f89bbcee8
2018-02-19 11:46:00 +00:00
Witold Bedyk
15f9962fcb Remove mysql-connector and bump version to 2.3.0
MySQL Connector is released under GPLv2 license which restricts the
distribution of the consuming project [1]. This change removes MySQL
Connector and leaves Drizzle JDBC which is licensed under BSD.

[1] https://governance.openstack.org/tc/reference/licensing.html

Story: 2001522
Task: 6324
Change-Id: I4c39ebc290475820b5ba3ab54c36198ca9069abe
Depends-On: https://review.openstack.org/541366
2018-02-09 15:54:47 +01:00
Craig Bryant
106088887a Log as error failure to send to Kafka
Exception on failure to send to Kafka was only being logged
at the debug level. Increased to error level as this is a major
failure in the Threshold Engine functionality

Change-Id: I131d6d7a20cd0e907334cf5d0ff6fac342e8f320
2018-01-18 10:47:30 -07:00
Witold Bedyk
be22e54fd7 Upper pom version to 2.2.0
Change-Id: I59e3d60f606791cc15bb0f2ee17d871e15aa754d
2017-12-20 14:32:08 +01:00
Witold Bedyk
86b8634c9c Change version to 2.1.1
Depends-On: Ib3da5c9e1f6e5e2d6f77269129bd769179bfd3be
Change-Id: I6e4b73525b3c32c2bdd04c63ae73bcb4c50b5447
2017-02-15 13:48:30 +00:00
Witold Bedyk
31a7b69b59 Change version to 2.1.0
Change-Id: Ic867645c7c77954bde27cecc64ec01057013f99a
2016-11-23 17:06:49 +01:00
Michael James Hoppal
a62c27a165 Bump drizzle driver version to support millisecond resolution
Change-Id: I671d4e081191bd72bbb8bac97ff3a27812dc50dc
2016-08-15 15:45:42 -06:00
Craig Bryant
2acdd58dc3 Implement the Last Function
The Alarm state is driven by the last measurement with the newest
timestamp. Use the value even if the measurement is older than the
oldest bucket. This ensures the measurement will be used when the
Threshold Engine is started if the measurement
is received while the Threshold Engine is stopped

Never evaluate subAlarm with function Last except on receiving of
a measurement.

Add tests to ensure this works.

The change is dependent on the monasca-common change and the
change to monasca-api to add the state field to sub_alarm.

Change-Id: Ib5123ed035018757a50d9ebeb7335fbca48054f2
Implements: Blueprint last-value
2016-08-02 12:05:37 -06:00
Michael James Hoppal
865816dd78 Add millisecond resolution to alarms.
At the moment it we are using the mysql built in function
NOW() which returns in second resolution.

Change-Id: I1192abb5aab3a9110721cc68f5a1d16a38f77c10
2016-07-13 08:14:35 -06:00
Craig Bryant
3927da1697 Save Measurements that arrive before their SubAlarms
Thresh creates an Alarm when a new Measurement matches an
AlarmDefinition. The previous Thresh code just discarded the
Measurement if it arrived before the newly created SubAlarm,
which was likely to occur. This code saves a Measurement that
does not match an existing SubAlarm in the expectation that the
SubAlarm will arrive very soon. It then adds the Measurement
to the SubAlarm. If the measurement would cause the SubAlarm to
transition to the ALARM state, that happens.

This is more important for determinstic alarms because they will get
fewer Measurements and ignoring the first one may prevent an Alarm's
state going to ALARM when it should

Change-Id: I08e9e481ad55862ba602eba5a68eb371b1d35bbc
2016-06-27 08:39:02 -06:00
Craig Bryant
0d80a987db Treat empty windows as OK for deterministic alarms
Using the standard case of count(log_message) > 1, getting no
log_message measurements should be treated as OK. However, the old code
uses the emptyWindowObservationThreshold for both deterministic and
non-deterministic alarms which means that there must be 3 empty windows
before the deterministic alarm transitions to OK.

This change cause the evaluation of the Alarm to treat an empty window
as OK for deterministic alarms. So, count(log_message) > 1, getting no
measurements in a window will transition the alarm back to OK

Change-Id: I19a04bf78f907b23ef583409f2def54771c07d72
2016-06-20 12:30:42 -06:00
Craig Bryant
d5d14ecdcd Clone the currentValues property in duplicate method
It is possible for the currentValues property to change which can cause
java.util.ConcurrentModificationException. Fix by cloning currentValues
before the SubAlarm gets emitted into storm

Change-Id: I555beffafe0208c0d256732517af401938876d3d
2016-06-09 16:42:47 -06:00
Craig Bryant
0e72d867ec Change to use Storm 1.0.0 instead of 0.9.x
Storm classes changed from starting with backtype to org.apache

Since this is a major backwards incompatible change, increment the
jar version

Copy some Stream classes from monasca-common. They were only used for
monasca-thresh anyways and having them in a separate repo made it
harder to make this change. A later review will remove these classes
from monasca-common

Need to have an explicit dependency on commons-codec

Change-Id: I36db83ce7fdea02ae4df267cf0820e49dcdf3001
2016-06-09 14:14:23 -06:00
Tomasz Trębski
080b11dc54 (Non)deterministic alarm processing
'deterministic' being part of alarm expressions
allows monasca-thresh to determine if
given alarms can go back to UNDETERMINED
state or not.

'deterministic' means that alarm
won't ever transititon to UNDETERMINED state,
even if there are no measurements received for
long enough. By default, all alarms
are assumed to be 'non-deterministic' which means
that they can transition to 'UNDETERMINED' state

Implements: blueprint alarmonlogs
Depends-On: Ia42f9a1be37c31416bdac341b092fe527f860c16
Change-Id: Ibe0839123a15494ad45b809e68600c0acef3d330
2016-06-07 12:00:58 +02:00
Brad Klein
4c2ac9a5e3 Include metric value in alarm notification message.
With commit 4e333d5fe4d069178045e1bb3935f9a4ee2be3bf notification
messages look like '...with the values: []', this fixes that.

Change-Id: I8e96b4aa5c77f74a9d3dd00a5647fbfde5fde9b2
Closes-Bug: #1554718
2016-03-11 10:58:31 -07:00
Craig Bryant
4e333d5fe4 Duplicate the SubAlarm before emitting it
This prevents Storm from throwing a ConcurrentModificationException if
the SubAlarm's state changes soon after the emit

Change-Id: Idc0de8a0ef6d13bce800e4e8a4e13e43cdf1c010
Closes-Bug: #1548999
2016-02-24 10:29:12 -07:00
Jenkins
78a8a1d0f1 Merge "Pass link and lifecycle state in state transitions" 2016-01-27 21:36:52 +00:00
Craig Bryant
6f3286bc0e Treat match-by of null as []
The API sometimes sends null for match-by when it should send []. Make the
Threshold Engine more tolerant by treating null as []

Change-Id: Idf29e58c27a2c0ba531d041a144e8c5f35b6be46
2016-01-26 11:55:18 -07:00
Ryan Brandt
c65fba06b0 Pass link and lifecycle state in state transitions
Requires changes in monasca-api, monasca-common to use

Change-Id: Ibf592a5e333f348895df6c681c23d0a34c115045
2016-01-19 14:28:19 -07:00
Deklan Dieterly
7febc99a69 Upgrade to Kafka 0.8.2.2
Upgrade Kafka to current stable release - 0.8.2.2.
Upgrade Scala version to 2.11.

Change-Id: I113997fc1c3124bc1073cb261d6b6f873c6fc6b2
2016-01-04 15:39:24 -07:00
Craig Bryant
96f9b442f3 Evaluate SubAlarms immediately if possible
Some SubAlarm expressions can be evaluated immediately. If the
expression is MAX(m) > 10, a single measurement of m > 10 will cause
the SubAlarm to transition to the ALARM state regardless of any other
measurement of m that is received. However if the operater is < or <=,
MAX can't be immediately evaluated since a following measurement
could be larger than the one storeed.  COUNT also can be evaluated
immediately if the operator is > or >= since it never decreases. MIN
is the opposite and can be immediately evaluated if the operator is
< or <=. AVG and SUM can't be evaluated until the end of the
evaluation window since the average or sum could go up or down
depending on the measurements received and whether or not they are
negative.

Also see if the sliding window for a SubAlarm can be slid when a
metric is received for the SubAlarm. This could allow the SubAlarm
to be evaluated faster than waiting for the tick tuple since that
is only received every 60 seconds.

Add unit tests for immediate SubAlarm evaluation.

Add unit tests for previously untested parts of SubAlarmStats

Change-Id: I989a82328fa4ccc04b49d203f70a1adc9fa4d3bb
2015-11-02 14:58:39 -07:00
Craig Bryant
c3568930f6 Change last URL from stackforge to openstack
Also remove a trailing space from run_maven.sh

Change-Id: I35b929a91e6dcbccd30d11263e0c3bf673e21040
2015-10-19 11:49:48 -06:00
venkatamahesh
8892699305 Change repositories from stackforge to openstack
Change-Id: I1579ccd3803a1d2ca6173ce517cfc28350e15d05
2015-10-19 09:15:07 +05:30
Tomasz Trębski
7a595cf420 Dependencies updated
Following changes made because monasca-common
was modified:

- removed unnecessary dependencies
- updated hikari version
- added javax.el-api

Change-Id: I176008a258411500bf14ba4a26258bdce90476db
2015-09-23 06:37:16 +00:00
Jenkins
84eddb58da Merge "Added a whitelist for restricting the StatsD metrics" 2015-09-17 20:52:10 +00:00
Michael James Hoppal
f43dfb5918 Add the drizzle driver to pom
Allows the end user the ability to choose between using drizzle
and mysql connector.

Change-Id: If74b239824d35ccbf9a5fd2f2cae6dbd0efb40a0
2015-09-04 16:10:52 -06:00
Jenkins
0d7a75c304 Merge "Add support for drizzle jdbc connector" 2015-09-03 22:13:41 +00:00
Roland Hochmuth
4334a8e44a Add support for drizzle jdbc connector
Mysql jdbc connector returns an Integer when querying period and period.
Drizzle jdbc connector returns a Long. Adding appropriate conversions.

Change-Id: Ie96b10347dbd52b4e0e267f5fbb7cf3d6d6eafff
2015-09-02 11:58:38 -06:00
Tomasz Trębski
622339f6bc Hibernate support added
- added ORM support with Hibernate
- rewritten two mysql repositories to use ORM

Change-Id: I22e342ca57b4cc62b12a44cdf503ce068b9b67b5
2015-08-31 11:45:39 +02:00
Dexter Fryar
afc22b56a1 Added a whitelist for restricting the StatsD metrics
A whitelist and metric map for the metrics that are
sent by Storm / Threshold Engine to the Monasca
StatsD agent/daemon.

Also relates to:
  https://github.com/hpcloud-mon/ansible-monasca-thresh/pull/14

=======

/etc/monasca/thresh-config.yml

```
statsdConfig:
  host: localhost
  port: 8125
  debugmetrics: false
  dimensions: !!map
    service : monitoring
    component : storm
  whitelist: !!seq
    - aggregation-bolt.execute-count.filtering-bolt_alarm-creation-stream
    - aggregation-bolt.execute-count.filtering-bolt_default
    - aggregation-bolt.execute-count.system_tick
    - filtering-bolt.execute-count.event-bolt_metric-alarm-events
    - filtering-bolt.execute-count.metrics-spout_default
    - thresholding-bolt.execute-count.aggregation-bolt_default
    - thresholding-bolt.execute-count.event-bolt_alarm-definition-events
    - system.memory_heap.committedBytes
    - system.memory_nonHeap.committedBytes
    - system.newWorkerEvent
    - system.startTimeSecs
    - system.GC_ConcurrentMarkSweep.timeMs
  metricmap: !!map
    aggregation-bolt.execute-count.filtering-bolt_alarm-creation-stream :
      monasca_threshold.aggregation-bolt.execute-count.filtering-bolt_alarm-creation-stream
    aggregation-bolt.execute-count.filtering-bolt_default :
      monasca_threshold.aggregation-bolt.execute-count.filtering-bolt_default
    aggregation-bolt.execute-count.system_tick :
      monasca_threshold.aggregation-bolt.execute-count.system_tick
    filtering-bolt.execute-count.event-bolt_metric-alarm-events :
      monasca_threshold.filtering-bolt.execute-count.event-bolt_metric-alarm-events
    filtering-bolt.execute-count.metrics-spout_default :
      monasca_threshold.filtering-bolt.execute-count.metrics-spout_default
    thresholding-bolt.execute-count.aggregation-bolt_default :
      monasca_threshold.thresholding-bolt.execute-count.aggregation-bolt_default
    thresholding-bolt.execute-count.event-bolt_alarm-definition-events :
      monasca_threshold.thresholding-bolt.execute-count.event-bolt_alarm-definition-events
    system.memory_heap.committedBytes :
      monasca_threshold.system.memory_heap.committedBytes
    system.memory_nonHeap.committedBytes :
      monasca_threshold.system.memory_nonHeap.committedBytes
    system.newWorkerEvent :
      monasca_threshold.system.newWorkerEvent
    system.startTimeSecs :
      monasca_threshold.system.startTimeSecs
    system.GC_ConcurrentMarkSweep.timeMs :
      monasca_threshold.system.GC_ConcurrentMarkSweep.timeMs
```

host: IP or host where the Monasca Agent running a StatsD is running that will consume
      the metrics produced by Storm / Threshold Engine

port: UDP port number where the Monasca Agent running a StatsD daemon that will consume
      the metrics produced by Storm / Threshold Engine

dimensions: A map of key/value pairs that will be passed along as dimensions for each metric

whitelist: A list of metrics in the native name that Storm presents

metricmap: A mapping from the native Storm metric name to a user defined name.  The user
           defined name is what will appear in the Monasca data store.  If there is no
           mapping present and it is listed in the whitelist then it will be published
           with the native name. The 12 metrics whitelisted/mapped above correspond to the
           monasca health dashboard which is defined in grafana.

           https://github.com/hpcloud-mon/grafana/blob/master/src/app/dashboards/monasca.json

Change-Id: I7bcefd03d02714ac42efd9b2d9cadb77907fa17e
2015-08-28 17:16:36 -05:00
Craig Bryant
237c752e6a Fix problem where dimensions were null
This caused issues for the MetricDefinitionAndTenantIdMatcher when
it was comparing dimensions. Changed it to replace null with an
empty set since that is what the rest of the code is expecting

Change-Id: I3dbec749f29604ef49d89d4a8ec1f6d882305957
2015-08-04 08:53:22 -06:00
Roland Hochmuth
fb9b6888c1 Modified query in getAlarmedMetrics for performance
Change-Id: I3520ea7154a450c5cbb5d4aecf152e1521435907
2015-07-30 21:59:32 -06:00
Craig Bryant
c400895872 Fix NullPointerExceptions in MetricFilteringBolt
This happened because MetricDefinitionAndTenantIdMatcher wasn't handling
the same Alarm Definition being added. This happens because there are
multiple MetricFilteringBolts using the same
MetricDefinitionAndTenantIdMatcher. The Alarm Definition is now
checked if is already there before being added

Change-Id: I9f382e8da5193b60a64dbe40c9fcf321fc47766f
2015-06-05 10:52:54 -06:00
Craig Bryant
9c4bd6cc99 Simplify the check for whether a metric matches
The old way was more efficient if there were few dimensions on
the metric and lots of Alarm Definitions with different
dimension sets for a given tenant id and metric name, but that
is probably unlikely. The more common case will be one or two
alarm definitions per tenant id and metric name and more
dimensions on the metric. The new algorithm is faster for that
case since it doesn't create every possible combinations of
dimensions like the old algorithm

Change-Id: I183570d52c61f0a2932cf37c5c659a6c529b4bbb
2015-05-29 15:46:41 -06:00
Ryan Brandt
5b33c539bf Add new state_updated_at field to alarm
Adjust the database interaction for the new state_updated_at field
in the alarm table

Change-Id: I941f9bfdc64a43820dbbe3a4c047cc64b93335a7
2015-05-13 15:50:01 -06:00
Ryan Brandt
c6f025016d Clean up orphaned alarms on alarm definition delete
Change-Id: I8dea800768989b80f1fe810f1c9572ed439d0133
2015-05-08 12:31:22 -06:00
Jenkins
1f0888d6fa Merge "Bump the version to 1.1.0" 2015-04-29 03:39:30 +00:00
Craig Bryant
88ce39abdb Bump the version to 1.1.0
Change-Id: Id63a620b4d3c301567e462b3d04a1ddc8a8322b2
2015-04-28 21:28:24 -06:00
Ryan Brandt
0f249a28cb Allow unicode in events
Change the events deserialization to handle UTF-8 encoding

Change-Id: I73c1a50df5fe365b1ed7d047f58d0e7f67f51d40
2015-04-21 14:14:14 -06:00
Craig Bryant
8f28398d07 Only the jar and sample config in the deb
Remove control scripts from deb

Update sample config file to be more current

Change-Id: If2b11dd1cab807f58b4b23a0a1933fb179032964
2015-04-20 11:22:47 -06:00
Craig Bryant
6bdef9f492 AlarmStateTransitionedEvent timestamp now in ms
This will ensure a unique timestamp. Influx will only keep one
entry with the same timestamp

Change-Id: Ibf1001fea9328a6541381d344221b86e39996e1d
2015-04-14 11:13:43 -06:00
Craig Bryant
e3ac4b0857 Remove warnings
Remove unused methods, unused imports, unused private constants

Change-Id: Ie900b295cc5410fa9039649e868228d1d3de78ee
2015-04-07 16:37:21 -06:00
Jenkins
51e27bab7d Merge "Added version information to jar which can be used on the command line e.g. java -jar monasca-thresh.jar --version" 2015-03-30 03:02:59 +00:00
Craig Bryant
47b734664e Use Maven 3 for build
Also, pull and build monasca-common directly instead of using
jars from tarballs.openstack.org since zuul often gets backed
up and jars don't get updated fast enough

Change-Id: I22fc5cfc085a583c337fca199d5e49ead93fcbb7
2015-03-28 13:40:16 -06:00
Dexter Fryar
2f0e0792fa Added version information to jar which can be used on the command
line e.g. java -jar monasca-thresh.jar --version

Change-Id: If1b6dc46ddac063d78b956fa43372cc4d30787fc
2015-03-25 16:50:44 -05:00
Dexter Fryar
3a00ef33ba Added monitoring of storm/threshold engine via StatsD
See https://wiki.openstack.org/wiki/Monasca/Monitoring_Of_Monasca

Change-Id: Ie995fa31791e61dc3d480f12c2dc99271c6e3e4a
2015-03-17 20:08:17 -05:00
Roland Hochmuth
7795d7cc3a Conversion to milliseconds
Change-Id: I285585d6cf0883215792fb44a040db913e8521a7
2015-03-10 19:55:20 -06:00
Craig Bryant
2b62c0477b Add measurement valueMeta
Changes tests to use the new Metric constructor with valueMeta

Requires the changes to monasca-common from

Implements: blueprint measurement-meta-data

Change-Id: Ibba190c14fb1cb9d5ab2a7b0e4da9bfcfba9874d
2015-03-09 22:53:19 -06:00
Craig Bryant
b4f0e5fcf6 Prevent premature evaluation of Sliding Window
Added ability to configure how long thresh should wait for metrics
before showing up before evaluating sliding window. Ensure the
current time is past the end of the sliding window plus the delay
before sliding

Only evaluate SubAlarm if current time is past slot end timestamp

This change depends on the monasca-common changes of
https://review.openstack.org/161941

Change-Id: Iab7cb1580253f2fc7c114cfb95c009dba6b23331
2015-03-05 16:23:58 -07:00