Metrics

Describes metric reporting capabilities provided by Philter.

Philter collects metrics while running to provide insights into its operation and the text being processed. The metrics collected include a count of the documents processed by Philter, counts of the types of sensitive information identified per type, and the entity confidence values of entities extracted by non-deterministic natural language processing methods. These metrics, by default, are reported to standard out, but can also be reported via JMX, Amazon CloudWatch, and Datadog.

Philter supports reporting metrics via JMX, Amazon CloudWatch, and Datadog.

Reporting Metrics to Amazon CloudWatch

To enable Philter metric reporting to Amazon CloudWatch modify Philter's application settings to set the AWS properties as detailed in the Settings.

Metrics will be published to CloudWatch every 60 seconds when enabled.

The AWS IAM user or role being used should have PutMetricData permissions:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"cloudwatch:PutMetricData"
],
"Resource": "*"
}
]
}

The metrics will be published to the Amazon CloudWatch namespace provided in Philter's settings. Amazon CloudWatch can then be used to visualize the metrics, set performance alarms, or perform other integrations with AWS services.

Philter metrics reported and visualized in Amazon CloudWatch.

Reporting Metrics to Datadog

Metrics will be published to Datadog every 60 seconds when enabled.

Metrics published to Datadog will have a philter prefix.

Philter metrics in Datadog's Metrics Summary.

The metrics can be used to make graphs and dashboards.

Example Datadog graphs of select Philter metrics.

Reporting Metrics to JMX

Metrics in JMX can be viewed using visualvm or similar tool.

Metrics Collected and Reported

The listing below shows an example of the metrics Philter collects and writes to standard out while running. The metrics reported to supported services such as JMX, Amazon CloudWatch and Datadog will contain the same metrics but may be represented or visualized differently between the services.

The metrics collected include:

  • A cumulative count of each type of sensitive information across all contexts and documents.

  • The total count of documents processed.

  • A histogram of entity confidence values.

  • Timing values showing the number of documents processed by Philter over 1, 5, 15 minute intervals.

These metrics will be reset when Philter is stopped and restarted.

Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: 2/28/20, 4:18:16 PM ============================================================
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: -- Counters --------------------------------------------------------------------
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.AGE
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 20
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.CREDIT_CARD
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 11
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.CUSTOM_DICTIONARY
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.DATE
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 388
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.EMAIL_ADDRESS
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 6
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.FIRST_NAME
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.HOSPITAL
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.HOSPITAL_ABBREVIATION
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.IDENTIFIER
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 1515
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.IP_ADDRESS
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 1
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.LOCATION_CITY
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.LOCATION_COUNTY
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.LOCATION_STATE
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.MAC_ADDRESS
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.NER_ENTITY
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 855
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.PHONE_NUMBER
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 322
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.PHONE_NUMBER_EXTENSION
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.SSN
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 6
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.STATE_ABBREVIATION
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.SURNAME
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.URL
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 1
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.VIN
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 0
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.ZIP_CODE
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 6
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.total.documents.processed
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 515
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: -- Histograms ------------------------------------------------------------------
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.entity.confidence
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 861
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: min = 20
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: max = 98
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: mean = 79.99
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: stddev = 19.38
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: median = 89.00
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: 75% <= 95.00
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: 95% <= 97.00
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: 98% <= 97.00
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: 99% <= 98.00
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: 99.9% <= 98.00
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: -- Meters ----------------------------------------------------------------------
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: philter.documents.processed
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: count = 515
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: mean rate = 1.72 events/second
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: 1-minute rate = 1.35 events/second
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: 5-minute rate = 1.06 events/second
Feb 28 16:18:16 ip-10-0-6-59.ec2.internal bash[3628]: 15-minute rate = 0.48 events/second