Philter can identify many predefined types of sensitive information. Each type, or filter, can be enabled or disabled separately from the other types in a filter profile.
Philter uses several methods to identify person's names.
Identifies ages such as
Identifies Bitcoin addresses such as
Identifies common cities
Identifies common counties
Identifies VISA, American Express, MasterCard, and Discover credit card numbers.
Identifies dates in many formats such as May 22, 1999
Identifies driver's license numbers for all 50 US states
Identifies email addresses
Identifies common hospital names and their abbreviations
Identifies international bank account numbers
Identifies IPv4 and IPv6 addresses
Identifies network MAC addresses
Identifies US passport numbers
Identifies phone numbers and phone number extensions
Identifies sections in text denoted by
Identifies US SSNs and TINs
Identifies US state names and abbreviations
Identifies UPS, FedEx, and USPS tracking numbers
Identifies vehicle identification numbers
Identifies US zip codes
In addition to the predefined types of sensitive information listed in the table above, you can also define your own types of sensitive information. Through custom identifiers and dictionaries, Philter can identify many other types of information that may be sensitive in your use-case. For example, if you have patient identifiers that follow a pattern of
AA-00000 you can define a custom identifier for this sensitive information.
Philter can be configured to look identify sensitive information based on custom dictionaries. When a term in the dictionary is found in the text, Philter will treat the term as sensitive information and apply the given replacement strategy.
Custom dictionaries support fuzziness to accommodate for misspellings. The replacement strategy for a custom dictionary has a
sensitivityLevel that controls the amount of allowed fuzziness.