U.S. Social Security Number (SSN)
live testingcurated4 patternsproximity 300recommended confidence 75 (medium)
Nine digits, formatted (ddd-dd-dddd) or unformatted, validated against SSN issuance rules. Four separate functions cover pre-2011 strong formatting and post-2011 randomised numbers.
Every one of the four patterns requires an SSN keyword in proximity. Nine digits alone never match this SIT, at any confidence. And note the 55-confidence pattern: it is below even the LOW band, so a rule set to low confidence still picks it up.
The patterns, as the engine reads them
confidence 85 (high)
- Primary: Func_ssn: Pre-2011 strongly formatted SSN with dashes or spaces (ddd-dd-dddd), validated against issuance ranges.
- Required: Keyword_ssn: 12 keywords (word match), e.g. "SSA Number", "social security number", "social security #"
confidence 75 (medium)
- Primary: Func_unformatted_ssn: Pre-2011 strongly formatted SSN as nine consecutive digits, validated against issuance ranges.
- Required: Keyword_ssn: 12 keywords (word match), e.g. "SSA Number", "social security number", "social security #"
confidence 65 (low)
- Primary: Func_randomized_formatted_ssn: Post-2011 randomised SSN with dashes or spaces: valid ranges but outside pre-2011 issuance.
- Required: Keyword_ssn: 12 keywords (word match), e.g. "SSA Number", "social security number", "social security #"
confidence 55 (low)
- Primary: Func_randomized_unformatted_ssn: Post-2011 randomised SSN as nine consecutive digits: valid ranges but outside pre-2011 issuance.
- Required: Keyword_ssn: 12 keywords (word match), e.g. "SSA Number", "social security number", "social security #"
Catalogue data reflects Microsoft's built-in SITs, which are the same in every tenant. This is a faithful simulation, not the live service, so confirm with the portal's SIT test before acting on a result.