Complete API documentation for the BMPM library.
- BeiderMorse - Main facade class
- Enums - Type-safe configuration options
- Engine Classes - Core processing components
- Utility Classes - Helper utilities
- Exceptions - Error handling
The main facade class providing a simple interface to all BMPM functionality.
Namespace: Zendevio\BMPM
public function __construct()Creates a new BeiderMorse instance with default settings:
- Name Type: Generic
- Accuracy: Approximate
- Language: Auto-detect
public static function create(): selfFluent factory method for chained configuration.
Example:
$encoder = BeiderMorse::create()
->withNameType(NameType::Ashkenazic)
->withAccuracy(MatchAccuracy::Exact);All configuration methods return a new instance (immutable pattern).
public function withNameType(NameType $nameType): selfSet the name type variant for phonetic encoding.
| Parameter | Type | Description |
|---|---|---|
$nameType |
NameType |
Generic, Ashkenazic, or Sephardic |
public function withAccuracy(MatchAccuracy $accuracy): selfSet the matching accuracy mode.
| Parameter | Type | Description |
|---|---|---|
$accuracy |
MatchAccuracy |
Exact or Approximate |
public function withLanguages(Language ...$languages): selfRestrict encoding to specific language(s).
| Parameter | Type | Description |
|---|---|---|
$languages |
Language... |
One or more Language enum values |
Example:
$encoder = BeiderMorse::create()
->withLanguages(Language::German, Language::Polish, Language::Russian);public function withLanguageMask(int $mask): selfSet language restriction using a bitmask directly.
| Parameter | Type | Description |
|---|---|---|
$mask |
int |
Combined language bitmask |
Example:
$mask = Language::German->value | Language::English->value;
$encoder = BeiderMorse::create()->withLanguageMask($mask);public function withAutoLanguageDetection(): selfClear language restrictions and enable automatic detection.
public function withDataPath(string $path): selfSet a custom path for rule data files.
| Parameter | Type | Description |
|---|---|---|
$path |
string |
Absolute path to rules directory |
public function encode(string $name): stringEncode a name to its phonetic representation.
| Parameter | Type | Description |
|---|---|---|
$name |
string |
The name to encode |
Returns: string - Phonetic encoding with alternatives in (a|b|c) format
Example:
$phonetic = $encoder->encode('Schwarzenegger');
// "(Svarcenegr|Svarceneger|...)"public function encodeToArray(string $name): arrayEncode a name and return all alternatives as an array.
| Parameter | Type | Description |
|---|---|---|
$name |
string |
The name to encode |
Returns: array<string> - Array of phonetic alternatives
Example:
$alternatives = $encoder->encodeToArray('Mueller');
// ['milr', 'mylr', 'miler', ...]public function encodeBatch(array $names): arrayEncode multiple names in batch.
| Parameter | Type | Description |
|---|---|---|
$names |
array<string> |
Array of names to encode |
Returns: array<string, string> - Associative array of name => phonetic
public function matches(string $name1, string $name2): boolCheck if two names might match phonetically.
| Parameter | Type | Description |
|---|---|---|
$name1 |
string |
First name |
$name2 |
string |
Second name |
Returns: bool - True if any phonetic alternatives match
public function similarity(string $name1, string $name2): floatGet the similarity score between two names using Jaccard index.
| Parameter | Type | Description |
|---|---|---|
$name1 |
string |
First name |
$name2 |
string |
Second name |
Returns: float - Score between 0.0 (no match) and 1.0 (identical)
public function detectLanguages(string $name): arrayDetect all possible language(s) of a name.
| Parameter | Type | Description |
|---|---|---|
$name |
string |
The name to analyze |
Returns: array<Language> - Array of detected languages
public function detectPrimaryLanguage(string $name): LanguageDetect the primary (most likely) language of a name.
| Parameter | Type | Description |
|---|---|---|
$name |
string |
The name to analyze |
Returns: Language - The most likely language
public function soundex(string $name): stringGet Daitch-Mokotoff Soundex encoding.
| Parameter | Type | Description |
|---|---|---|
$name |
string |
The name to encode |
Returns: string - Space-separated 6-digit D-M Soundex codes
public function getNameType(): NameTypeGet the current name type setting.
public function getAccuracy(): MatchAccuracyGet the current accuracy setting.
public function getLanguageMask(): ?intGet the current language mask (null if auto-detect).
public function getAvailableLanguages(): arrayGet all languages available for the current name type.
Returns: array<Language>
Namespace: Zendevio\BMPM\Enums
enum NameType: string
{
case Generic = 'gen';
case Ashkenazic = 'ash';
case Sephardic = 'sep';
}| Method | Returns | Description |
|---|---|---|
directory() |
string |
Directory name for rule files |
label() |
string |
Human-readable label |
description() |
string |
Detailed description |
fromString(string $value) |
self |
Create from string (case-insensitive) |
Namespace: Zendevio\BMPM\Enums
enum MatchAccuracy: string
{
case Exact = 'exact';
case Approximate = 'approx';
}| Method | Returns | Description |
|---|---|---|
label() |
string |
Human-readable label |
description() |
string |
Detailed description |
fromString(string $value) |
self |
Create from string |
isApproximate() |
bool |
Check if Approximate mode |
isExact() |
bool |
Check if Exact mode |
Namespace: Zendevio\BMPM\Enums
enum Language: int
{
case Any = 1;
case Arabic = 2;
case Cyrillic = 4;
case Czech = 8;
case Dutch = 16;
case English = 32;
case French = 64;
case German = 128;
case Greek = 256;
case GreekLatin = 512;
case Hebrew = 1024;
case Hungarian = 2048;
case Italian = 4096;
case Latvian = 8192;
case Polish = 16384;
case Portuguese = 32768;
case Romanian = 65536;
case Russian = 131072;
case Spanish = 262144;
case Turkish = 524288;
}| Method | Returns | Description |
|---|---|---|
genericLanguages() |
array<self> |
Languages for Generic mode |
ashkenazicLanguages() |
array<self> |
Languages for Ashkenazic mode |
sephardicLanguages() |
array<self> |
Languages for Sephardic mode |
forNameType(NameType $type) |
array<self> |
Languages for specified name type |
fromString(string $name) |
?self |
Create from name (case-insensitive) |
fromMask(int $mask) |
array<self> |
Get languages from bitmask |
combineMask(array $langs) |
int |
Combine languages to bitmask |
| Method | Returns | Description |
|---|---|---|
ruleName() |
string |
Lowercase name for rule files |
label() |
string |
Human-readable label |
index(NameType $type) |
int |
Index position for name type |
isInMask(int $mask) |
bool |
Check if included in mask |
Namespace: Zendevio\BMPM\Engine
Implements: PhoneticEncoderInterface
The core phonetic encoding engine implementing the Beider-Morse algorithm.
public function __construct(
RuleLoaderInterface $ruleLoader,
LanguageDetectorInterface $languageDetector,
)| Method | Description |
|---|---|
encode(...) |
Encode a single name |
encodeToArray(...) |
Encode and return array |
encodeBatch(...) |
Encode multiple names |
clearCache() |
Clear internal rule caches |
Namespace: Zendevio\BMPM\Engine
Implements: LanguageDetectorInterface
Detects language(s) of a name based on spelling patterns.
public function __construct(RuleLoaderInterface $ruleLoader)| Method | Returns | Description |
|---|---|---|
detect(string $name, NameType $type) |
int |
Bitmask of detected languages |
detectLanguages(string $name, NameType $type) |
array<Language> |
Array of detected languages |
detectPrimary(string $name, NameType $type) |
Language |
Primary detected language |
clearCache() |
void |
Clear internal caches |
Namespace: Zendevio\BMPM\Util
Expands phonetic alternates in parenthesized notation.
| Method | Returns | Description |
|---|---|---|
expand(string $phonetic) |
array<string> |
Expand to array of alternatives |
collapse(array $alternatives) |
string |
Convert array back to parenthesized |
removeDuplicates(string $phonetic) |
string |
Remove duplicate alternatives |
hasAlternates(string $phonetic) |
bool |
Check if has alternatives |
countAlternatives(string $phonetic) |
int |
Count number of alternatives |
Namespace: Zendevio\BMPM\Util
UTF-8 safe string manipulation utilities.
| Method | Returns | Description |
|---|---|---|
normalize(string $input) |
string |
Normalize to UTF-8 lowercase |
toUtf8(string $input) |
string |
Convert to UTF-8 |
isAscii(string $input) |
bool |
Check if ASCII only |
substring(string $input, int $start, ?int $length) |
string |
UTF-8 safe substring |
length(string $input) |
int |
UTF-8 string length |
Namespace: Zendevio\BMPM\Soundex
Daitch-Mokotoff Soundex implementation for Slavic/Yiddish names.
public function encode(string $name): stringReturns space-separated 6-digit codes.
All exceptions extend BeiderMorseException.
Namespace: Zendevio\BMPM\Exceptions
Base exception class for all BMPM errors.
Thrown when input validation fails.
| Method | Description |
|---|---|
emptyInput() |
Input is empty |
invalidEncoding(string $input) |
Input is not valid UTF-8 |
inputTooLong(int $length, int $max) |
Input exceeds max length |
Thrown when rule loading fails.
| Method | Description |
|---|---|
fileNotFound(string $path) |
Rule file not found |
invalidJson(string $path, string $error) |
Invalid JSON in rule file |
invalidRuleFormat(string $path, string $error) |
Invalid rule format |
missingRequiredField(string $field, string $path) |
Missing required field |
interface PhoneticEncoderInterface
{
public function encode(string $input, ...): string;
public function encodeToArray(string $input, ...): array;
public function encodeBatch(array $inputs, ...): array;
}interface LanguageDetectorInterface
{
public function detect(string $name, NameType $nameType): int;
public function detectLanguages(string $name, NameType $nameType): array;
public function detectPrimary(string $name, NameType $nameType): Language;
}interface RuleLoaderInterface
{
public function loadRules(Language $language, NameType $nameType): RuleSet;
public function loadFinalRules(Language $language, NameType $nameType, MatchAccuracy $accuracy): RuleSet;
public function loadCommonRules(NameType $nameType, MatchAccuracy $accuracy): RuleSet;
public function loadLanguageRules(NameType $nameType): array;
public function clearCache(): void;
}