Complete reference of languages supported by each name type mode.
The Generic mode supports the widest range of languages, suitable for general-purpose name matching.
| Language | Bitmask Value | Script | Region |
|---|---|---|---|
| Any | 1 | - | Fallback |
| Arabic | 2 | Arabic | Middle East, North Africa |
| Cyrillic | 4 | Cyrillic | Russia, Eastern Europe |
| Czech | 8 | Latin | Czech Republic |
| Dutch | 16 | Latin | Netherlands, Belgium |
| English | 32 | Latin | UK, USA, Commonwealth |
| French | 64 | Latin | France, Canada, Africa |
| German | 128 | Latin | Germany, Austria, Switzerland |
| Greek | 256 | Greek | Greece, Cyprus |
| Greek (Latin) | 512 | Latin | Greek names in Latin script |
| Hebrew | 1024 | Hebrew | Israel, Jewish communities |
| Hungarian | 2048 | Latin | Hungary |
| Italian | 4096 | Latin | Italy |
| Latvian | 8192 | Latin | Latvia |
| Polish | 16384 | Latin | Poland |
| Portuguese | 32768 | Latin | Portugal, Brazil |
| Romanian | 65536 | Latin | Romania, Moldova |
| Russian | 131072 | Cyrillic | Russia |
| Spanish | 262144 | Latin | Spain, Latin America |
| Turkish | 524288 | Latin | Turkey |
Total: 20 languages
Optimized for Ashkenazic Jewish surnames from Central and Eastern Europe.
| Language | Bitmask Value | Notes |
|---|---|---|
| Any | 1 | Fallback |
| Cyrillic | 4 | Russian/Ukrainian variants |
| English | 32 | Anglicized names |
| French | 64 | French variants |
| German | 128 | Most common origin |
| Hebrew | 1024 | Original Hebrew names |
| Hungarian | 2048 | Hungarian variants |
| Polish | 16384 | Polish-Jewish names |
| Romanian | 65536 | Romanian-Jewish names |
| Russian | 131072 | Russian-Jewish names |
| Spanish | 262144 | Sephardic influence |
Total: 11 languages
- German: -stein, -berg, -feld, -man(n)
- Polish: -ski, -wicz, -czyk
- Russian: -ovich, -evich, -sky
- Hebrew: Cohen, Levi, variations
Optimized for Sephardic Jewish surnames from Spain, Portugal, and Mediterranean regions.
| Language | Bitmask Value | Notes |
|---|---|---|
| Any | 1 | Fallback |
| French | 64 | North African influence |
| Hebrew | 1024 | Original Hebrew names |
| Italian | 4096 | Italian-Jewish names |
| Portuguese | 32768 | Portuguese origin |
| Spanish | 262144 | Primary origin |
Total: 6 languages
- Spanish: -ez, -es, de + location
- Portuguese: -es, da + location
- Italian: -i, -o endings
- Ladino variations
The library automatically detects likely languages based on spelling patterns:
use Zendevio\BMPM\BeiderMorse;
$encoder = new BeiderMorse();
// German pattern detected
$encoder->detectLanguages('Müller');
// → [Language::German]
// Polish pattern detected
$encoder->detectLanguages('Kowalski');
// → [Language::Polish]
// Multiple languages possible
$encoder->detectLanguages('Smith');
// → [Language::English, Language::German, ...]| Language | Patterns | Examples |
|---|---|---|
| Arabic | -allah, -al, abd-, ibn- | Abdullah, Abdallah |
| Czech | ř, č, ž, -ová | Dvořák, Černý |
| Dutch | -ij, -ijk, van- | Dijkstra, van der Berg |
| English | -tion, -son, Mc-, O'- | Johnson, McDonald |
| French | -eau, -eux, -ault | Moreau, Dubois |
| German | -sch, -stein, ü, ö | Müller, Schmidt |
| Greek | -os, -is, -opoulos | Papadopoulos |
| Hebrew | -ah, -el, -man | Abraham, Cohen |
| Hungarian | -gy, -sz, -cs | Nagy, Kovács |
| Italian | -i, -ini, -elli | Rossi, Bianchi |
| Polish | -ski, -cki, -wicz | Kowalski, Nowak |
| Portuguese | -ão, -es, -eira | Silva, Pereira |
| Romanian | -escu, -iu | Ionescu, Popescu |
| Russian | -ov, -ev, -sky | Ivanov, Smirnov |
| Spanish | -ez, -az, -oz | Gonzalez, Rodriguez |
| Turkish | -oğlu, -lı, -ci | Yılmaz, Demir |
For better precision when you know the origin:
use Zendevio\BMPM\Enums\Language;
// Restrict to specific language
$encoder = BeiderMorse::create()
->withLanguages(Language::German);
// Restrict to multiple languages
$encoder = BeiderMorse::create()
->withLanguages(
Language::German,
Language::Polish,
Language::Russian
);Languages use power-of-2 values for efficient combining:
// Combine languages
$mask = Language::German->value | Language::Polish->value;
// 128 | 16384 = 16512
// Check if language in mask
$hasGerman = ($mask & Language::German->value) !== 0;
// true
// Get all languages from mask
$languages = Language::fromMask($mask);
// [Language::German, Language::Polish]
// Combine using helper
$mask = Language::combineMask([
Language::English,
Language::French,
Language::German
]);Most European languages use Latin script with various diacritical marks:
- German: ä, ö, ü, ß
- French: é, è, ê, ç
- Polish: ą, ę, ł, ó, ś, ź, ż
- Czech: á, č, ď, é, ě, í, ň, ó, ř, š, ť, ú, ů, ý, ž
The library normalizes these for matching.
Names in non-Latin scripts are typically provided in transliterated form:
- Hebrew: Transliterated to Latin (Cohen, Levi)
- Cyrillic: Transliterated to Latin (Ivanov, Smirnov)
- Greek: Transliterated to Latin (Papadopoulos)
- Arabic: Transliterated to Latin (Abdullah, Mohammed)
The phonetic rules handle common transliteration variants.