All shops not used to plain ASCII text content. Many special characters are not handled by PrestaShop, mostly those from languages that use a non-Latin alphabet. But PrestaShop is built around UTF-8, so you can easily support to the default behavior.How to handle the Carian language in PrestaShop?
This method use the PCRE library. PCRE included in PHP. It is a regular expression library. It is used for helping with support for special characters and many other languages, or “scripts”.

List of the script is : Arabic, Armenian, Avestan, Balinese, Bamum, Batak, Bengali, Bopomofo, Brahmi, Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Chakma, Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic, Deseret, Devanagari, Egyptian_Hieroglyphs, Ethiopic, Georgian, Glagolitic, Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew, Hiragana, Imperial_Aramaic, Inherited, Inscriptional_Pahlavi, Inscriptional_Parthian, Javanese, Kaithi, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer, Lao, Latin, Lepcha, Limbu, Linear_B, Lisu, Lycian, Lydian, Malayalam, Mandaic, Meetei_Mayek, Meroitic_Cursive, Meroitic_Hieroglyphs, Miao, Mongolian, Myanmar, New_Tai_Lue, Nko, Ogham, Old_Italic, Old_Persian, Old_South_Arabian, Old_Turkic, Ol_Chiki, Oriya, Osmanya, Phags_Pa, Phoenician, Rejang, Runic, Samaritan, Saurashtra, Sharada, Shavian, Sinhala, Sora_Sompeng, Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa, Tai_Le, Tai_Tham, Tai_Viet, Takri, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh, Ugaritic, Vai, Yi.

The shortcut of PCRE is \pL. Most of these languages are used with the PCRE shortcut.

Overriding
In order to handle Carian language characters, you will need to add a piece of PCRE code to allow special characters to match with regular expressions in PrestaShop.

The piece of PCRE code: \p{Carian}

For this, We are looking to work with isLinkRewrite() method. This method will validate that the string is a valid URL.
————————————————————————————————————–
public static function isLinkRewrite($link)
{
if (Configuration::get(‘PS_ALLOW_ACCENTED_CHARS_URL’))
return preg_match(‘/^[_a-zA-Z0-9\-\pL\p{Carian}]+$/u’, $link);
return preg_match(‘/^[_a-zA-Z0-9\-]+$/’, $link);
}
————————————————————————————————————–

Then, override the Tools class, and more precisely the str2url() method, this used to allows you to clean a string and turn it into a safe and valid URL.

————————————————————————————————————–
public static function str2url($strurl)
{
if (function_exists(‘mb_strtolower’))
$strurl= mb_strtolower($strurl, ‘utf-8’);

if (!function_exists(‘mb_strtolower’) || !Configuration::get(‘PS_ALLOW_ACCENTED_CHARS_URL’))
$strurl= Tools::replaceAccentedChars($strurl);

// Remove all non-whitelist characters.
if (Configuration::get(‘PS_ALLOW_ACCENTED_CHARS_URL’))
$strurl= preg_replace(‘/[^a-zA-Z0-9\s\’\:\/\[\]-\pL\p{Carian}]/u’, ”, $strurl);
else
$strurl= preg_replace(‘/[^a-zA-Z0-9\s\’\:\/\[\]-]/’,”, $strurl);

$strurl= preg_replace(‘/[\s\’\:\/\[\]-]+/’, ‘ ‘, $strurl);
$strurl= str_replace(array(‘ ‘, ‘/’), ‘-‘, $strurl);

// If it was not possible to lowercase the string with mb_strtolower, we do it after the transformations.
if (!function_exists(‘mb_strtolower’))
$strurl= strtolower($strurl);

return $strurl;
}
————————————————————————————————————–

Then, you will need to update the default routes from the Dispatcher class. The default routes adding to the following code at the end of the Dispatcher class, after the “$this->loadRoutes();”

————————————————————————————————————–
public function __construct()
{
// …
// previous code does not change
// …

$this->loadRoutes();

foreach ($this->default_routes as &$routes)
foreach ($routes[‘keywords’] as &$keywords)
$keywords[‘regexp’] = str_replace(‘\\pL’, ‘\\pL\\p{Carian}’, $keywords[‘regexp’]);
parent::__construct();
}
————————————————————————————————————–

You have now overridden the main methods and able to handle Carian characters in your shop URL.
If you want to handle another language, only you need to add a new language condition to your regular expression.

Using another language than Carian
We want to use the Telugu language instead of Carian, which is mostly used in south-central India.
For this, only need to change the code and replace every instance of
\p{Carian} with \p{Telugu}:

For Example :
————————————————————————————————————–
public static function isLinkRewrite($link)
{
if (Configuration::get(‘PS_ALLOW_ACCENTED_CHARS_URL’))
return preg_match(‘/^[_a-zA-Z0-9\-\pL\p{Telugu}]+$/u’, $link);
return preg_match(‘/^[_a-zA-Z0-9\-]+$/’, $link);
}
————————————————————————————————————–

Adding support for another language
In addition to Carian language, we want to support Gujarati language. It is the chief language in the state of “Gujarat”.
For this case, only need to repeat the PCRE code ( \p{language Name} ), with “Gujarati” instead of “Carian”.
For example:
———————————————————————————————————–
public static function isLinkRewrite($link)
{
if (Configuration::get(‘PS_ALLOW_ACCENTED_CHARS_URL’))
return preg_match(‘/^[_a-zA-Z0-9\-\pL\p{Carian}\p{Gujarati}]+$/u’, $link);
return preg_match(‘/^[_a-zA-Z0-9\-]+$/’, $link);
}
———————————————————————————————————–
For above methods, we have added both \p{Carian} and \p{Gujarati} in the regular expression, one after the other. You want to add another languages to follow the above example to add as many languages as needed.