Forums / Setup & design / Clean URL for Vietnamese pages

Clean URL for Vietnamese pages

Author Message

Guillaume Marty

Thursday 09 June 2011 9:22:19 am

I saw a topic and a bug related to this issue, but they date back to 2009.

The problem is clean URL are not generated for pages written in Vietnamese, falling back to /content/view/full/ type URL.

I installed the transformation file attached to the bug report and override transform.ini this way:

[Transformation]
Charsets[]=utf-8;vietnamese

[vietnamese]
Files[]=vietnamese.tr
Extensions[]

 

That's almost OK as some characters are not caught by the transformation rules and are replace by a hyphen.

Character: ệ
Rule in tranformation file: U+1EC7 = "e"
Result: -
Expected result: e

Any ideas why not all characters are transformed?

Ivo Lukac

Thursday 09 June 2011 10:28:12 am

Hi

Try this custom url translator, place the file in "urlfilters/ngvietnamesefilter.php" in your extension with content:

<?php
class nGVietnameseFilter extends eZURLAliasFilter
{
static $mappingArray = array('\u00C0' => 'A', '\u1EA2' => 'A', '\u00C3' => 'A', '\u00C1' => 'A', '\u1EA0' => 'A', '\u1EB0' => 'A','\u1EB2' => 'A', '\u1EB4' => 'A', '\u1EAE' => 'A', '\u1EB6' => 'A', '\u1EA6' => 'A', '\u1EA8' => 'A','\u1EAA' => 'A', '\u1EA4' => 'A', '\u1EAC' => 'A', '\u00C8' => 'E', '\u1EBA' => 'E', '\u1EBC' => 'E','\u00C9' => 'E', '\u1EB8' => 'E', '\u1EC0' => 'E', '\u1EC2' => 'E', '\u1EC4' => 'E', '\u1EBE' => 'E','\u1EC6' => 'E', '\u00CC' => 'I', '\u1EC8' => 'I', '\u0128' => 'I', '\u00CD' => 'I', '\u1ECA' => 'I','\u00D2' => 'O', '\u1ECE' => 'O', '\u00D5' => 'O', '\u00D3' => 'O', '\u1ECC' => 'O', '\u1ED2' => 'O','\u1ED4' => 'O', '\u1ED6' => 'O', '\u1ED0' => 'O', '\u1ED8' => 'O', '\u1EDC' => 'O', '\u1EDE' => 'O','\u1EE0' => 'O', '\u1EDA' => 'O', '\u1EE2' => 'O', '\u00D9' => 'U', '\u1EE6' => 'U', '\u0168' => 'U','\u00DA' => 'U', '\u1EE4' => 'U', '\u1EEA' => 'U', '\u1EEC' => 'U', '\u1EEE' => 'U', '\u1EE8' => 'U','\u1EF0' => 'U', '\u1EF2' => 'Y', '\u1EF6' => 'Y', '\u1EF8' => 'Y', '\u00DD' => 'Y', '\u1EF4' => 'Y','\u00E0' => 'a', '\u1EA3' => 'a', '\u00E3' => 'a', '\u00E1' => 'a', '\u1EA1' => 'a', '\u1EB1' => 'a','\u1EB3' => 'a', '\u1EB5' => 'a', '\u1EAF' => 'a', '\u1EB7' => 'a', '\u1EA7' => 'a', '\u1EA9' => 'a','\u1EAB' => 'a', '\u1EA5' => 'a', '\u1EAD' => 'a', '\u00E8' => 'e', '\u1EBB' => 'e', '\u1EBD' => 'e','\u00E9' => 'e', '\u1EB9' => 'e', '\u1EC1' => 'e', '\u1EC3' => 'e', '\u1EC5' => 'e', '\u1EBF' => 'e','\u1EC7' => 'e', '\u00EC' => 'i', '\u1EC9' => 'i', '\u0129' => 'i', '\u00ED' => 'i', '\u1ECB' => 'i','\u00F2' => 'o', '\u1ECF' => 'o', '\u00F5' => 'o', '\u00F3' => 'o', '\u1ECD' => 'o', '\u1ED3' => 'o','\u1ED5' => 'o', '\u1ED7' => 'o', '\u1ED1' => 'o', '\u1ED9' => 'o', '\u1EDD' => 'o', '\u1EDF' => 'o','\u1EE1' => 'o', '\u1EDB' => 'o', '\u1EE3' => 'o', '\u00F9' => 'u', '\u1EE7' => 'u', '\u0169' => 'u','\u00FA' => 'u', '\u1EE5' => 'u', '\u1EEB' => 'u', '\u1EED' => 'u', '\u1EEF' => 'u', '\u1EE9' => 'u','\u1EF1' => 'u', '\u1EF3' => 'y', '\u1EF7' => 'y', '\u1EF9' => 'y', '\u00FD' => 'y', '\u1EF5' => 'y','\uFB00' => 'ff', '\uFB01' => 'fi', '\uFB02' => 'fl', '\uFB03' => 'ffi', '\uFB04' => 'ffl', '\uFB05' => 'ft', '\uFB06' => 'st','\u00C2' => 'A', '\u00CA' => 'E', '\u00CE' => 'I', '\u00D4' => 'O', '\u00DB' => 'U','\u00E2' => 'a', '\u00EA' => 'e', '\u00EE' => 'i', '\u00F4' => 'o', '\u00FB' => 'u','\u01A0' => 'O', '\u01A1' => 'o', '\u01AF' => 'U', '\u01B0' => 'u');

static function utf8ToUnicode( $str ) {
$unicode = array();$values = array();$lookingFor = 1;
for ($i = 0; $i < strlen( $str ); $i++ ) {
$thisValue = ord( $str[ $i ] );
if ( $thisValue < ord('A') ) {
if ($thisValue >= ord('0') && $thisValue <= ord('9')) {
$unicode[] = chr($thisValue);
}else {
$unicode[] = '%'.dechex($thisValue);
}
} else {
if ( $thisValue < 128)
$unicode[] = $str[ $i ];
else {
if ( count( $values ) == 0 ) $lookingFor = ( $thisValue < 224 ) ? 2 : 3;
$values[] = $thisValue;
if ( count( $values ) == $lookingFor ) {
$number = ( $lookingFor == 3 ) ?( ( $values[0] % 16 ) * 4096 ) + ( ( $values[1] % 64 ) * 64 ) + ( $values[2] % 64 ):( ( $values[0] % 32 ) * 64 ) + ( $values[1] % 64 );
$number = dechex($number);
$unicode[] = '\u' . strtoupper(str_pad($number, 4, '0', STR_PAD_LEFT));
$values = array();
$lookingFor = 1;
}
} 
}
} 
return implode("",$unicode);
} 
function process( $text, &$languageObject, &$caller ){
$outputText = '';$textArray = preg_split('/(?<!^)(?!$)/u', $text);
foreach($textArray as $char){
$unicodeChar = nGVietnameseFilter::utf8ToUnicode($char);
$outputText .= (array_key_exists($unicodeChar, nGVietnameseFilter::$mappingArray)) ? nGVietnameseFilter::$mappingArray[$unicodeChar] : $char;
}
return $outputText;
}
}
?>

Add following lines to your site.ini:

 [URLTranslator]
Extensions[]={YOUR EXTENSION NAME}
Filters[]=nGVietnameseFilter

http://www.linkedin.com/in/ivolukac
http://www.netgen.hr/eng/blog
http://twitter.com/ilukac

Guillaume Marty

Tuesday 14 June 2011 5:35:45 am

Thanks for your reply, but it didn't work for me.

First, I tried to do what you described.

Then I regenerated the autoloads array and tried:

[URLTranslator]
FilterClasses[]=nGVietnameseFilter

(Extensions & Filters are deprecated now)

But it didn't work either. It looks like the characters are transformed in a bad way beforehand. I'm still enquiring.

Ivo Lukac

Tuesday 14 June 2011 5:50:21 am

Hi,

Send me your email via "Direct contact" form (http://share.ez.no/authorcontact/form/9504 ) and I'll send you the files, maybe the copy&paste method from post is not good

http://www.linkedin.com/in/ivolukac
http://www.netgen.hr/eng/blog
http://twitter.com/ilukac

eZ debug

Timing: Jan 17 2025 22:36:47
Script start
Timing: Jan 17 2025 22:36:47
Module start 'content'
Timing: Jan 17 2025 22:36:48
Module end 'content'
Timing: Jan 17 2025 22:36:48
Script end

Main resources:

Total runtime0.8686 sec
Peak memory usage4,096.0000 KB
Database Queries199

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0050 589.1563180.8438
Module start 'content' 0.00500.7401 770.0000579.5313
Module end 'content' 0.74510.1235 1,349.5313340.7813
Script end 0.8686  1,690.3125 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00360.4177210.0002
Check MTime0.00130.1505210.0001
Mysql Total
Database connection0.00050.057210.0005
Mysqli_queries0.799292.00571990.0040
Looping result0.00190.22381970.0000
Template Total0.845097.320.4225
Template load0.00210.238520.0010
Template processing0.842997.032820.4214
Template load and register function0.00020.025410.0002
states
state_id_array0.00080.089210.0008
state_identifier_array0.00110.127920.0006
Override
Cache load0.00160.1869350.0000
Sytem overhead
Fetch class attribute can translate value0.00120.140730.0004
Fetch class attribute name0.00110.128250.0002
XML
Image XML parsing0.00090.105630.0003
class_abstraction
Instantiating content class attribute0.00000.001460.0000
General
dbfile0.00250.2893280.0001
String conversion0.00000.000630.0000
Note: percentages do not add up to 100% because some accumulators overlap

CSS/JS files loaded with "ezjscPacker" during request:

CacheTypePacklevelSourceFiles
CSS0extension/community/design/community/stylesheets/ext/jquery.autocomplete.css
extension/community_design/design/suncana/stylesheets/scrollbars.css
extension/community_design/design/suncana/stylesheets/tabs.css
extension/community_design/design/suncana/stylesheets/roadmap.css
extension/community_design/design/suncana/stylesheets/content.css
extension/community_design/design/suncana/stylesheets/star-rating.css
extension/community_design/design/suncana/stylesheets/syntax_and_custom_tags.css
extension/community_design/design/suncana/stylesheets/buttons.css
extension/community_design/design/suncana/stylesheets/tweetbox.css
extension/community_design/design/suncana/stylesheets/jquery.fancybox-1.3.4.css
extension/bcsmoothgallery/design/standard/stylesheets/magnific-popup.css
extension/sevenx/design/simple/stylesheets/star_rating.css
extension/sevenx/design/simple/stylesheets/libs/fontawesome/css/all.min.css
extension/sevenx/design/simple/stylesheets/main.v02.css
extension/sevenx/design/simple/stylesheets/main.v02.res.css
JS0extension/ezjscore/design/standard/lib/yui/3.17.2/build/yui/yui-min.js
extension/ezjscore/design/standard/javascript/jquery-3.7.0.min.js
extension/community_design/design/suncana/javascript/jquery.ui.core.min.js
extension/community_design/design/suncana/javascript/jquery.ui.widget.min.js
extension/community_design/design/suncana/javascript/jquery.easing.1.3.js
extension/community_design/design/suncana/javascript/jquery.ui.tabs.js
extension/community_design/design/suncana/javascript/jquery.hoverIntent.min.js
extension/community_design/design/suncana/javascript/jquery.popmenu.js
extension/community_design/design/suncana/javascript/jScrollPane.js
extension/community_design/design/suncana/javascript/jquery.mousewheel.js
extension/community_design/design/suncana/javascript/jquery.cycle.all.js
extension/sevenx/design/simple/javascript/jquery.scrollTo.js
extension/community_design/design/suncana/javascript/jquery.cookie.js
extension/community_design/design/suncana/javascript/ezstarrating_jquery.js
extension/community_design/design/suncana/javascript/jquery.initboxes.js
extension/community_design/design/suncana/javascript/app.js
extension/community_design/design/suncana/javascript/twitterwidget.js
extension/community_design/design/suncana/javascript/community.js
extension/community_design/design/suncana/javascript/roadmap.js
extension/community_design/design/suncana/javascript/ez.js
extension/community_design/design/suncana/javascript/ezshareevents.js
extension/sevenx/design/simple/javascript/main.js

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
4content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
2content/datatype/view/ezxmltags/link.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/link.tplEdit templateOverride template
9content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
4content/datatype/view/ezxmltags/literal.tpl<No override>extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tplEdit templateOverride template
1content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
2content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
1pagelayout.tpl<No override>extension/sevenx/design/simple/templates/pagelayout.tplEdit templateOverride template
 Number of times templates used: 24
 Number of unique templates used: 8

Time used to render debug report: 0.0001 secs