
It's better when it's simple

User Tools

Site Tools



This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
utf-8 [2014-12-12 08:16] – removed [2017-11-13 02:49]
Line 1: Line 1:
 +====== UTF-8 Encoding ======
 +[[DokuWiki]] now stores all its data in UTF-8. To avoid problems, the filenames of the datafiles itself are [[phpfn>urlencode|URL-encoded]] when saved. DokuWiki versions older than release 2005-02-06 used different encodings so the datafiles need to be [[tips:utf8update|reencoded]] when the software is updated. Switching the used encoding to charsets different from UTF-8 is **not** supported.
 +===== Browser Setup for UTF-8 =====
 +All modern browsers do handle UTF-8 encoded web pages - it's one of the few things that actually work as expected in most browsers. If your browser doesn't display some characters correctly, you are probably missing the correct Unicode fonts.
 +Windows users should install the ''Arialuni.TTF'' font from Microsoft. It is included in Microsoft's Office Suite.
 +[[|Debian]] users can read my [[notes>debianfonts|page on fonts]] to learn how to install Unicode fonts correctly.
 +  * [[wp>Unicode and HTML]]
 +  * [[|Configuring browsers for Unicode]]
 +===== Editing Files =====
 +{{ wiki:np2-bom.png|Save without a BOM in Notepad 2}}
 +If you intend to edit the data files directly or want to create a [[Localization|translation]]. You need to use a UTF-8 aware editor. There are a lot of capable editors out there, I just want to recommend two small, simple, and free ones here if you still need one ((This is neither intended to be a complete list of Unicode editors, nor as a selection of the best available choices. It's just two small editors I did like. Please do **not** add more editors.)) :
 +  * [[|TEA]] -- a GTK2 based editor for GNU/Linux
 +  * [[|Notepad2]] -- a very good notepad replacement for Windows
 +Please note: DokuWiki does __not__ use a [[wp>Byte Order Mark]] and you should make sure your software doesn't, either (especially when editing the PHP and config files).
 +===== batch Encoding file =====
 +  * On Unix use
 +  * On Window use recode, a port of iconv:
 +    * Example of a simple conversion for french local computer:<code>recode lat1..u8 test.txt</code>with ''lat1'' the source charset and ''u8'' the conversion charset for UTF-8.
 +    * To batch the conversion on Windows use this (conversion of all the files in sub-directory)<code>FOR /F "tokens=*" %%G IN ('dir/b/S/X ^"C:\yourpath\*.txt^"') DO recode -v lat1..u8 %%~sG</code>
 +  * More explanation there: [[|the link]]
 +===== Examples =====
 +Below are some examples of UTF-8 characters to check your browser((copied from
 +Zodiac Signs: ♈ ♉ ♊ ♋ ♌ ♍ ♎ ♏ ♐ ♑ ♒ ♓
 +A chessboard:
 +^   ^ A ^ B ^ C ^ D ^ E ^ F ^ G ^ H ^
 +^ 8 | ♜ | ♞ | ♝ | ♛ | ♚ | ♝ | ♞ | ♜ |
 +^ 7 | ♟ | ♟ | ♟ | ♟ | ♟ | ♟ | ♟ | ♟ | 
 +^ 6 |      |        |      |   
 +^ 5 |      |        |      |   
 +^ 4 |      |        |      |   
 +^ 3 |      |        |      |   
 +^ 2 | ♙ | ♙ | ♙ | ♙ | ♙ | ♙ | ♙ | ♙ |
 +^ 1 | ♖ | ♘ | ♗ | ♕ | ♔ | ♗ | ♘ | ♖ |
 +Russian (по-русски):
 +  По оживлённым берегам
 +  Громады стройные теснятся
 +  Дворцов и башен; корабли
 +  Толпой со всех концов земли
 +  К богатым пристаням стремятся;
 +Ancient Greek:
 +Αρχαίο Πνεύμα Αθάνατον!  
 +Ἰοὺ ἰού· τὰ πάντʼ ἂν ἐξήκοι σαφῆ.
 +  Ὦ φῶς, τελευταῖόν σε προσϐλέψαιμι νῦν,
 +  ὅστις πέφασμαι φύς τʼ ἀφʼ ὧν οὐ χρῆν, ξὺν οἷς τʼ
 +  οὐ χρῆν ὁμιλῶν, οὕς τέ μʼ οὐκ ἔδει κτανών.
 +Modern Greek:
 +  Η σύγχρονη Ελλάδα, έχει να παρουσιάσει δυναμικό
 +  έργο στον τομέα του πολιτισμού, των τεχνών και
 +  των γραμμάτων. Αντίστοιχα δυναμική είναι η παρουσία
 +  των Ελλήνων επιχειρηματιών στην διεθνή οικονομική
 +  και βιομηχανική σκηνή.
 +  पशुपतिरपि तान्यहानि कृच्छ्राद्
 +  अगमयदद्रिसुतासमागमोत्कः । 
 +  कमपरमवशं न विप्रकुर्युर्
 +  विभुमपि तं यदमी स्पृशन्ति भावाः ॥
 +गूगल समाचार हिन्दी में 
 +  한글은 아름다운 우리글입니다.
 +  곱고 아름답게 사용하는 것이 우리의 의무입니다.
 +Traditional Chinese:
 +  子曰:「學而時習之,不亦說乎?有朋自遠方來,不亦樂乎?
 +  人不知而不慍,不亦君子乎?」
 +  有子曰:「其為人也孝弟,而好犯上者,鮮矣;
 +  不好犯上,而好作亂者,未之有也。君子務本,本立而道生。
 +  孝弟也者,其為仁之本與!」
 + Simplified Chinese:
 +  子曰:「学而时习之,不亦说乎?有朋自远方来,不亦乐乎?
 +  人不知而不慍,不亦君子乎?」
 +  有子曰:「其为人也孝弟,而好犯上者,鲜矣;
 +  不好犯上,而好作乱者,未之有也。君子务本,本立而道生。
 +  孝弟也者,其为仁之本与!」
 +  「秋の田の かりほの庵の 苫をあらみ わが衣手は 露にぬれつつ」 天智天皇
 +  「春すぎて 夏来にけらし 白妙の 衣ほすてふ 天の香具山」 持統天皇
 +  「あしびきの 山鳥の尾の しだり尾の ながながし夜を ひとりかも寝む」 柿本人麻呂 
 +  Iedomu jaukie ideāli,
 +  Vecākie principi, tikla, mīla - 
 +  Dienas allažības priekšā
 +  Šķīst kā graudi akmeņstarpā.
 +  Glāžšķūņa rūķīši jautri dziedādami čiepj koncertflīģeļa vāku. 
 +Simplified Chinese :
 +  这是简体字汉语。 zhè shì jiǎn tǐ zì hàn yǔ 
 +  Հարգանքներիս հավաստիքը Հայ Ժողովրդին:
 +  Ամենալավ օրենքները չեն օգնի, եթե մարդիկ բանի պետք չեն:
 +  بنی‌آدم اعضای یک‌دیگرند / که در آفرینش ز یک گوهرند
 +  המשפט עם הזכוכית שאפשר לאכול בלי שזה מפריע, לא זוכר איך הוא הולך
utf-8.txt · Last modified: 2017-11-13 02:49 by

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki