Skip to page content or Skip to Accesskey List.

Work

Main Page Content

Ibm Teaches Netscape To Speak Hebrew

Rated 3.74 (Ratings: 1)

Want more?

 
Picture of sforbes

Shoshannah Forbes

Member info

User since: 15 Apr 2000

Articles written: 1

IBM has released a version of Netscape 4.61 with BiDi support,

enabling it to display Hebrew and Arabic web pages natively.

This gives site developers cheaper and more advanced ways of creating sites in these languages.

Note- Since I work mainly with Hebrew web sites in the Hebrew market,

this article will focus on the Hebrew side of things. In Arabic, matters are

similar but not identical, with some additional problems unique to the Arabic

language. I will love to hear comments from people who create Arabic language

web sites.

S.L.F

What's so tough about Hebrew and Arabic?

Hebrew and Arabic are Bi-Directional languages (BiDi for short). This means that

while most of the text is written from right-to-left, some of the text (like

numbers) is written from left to right.

Historically, since Netscape lacked any kind of

Hebrew support, a cloggy workaround was developed

which is called "Visual Hebrew". In general, this had

two parts:

On the Client side, the user needed to

install a "Web View" font, which has a Western

encoding but includes Hebrew glyphs (and most of the

web view fonts are of very low quality).

On the Developer side, the developer must

have used certain techniques to have the page

readable with the web view font:

  1. All Hebrew text must be reversed, while leaving

    any numbers or English text intact. For example,

    the sentence:

    "I love Lucy and will meet with her on May 13"

    Would become

    "13 yaM no reh htiw teem lliw dna ycuL evol I"
  2. All line breaks must be hard coded into the HTML

    - you can not let the browser wrap long lines, since

    then the words will become out of order.
  3. All text must be manually aligned to the right -

    either with <p align="right"> or with

    tables.
  4. You cannot use lists (<ol> or <ul>),

    since they would be indented to the left instead

    of to the right.
  5. You cannot define font faces (either via CSS or

    via the <font> tag), since the Hebrew fonts

    on the system are logical fonts, and would not

    work with web pages.
  6. Some elements, like forms and page titles, the

    browser uses the OS directly to display, which

    means that they have to be written differently

    since the OS uses logical Hebrew (in logical

    Hebrew, the data is stored in the order it was

    entered, with a flag marking the directionality.

    When the data is processed and displayed the OS

    uses that flag to keep the correct direction of

    the element)

It is rather obvious that this visual method has

huge shortcomings, both on the user side (you can

not copy and paste directly from web pages, and the

browser search function is useless) and on the

developer side (the extra cost of converting

existing documents to the visual encoding, the

limitations of design, and a need to add an extra

Hebrew flipping function to any data that is going

in or out of a database or being accepted from the

user).

Support by the browsers

Microsoft, with version 3 of Internet Explorer,

introduced a separate "Hebrew Enabled" version

which uses the Unicode BiDi algorithm on Hebrew

operating systems in order to display visually

encoded web pages with any system font, and new

support for "Logical" web pages, which work similar

to the OS in allowing the Arthur's "flag" the

directionality of elements, and render both

Right-To-Left (RTL) and Left-To-Right (LTR)

elements properly.

In version 5 of Internet Explorer, Microsoft went

one step further, allowing anyone, on any language

of Windows to view Hebrew web pages - both logical

and visual encoded (unfortunately, Mac IE5 has no

Hebrew support). However, to be able to write in

Hebrew (for example in web forms) the user still

needs to have a Hebrew supporting OS (such as

Windows 2000 with the Hebrew language pack

installed).

The W3C's HTML 4

spec also included the Unicode BiDi algorithm,

introducing among others the DIR (direction)

attribute that can go with any element to mark

its directionality (RTL or LTR) and the &lrm;

(Left to Right Mark) and &rlm; (Right to Left

Mark) entities, which control the directionality

of single characters.

All this time, the Netscape browser continued to

lack

any BiDi support
whatsoever.

This caused an interesting chicken-and-egg

problem since, while about 80% of the users were

using IE, web sites did not want to lose 20% of

their users, so continued to use visual Hebrew

encoding for their pages (Even the

Microsoft

Israel
web site continued to use visual Hebrew

for a surprisingly long time). Of course, the fact

that most web pages where written visually and

therefore viewable with Netscape, did not give end

users any real reason to cry out for BiDi support

in their browser. The problem of copying and

pasting to and from web pages was solved by a

booming market of utilities and applications that

did just that.

Until last week.

Last week IBM released

a version of Netscape 4.61 which they had licensed

from Netscape and added BiDi support.

The IBM version, Netscape 4.61i, includes the full

Communicator suite, but just the browser has BiDi

support (unlike IE where the full package - the

browser, FrontPage Express and Outlook Express

support Hebrew).

The user interface has no Hebrew option (again,

unlike IE which has a Hebrew interface available

for users of localized Hebrew windows), but is

finally aware of BiDi.

No more need to define a special web view font in

order to view Hebrew web pages - any Hebrew font

installed on the system will do.

The fonts for Hebrew are

defined independently from fonts for other

languages. The user can, for example, define

Trebuchet MS (which has no Hebrew glyphs) as

his/her default Latin1 font, and Arial Hebrew as

his/her default Hebrew font.

There is a full new section

in the preferences in order to define BiDi options

like the default direction of a web page

(LTR or RTL), the default user encoding etc.

Sites that have no encoding defined or have

incorrect encoding defined, can be viewed by

switching to the correct character set from the

new

encoding menu
. This time, it has all four

Hebrew character sets:

  1. Hebrew logical (windows-1255)
  2. Hebrew implicit (ISO-8859-8I, similar but not identical to the one above)
  3. Hebrew visual
  4. Hebrew DOS (which is almost not in use)

Yes, logical Hebrew is finally here in Netscape.

It still suffers some bugs, but it works well with

most of my test pages.

I should note though, that the MSN Israel web

site (http://www.msn.co.il/homepage.asp)

the only major web site written in logical Hebrew,

caused Hebrew Netscape to crash consistently. Is it

the web site? Is it something in logical Hebrew?

Is it the browser? At the moment I haven't done

enough testing yet to determined what it is.

One issue I did find, though, is that the User

Agent string of this browser is identical to any

Netscape 4.6 international browser, therefore

there is no way to tell from standard server logs

how many of the Netscape visitors to a site

actually have Hebrew support.

IBM apparently will be basing their work on

Hebrew

support in the Mozilla project
upon the work

they have done here, but AOL/Netscape has of yet

not said a word about their plans, if any, of

including the BiDi support code in the upcoming

Netscape 6.

For more information:

An expert in all things bi-directional, especially handling of right-to-left text like Hebrew and Arabic. Shoshannah hunts bugs for a living- both in web based applications and in cross-platform desktop applications.


Her web site can be found at www.xslf.com.

The access keys for this page are: ALT (Control on a Mac) plus:

evolt.org Evolt.org is an all-volunteer resource for web developers made up of a discussion list, a browser archive, and member-submitted articles. This article is the property of its author, please do not redistribute or use elsewhere without checking with the author.