VOICE Home Page: http://www.os2voice.org |
December 2001
[Newsletter Index]
|
By Christian Langanke © December 2001 |
After having more experience with the Apache
web server, I learned about a dynamic technique called MultiViews, which makes
it obsolete to hardcode language information in the HTML code. MultiViews are an
implementation of the so-called "content negotiation" within Apache, as
defined in the HTTP/1.1 specification as well as in some RFCs. This means that the
server and the browser negotiate about what would be the best variant of a requested
document to be returned by the server, if several variants are available.
MultiViews are designed to differ not only between languages, but also between other type of content (like e.g. between HTML, plain text and Postscript), but supporting automatic language selection is probably the most common use of this feature and only this is addressed here. When I found out that my internet provider is also using Apache, I began testing how MultiViews would work and very quickly changed my website to use this feature. With this article I want to share my experience with you.
This article however deals with the Apache approach only. In order to use the Apache MultiViews on your website, you are required to have either a provider using Apache webserver or your own machine running Apache on it. Moreover, of these two cases this article only covers the first one, explaining in detail how you can make use of MultiViews with the server of your provider, and what you could ask your provider for in case this feature is not yet configured on this web server. With this information also an advanced Apache user should be able to configure its own server with no problem.
By the way, in rare cases Apache is configured to display a modified error message,
for example to support error messages in different languages - then the admin of
your server is using exactly the technique being described in this article. If such
a message does not mention the webserver being used, you have to send the providers'
admin an email and just ask.
[Editor's note: An alternative method is to go to http://uptime.netcraft.com/up/graph
and enter the URL and it will tell you the server and operating system being used.]
If your provider does not use the Apache server, and you do not want to use conventional HTML links to distinguish between languages, there are other mechanisms to support automatic language selection from the server side. These are based on different mechanisms like e.g. CGI and PHP and are not discussed here.
With Apache driven websites supporting MultiViews this is no longer an issue.
All you have to do as a user in order to take advantage of MultiViews is simply
install a browser in your native language. Then the website automatically and seamlessly
takes care for your language. But how is that accomplished?
Did you ever notice the language settings dialog in the preferences of your browser
and possibly thought "well, what is this being used for anyway?". You
may have guessed it already, yes, it is to tell your browser which language or list
of languages you prefer to receive from web servers. Mostly you do not even have
to maintain this list, as a web browser usually adds its language to this
list during installation. So if you install a web browser in your native language,
for the browser this will automatically be also your preferred language.
The browser sends the list of preferred languages to a web server with every
request. The bad news is: most servers and/or the websites hosted on them just ignore
this list. The good news is: websites supporting Apache MultiViews will make use
of it: if the requested page is available in one of your preferred language, the
page is returned in the first preferred language supported. If not, the page is
returned in a fallback language (usually english).
As a result, as a user you will perhaps not even notice that a given web site
supports multiple languages, instead you just have the language displayed that fits
your needs best. No more clicking on flags, no more chasing links to read a website
in your favourite language, just enter the web site and start to read...
The rest of this article deals with the implementation of that neat feature on the server side. If you are a user only and/or don't care for hosting a multilingual website, you are through by now and will possibly just be glad about every website serving you well using MultiViews. If you own a multilingual website or are just interested further, here is the rest of the story.
If you use MultiViews instead, managing content in different languages is completely transparent to the HTML code, as the server does all the work required to distinct between content of different languages. Here is how it works.
The huge difference to other techniques is that the browser does not select between
files with content of several languages according to the links within your HTML
code. Instead the server checks for the additional filename extension to distinguish
between language variants. Moreover, this method is completely transparent to the
web browser, as the language identifiers are not even sent to the browser. This
results in a very convenient side effect: you can link within your files without
specifying the language identifiers, having exactly the same links for every language
and thus avoiding errors during translation.
The above first sounds like one could add any kind of file extensions to a filename,
but that is not true - Apache only supports known or additionally configured MIME
types or language identifiers.
In the sample coming up now, we use en as the identifier for the English
language and de for the identifier for the German language, as defined in
RFC 1766. Refer to the language selection dialog within your browser settings to
see more language identifiers.
Now let's think of a single file named index.html to make explanation easier.
In order to support English and German language, you normally would have files on
your web space like
Now prepare for MultiLinks by naming the files like this, where English is the
fallback language:
While the purpose of the first two files is obvious, why do we need the third
file? This one is returned by Apache, if neither the English nor the German language
was set as preferred by the user, but any other; unfortunately Apache does not cooperate
in a sensible way here. In order to provide a file with a fallback variant we may
not use the name index.html and for obvious reasons, we also cannot use
any other language identifier as an additional filename extension. Instead we may
only use a valid MIME type here and logically we have to reuse the MIME type extension
properly describing the contents of the file, which is .html again.
The same scheme would apply to PHP files, in this example this would result in the
name index.php.php for the fallback variant.
On your own Unix server you will most likely use a symbolic link within the filesystem
to access index.html.en through index.html.html, so that you don't
have to provide the english version twice. On an OS/2 machine, or when you don't
have telnet access to your Unix-based server to create such links, you are required
to use a copy for the fallback variants or use serverside includes, unfortunately
wasting some harddisk space.
But all in all: isn't that simple?
To find out if that is the case, just upload test files according to the above
sample onto your webspace and request index.html within your browser (without
appending the language identifier to the filename!). Further, play around with the
language settings within the preferences of your browser and reload the document
after each change:
In this situation hopefully another option is left for you: you can modify Apache's
behaviour for the files residing in your web space, provided that at least this
is allowed within the main configuration of Apache. You do that by creating an additional
configuration file named .htaccess and upload it into your
webspace. You can have one per directory, but since settings are passed onto subdirectories,
one .htaccess file in the top level directory of your web tree is
sufficient in most cases. A common exception to this rule are .htaccess
files that restrict access to a certain subdirectory, this file is only placed into
that directory. Note that the value of a directive in the .htaccess
file of a subdirectory will override the value from the .htaccess
file in the parent directory, and so forth.
The .htaccess file may contain a lot of different directives.
To name the most important uses, you can configure Apache for example to
If you define other options as well, specify them together on one line looking like one of these examples:Options MultiViews
Options MultiViews <other_option> <other_option>Now upload a file named .htaccess containing the options line to the directory, where the files of the above sample reside.
or Options <other_option> MultiViews <other_option>
IMPORTANT: Do not edit the .htaccess file on
your local system with an editor that appends an end-of-file character (ASCII code
26) to the file, like the Tiny Editor (TEDIT.EXE) and the OS/2 system editor (E.EXE)
do. Apache will ignore such a file even when uploaded in ASCII mode! |
Now test the MultiViews by accessing index.html of the sample above
subdirectory. The following may occur:
If he agrees, very good! If not, I can think of only two reasons why he might
not want to do this:
Note:
It is not sufficient to specify the option All, which turns on a certain set of other options - the MultiView Option is not included there.Options MultiViews <other_option> <other_option>
The migration method depends on how you have used language identifiers embedded
within file and/or directory names until now:
are to be copied to
and a copy of, or symbolic link to news/page01.html.en is to be created
with the name of
to
Also here a copy of, or a symbolic link to news/page01.html.en is required
under the name of
Advantages are:
You will most likely convert complete subdirectories, but in the following example
we have migrated and non-migrated files of one subdirectory in order to make clear,
that this really can be done file by file. Here the first two pages are available
in two languages and the fallback variant, while the two last pages are available
only in one language. This really allows smooth migration:
From a webmaster's point of view: whenever I have to support a multilingual website
in the future, I will try to avoid any other method at all costs, because MultiViews
are flexible, intuitive and far less error-prone than any other method that I know.
It can also easily be combined with other techniques such as using PHP scripts,
in this case no language parameter is required anymore.
From a user's point of view: I also regard highly that MultiViews take care of
my language preferences, and it is very simple for users to configure the language
list within the preferences of their web browser (if that is required in rare cases
anyway).
Now that I know the MultiViews technique, which can seamlessly and automatically integrate multiple languages into a website, I would use flags and other hardcoded links only if I did not have an Apache web server at all. I think that most users do not care for how many languages a website supports, as long as the preferred language of the user or the best alternative to that is automatically supported. So in my opinion that integration of multiple languages into the website is the best one, that you hardly notice, if at all.
References:
|
On his homepage he provides several self-created OS/2 programs for free. He also is author of the forthcoming Team OS/2 Internet Assistant for OS/2 and eComStation and supports OS/2 Netlabs in providing CVS services for OS/2 internet projects like e.g. Odin und Everblue with his Netlabs Open Source Archive Client and Administrator packages.
[Feature Index]
editor@os2voice.org
[Previous Page] [Newsletter Index] [Next Page]
VOICE Home Page: http://www.os2voice.org