Friday, July 4, 2008

Multi Langauge Transalation Using AJAX

Developer's Guide

With the AJAX Language API, you can translate and detect the language of blocks of text within a webpage using only Javascript.

The API is new, so there may be bugs and slightly less than perfect documentation. Bear with us as we fill in the holes, and join the AJAX APIs developer forum to give feedback and discuss the API.

Table of Contents

Audience

This documentation is designed for people familiar with JavaScript programming and object-oriented programming concepts. There are many JavaScript tutorials available on the Web.

Introduction

The "Hello, World" of the Google AJAX Language API

The easiest way to start learning about this API is to see a simple example. The following example will detect the language of the given text and then translate it to English.

<html>
<head>
<script type="text/javascript" src="http://www.google.com/jsapi">script>
<script type="text/javascript">

google
.load("language", "1");

function initialize() {
var text = document.getElementById("text").innerHTML;
google
.language.detect(text, function(result) {
if (!result.error && result.language) {
google
.language.translate(text, result.language, "en",
function(result) {
var translated = document.getElementById("translation");
if (result.translation) {
translated
.innerHTML = result.translation;
}
});
}
});
}
google
.setOnLoadCallback(initialize);

script>
head>
<body>
<div id="text">你好,很高興見到你。div>
<div id="translation">div>
body>
html>

You can view this example here to edit and play around with it.

Including the AJAX Language API on Your Page

To include the AJAX Language API in your page, you will utilize the Google AJAX API loader. The common loader allows you to load all of the AJAX apis that you need, including the language API. You need to both include the Google AJAX APIs script tag and call google.load("language", "1"):

<script type="text/javascript" src="http://www.google.com/jsapi">script>
<script type="text/javascript">
google
.load("language", "1");
script>

The first script tag loads the google.load function, which lets you load individual Google APIs. google.load("language", "1") loads Version 1 of the Language API. Currently the AJAX Language API is in Version 1, but new versions may be available in the future. See the versioning discussion below for more information.

API Updates

The second argument to google.load is the version of the AJAX Language API you are using. Currently the AJAX Language API is in version 1, but new versions may be available in the future.

If we do a significant update to the API in the future, we will change the version number and post a notice on Google Code and the AJAX APIs discussion group. When that happens, we expect to support both versions for at least a month in order to allow you to migrate your code.

The AJAX Language API team periodically updates the API with the most recent bug fixes and performance enhancements. These bug fixes should only improve performance and fix bugs, but we may inadvertently break some API clients. Please use the AJAX APIs discussion group to report such issues.

Examples

Language Translation

This example shows a simple translation of a javascript string.

google.language.translate("Hello world", "en", "es", function(result) {
if (!result.error) {
var container = document.getElementById("translation");
container
.innerHTML = result.translation;
}
});

View example (translate.html)

Language Detection

This example shows language detection of a javascript string. The language code is returned

var text = "¿Dónde está el baño?";
google
.language.detect(text, function(result) {
if (!result.error) {
var language = 'unknown';
for (l in google.language.Languages) {
if (google.language.Languages[l] == result.language) {
language
= l;
break;
}
}
var container = document.getElementById("detection");
container
.innerHTML = text + " is: " + language + "";
}
});

View example (detection.html)

Source Detection during Translation

The following example is similar to the basic translation example but shows how to translate the text without knowing the source language. By specifying an empty string as unknown for the source language, the system will detect and translate in one call.

google.language.translate("Hello world", "", "es", function(result) {
if (!result.error) {
var container = document.getElementById("translation");
container
.innerHTML = result.translation;
}
});

View example (autotranslate.html)

Some Additional Samples

Here are two addional samples that allow some interaction. The first does language detection with a pre-canned text string but allows other text to be input. It also display confidence and reliability factors.

View example (detect.html)

The second additional sample does translation. However it will also allow interaction similar to the sample above.

View example (translate.html)

API Details

Supported Languages

The Google AJAX Language API currently supports the following languages. The technology is constantly improving and the team is working hard to expand this list, so please check back often. You can also visit Google Translate to view an up to date list.

  • Afrikaans New!
  • Albanian New!
  • Amharic New!
  • Arabic
  • Armenian New!
  • Azerbaijani New!
  • Basque New!
  • Belarusian New!
  • Bengali New!
  • Bihari New!
  • Bulgarian New!
  • Burmese New!
  • Catalan New!
  • Cherokee New!
  • Chinese (Simplified and Traditional)
  • Croatian New!
  • Czech New!
  • Danish New!
  • Dhivehi New!
  • Dutch
  • English
  • Esperanto New!
  • Estonian New!
  • Filipino New!
  • Finnish New!
  • French
  • Galician New!
  • Georgian New!
  • German
  • Greek
  • Guarani New!
  • Gujarati New!
  • Hebrew New!
  • Hindi New!
  • Hungarian New!
  • Icelandic New!
  • Indonesian New!
  • Inuktitut New!
  • Italian
  • Japanese
  • Kannada New!
  • Kazakh New!
  • Khmer New!
  • Korean
  • Kurdish New!
  • Kyrgyz New!
  • Laothian New!
  • Latvian New!
  • Lithuanian New!
  • Macedonian New!
  • Malay New!
  • Malayalam New!
  • Maltese New!
  • Marathi New!
  • Mongolian New!
  • Nepali New!
  • Norwegian New!
  • Oriya New!
  • Pashto New!
  • Persian New!
  • Polish New!
  • Portuguese
  • Punjabi New!
  • Romanian New!
  • Russian
  • Sanskrit New!
  • Serbian New!
  • Sindhi New!
  • Sinhalese New!
  • Slovak New!
  • Slovenian New!
  • Spanish
  • Swahili New!
  • Swedish New!
  • Tajik New!
  • Tamil New!
  • Tagalog New!
  • Telugu New!
  • Thai New!
  • Tibetan New!
  • Turkish New!
  • Ukranian New!
  • Urdu New!
  • Uzbek New!
  • Uighur New!
  • Vietnamese New!

Supported Language Translation Pairs

The Google AJAX Language API currently detects all languages listed above. A subset of those languages are translatable and are listed below. Any two languages from the following list can be translated. To test if a language is translatable, utilize google.language.isTranslatable(languageCode);

  • Arabic
  • Bulgarian New!
  • Chinese (Simplified and Traditional)
  • Croatian New!
  • Czech New!
  • Danish New!
  • Dutch
  • English
  • Finnish New!
  • French
  • German
  • Greek
  • Hindi New!
  • Italian
  • Japanese
  • Korean
  • Norwegian New!
  • Polish New!
  • Portuguese
  • Romanian New!
  • Russian
  • Spanish
  • Swedish New!

Flash and other Non-Javascript EnvironmentsNew!

For Flash developers, and those developers that have a need to access the AJAX Language API from other Non-Javascript environments, the API exposes a simple RESTful interface. In all cases, the method supported is GET and the response format is a JSON encoded result with embedded status codes. Applications that use this interface must abide by all existing terms of use. An area to pay special attention to relates to correctly identifying yourself in your requests. Applications MUST always include a valid and accurate http referer header in their requests. In addition, we ask, but do not require, that each request contains a valid API Key. By providing a key, your application provides us with a secondary identification mechanism that is useful should we need to contact you in order to correct any problems.

The easiest way to start learning about this interface is to try it out... Using the command line tool curl or wget execute the following command:

curl -e http://www.my-ajax-site.com \
'http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&q=hello%20world&langpair=en%7Cit'

This command performs a Language Translation(/ajax/services/language/translate), for Hello World (q=hello%20world) from English to Italian (langpair=en%7Cit). The response has a Content-Type of text/javascript; charset=utf-8. You can see from the response below that the responseData is identical to the properties described in the Result Objects documentation.

{"responseData": {
"translatedText":"Ciao mondo"
},
"responseDetails": null, "responseStatus": 200}

In addition to this response format, the protocol also supports a classic JSON-P style callback which is triggered by specifying a callback argument. When this argument is present the json object is delivered as an argument to the specified callback.

callbackFunction(
{"responseData": {
"translatedText":"Ciao mondo"
},
"responseDetails": null, "responseStatus": 200})

And finally, the protocol supports a callback and context argument. When these url arguments are specified, the response is encoded as a direct Javascript call with a signature of: callback(context, result, status, details, unused). Note the slight difference in the following command and response.

curl -e http://www.my-ajax-site.com \
'http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&q=hello%20world&langpair=en%7Cit&callback=foo&context=bar'

This command performs a Language Translate and is identical to the previous call, BUT has been altered to pass both callback and context. With these arguments in place, instead of a JSON object being returned, a Javascript call is returned in the response and the JSON object is passed via the result parameter.

foo('bar', {"translatedText":"Ciao mondo"}, 200, null)

Code SnippetsNew!

The AJAX Search API documentation includes a small collection of code snippets that demonstrate access to the service from Flash, Java, and Php. The language specific part of this is uniform across all of the AJAX APIs so instead of repeating the snippets, please view via this link.

For complete documentation on this interface, please visit the class reference manual.

Troubleshooting

If you encounter problems with your code:

  • Look for typos. Remember that JavaScript is a case-sensitive language.
  • Use a JavaScript debugger. In Firefox, you can use the JavaScript console or the FireBug extension. In IE, you can use the Microsoft Script Debugger.
  • Search the AJAX APIs discussion group. If you can't find a post that answers your question, post your question to the group along with a link to a web page that demonstrates the problem.

Thursday, July 3, 2008

Backups are a snap with rsnapshot

We’ve all heard the reasons for backing up our data regularly -- accidental deletion of files (rm -rf *), corrupted files from crashed applications, the dreaded hard disk failure, the list goes on. Nevertheless, on average, only 25 per cent of computer users perform routine backups of their data, as shown by a recent Harris Interactive survey. So why do the remaining 75 per cent put off this important task? Well, manual backups are often an adhoc measure, unreliable, and time-consuming. Automating an otherwise tedious backup process is key to producing routine and reliable backups. With that in mind, we’ll take a look at rsnapshot, a handy backup utility based on rsync, a well-known open source tool.

rsnapshot was written by Nathan Rosenquist as a replacement for a patchwork of complex shell scripts he had crafted to do rsync backups. Any changes to the backup scheme meant manually editing the scripts, making sure no bugs were introduced. rsnapshot was a great improvement over this process, it was easy to configure, portable across different operating systems, supported remote backups, and best of all, automated the entire backup process.

rsnapshot enables users to keep multiple backups of their data, from local or remote systems, readily accessible. Each backup is a complete snapshot of the data at a specific point in time. rsnapshot minimizes disk space usage by utilizing hard links (multiple entries in the file system to share a single data entity) and rsync. Thus, the total amount of disk space used is the space for one full backup, plus any incremental snapshots.

Since rsnapshot is written entirely in Perl, its a snap to install on most modern versions of Linux or BSD. In fact, rsnapshot comes pre-installed on Debian, Gentoo, FreeBSD, OpenBSD, and NetBSD. Users with other distributions can compile and install rsnapshot by downloading the latest version from www.rsnapshot.org.

Install rsnapshot

To get started I will download and install rsnapshot (v1.2.1) on my Fedora Core 4 system (mango). If you're are using a distribution that already has rsnapshot installed, just skip to the next section.

To install rsnapshot you will need to have both perl (v5.004+) and rsync available on your system. Although, not required, it helps to have OpenSSH, BSD logger, GNU cp, and GNU du, available as well. If you have perl and rsync on your system, follow the simple instructions below to install rsnapshot.

$ wget -q http://www.rsnapshot.org/downloads/rsnapshot-1.2.1.tar.gz
$ tar xzf rsnapshot-1.2.1.tar.gz
$ cd rsnapshot-1.2.1

$ ./configure --prefix=/usr/local --sysconfdir=/etc

The --sysconfdir=/etc parameter above tells rsnapshot to look for its configuration file (rsnapshot.conf) in /etc. Installing rsnapshot requires root privileges.

$ make install

Make sure rsnapshot is available in your command search path.

$ whereis rsnapshot
rsnapshot: /usr/local/bin/rsnapshot

Configure rsnapshot

For the purposes of this article, we will use rsnapshot to backup data from one Linux system (kiwi) to another (mango). rsnapshot will run on mango, which will also host the backup archives. Both systems should have rsync and ssh installed.

All configuration parameters of rsnapshot are controlled via the rsnapshot.conf file. Before we setup rsnapshot, we'll copy the default configuration file /etc/rsnapshot.conf.default and save it as /etc/rsnapshot.conf. This way we can revert back to a clean configuration if we mangle our config file.

Now, let’s edit rsnapshot.conf on mango to setup our backup system. Most of the parameter defaults do not need modification, so we’ll just focus on those that do.

Where will backups be stored?

The snapshot_root parameter in the SNAPSHOT ROOT DIRECTORY section specifies the directory where rsnapshot will place backup snapshots as they are created. Make sure you select a disk partition with adequate free space to hold your backups.

# Note: Use TABS (not spaces) to separate
# the configuration directive and the value.
# If specifying a directory, put a
# slash at the end.

snapshot_root /usr2/snapshots/

If you plan on using an USB/FireWire hard disk for storing backups, then the no_create_root parameter should be set to 0. This tells rsnapshot to create the snapshot root directory if it doesn’t already exist.

Which external programs will rsnapshot use?

Next, the EXTERNAL PROGRAM DEPENDENCIES section contains parameters to specify paths for optional external tools that rsnapshot depends on to provide certain features. Be sure to uncomment the lines starting with cmd_cp, cmd_ssh, and cmd_du by removing the hash (#) mark at the beginning of the line.

# use GNU cp
cmd_cp /bin/cp

# use ssh for secure remote backups
cmd_ssh /usr/bin/ssh

# use GNU du to check disk space usage
cmd_du /usr/bin/du

How often will backups happen?

The configuration parameters in the BACKUP INTERVALS section determine how often rsnapshot will perform backups and how many snapshots will be kept. The keyword interval is followed by an alphanumeric label, followed by a number, signifying how many intervals to keep.

In our backup system, we want to take a snapshot of kiwi every 3 hours, so that's 8 snapshots per day. Each time rsnapshot hourly is executed, it will create a new snapshot, rotate the old ones, and retain the 8 most recent (hourly.0 - hourly.7) snapshots. We also want to take a daily snapshot, and keep a week's (7 days) worth of snapshots.

#interval    minutes    6
interval hourly 8
interval daily 7
#interval weekly 4

The order of the interval definitions is very important. The first interval line must represent the smallest unit of time, with each subsequent line representing a larger interval. If you were to add a weekly interval, it would appear after the daily interval. Similarly, a minutes interval would appear before hourly.

What is included or excluded from the backup?

Most of the parameters in the GLOBAL OPTIONS section can be left at their default values. However, there are two parameters that you can use to include or exclude files from the backup. Both parameters get passed directly to rsync, so take a look at the --include and --exclude options in the rsync man page for a thorough explanation of how to construct match patterns. If you prefer listing all your include/exclude patterns in separate files, specify them using the include_file and exclude_file parameters.

Here are some simple examples to get you started.

# exclude anything starting with a dot character (.)
exclude .*

# exclude anything ending with a tilde character (~)
exclude *~

# include .ssh directory
include /home/nsharma/.ssh/

What should be backed up?

The BACKUP POINTS / SCRIPTS section tells rsnapshot what is to be backed up and where the backup snapshot is stored. This part is very important, so pay attention. We will use rsync over ssh to backup two directories and a file from the system named kiwi, and store the snapshots in a directory named kiwi_backups. The hostname kiwi must resolve to an IP address, either via DNS or the /etc/hosts file.

# two directories (/home/nsharma, /my_articles)
backup root@kiwi:/home/nsharma/ kiwi_backups/
backup root@kiwi:/my_articles/ kiwi_backups/

# one file
backup root@kiwi:/etc/passwd kiwi_backups/

The configuration above will only work if we can login (without manually entering passwords) to kiwi as root via ssh. The easiest way to setup access is by creating "passphraseless" keys with ssh-keygen, and here’s how to do it.

Setting up "passphraseless" keys

Login as root on mango

Use the ssh-keygen program to create a public/private key pair with Digital Signature Algorithm (DSA) encryption

$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
0d:f0:ea:bc:b8:0d:69:c6:6d:e0:59:c2:ee:31:4d:90 root@mango.private.dom

Transfer public key from mango to kiwi using scp

$ scp .ssh/id_dsa.pub root@kiwi.private.dom:mango.pub
root@kiwi.private.dom's password:
id_dsa.pub 100% 619 0.6KB/s 00:00

Login as root on kiwi

Install mango public key

$ cat mango.pub >> /root/.ssh/authorized_keys

Delete mango.pub file from kiwi

$ rm -f mango.pub

We should now be able to login to kiwi as root from mango without being prompted for a password.

If you’re uncomfortable with the idea of "passphraseless" keys, then take a look at the ssh-agent man page and a utility called keychain available at www.gentoo.org/proj/en/keychain/index.xml.

Testing our configuration

Before we run rsnapshot for the first time, we should make sure the syntax of our configuration file is correct, and execute a dry run of each interval we have defined.

Checking for correct syntax

$ rsnapshot configtest

rsnapshot will either show you the errors, or a Syntax OK message if there are no errors.

Dry run for each interval

# test run for 'hourly' interval
$ rsnapshot -t hourly

# test run for 'daily' interval
$ rsnapshot -t daily

The output from each command will show you exactly what rsnapshot will do for the specified intervals.

Automating the backup process

Our next and final step is to automate the execution of rsnapshot on mango. We'll add two entries to the cron scheduling server to request execution of rsnapshot every 3 hours on the hour, for the hourly interval, and every night at 11:00 pm, for the daily interval. Logged in as root on mango, we’ll invoke the crontab program with the edit (-e) option. The crontab invokes the default editor, as specified using the VISUAL or EDITOR shell environment variables.

$ crontab -e

Now, we add the following entries and save and close the file.

0 */3 * * * /usr/local/bin/rsnapshot hourly
0 23 * * * /usr/local/bin/rsnapshot daily

That’s it, we now have a fully automated backup system which creates hourly and daily snapshots of our data. For detailed documentation about rsnapshot, check out the rsnapshot man page and the rsnapshot website at www.rsnapshot.org.

Conclusion

Knowing what data to preserve and how to recover it in an emergency is critical to having a solid backup plan. Using the right tools to implement that backup plan is just as important. Take control of your backups with rsnapshot!

Before we finish, here’s an actual run of rsnapshot against the hourly interval.

$ rsnapshot -v hourly
echo 19462 > /var/run/rsnapshot.pid
mkdir -m 0755 -p /usr2/snapshots/hourly.0/
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
--include=/home/nsharma/.ssh/ --exclude=.* --exclude=*~ \
--rsh=/usr/bin/ssh root@kiwi:/home/nsharma/ \
/usr2/snapshots/hourly.0/kiwi_backups/
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
--include=/home/nsharma/.ssh/ --exclude=.* --exclude=*~ \
--rsh=/usr/bin/ssh root@kiwi:/my_articles/ \
/usr2/snapshots/hourly.0/kiwi_backups/
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
--include=/home/nsharma/.ssh/ --exclude=.* --exclude=*~ \
--rsh=/usr/bin/ssh root@kiwi:/etc/passwd \
/usr2/snapshots/hourly.0/kiwi_backups/
touch /usr2/snapshots/hourly.0/
rm -f /var/run/rsnapshot.pid

sanjay's shared items

My Blog List