No. 1 Story

HP job cuts loom for Australian employees

A number of Australian employees of Hewlett-Packard are facing the loss of their jobs as the global computer giant looks to slash its worldwide workforce by up to 30,000.

read more

Related Articles

Adoption of cloud computing has reached a tipping point  - but don’t expect legacy...
In yet another blow to the Facebook IPO this week, following the withdrawal of...
Recruitment technology and social media have played a significant role in growing business in...
The Spotify music service was launched in Australia this week, and support soon arrived...
A new powered speaker from Logitech takes advantage of Apple's AirPlay system for wireless...

CAPTCHAs make up for OCR shortcomings

Your IT - Home IT

A group at Carnegie Mellon University has come up with a way of helping with efforts to digitise books at the same time as allowing web sites to prove that a user is a human rather than a piece of software.

The reCAPTCHA system is a variation on the widely used CAPTCHA method of verifying human users by asking them to type in distorted or otherwise obscured words or other sequences of characters.

reCAPTCHA presents users with a dual CAPTCHA. It 'knows' the answer to one, but the other is an unrecognised word obtained from scanning and OCRing a book. The reasoning is that if people can correctly recognise the known CAPTCHA, then their response to the unknown word will also be correct. Once people give the same answer, it is taken as definitive.

This is a doubly clever idea. In addition to getting some useful work out of a chore many of us perform several times a day (tens of millions of CAPTCHAs are thought to be decoded daily), the fact that the words used have already proved resistant to OCR makes them good candidates for CAPTCHAs.

The reCAPTCHA project is currently helping digitise books from the Internet Archive. In addition to plugins for for popular systems and languages including WordPress, phpBB, PHP, Perl and Ruby, the project also offers reCAPTCHA Mailhide, a way of concealing email addresses on even simple web pages.

CAPTCHA is an acronym - possibly back-formed - for Completely Automated Turing Test To Tell Computers and Humans Apart. CAPTCHAs are often used to ensure that only humans submit comments to web sites, sign up for accounts, vote in online polls, and perform other activities. reCAPTCHA is run by the original creators of CAPTCHA.