Tuesday 24 April 2012

Create basic spider

Each spider is a javascript file containing one main() function

function main(env, args) {
  var links = args.get('links');
  links.add('Hello world!');
}

'env' is object of Machine class.

'args' is object of HashMap class.

'args' contains value with 'links' key which is object of ArrayList class. 'links' object will be printed to log after spider finish running.

In spider script, object which belongs to java classes is accessed as in java language. Some methods having return value, paramater which is object of unsupported classes is called with errors. This prevent spider from accessing some restrict resource like file system. Spider is runned in sandbox which is safe for system.

No comments:

Post a Comment