phantomjs series introduction of Phantomjs Api

Jin Shuyun: “The rhinoceros does not dare to burn, the burning is different, the clothes belt, the person can communicate with the ghost”

The previous articles introduced the usage of Selenium+Phantomjs and also discussed performance optimization issues. However, using selenium or python to run phantomjs is not an efficient method in nature. Furthermore, selenium is not perfect for phantomjs encapsulation (long time has not been updated), so it is necessary to study the original phantomjs. So I refer to the [official website] (http://phantomjs.org) introduction, learn to summarize the written text, and share it in this record.

Phantomjs fully supports the web without a browser, also known as a headless browser. It is a webkit-based server-side javascript API that can be used for page automation, network monitoring, web page screenshots, crawler crawling, and more. Phantomjs has a lot of api interfaces. The interface syntax uses the syntax of js. phantom provides classes. After instantiation, you can call the object method. You can use the callback function to achieve the functions you want. The APi mainly has web server Api and webPage. APi, System APi, etc., here mainly introduces the usage of several commonly used APIs.

phantomjs-Command Line Interface

Description: phantomjs command line usage and parameter settings
First we look at how to call phantomjs to run js scripts.

1
phantomjs [options] somescript.js [arg1 [arg2 [...]]]

Optional parameters: (only listed commonly used)

  • –disk-cache=[true|false] Cache settings
  • –ignore-ssl-errors=[true|false] ignore ssl errors
  • –load-images=[true|false] Loading images
  • –proxy=address:port Set proxy

There are a lot of parameters, not listed one by one, for detailed reference: [phantomjs-Command Line Interface] (http://phantomjs.org/api/command-line.html)

phantomjs-system module

Description: phantomjs system operates APi
Document address: [phantomjs-system module] (http://phantomjs.org/api/system/)
Role: for system operation

args(Get program input parameters)

Code (test.js)

1
2
3
4
5
6
7
8
9
10
var system = require('system');
var args = system.args;
if (args.length === 1) {
console.log('Try to pass some arguments when invoking this script!');
} else {
args.forEach(function(arg, i) {
console.log(i + ': ' + arg);
});
}

run:
phantomjs test.js hello
result:
0 test.js
1 hello
Function: Accept console input parameters.

env(System Environment Variable)

Code (test.js):

1
2
3
4
5
6
var system = require('system');
var env = system.env;
Object.keys(env).forEach(function(key) {
console.log(key + '=' + env[key]);
});

Run: phantomjs test.js
Function: List system environment variables

os(platform type)

Code (test.js):

1
2
3
4
5
var system = require('system');
var os = system.os;
console.log(os.architecture); // '32bit'
console.log(os.name); // 'windows'
console.log(os.version); // '7'

Run: phantomjs test.js
result:
32bit
windows
7
Function: Output running platform type

pid (process id)

Code (test.js):

1
2
3
4
var system = require('system');
var pid = system.pid;
console.log(pid);

Output process pid

platgform(platform information)

Code (test.js):

1
2
var system = require('system');
console.log(system.platform); // 'phantomjs'

Run result: phantomjs

Phantomjs-web server module

Description: phantomjs web server module APi
Document address: [Phantomjs-web server module] (http://phantomjs.org/api/webserver/method/listen.html)
Role: As a webserver server, provide http services.
Code (test.js):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
var webserver = require ('webserver');
var server = webserver.create();
var service = server.listen(8080, function(request, response) {
response.statusCode = 200;
response.setHeader("Cookie","1adaa2121");
response.setEncoding("binary");
response.write('<html><body>Hello!</body></html>');
console.log(request.method);
console.log(request.url);
console.log(request.httpVersion);
console.log(request.headers);
console.log(request.post);
console.log(request.postRaw);
response.close();
});

Run: phantomjs test.js
Visit: http://localhost:8080

If you want to specify ip and port, then 8080 can be written like: ‘127.0.0.1:9999’.

There are 2 parameters, request and response.

Request parameter method:

  • request.method
  • request.url
  • request.httpVersion
  • request.headers
  • request.post
  • request.postRaw

Used to get the requested content.

Response parameter method:

  • response.headers
  • response.setheader(name,value)
  • response.header(name)
  • response.statusCode()
  • response.setEncoding(“binary”)
  • response.write(html_data)
  • response.writeHead(statusCode,headers)
  • reponse.close()
  • answer.closeGracefully ()

Phantomjs-web page module

Description: phantomjs web page module APi
Document address: [Phantomjs-web page module] (http://phantomjs.org/api/webpage/)
Role: used to send http requests, get network resources, or page operations.

Instantiating api classes

1
2
var web Page = require('webpage');
var page = webPage.create();
  • page.content source code
  • page.title title
  • page.cookie cookie
  • page.plainText web content (remove html)
  • page.setting parameter setting
  • page.url current url

clipRect cut page

1
2
3
4
5
6
page.clipRect = {
top: 14,
left: 3,
width: 400,
height: 300
};

contentGet the webpage source code

1
2
3
4
5
6
7
8
var web Page = require('webpage');
var page = webPage.create();
page.open('http://thief.one', function (status) {
var content = page.content;
console.log('Content: ' + content);
phantom.exit();
});
1
2
3
4
5
6
7
8
9
10
page.open('http://thief.one', function (status) {
var cookies = page.cookies;
console.log('Listing cookies:');
for(var i in cookies) {
console.log(cookies[i].name + '=' + cookies[i].value);
}
phantom.exit();
});

Setting customHeaders content:

1
2
3
4
page.customHeaders = {
"X-Test": "foo",
"DNT": "1"
};

plainTextGet web content (remove html only left content)

1
2
3
4
page.open('http://thief.one', function (status) {
console.log('Stripped down page text:\n' + page.plainText);
phantom.exit();
});

setting Request header settings

1
2
3
var web Page = require('webpage');
var page = webPage.create();
page.settings.userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36';

zoomFactor thumbnail creation

1
2
3
4
5
var web Page = require('webpage');
var page = webPage.create();
page.zoomFactor = 0.25;
page.render('capture.png');
1
2
3
4
5
6
7
8
9
phantom.addCookie({
'name' : 'Valid-Cookie-Name', /* required property */
'value' : 'Valid-Cookie-Value', /* required property */
'domain' : 'localhost',
'path' : '/foo', /* required property */
'httponly' : true,
'secure' : false,
'expires' : (new Date()).getTime() + (1000 * 60 * 60) /* <-- expires in 1 hour */
});

upload files

1
2
3
4
var web Page = require('webpage');
var page = webPage.create();
page.uploadFile('input[name=image]', '/path/to/some/photo.jpg');

render page screenshot

1
2
3
4
5
page.viewportSize = { width: 1920, height: 1080 };
page.open("http://www.google.com", function start(status) {
page.render('google_home.jpeg', {format: 'jpeg', quality: '100'});
phantom.exit();
});

For more examples, please refer to: [examples] (http://phantomjs.org/examples/index.html)

Portal

[[phantomjs series] phantomjs correctly opened] (http://thief.one/2017/03/31/Phantomjs%E6%AD%A3%E7%A1%AE%E6%89%93%E5%BC% 80%E6%96%B9%E5%BC%8F/)
[[phantomjs series] phantomjs api introduction] (http://thief.one/2017/03/13/Phantomjs-Api%E4%BB%8B%E7%BB%8D/)
[[Phantomjs series] those pits that selenium+phantomjs climbed] (http://thief.one/2017/03/01/Phantomjs%E7%88%AC%E8%BF%87%E7%9A%84%E9 %82%A3%E4%BA%9B%E5%9D%91/)
[[phantomjs series] selenium+phantomjs performance optimization] (http://thief.one/2017/03/01/Phantomjs%E6%80%A7%E8%83%BD%E4%BC%98%E5%8C% 96/)

本文标题:phantomjs series introduction of Phantomjs Api

文章作者:nmask

发布时间:2017年03月13日 - 19:03

最后更新:2019年07月11日 - 18:07

原始链接:https://thief.one/2017/03/13/Phantomjs-Api Introduction/

许可协议: 署名-非商业性使用-禁止演绎 4.0 国际 转载请保留原文链接及作者。

nmask wechat
欢迎您扫一扫上面的微信公众号,订阅我的博客!
坚持原创技术分享,您的支持将鼓励我继续创作!

热门文章推荐: