JavaScript parser in Python [closed] JavaScript parser in Python [closed] python python

JavaScript parser in Python [closed]


Nowadays, there is at least one better tool, called slimit:

SlimIt is a JavaScript minifier written in Python. It compiles JavaScript into more compact code so that it downloads and runs faster.

SlimIt also provides a library that includes a JavaScript parser, lexer, pretty printer and a tree visitor.

Demo:

Imagine we have the following javascript code:

$.ajax({    type: "POST",    url: 'http://www.example.com',    data: {        email: 'abc@g.com',        phone: '9999999999',        name: 'XYZ'    }});

And now we need to get email, phone and name values from the data object.

The idea here would be to instantiate a slimit parser, visit all nodes, filter all assignments and put them into the dictionary:

from slimit import astfrom slimit.parser import Parserfrom slimit.visitors import nodevisitordata = """$.ajax({    type: "POST",    url: 'http://www.example.com',    data: {        email: 'abc@g.com',        phone: '9999999999',        name: 'XYZ'    }});"""parser = Parser()tree = parser.parse(data)fields = {getattr(node.left, 'value', ''): getattr(node.right, 'value', '')          for node in nodevisitor.visit(tree)          if isinstance(node, ast.Assign)}print fields

It prints:

{'name': "'XYZ'",  'url': "'http://www.example.com'",  'type': '"POST"',  'phone': "'9999999999'",  'data': '',  'email': "'abc@g.com'"}


ANTLR, ANother Tool for Language Recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a variety of target languages.

The ANTLR site provides many grammars, including one for JavaScript.

As it happens, there is a Python API available - so you can call the lexer (recognizer) generated from the grammar directly from Python (good luck).


I have translated esprima.js to Python:

https://github.com/PiotrDabkowski/pyjsparser

>>> from pyjsparser import parse>>> parse('var $ = "Hello!"'){"type": "Program","body": [    {        "type": "VariableDeclaration",        "declarations": [            {                "type": "VariableDeclarator",                "id": {                    "type": "Identifier",                    "name": "$"                },                "init": {                    "type": "Literal",                    "value": "Hello!",                    "raw": '"Hello!"'                }            }        ],        "kind": "var"    }  ]}

It's a manual translation so its very fast, takes about 1 second to parse angular.js file (so 100k characters per second). It supports whole ECMAScript 5.1 and parts of version 6 - for example Arrow functions, const, let.

If you need support for all the newest JS6 features you can translate esprima on the fly with Js2Py:

import js2pyesprima = js2py.require("esprima@4.0.1")esprima.parse("a = () => {return 11};")# {'body': [{'expression': {'left': {'name': 'a', 'type': 'Identifier'}, 'operator': '=', 'right': {'async': False, 'body': {'body': [{'argument': {'raw': '11', 'type': 'Literal', 'value': 11}, 'type': 'ReturnStatement'}], 'type': 'BlockStatement'}, 'expression': False, 'generator': False, 'id': None, 'params': [], 'type': 'ArrowFunctionExpression'}, 'type': 'AssignmentExpression'}, 'type': 'ExpressionStatement'}], 'sourceType': 'script', 'type': 'Program'}