parse5

NPM Version

WHATWG HTML5 specification-compliant, fast and ready for production HTML parsing/serialization toolset for Node.

parse5 contains nearly everything what you will need to deal with the HTML. It's the fastest spec-compliant HTML parser for Node to the date and will parse HTML the way the latest version of your browser does. It's stable and used by such projects as [jsdom](https://github.com/tmpvar/jsdom), [Angular2](https://github.com/angular/angular), [Polymer](https://www.polymer-project.org/1.0/) and many more. # Table of contents * [Install](#install) * [Usage](#usage) * [API Reference](#api-reference) * [FAQ](#faq) * [Version history](#version-history) * [License](#license-and-author-information) # Install ``` $ npm install parse5 ``` # Usage ```js var parse5 = require('parse5'); var document = parse5.parse('Hi there!'); var documentHtml = parse5.serialize(document); var fragment = parse5.parseFragment('Yo!'); var fragmentHtml = parse5.serialize(fragment); ``` For more advanced examples see [API reference](#api-reference) and [FAQ](#faq). # API Reference ## Objects
parse5 : object
## Typedefs
ElementLocationInfo : Object
LocationInfo : Object
ParserOptions : Object
SAXParserOptions : Object
SerializerOptions : Object
TreeAdapter : Object
## parse5 : object **Kind**: global namespace * [parse5](#parse5) : object * [.ParserStream](#parse5+ParserStream) ⇐ stream.Writable * [new ParserStream(options)](#new_parse5+ParserStream_new) * [.document](#parse5+ParserStream+document) : ASTNode.<document> * ["script" (scriptElement, documentWrite(html), resume)](#parse5+ParserStream+event_script) * [.SAXParser](#parse5+SAXParser) ⇐ stream.Transform * [new SAXParser(options)](#new_parse5+SAXParser_new) * [.stop()](#parse5+SAXParser+stop) * ["startTag" (name, attributes, selfClosing, [location])](#parse5+SAXParser+event_startTag) * ["endTag" (name, [location])](#parse5+SAXParser+event_endTag) * ["comment" (text, [location])](#parse5+SAXParser+event_comment) * ["doctype" (name, publicId, systemId, [location])](#parse5+SAXParser+event_doctype) * ["text" (text, [location])](#parse5+SAXParser+event_text) * [.SerializerStream](#parse5+SerializerStream) ⇐ stream.Readable * [new SerializerStream(node, [options])](#new_parse5+SerializerStream_new) * [.treeAdapters](#parse5+treeAdapters) * [.parse(html, [options])](#parse5+parse) ⇒ ASTNode.<Document> * [.parseFragment([fragmentContext], html, [options])](#parse5+parseFragment) ⇒ ASTNode.<DocumentFragment> * [.serialize(node, [options])](#parse5+serialize) ⇒ String ### parse5.ParserStream ⇐ stream.Writable **Kind**: instance class of [parse5](#parse5) **Extends:** stream.Writable * [.ParserStream](#parse5+ParserStream) ⇐ stream.Writable * [new ParserStream(options)](#new_parse5+ParserStream_new) * [.document](#parse5+ParserStream+document) : ASTNode.<document> * ["script" (scriptElement, documentWrite(html), resume)](#parse5+ParserStream+event_script) #### new ParserStream(options) Streaming HTML parser with the scripting support. [Writable stream](https://nodejs.org/api/stream.html#stream_class_stream_writable). | Param | Type | Description | | --- | --- | --- | | options | [ParserOptions](#ParserOptions) | Parsing options. | **Example** ```js var parse5 = require('parse5'); var http = require('http'); // Fetch google.com content and obtain it's node http.get('http://google.com', function(res) { var parser = new parse5.ParserStream(); parser.on('finish', function() { var body = parser.document.childNodes[0].childNodes[1]; }); res.pipe(parser); }); ``` #### parserStream.document : ASTNode.<document> Resulting document node. **Kind**: instance property of [ParserStream](#parse5+ParserStream) #### "script" (scriptElement, documentWrite(html), resume) Raised then parser encounters `'); ``` ### parse5.SAXParser ⇐ stream.Transform **Kind**: instance class of [parse5](#parse5) **Extends:** stream.Transform * [.SAXParser](#parse5+SAXParser) ⇐ stream.Transform * [new SAXParser(options)](#new_parse5+SAXParser_new) * [.stop()](#parse5+SAXParser+stop) * ["startTag" (name, attributes, selfClosing, [location])](#parse5+SAXParser+event_startTag) * ["endTag" (name, [location])](#parse5+SAXParser+event_endTag) * ["comment" (text, [location])](#parse5+SAXParser+event_comment) * ["doctype" (name, publicId, systemId, [location])](#parse5+SAXParser+event_doctype) * ["text" (text, [location])](#parse5+SAXParser+event_text) #### new SAXParser(options) Streaming [SAX](https://en.wikipedia.org/wiki/Simple_API_for_XML)-style HTML parser. [Transform stream](https://nodejs.org/api/stream.html#stream_class_stream_transform) (which means you can pipe *through* it, see example). | Param | Type | Description | | --- | --- | --- | | options | [SAXParserOptions](#SAXParserOptions) | Parsing options. | **Example** ```js var parse5 = require('parse5'); var http = require('http'); var fs = require('fs'); var file = fs.createWriteStream('/home/google.com.html'); var parser = new SAXParser(); parser.on('text', function(text) { // Handle page text content ... }); http.get('http://google.com', function(res) { // SAXParser is the Transform stream, which means you can pipe // through it. So you can analyze page content and e.g. save it // to the file at the same time: res.pipe(parser).pipe(file); }); ``` #### saxParser.stop() Stops parsing. Useful if you want parser to stop consume CPU time once you've obtained desired info from input stream. Doesn't prevents piping, so data will flow through parser as usual. **Kind**: instance method of [SAXParser](#parse5+SAXParser) **Example** ```js var parse5 = require('parse5'); var http = require('http'); var fs = require('fs'); var file = fs.createWriteStream('/home/google.com.html'); var parser = new parse5.SAXParser(); parser.on('doctype', function(name, publicId, systemId) { // Process doctype info ans stop parsing ... parser.stop(); }); http.get('http://google.com', function(res) { // Despite the fact that parser.stop() was called whole // content of the page will be written to the file res.pipe(parser).pipe(file); }); ``` #### "startTag" (name, attributes, selfClosing, [location]) Raised then parser encounters start tag. **Kind**: event emitted by [SAXParser](#parse5+SAXParser) | Param | Type | Description | | --- | --- | --- | | name | String | Tag name. | | attributes | String | List of attributes in `{ key: String, value: String }` form. | | selfClosing | Boolean | Indicates if tag is self-closing. | | [location] | [LocationInfo](#LocationInfo) | Start tag source code location info. Available if location info is enabled in [SAXParserOptions](#SAXParserOptions). | #### "endTag" (name, [location]) Raised then parser encounters end tag. **Kind**: event emitted by [SAXParser](#parse5+SAXParser) | Param | Type | Description | | --- | --- | --- | | name | String | Tag name. | | [location] | [LocationInfo](#LocationInfo) | End tag source code location info. Available if location info is enabled in [SAXParserOptions](#SAXParserOptions). | #### "comment" (text, [location]) Raised then parser encounters comment. **Kind**: event emitted by [SAXParser](#parse5+SAXParser) | Param | Type | Description | | --- | --- | --- | | text | String | Comment text. | | [location] | [LocationInfo](#LocationInfo) | Comment source code location info. Available if location info is enabled in [SAXParserOptions](#SAXParserOptions). | #### "doctype" (name, publicId, systemId, [location]) Raised then parser encounters [document type declaration](https://en.wikipedia.org/wiki/Document_type_declaration). **Kind**: event emitted by [SAXParser](#parse5+SAXParser) | Param | Type | Description | | --- | --- | --- | | name | String | Document type name. | | publicId | String | Document type public identifier. | | systemId | String | Document type system identifier. | | [location] | [LocationInfo](#LocationInfo) | Document type declaration source code location info. Available if location info is enabled in [SAXParserOptions](#SAXParserOptions). | #### "text" (text, [location]) Raised then parser encounters text content. **Kind**: event emitted by [SAXParser](#parse5+SAXParser) | Param | Type | Description | | --- | --- | --- | | text | String | Text content. | | [location] | [LocationInfo](#LocationInfo) | Text content code location info. Available if location info is enabled in [SAXParserOptions](#SAXParserOptions). | ### parse5.SerializerStream ⇐ stream.Readable **Kind**: instance class of [parse5](#parse5) **Extends:** stream.Readable #### new SerializerStream(node, [options]) Streaming AST node to HTML serializer. [Readable stream](https://nodejs.org/api/stream.html#stream_class_stream_readable). | Param | Type | Description | | --- | --- | --- | | node | ASTNode | Node to serialize. | | [options] | [SerializerOptions](#SerializerOptions) | Serialization options. | **Example** ```js var parse5 = require('parse5'); var fs = require('fs'); var file = fs.createWriteStream('/home/index.html'); // Serialize parsed document to the HTML and write it to file var document = parse5.parse('Who is John Galt?'); var serializer = new parse5.SerializerStream(document); serializer.pipe(file); ``` ### parse5.treeAdapters Provides built-in tree adapters which can be used for parsing and serialization. **Kind**: instance property of [parse5](#parse5) **Properties** | Name | Type | Description | | --- | --- | --- | | default | [TreeAdapter](#TreeAdapter) | Default tree format for parse5. | | htmlparser2 | [TreeAdapter](#TreeAdapter) | Quite popular [htmlparser2](https://github.com/fb55/htmlparser2) tree format (e.g. used by [cheerio](https://github.com/MatthewMueller/cheerio) and [jsdom](https://github.com/tmpvar/jsdom)). | **Example** ```js var parse5 = require('parse5'); // Use default tree adapter for parsing var document = parse5.parse('
', { treeAdapter: parse5.treeAdapters.default }); // Use htmlparser2 tree adapter with SerializerStream var serializer = new parse5.SerializerStream(node, { treeAdapter: parse5.treeAdapters.htmlparser2 }); ``` ### parse5.parse(html, [options]) ⇒ ASTNode.<Document> Parses HTML string. **Kind**: instance method of [parse5](#parse5) **Returns**: ASTNode.<Document> - document | Param | Type | Description | | --- | --- | --- | | html | string | Input HTML string. | | [options] | [ParserOptions](#ParserOptions) | Parsing options. | **Example** ```js var parse5 = require('parse5'); var document = parse5.parse('Hi there!'); ``` ### parse5.parseFragment([fragmentContext], html, [options]) ⇒ ASTNode.<DocumentFragment> Parses HTML fragment. **Kind**: instance method of [parse5](#parse5) **Returns**: ASTNode.<DocumentFragment> - documentFragment | Param | Type | Description | | --- | --- | --- | | [fragmentContext] | ASTNode | Parsing context element. If specified, given fragment will be parsed as if it was set to the context element's `innerHTML` property. | | html | string | Input HTML fragment string. | | [options] | [ParserOptions](#ParserOptions) | Parsing options. | **Example** ```js var parse5 = require('parse5'); var documentFragment = parse5.parseFragment('
'); //Parse html fragment in context of the parsed element var trFragment = parser.parseFragment(documentFragment.childNodes[0], ''); ``` ### parse5.serialize(node, [options]) ⇒ String Serializes AST node to HTML string. **Kind**: instance method of [parse5](#parse5) **Returns**: String - html | Param | Type | Description | | --- | --- | --- | | node | ASTNode | Node to serialize. | | [options] | [SerializerOptions](#SerializerOptions) | Serialization options. | **Example** ```js var parse5 = require('parse5'); var document = parse5.parse('Hi there!'); //Serialize document var html = parse5.serialize(document); //Serialize element content var bodyInnerHtml = parse5.serialize(document.childNodes[0].childNodes[1]); ``` ## ElementLocationInfo : Object **Kind**: global typedef **Extends:** [LocationInfo](#LocationInfo) **Properties** | Name | Type | Description | | --- | --- | --- | | startTag | [LocationInfo](#LocationInfo) | Element's start tag [LocationInfo](#LocationInfo). | | endTag | [LocationInfo](#LocationInfo) | Element's end tag [LocationInfo](#LocationInfo). | ## LocationInfo : Object **Kind**: global typedef **Properties** | Name | Type | Description | | --- | --- | --- | | line | Number | One-based line index | | col | Number | One-based column index | | startOffset | Number | Zero-based first character index | | endOffset | Number | Zero-based last character index | ## ParserOptions : Object **Kind**: global typedef **Properties** | Name | Type | Default | Description | | --- | --- | --- | --- | | locationInfo | Boolean | false | Enables source code location information for the nodes. When enabled, each node (except root node) has `__location` property. In case the node is not an empty element, `__location` will be [ElementLocationInfo](#ElementLocationInfo) object, otherwise it's [LocationInfo](#LocationInfo). If element was implicitly created by the parser it's `__location` property will be `null`. | | treeAdapter | [TreeAdapter](#TreeAdapter) | parse5.treeAdapters.default | Specifies resulting tree format. | ## SAXParserOptions : Object **Kind**: global typedef **Properties** | Name | Type | Default | Description | | --- | --- | --- | --- | | locationInfo | Boolean | false | Enables source code location information for the tokens. When enabled, each token event handler will receive [LocationInfo](#LocationInfo) object as the last argument. | ## SerializerOptions : Object **Kind**: global typedef **Properties** | Name | Type | Default | Description | | --- | --- | --- | --- | | treeAdapter | [TreeAdapter](#TreeAdapter) | parse5.treeAdapters.default | Specifies input tree format. | ## TreeAdapter : Object **Kind**: global typedef * [TreeAdapter](#TreeAdapter) : Object * [.createDocument()](#TreeAdapter.createDocument) ⇒ ASTNode.<Document> * [.createDocumentFragment()](#TreeAdapter.createDocumentFragment) ⇒ ASTNode.<DocumentFragment> * [.createElement(tagName, namespaceURI, attrs)](#TreeAdapter.createElement) ⇒ ASTNode.<Element> * [.createElement(data)](#TreeAdapter.createElement) ⇒ ASTNode.<CommentNode> * [.setDocumentType(document, name, publicId, systemId)](#TreeAdapter.setDocumentType) * [.setQuirksMode(document)](#TreeAdapter.setQuirksMode) * [.setQuirksMode(document)](#TreeAdapter.setQuirksMode) ⇒ Boolean * [.detachNode(node)](#TreeAdapter.detachNode) * [.insertText(parentNode, text)](#TreeAdapter.insertText) * [.insertTextBefore(parentNode, text, referenceNode)](#TreeAdapter.insertTextBefore) * [.adoptAttributes(recipientNode, attrs)](#TreeAdapter.adoptAttributes) * [.getFirstChild(node)](#TreeAdapter.getFirstChild) ⇒ ASTNode * [.getChildNodes(node)](#TreeAdapter.getChildNodes) ⇒ Array * [.getParentNode(node)](#TreeAdapter.getParentNode) ⇒ ASTNode * [.getAttrList(node)](#TreeAdapter.getAttrList) ⇒ Array * [.getTagName(element)](#TreeAdapter.getTagName) ⇒ String * [.getNamespaceURI(element)](#TreeAdapter.getNamespaceURI) ⇒ String * [.getTextNodeContent(textNode)](#TreeAdapter.getTextNodeContent) ⇒ String * [.getTextNodeContent(commentNode)](#TreeAdapter.getTextNodeContent) ⇒ String * [.getDocumentTypeNodeName(doctypeNode)](#TreeAdapter.getDocumentTypeNodeName) ⇒ String * [.getDocumentTypeNodePublicId(doctypeNode)](#TreeAdapter.getDocumentTypeNodePublicId) ⇒ String * [.getDocumentTypeNodeSystemId(doctypeNode)](#TreeAdapter.getDocumentTypeNodeSystemId) ⇒ String * [.isTextNode(node)](#TreeAdapter.isTextNode) ⇒ Boolean * [.isCommentNode(node)](#TreeAdapter.isCommentNode) ⇒ Boolean * [.isDocumentTypeNode(node)](#TreeAdapter.isDocumentTypeNode) ⇒ Boolean * [.isElementNode(node)](#TreeAdapter.isElementNode) ⇒ Boolean ### TreeAdapter.createDocument() ⇒ ASTNode.<Document> Creates document node **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: ASTNode.<Document> - document **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L19) ### TreeAdapter.createDocumentFragment() ⇒ ASTNode.<DocumentFragment> Creates document fragment node **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: ASTNode.<DocumentFragment> - fragment **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L37) ### TreeAdapter.createElement(tagName, namespaceURI, attrs) ⇒ ASTNode.<Element> Creates element node **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: ASTNode.<Element> - element **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L61) | Param | Type | Description | | --- | --- | --- | | tagName | String | Tag name of the element. | | namespaceURI | String | Namespace of the element. | | attrs | Array | Attribute name-value pair array. Foreign attributes may contain `namespace` and `prefix` fields as well. | ### TreeAdapter.createElement(data) ⇒ ASTNode.<CommentNode> Creates comment node **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: ASTNode.<CommentNode> - comment **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L85) | Param | Type | Description | | --- | --- | --- | | data | String | Comment text. | ### TreeAdapter.setDocumentType(document, name, publicId, systemId) Sets document type. If `document` already have document type node in it then `name`, `publicId` and `systemId` properties of the node will be updated with the provided values. Otherwise, creates new document type node with the given properties and inserts it into `document`. **Kind**: static method of [TreeAdapter](#TreeAdapter) **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L131) | Param | Type | Description | | --- | --- | --- | | document | ASTNode.<Document> | Document node. | | name | String | Document type name. | | publicId | String | Document type public identifier. | | systemId | String | Document type system identifier. | ### TreeAdapter.setQuirksMode(document) Sets document quirks mode flag. **Kind**: static method of [TreeAdapter](#TreeAdapter) **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L167) | Param | Type | Description | | --- | --- | --- | | document | ASTNode.<Document> | Document node. | ### TreeAdapter.setQuirksMode(document) ⇒ Boolean Determines if document quirks mode flag is set. **Kind**: static method of [TreeAdapter](#TreeAdapter) **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L183) | Param | Type | Description | | --- | --- | --- | | document | ASTNode.<Document> | Document node. | ### TreeAdapter.detachNode(node) Removes node from it's parent. **Kind**: static method of [TreeAdapter](#TreeAdapter) **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L197) | Param | Type | Description | | --- | --- | --- | | node | ASTNode | Node. | ### TreeAdapter.insertText(parentNode, text) Inserts text into node. If the last child of the node is the text node then provided text will be appended to the text node content. Otherwise, inserts new text node with the given text. **Kind**: static method of [TreeAdapter](#TreeAdapter) **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L220) | Param | Type | Description | | --- | --- | --- | | parentNode | ASTNode | Node to insert text into. | | text | String | Text to insert. | ### TreeAdapter.insertTextBefore(parentNode, text, referenceNode) Inserts text into node before the referenced child node. If node before the referenced child node is the text node then provided text will be appended to the text node content. Otherwise, inserts new text node with the given text before the referenced child node. **Kind**: static method of [TreeAdapter](#TreeAdapter) **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L249) | Param | Type | Description | | --- | --- | --- | | parentNode | ASTNode | Node to insert text into. | | text | String | Text to insert. | | referenceNode | ASTNode | Node to insert text before. | ### TreeAdapter.adoptAttributes(recipientNode, attrs) Copies attributes to the given node. Only those nodes which are not yet present in the node are copied. **Kind**: static method of [TreeAdapter](#TreeAdapter) **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L270) | Param | Type | Description | | --- | --- | --- | | recipientNode | ASTNode | Node to copy attributes into. | | attrs | Array | Attributes to copy. | ### TreeAdapter.getFirstChild(node) ⇒ ASTNode Returns first child of the given node. **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: ASTNode - firstChild **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L297) | Param | Type | Description | | --- | --- | --- | | node | ASTNode | Node. | ### TreeAdapter.getChildNodes(node) ⇒ Array Returns array of the given node's children. **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: Array - children **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L313) | Param | Type | Description | | --- | --- | --- | | node | ASTNode | Node. | ### TreeAdapter.getParentNode(node) ⇒ ASTNode Returns given node's parent. **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: ASTNode - parent **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L329) | Param | Type | Description | | --- | --- | --- | | node | ASTNode | Node. | ### TreeAdapter.getAttrList(node) ⇒ Array Returns array of the given node's attributes in form of the name-value pair. Foreign attributes may contain `namespace` and `prefix` fields as well. **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: Array - attributes **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L346) | Param | Type | Description | | --- | --- | --- | | node | ASTNode | Node. | ### TreeAdapter.getTagName(element) ⇒ String Returns given element's tag name. **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: String - tagName **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L364) | Param | Type | Description | | --- | --- | --- | | element | ASTNode.<Element> | Element. | ### TreeAdapter.getNamespaceURI(element) ⇒ String Returns given element's namespace. **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: String - namespaceURI **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L380) | Param | Type | Description | | --- | --- | --- | | element | ASTNode.<Element> | Element. | ### TreeAdapter.getTextNodeContent(textNode) ⇒ String Returns given text node's content. **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: String - text **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L396) | Param | Type | Description | | --- | --- | --- | | textNode | ASTNode.<Text> | Text node. | ### TreeAdapter.getTextNodeContent(commentNode) ⇒ String Returns given comment node's content. **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: String - commentText **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L412) | Param | Type | Description | | --- | --- | --- | | commentNode | ASTNode.<Comment> | Comment node. | ### TreeAdapter.getDocumentTypeNodeName(doctypeNode) ⇒ String Returns given document type node's name. **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: String - name **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L428) | Param | Type | Description | | --- | --- | --- | | doctypeNode | ASTNode.<DocumentType> | Document type node. | ### TreeAdapter.getDocumentTypeNodePublicId(doctypeNode) ⇒ String Returns given document type node's public identifier. **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: String - publicId **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L444) | Param | Type | Description | | --- | --- | --- | | doctypeNode | ASTNode.<DocumentType> | Document type node. | ### TreeAdapter.getDocumentTypeNodeSystemId(doctypeNode) ⇒ String Returns given document type node's system identifier. **Kind**: static method of [TreeAdapter](#TreeAdapter) **Returns**: String - systemId **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L460) | Param | Type | Description | | --- | --- | --- | | doctypeNode | ASTNode.<DocumentType> | Document type node. | ### TreeAdapter.isTextNode(node) ⇒ Boolean Determines if given node is a text node. **Kind**: static method of [TreeAdapter](#TreeAdapter) **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L477) | Param | Type | Description | | --- | --- | --- | | node | ASTNode | Node. | ### TreeAdapter.isCommentNode(node) ⇒ Boolean Determines if given node is a comment node. **Kind**: static method of [TreeAdapter](#TreeAdapter) **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L493) | Param | Type | Description | | --- | --- | --- | | node | ASTNode | Node. | ### TreeAdapter.isDocumentTypeNode(node) ⇒ Boolean Determines if given node is a document type node. **Kind**: static method of [TreeAdapter](#TreeAdapter) **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L509) | Param | Type | Description | | --- | --- | --- | | node | ASTNode | Node. | ### TreeAdapter.isElementNode(node) ⇒ Boolean Determines if given node is an element. **Kind**: static method of [TreeAdapter](#TreeAdapter) **See**: [default implementation.](https://github.com/inikulin/parse5/blob/tree-adapter-docs-rev/lib/tree_adapters/default.js#L525) | Param | Type | Description | | --- | --- | --- | | node | ASTNode | Node. | # FAQ ## Q: I want to work with my own document tree format. How can I achieve this? You can create a custom tree adapter so parse5 can work with your own DOM-tree implementation. Then just pass it to the parser or serializer via option: ```js var parse5 = require('parse5'); var myTreeAdapter = { //Adapter methods... }; var document = parse5.parse('
', { treeAdapter: myTreeAdapter }); var html = parse5.serialize(document, { treeAdapter: myTreeAdapter }); ``` You can find description of the methods which should be exposed by tree adapter and links to their default implementation in the [API reference](#TreeAdapter). ## Q: How can I use parse5 in the browser? Just compile it with [browserify](http://browserify.org/) and you're set. ## Q: I'm parsing `` with the `SAXParser` and I expect `selfClosing` flag to be `true` for the `` tag. But it's not. Is there something wrong with parser? No. Self-closing tag is the tag that has `/` before the closing bracket. E.g: `
`, ``. In the provided example tag just doesn't have end tag. Self-closing tags and tags without end tags are differently treated by the parser: in case of self-closing tag parser will not lookup for the appropriate closing tag and expects element to not have any content. But if start tag is not self-closing parser will treat everything after it (with the few exceptions) as the element content. However, if the start tag is in the list of [void elements](https://html.spec.whatwg.org/multipage/syntax.html#void-elements) parser expects corresponding element to not have content and behaves the same way as the if element is self-closing. So, semantically if element is the void element self-closing tags and tags without closing tags are equivalent, but it's not true for all other tags. **TL;DR**: `selfClosing` is the part of the lexical information and will be set only if the tag in source code has `/` before the closing bracket. ## Q: I have some weird output from the parser, seems like it's a bug. More likely, it's not. There are a lot of weird edge cases in HTML5 parsing algorithm, e.g.: ```html 1

23

``` will be parsed as ```html 1

23

``` Just try it in the latest version of your browser before submitting the issue. # Version history ## 2.0.0 * Add: [ParserStream](http://inikulin.github.io/parse5/#parse5+ParserStream) with the scripting support. * Add: [SerializerStream](http://inikulin.github.io/parse5/#parse5+SerializerStream) * Add: Line/column location info. * Update (**breaking**): `SimpleApiParser` was renamed to [SAXParser](http://inikulin.github.io/parse5/#parse5+SAXParser). * Update (**breaking**): [SAXParser](http://inikulin.github.io/parse5/#parse5+SAXParser) is the [transform stream](https://nodejs.org/api/stream.html#stream_class_stream_transform) now. * Update (**breaking**): [SAXParser](http://inikulin.github.io/parse5/#parse5+SAXParser) handler subscription is done via events now. * Add: [SAXParser.stop()](http://inikulin.github.io/parse5/#parse5+SAXParser+stop) * Add (**breaking**): [parse5.parse()](http://inikulin.github.io/parse5/#parse5+parse) and [parse5.parseFragment()](http://inikulin.github.io/parse5/#parse5+parseFragment) methods as replacement for the `Parser` class. * Add (**breaking**): [parse5.serialize()](http://inikulin.github.io/parse5/#parse5+serialized) method as replacement for the `Serializer` class. * Update: parsing algorithm was updated with the latest [HTML spec](https://html.spec.whatwg.org/) changes. * Remove (**breaking**): `decodeHtmlEntities` and `encodeHtmlEntities` options. [Discussion](https://github.com/inikulin/parse5/issues/75). ## 1.5.0 * Add: Location info for the element start and end tags (by @sakagg). ## 1.4.2 * Fix: htmlparser2 tree adapter `DocumentType.data` property rendering (GH [#45](https://github.com/inikulin/parse5/issues/45)). ## 1.4.1 * Fix: Location info handling for the implicitly generated `` and `` elements (GH [#44](https://github.com/inikulin/parse5/issues/44)). ## 1.4.0 * Add: Parser [decodeHtmlEntities](https://github.com/inikulin/parse5#optionsdecodehtmlentities) option. * Add: SimpleApiParser [decodeHtmlEntities](https://github.com/inikulin/parse5#optionsdecodehtmlentities-1) option. * Add: Parser [locationInfo](https://github.com/inikulin/parse5#optionslocationinfo) option. * Add: SimpleApiParser [locationInfo](https://github.com/inikulin/parse5#optionslocationinfo-1) option. ## 1.3.2 * Fix: `` processing in `
Shake it, baby