MiniSoup HTML Parser (Independent Publisher) (Preview)

A lightweight HTML parsing library inspired by Beautiful Soup, providing capabilities for HTML element analysis and extraction
This connector is available in the following products and regions:
Service | Class | Regions |
---|---|---|
Logic Apps | Standard | All Logic Apps regions except the following: - Azure Government regions - Azure China regions - US Department of Defense (DoD) |
Contact | |
---|---|
Name | MiniSoup Support |
URL | https://github.com/DEmodoriGatsuO/MiniSoup |
[email protected] |
Connector Metadata | |
---|---|
Publisher | Shogo Shindo |
Website | https://github.com/DEmodoriGatsuO/MiniSoup |
Privacy policy | https://github.com/DEmodoriGatsuO/MiniSoup/blob/main/PRIVACY.md |
Categories | Data;Website |
Throttling Limits
Name | Calls | Renewal Period |
---|---|---|
API calls per connection | 100 | 60 seconds |
Actions
Extract Values from HTML Elements |
Extracts specific attribute values from HTML elements matching the provided selector |
Fetch HTML Content |
Fetches HTML content from a specified URL |
Find All Matching Elements |
Finds all HTML elements matching the specified tag name and optional attributes |
Parse HTML Table |
Parses an HTML table into structured data with headers and rows |
Select HTML Elements |
Selects HTML elements matching the provided selector |
Extract Values from HTML Elements
Extracts specific attribute values from HTML elements matching the provided selector
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
html
|
html | True | string |
HTML content to be parsed |
selector
|
selector | True | string |
CSS selector or XPath for targeting elements |
attribute
|
attribute | True | string |
Attribute to extract from selected elements. Use 'text' for inner text, 'html' for inner HTML, or specific attribute name |
selector_type
|
selector_type | string |
Type of selector to use |
Returns
Name | Path | Type | Description |
---|---|---|---|
success
|
success | boolean |
Indicates whether the operation was successful |
values
|
values | array of string |
Array of extracted values from the matching elements |
count
|
count | integer |
Number of values extracted |
Fetch HTML Content
Fetches HTML content from a specified URL
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
url
|
url | True | string |
URL to fetch HTML content from |
Returns
Name | Path | Type | Description |
---|---|---|---|
success
|
success | boolean |
Indicates whether the operation was successful |
html
|
html | string |
HTML content retrieved from the specified URL |
Find All Matching Elements
Finds all HTML elements matching the specified tag name and optional attributes
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
html
|
html | True | string |
HTML content to be parsed |
tag_name
|
tag_name | True | string |
HTML tag name to search for |
id
|
id | string |
Filter by element ID |
|
class
|
class | string |
Filter by element class |
Returns
Name | Path | Type | Description |
---|---|---|---|
success
|
success | boolean |
Indicates whether the operation was successful |
elements
|
elements | array of HtmlElement |
Array of HTML elements that match the specified tag name and attributes |
count
|
count | integer |
Number of elements found |
Parse HTML Table
Parses an HTML table into structured data with headers and rows
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
html
|
html | True | string |
HTML content containing the table |
table_selector
|
table_selector | string |
CSS selector to locate the HTML table element |
|
header_rows_exist
|
header_rows_exist | boolean |
Whether the table has header rows |
Returns
Name | Path | Type | Description |
---|---|---|---|
success
|
success | boolean |
Indicates whether the operation was successful |
Headers
|
data.Headers | array of string |
Column headers extracted from the table |
Rows
|
data.Rows | array of array |
Table rows, each containing an array of cell values |
items
|
data.Rows | array of string |
Select HTML Elements
Selects HTML elements matching the provided selector
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
html
|
html | True | string |
HTML content to be parsed |
selector
|
selector | True | string |
CSS selector or XPath for targeting elements |
selector_type
|
selector_type | string |
Type of selector to use |
Returns
Name | Path | Type | Description |
---|---|---|---|
success
|
success | boolean |
Indicates whether the operation was successful |
elements
|
elements | array of HtmlElement |
Array of HTML elements that match the specified selector |
count
|
count | integer |
Number of elements found |
Definitions
HtmlElement
Represents an HTML element with its properties and attributes
Name | Path | Type | Description |
---|---|---|---|
tag
|
tag | string |
The HTML tag name of the element (e.g., 'div', 'span', 'a') |
outerHtml
|
outerHtml | string |
The complete HTML of the element including the element itself |
innerHtml
|
innerHtml | string |
The HTML content inside the element, which may include other elements |
innerText
|
innerText | string |
The text content inside the element with all HTML tags removed |
attributes
|
attributes | object |
All attributes of the element as name-value pairs |
isSelfClosing
|
isSelfClosing | boolean |
Indicates whether the element is a self-closing tag (e.g., |