# GraphEx-Web-Automation-Plugin **Repository Path**: mirrors_mitre/GraphEx-Web-Automation-Plugin ## Basic Information - **Project Name**: GraphEx-Web-Automation-Plugin - **Description**: A plugin for the GraphEx application to create nodes for controlling Playwright - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-10-11 - **Last Updated**: 2026-01-18 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ©2025 The MITRE Corporation. ALL RIGHTS RESERVED. The author's affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended to convey or imply MITRE's concurrence with, or support for, the positions, opinions, or viewpoints expressed by the author.'©2025 The MITRE Corporation. ALL RIGHTS RESERVED. NOTICE This software was produced for the U. S. Government under Basic Contract No. W56KGU-18-D-0004, and is subject to the Rights in Noncommercial Computer Software and Noncommercial Computer Software Documentation Clause 252.227-7014 (FEB 2014) # Introduction [GraphEx](https://github.com/mitre/GraphEx) plugin enabling web automation using playwright. This plugin is an extension to the [GraphEx application](https://github.com/mitre/GraphEx). # Installation Install the plugin with the command `make all`. This also sets up the required Playwright tools for automation. # Execution This repository is not intended for standalone use. It bridges the gap between Python's Playwright web automation tool and the mitre-graphex python module. # Using Playwright Nodes The `graphex-webautomation-plugin` leverages the [Playwright](https://playwright.dev/) Python package by Microsoft. This open-source tool automates browser interactions with Chromium, Firefox, and WebKit in Python, among other languages. A standout feature is Playwright's code generation tool, simplifying browser interaction scripting. ## Interacting with the Browser The `Create Playwright Browser Context` node initiates a browser context, akin to manually opening a new browser window. Multiple pages can be opened within this context.  For clarity: manually launching a browser and logging into a site means you won't need to log in again when opening a new tab in that browser. This is due to session retention (usually via cookies). Likewise, Playwright's browser context maintains session data across its pages, so actions like logging in on one page are recognized on others within the same context. ## Crafting and Executing Page Commands The `Execute Playwright Page Script` node allows synchronous execution of a series of Playwright commands in Python. The node accepts a `page commands` script which has access to these local variables: - **page**: A `playwright.sync_api.Page` python object - **output**: A data container that can be a list or a dictionary, used for storing parsed outputs - **re**: the standard regex library - **time**: the standard time library The `Execute Playwright Page Script` node facilitates the synchronous execution of a series of Playwright commands in Python. It accepts a `page commands` script, which can access the following local variables: - **page**: A `playwright.sync_api.Page` object - **output**: A container, either a list or a dictionary, for storing parsed outputs - **re**: The standard regex library - **time**: The standard time library For example, to download the Ubuntu 22.04.3 desktop iso, use: ```python page.goto("https://ubuntu.com/download/desktop") page.get_by_role("button", name="Accept all and visit site").click() page.get_by_role("link", name="Search Search").click() page.get_by_placeholder("Search our sites").fill("22.04.3") page.get_by_placeholder("Search our sites").press("Enter") page.get_by_role("link", name="Ubuntu 22.04.3 LTS (Jammy Jellyfish)", exact=True).click() with page.expect_download() as download_info: page.get_by_role("link", name="64-bit PC (AMD64) desktop image").click() download = download_info.value ``` By default, the plugin waits up to 30 seconds for an element's appearance before erroring out. No need for explicit timeouts. Adjust this duration using the `Element Timeout (ms)` option in the `Open a Playwright Page` node. The plugin handles file downloads automatically, storing their paths in the `Download Filepaths` output. ## Creating Page Commands To use Playwright's code generation tool for creating page commands, follow these steps: 1. Install playwright: `pip install playwright`. 2. Set it up: `python3 -m playwright install`. Use the codegen tool while bypassing HTTPS errors with: ```bash python3 -m playwright codegen --ignore-https-errors --viewport "1920, 1080" ``` This command launches a Chromium browser alongside Playwright's code generator. Interact with the browser, and it'll log commands for you.  For the `Execute Playwright Page Script` node, extract commands post `goto` (assuming a predefined URL):  Then, input the code block into the node's `page_commands`. ## Refining Codegen Commands While Playwright's Python codegen tool is an excellent starting point for web automation, always review the auto-generated code. Here are points to consider: 1. **Is the selector overly specific?** Avoid relying on hardcoded values, like specific version numbers. For instance, while automating the task of navigating to the Express NPM package page and selecting the latest version:  As of october 2023, playwright codegen will produce this code: ```python page.goto("https://www.npmjs.com/package/express?activeTab=versions") page.locator("li").filter(has_text="4.18.217,767,703latest").get_by_label("4.18.2").click() ``` This code becomes obsolete when the version updates. Here's the HTML structure from the browser's inspect tool for clarity: ```html