{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "```\n", "title: \"Grewpy • drawing dependencies\"\n", "date: 2023-09-19\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[`grewpy` Tutorial](../tutorial)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# `grewpy` library: Drawing dependencies\n", "\n", "Download the notebook [here](../drawing_dep.ipynb)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The code below initializes Grewpy, load a `corpus` and a `graph` from the `corpus`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2024-08-11T17:01:32.150016Z", "iopub.status.busy": "2024-08-11T17:01:32.149483Z", "iopub.status.idle": "2024-08-11T17:01:32.739068Z", "shell.execute_reply": "2024-08-11T17:01:32.738744Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "connected to port: 55918\n" ] } ], "source": [ "import grewpy\n", "from grewpy import Corpus, Request, Graph\n", "grewpy.set_config(\"sud\") # ud or basic\n", "corpus = Corpus(\"SUD_English-PUD\")\n", "sent_id = \"n01003007\"\n", "graph = corpus[sent_id]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Build the SVG picture for a graph\n", "\n", "In the Graph module, the method `to_svg` produces the SVG code for the dependency structure picture.\n", "The code below stores the result in a new file `1926.svg`." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2024-08-11T17:01:32.741296Z", "iopub.status.busy": "2024-08-11T17:01:32.741132Z", "iopub.status.idle": "2024-08-11T17:01:32.751553Z", "shell.execute_reply": "2024-08-11T17:01:32.751234Z" } }, "outputs": [], "source": [ "import os\n", "os.makedirs(\"images\", exist_ok = True)\n", "with open(\"images/n01003007.svg\", 'w') as f:\n", " f.write (graph.to_svg())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![n01003007 image](../images/n01003007.svg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, the _anchor_ node (noted `__0__`) and the `root` link between it and the root of the sentence are not drawn.\n", "With the option `draw_root`, the drawing of the anchor node and the `root` link is added." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2024-08-11T17:01:32.753461Z", "iopub.status.busy": "2024-08-11T17:01:32.753344Z", "iopub.status.idle": "2024-08-11T17:01:32.757272Z", "shell.execute_reply": "2024-08-11T17:01:32.756968Z" } }, "outputs": [], "source": [ "with open(\"images/n01003007_with_root.svg\", 'w') as f:\n", " f.write (graph.to_svg(draw_root=True))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![n01003007_with_root image](../images/n01003007_with_root.svg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Drawing dep structure with highlighted matching\n", "It is possible to create an image of a graph with highlighted matching, as in the Grew-match interface.\n", "To do this, we should specify that we want to keep the _decoration_ of the graph when searching the corpus. The optional argument `deco=True` does exactly that." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2024-08-11T17:01:32.758966Z", "iopub.status.busy": "2024-08-11T17:01:32.758867Z", "iopub.status.idle": "2024-08-11T17:01:32.766830Z", "shell.execute_reply": "2024-08-11T17:01:32.766562Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{'sent_id': 'n03009011', 'matching': {'nodes': {'N': '13'}, 'edges': {}}, 'deco': 7}, {'sent_id': 'w01028050', 'matching': {'nodes': {'N': '9'}, 'edges': {}}, 'deco': 6}, {'sent_id': 'n01092025', 'matching': {'nodes': {'N': '21'}, 'edges': {}}, 'deco': 5}, {'sent_id': 'n01057036', 'matching': {'nodes': {'N': '5'}, 'edges': {}}, 'deco': 4}, {'sent_id': 'n01042004', 'matching': {'nodes': {'N': '22'}, 'edges': {}}, 'deco': 3}, {'sent_id': 'n01035004', 'matching': {'nodes': {'N': '17'}, 'edges': {}}, 'deco': 2}, {'sent_id': 'n01019005', 'matching': {'nodes': {'N': '6'}, 'edges': {}}, 'deco': 1}]\n" ] } ], "source": [ "request = Request(\"pattern { N[lemma=question] }\")\n", "matchings = corpus.search(request, deco=True)\n", "print (matchings)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each item in the list has an (abstract) attribute `order` which can be used in further function calls to manage graph decoration." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let `g1` be the first matched graph and `d1` the decoration associated to the matching:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2024-08-11T17:01:32.768624Z", "iopub.status.busy": "2024-08-11T17:01:32.768523Z", "iopub.status.idle": "2024-08-11T17:01:32.770882Z", "shell.execute_reply": "2024-08-11T17:01:32.770650Z" } }, "outputs": [], "source": [ "g1 = corpus[matchings[1][\"sent_id\"]]\n", "d1 = matchings[1][\"deco\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The decorated SVG is computed and saved in a file `question_with_deco.svg` with:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2024-08-11T17:01:32.772471Z", "iopub.status.busy": "2024-08-11T17:01:32.772362Z", "iopub.status.idle": "2024-08-11T17:01:32.778178Z", "shell.execute_reply": "2024-08-11T17:01:32.777927Z" } }, "outputs": [], "source": [ "with open(\"images/question_with_deco.svg\", 'w') as f:\n", " f.write (g1.to_svg (deco=d1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![question_with_deco image](../images/question_with_deco.svg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The decoration can also be used in method `to_sentence` to produced the _highlighted_ sentence (like the text in red in Grew-match interface) using some HTML class called `highlight`." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2024-08-11T17:01:32.779839Z", "iopub.status.busy": "2024-08-11T17:01:32.779758Z", "iopub.status.idle": "2024-08-11T17:01:32.784767Z", "shell.execute_reply": "2024-08-11T17:01:32.784519Z" } }, "outputs": [ { "data": { "text/plain": [ "'At the heart of the conflict was the question of whether Kansas would enter the Union as a free state or slave state. '" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g1.to_sentence (deco=d1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.5" } }, "nbformat": 4, "nbformat_minor": 2 }