From nobody Fri Dec 20 02:09:50 2024 X-Original-To: dev-commits-ports-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YDrTW1xkQz5h6TN; Fri, 20 Dec 2024 02:09:51 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4YDrTW0wybz50Qc; Fri, 20 Dec 2024 02:09:51 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1734660591; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=YKVzIbjbKKNLP0vahHR9y110++osOcBJFNSU6ImWXM8=; b=TjNg2KDCwLOLL+7FNIfmG03GQKoFqa2W1IZvwdiJKLM3A8XVSEZkbpz/vgBL5xGgzyRjfh tjLuFRi7PQ+6PDMon/eJodmX548TLMpENTZ2WCYp/1Voci3sULze/9SAPc+JSBwfVy0Q4F 3XOZf+JeULHmqgIAi9xpSAJSstJ6Z8Xu0tOPNi5O1MyaFst+gJTEtoSUDAps32dIdBtTrd 6k8VGoWGHk+jwVCua1Anf1Brtp7S7xUKyuB+8JzwBe/zI+bfRXV612g0xPGuHsqRAtGn4N BT5cDHvwptF8HQMIlDujbQkbNmlJ2/x5z9kF/+ggE9aGXO2p0gP/zabb3QxxBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1734660591; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=YKVzIbjbKKNLP0vahHR9y110++osOcBJFNSU6ImWXM8=; b=rqko/QVIa1m1uRzlH7Nl+/Vgioqk1pVnPAmnK1DFIzNFSFwkiXJKXDQi+cB+wYL6tvCFVO lnBfOrQBBHSj5RDBqk0s13TxiExUdc/2JQ15eUFpUOJQ+9NDWt5L6yO1nvuFplJ9KGLito 02VDtixYZT7SbCUUFKhgYdWD3lF0d6N7xKdVG/iWzM1yjpKz2hNJEV1MCQyj56G7s710dO bl3SU5k9o3ZIcXfL3EIfiCQQYJA8I0iM1JIuRIqcSFWa2yDL+2wjD14/VmkJHwjOPQuXTt Eb3VhRW9StjeeTUDNTiq8ryQhwEZy6YWU87jvZPWSWWAdrrCNEZ39Q5dznwe2A== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1734660591; a=rsa-sha256; cv=none; b=bPcGDHrqCJCaXKRfsxk6ZITpeSkf+vTqXFH9/ar2MMWR0d+eIyp8p+fDFfgxtXwu5DUw67 Bqg2Xao/6xgU+qKNlVNYJj1ktDybv/Q0KBg/8d11PlsuTq+OIIQaxNBOSxqSqkBZHxeUYu N+Fl0lMpC1Y4JoRXzdb+p/93dt3GvozJGV1vypb/p1YySj/FPm+JtdnSiPqmWWqgHGraoE JzskKp1r6nXYB6Et8m4E2Wo9p4Z9T6CW103+vazfAnX/sPd/FQb3B8Ley39V0pY3QlBLRs /G3JOv4fwDFF0VyfQWXaxmduOsKh0p5J5NZIdDxu6ZHnxqQhylDWO9eoP8xf1Q== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4YDrTW0DMPzJgg; Fri, 20 Dec 2024 02:09:51 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 4BK29ohg092790; Fri, 20 Dec 2024 02:09:50 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 4BK29oeC092787; Fri, 20 Dec 2024 02:09:50 GMT (envelope-from git) Date: Fri, 20 Dec 2024 02:09:50 GMT Message-Id: <202412200209.4BK29oeC092787@gitrepo.freebsd.org> To: ports-committers@FreeBSD.org, dev-commits-ports-all@FreeBSD.org, dev-commits-ports-main@FreeBSD.org From: Wen Heping Subject: git: 89d55115a4c0 - main - converters/py-markitdown: New port List-Id: Commit messages for all branches of the ports repository List-Archive: https://lists.freebsd.org/archives/dev-commits-ports-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-ports-all@freebsd.org Sender: owner-dev-commits-ports-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: wen X-Git-Repository: ports X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 89d55115a4c0f52bbeb08ea5f5899d6e6b62fa1b Auto-Submitted: auto-generated The branch main has been updated by wen: URL: https://cgit.FreeBSD.org/ports/commit/?id=89d55115a4c0f52bbeb08ea5f5899d6e6b62fa1b commit 89d55115a4c0f52bbeb08ea5f5899d6e6b62fa1b Author: Wen Heping AuthorDate: 2024-12-20 02:01:25 +0000 Commit: Wen Heping CommitDate: 2024-12-20 02:09:15 +0000 converters/py-markitdown: New port MarkItDown library is a utility tool for converting various files to Markdown (e.g., for indexing, text analysis, etc.) It presently supports: *PDF (.pdf) *PowerPoint (.pptx) *Word (.docx) *Excel (.xlsx) *Images (EXIF metadata, and OCR) *Audio (EXIF metadata, and speech transcription) *HTML (special handling of Wikipedia, etc.) *Various other text-based formats (csv, json, xml, etc.) *ZIP (Iterates over contents and converts each file) --- converters/Makefile | 1 + converters/py-markitdown/Makefile | 27 +++++++++++++++++++++++++++ converters/py-markitdown/distinfo | 3 +++ converters/py-markitdown/pkg-descr | 13 +++++++++++++ 4 files changed, 44 insertions(+) diff --git a/converters/Makefile b/converters/Makefile index d963b78583d0..645b3b83065f 100644 --- a/converters/Makefile +++ b/converters/Makefile @@ -153,6 +153,7 @@ SUBDIR += py-bsdconv SUBDIR += py-gotenberg-client SUBDIR += py-mammoth + SUBDIR += py-markitdown SUBDIR += py-rencode SUBDIR += py-svglib SUBDIR += py-text-unidecode diff --git a/converters/py-markitdown/Makefile b/converters/py-markitdown/Makefile new file mode 100644 index 000000000000..a9ce7a689d57 --- /dev/null +++ b/converters/py-markitdown/Makefile @@ -0,0 +1,27 @@ +PORTNAME= markitdown +DISTVERSION= 0.0.1a3 +CATEGORIES= converters python +MASTER_SITES= PYPI +PKGNAMEPREFIX= ${PYTHON_PKGNAMEPREFIX} + +MAINTAINER= wen@FreeBSD.org +COMMENT= Utility tool for converting various files to Markdown +WWW= https://pypi.org/project/tlv8/ + +LICENSE= APACHE20 + +BUILD_DEPENDS= ${PYTHON_PKGNAMEPREFIX}hatchling>=0:devel/py-hatchling@${PY_FLAVOR} +RUN_DEPENDS= ${PYTHON_PKGNAMEPREFIX}mammoth>=0:converters/py-mammoth@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}markdownify>=0:textproc/py-markdownify@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}pandas>=0:math/py-pandas@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}pdfminer.six>=0:textproc/py-pdfminer.six@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}python-pptx>=0:textproc/py-python-pptx@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}puremagic>=0:sysutils/py-puremagic@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}requests>=0:www/py-requests@${PY_FLAVOR} + +USES= python +USE_PYTHON= autoplist pep517 + +NO_ARCH= yes + +.include diff --git a/converters/py-markitdown/distinfo b/converters/py-markitdown/distinfo new file mode 100644 index 000000000000..a69065a058ef --- /dev/null +++ b/converters/py-markitdown/distinfo @@ -0,0 +1,3 @@ +TIMESTAMP = 1734654122 +SHA256 (markitdown-0.0.1a3.tar.gz) = f6c8f5f7f5541e91c6c535218318968fefd71e2a6faa0eb782b3492e04cd023d +SIZE (markitdown-0.0.1a3.tar.gz) = 16073 diff --git a/converters/py-markitdown/pkg-descr b/converters/py-markitdown/pkg-descr new file mode 100644 index 000000000000..8871cf0e5603 --- /dev/null +++ b/converters/py-markitdown/pkg-descr @@ -0,0 +1,13 @@ +MarkItDown library is a utility tool for converting various files to Markdown +(e.g., for indexing, text analysis, etc.) + +It presently supports: + *PDF (.pdf) + *PowerPoint (.pptx) + *Word (.docx) + *Excel (.xlsx) + *Images (EXIF metadata, and OCR) + *Audio (EXIF metadata, and speech transcription) + *HTML (special handling of Wikipedia, etc.) + *Various other text-based formats (csv, json, xml, etc.) + *ZIP (Iterates over contents and converts each file)