analitics

Pages

Showing posts with label unstructured. Show all posts
Showing posts with label unstructured. Show all posts

Sunday, December 29, 2024

Python 3.13.0 : Only the unstructured library in Fedora 41 ...

The unstructured library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of unstructured revolve around streamlining and optimizing the data processing workflow for LLMs. unstructured modular functions and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and efficient in transforming unstructured data into structured outputs.
I used pip tool to install this python package:
mythcat@fedora:~$ pip install unstructured
Defaulting to user installation because normal site-packages is not writeable
Collecting unstructured
  Downloading unstructured-0.11.8-py3-none-any.whl.metadata (26 kB)
...
Successfully installed aiofiles-24.1.0 annotated-types-0.7.0 anyio-4.7.0 backoff-2.2.1 beautifulsoup4-4.12.3 chardet-5.2.0 click-8.1.8 cryptography-44.0.0 dataclasses-json-0.6.7 emoji-2.14.0 eval-type-backport-0.2.2 filetype-1.2.0 h11-0.14.0 httpcore-1.0.7 httpx-0.28.1 jsonpath-python-1.0.6 langdetect-1.0.9 marshmallow-3.23.2 mypy-extensions-1.0.0 nest-asyncio-1.6.0 nltk-3.9.1 pydantic-2.9.2 pydantic-core-2.23.4 pypdf-5.1.0 python-iso639-2024.10.22 python-magic-0.4.27 rapidfuzz-3.11.0 requests-toolbelt-1.0.0 sniffio-1.3.1 soupsieve-2.6 tabulate-0.9.0 tqdm-4.67.1 typing-extensions-4.12.2 typing-inspect-0.9.0 unstructured-0.11.8 unstructured-client-0.28.1 wrapt-1.17.0
But I got error on the unstructured-inference:
pip install unstructured-inference
Collecting unstructured-inference
...
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for onnx
Failed to build onnx
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (onnx)
... and errors on unstructured[pdf]:
mythcat@fedora:~$ pip install unstructured[pdf]
...
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for onnx
Failed to build onnx
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (onnx)