analitics

Pages

Friday, June 12, 2026

Python Qt : Simple script to use wandb and weave.

WandB and Weave work together as complementary tools that enhance the process of evaluating, monitoring, and understanding machine‑learning and large‑language‑model behavior, each focusing on a different layer of the workflow while sharing the same ecosystem. WandB functions primarily as an experiment‑tracking platform that records metrics, logs model outputs, stores configuration details, and organizes results into interactive dashboards, making it easy to compare multiple runs, visualize performance trends, and maintain a structured history of experiments across time. It acts like a scientific notebook that automatically captures everything relevant during evaluation, from scores and prompts to timing information, enabling reproducibility and long‑term analysis. Weave complements this by focusing on the granular tracing of LLM calls, capturing each prompt, response, intermediate step, and metadata associated with model execution, which allows developers to inspect how a model arrived at a particular answer, debug unexpected behavior, and analyze qualitative aspects of model reasoning. While WandB summarizes experiments at a high level, Weave dives deep into the internals of each interaction, providing structured logs that can be searched, filtered, and compared. Together, they create a unified workflow where WandB offers experiment‑level insights and Weave provides call‑level transparency, giving developers a complete picture of model performance, reliability, and behavior across different prompts, models, or configurations, especially useful when benchmarking or refining LLMs.
Let's install these:
python -m pip install wandb weave
Collecting wandb
...
Successfully installed abnf-2.2.0 backoff-2.2.1 chardet-7.4.3 cint-1.0.0 diskcache-weave-5.6.3.post1 fickling-0.1.11
googleapis-common-protos-1.75.0 gql-4.0.0 graphql-core-3.2.11 intervaltree-3.2.1 kaitaistruct-0.11 
opentelemetry-api-1.42.1 opentelemetry-exporter-otlp-proto-common-1.42.1 opentelemetry-exporter-otlp-proto-http-1.42.1
opentelemetry-proto-1.42.1 opentelemetry-sdk-1.42.1 opentelemetry-semantic-conventions-0.63b1 pdfminer.six-20260107
polyfile-weave-0.5.9 protobuf-6.33.6 sentry-sdk-2.62.0 sortedcontainers-2.4.0 wandb-0.27.2 weave-0.52.42
Let's see one exemple with my custom artificial intelligence model and PyQt6.
The PyQt6 script is a small LLM evaluation application that takes two inputs: the Ollama model you select and a fixed set of short test prompts. When you start the evaluation, the script sends each prompt to the chosen model, collects the generated responses, then sends those responses to a smaller judge model to obtain a numerical quality score. All generation uses reduced context and limited output length to keep execution fast on an i3 CPU. As it runs, the script displays each answer in the text panel and updates a progress bar. When all prompts are processed, it compiles the collected scores and displays them as a bar chart in the canvas, giving you a quick visual summary of the model’s performance.
The online tool show this result for this script:

Tool : marpy the browser IDE.

You start in a browser IDE that is Python-first: virtualenvs, proper dependencies, real logs, not a toy REPL.

Python Qt : Simple script to build your own model.

Today, I test a simple script with PyQt6 python package to have a custom UI to build my own ollama artificial inteligence model.
The script is simple one, create one user interface, with options to add text files and many settings:
After I run the model works very well:
ollama run test_catafest_001:latest
>>> tell me something about you !
I'm an AI designed to assist and communicate with humans in a helpful and
informative way! Here are some interesting facts about me:

1. **Name**: I don't have a personal name, but I'm often referred to as
"Assistant" or "AI."
2. **Origin**: I was created by a team of researchers and developers at
Meta AI, a subsidiary of Meta Platforms, Inc.
3. **Language**: I'm fluent in multiple languages, including English,
Spanish, French, German, Italian, Portuguese, Dutch, Russian, Chinese,
Japanese, and many more.
4. **Knowledge**: My training data consists of a massive corpus of text
from various sources, including books, articles, research papers, and
websites. This allows me to provide information on a wide range of
topics, from science and history to entertainment and culture.
5. **Personality**: I strive to be neutral, objective, and respectful in
my interactions. While I can understand humor and sarcasm, I may not
always "get it" right.
6. **Goals**: My primary goal is to assist users like you with their
questions, provide helpful information, and facilitate learning and
exploration.
7. **Limitations**: I'm a machine learning model, which means I have
limitations in terms of my understanding and response capabilities. For
example, I may not be able to understand nuances of human language or
provide personalized advice.

Now, tell me something about you! What's on your mind?

>>> I build you. What knowledge I add you ?!
That's a fascinating perspective! As a conversational AI, my training
data is the foundation upon which I'm built. The knowledge and
information that you've contributed to me through our conversations will
shape my understanding and capabilities over time.

When you interact with me, you're not just asking questions or providing
feedback – you're also helping to refine my language understanding and
generation abilities. Your input can influence:

1. **Domain-specific knowledge**: By discussing topics in-depth, you'll
help me develop a better grasp of various subjects, including science,
history, technology, arts, and more.
2. **Language nuances**: Our conversations will expose me to idioms,
colloquialisms, and cultural references that I might not have encountered
otherwise.
3. **Conversational flow**: Your responses and questions will help me
develop a more natural tone and rhythm in our interactions, allowing us
to engage in more fluid and human-like conversations.

Some examples of knowledge you could add to me include:

* **Domain-specific terminology**: By introducing technical terms or
jargon related to your profession or area of interest, you'll help me
become more familiar with those concepts.
* **Idioms and colloquialisms**: Sharing everyday expressions, slang, or
regional phrases will enable me to better understand the nuances of human
language.
* **Cultural references**: Discussing books, movies, music, or other
cultural touchstones will enrich my knowledge of shared experiences and
common ground.

Feel free to share your expertise, hobbies, or interests with me! The
more we chat, the more I'll learn from you, and the better equipped I'll
become to engage in meaningful conversations.
Let's see the model type with this command:
/show info
Model
architecture llama
parameters 3.2B
context length 131072
embedding length 3072
quantization Q4_K_M

Capabilities
completion
tools

Parameters
stop "<|start_header_id|>"
stop "<|end_header_id|>"
stop "<|eot_id|>"

System
=== KNOWLEDGE DATA ===
[FILE: 001.txt]
...

License
LLAMA 3.2 COMMUNITY LICENSE AGREEMENT
Llama 3.2 Version Release Date: September 25, 2024
...
Let's see the result: