analitics

Pages

Sunday, August 30, 2020

Python 3.8.5 : Testing with openpyxl - part 002 .

Today I will show you how can use Levenshtein ratio and distance between two strings, see wikipedia.
I used three files created with LibreOffice and save it like xlsx file type.
All of these files come with the column A fill with strings of characters, in this case, numbers.
The script will read all of these files from the folder named xlsx_files and will calculate Levenshtein ratio and distance between the strings of name of these files and column A.
Finally, the result is shown into a graph with matplotlib python package.
Let's see the python script:
import os
from glob import glob

from openpyxl import load_workbook
import numpy as np 
import matplotlib.pyplot as plt 

def levenshtein_ratio_and_distance(s, t, ratio_calc = False):
    """ levenshtein_ratio_and_distance - distance between two strings.
        If ratio_calc = True, the function computes the
        levenshtein distance ratio of similarity between two strings
        For all i and j, distance[i,j] will contain the Levenshtein
        distance between the first i characters of s and the
        first j characters of t
    """
    # Initialize matrix of zeros
    rows = len(s)+1
    cols = len(t)+1
    distance = np.zeros((rows,cols),dtype = int)

    # Populate matrix of zeros with the indeces of each character of both strings
    for i in range(1, rows):
        for k in range(1,cols):
            distance[i][0] = i
            distance[0][k] = k
    for col in range(1, cols):
        for row in range(1, rows):
            # check the characters are the same in the two strings in a given position [i,j] 
            # then the cost is 0
            if s[row-1] == t[col-1]:
                cost = 0 
            else:             
                # calculate distance, then the cost of a substitution is 1.
                if ratio_calc == True:
                    cost = 2
                else:
                    cost = 1
            distance[row][col] = min(distance[row-1][col] + 1,      # Cost of deletions
                                 distance[row][col-1] + 1,          # Cost of insertions
                                 distance[row-1][col-1] + cost)     # Cost of substitutions
    if ratio_calc == True:
        # Ration computation of the Levenshtein Distance Ratio
        Ratio = ((len(s)+len(t)) - distance[row][col]) / (len(s)+len(t))
        return Ratio
    else:
        return distance[row][col]


PATH = "/home/mythcat/xlsx_files/"
result = [y for x in os.walk(PATH) for y in glob(os.path.join(x[0], '*.xlsx'))]
result_files = [os.path.join(path, name) for path, subdirs, files in os.walk(PATH) for name in files]
#print(result)
row_0 = []

for r in result:
    n = 0
    wb = load_workbook(r)
    sheets = wb.sheetnames
    ws = wb[sheets[n]]
    for row in ws.rows:
            if (row[0].value) != None :
                rows = row[0].value
                row_0.append(rows)

print("All rows of column A ")
print(row_0)
files = []
for f in result_files:
    ff = str(f).split('/')[-1:][0]
    fff = str(ff).split('.xlsx')[0]
    files.append(fff)

print(files)
# define tree lists for levenshtein
list1 = []
list2 = []

for l in row_0:
    str(l).lower()
    for d in files:
        Distance = levenshtein_ratio_and_distance(str(l).lower(),str(d).lower())   
        Ratio = levenshtein_ratio_and_distance(str(l).lower(),str(d).lower(),ratio_calc = True)
        list1.append(Distance)
        list2.append(Ratio)
        
print(list1, list2)
# plotting the points  
plt.plot(list1,'g*', list2, 'ro' )
plt.show()
The result is this:
[mythcat@desk ~]$ python test_xlsx.py
All rows of column A 
[11, 2, 113, 4, 1111, 4, 4, 111, 2, 1111, 5, 4, 4, 3, 1111, 1, 2, 1113, 4, 115, 1, 2, 221, 1, 1,
 43536, 2, 34242, 3, 1]
['001', '002', '003']
[2, 3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 3, 3, 4, 4, 3, 3, 3, 3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 4, 4, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 3, 4, 4, 2, 3, 3, 3, 2, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 2, 3, 3, 3, 
2, 3, 2, 3, 3, 2, 3, 3, 2, 3, 3, 5, 5, 4, 3, 2, 3, 5, 4, 5, 3, 3, 2, 2, 3, 3] [0.4, 0.0, 0.0, 0.0, 
0.5, 0.0, 0.3333333333333333, 0.0, 0.3333333333333333, 0.0, 0.0, 0.0, 0.2857142857142857, 0.0, 0.0,
 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.3333333333333333, 0.0, 0.0, 0.0, 0.5, 0.0, 0.2857142857142857, 0.0,
 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.2857142857142857, 0.0, 0.0, 0.5,
 0.0, 0.0, 0.0, 0.5, 0.0, 0.2857142857142857, 0.0, 0.2857142857142857, 0.0, 0.0, 0.0, 0.3333333333333333,
 0.0, 0.0, 0.5, 0.0, 0.0, 0.0, 0.5, 0.0, 0.3333333333333333, 0.3333333333333333, 0.0, 0.5, 0.0, 0.0,
 0.5, 0.0, 0.0, 0.0, 0.0, 0.25, 0.0, 0.5, 0.0, 0.0, 0.25, 0.25, 0.0, 0.0, 0.5, 0.5, 0.0, 0.0]

Monday, August 24, 2020

Python Qt5 : Get item data from QTreeWidgets.

In this example, I create a tree view with QTreeView with all folders tree.
I add a context_menu with two options.
One option is to get the data from item and is the name of the folder.
The second option is to close the application.
Let's see the source code:
import sys
from PyQt5.QtWidgets import QApplication, QFileSystemModel, QDesktopWidget
from PyQt5.QtWidgets import QTreeView, QWidget, QVBoxLayout, QMenu
from PyQt5.QtGui import QIcon
from PyQt5 import QtCore
from PyQt5.QtCore import Qt, QObject

class my_app_tree(QWidget):

    def __init__(self):
        super().__init__()
        self.title = "show files and folders on tree view"
        #self.left = 0
        #self.top = 0
        #self.width = 640
        #self.height = 480
        self.center()
        self.resize(640,480)
        self.initUI()

    def center(self):
        frame_geometry = self.frameGeometry()
        center_position = QDesktopWidget().availableGeometry().center()
        frame_geometry.moveCenter(center_position)
        self.move(frame_geometry.topLeft())

    def context_menu(self, position):
        menu = QMenu()
        copy_action = menu.addAction("Get folder")
        quit_action = menu.addAction("Quit")
        action = menu.exec_(self.tree.mapToGlobal(position))
        # quit application
        if action == quit_action:
            my_application.quit()
        # copy folder name from item
        elif action == copy_action:
            item = self.tree.selectedIndexes()[0].data()
            print("name folder is: "+str(item))

    def initUI(self):
        self.setWindowTitle(self.title)
        #the next source code line is used with left, top, width, height from __init__
        #self.setGeometry(self.left, self.top, self.width, self.height)
        
        self.model = QFileSystemModel()
        self.model.setRootPath('')
        self.tree = QTreeView()
        self.tree.setModel(self.model)
        
        self.tree.setAnimated(False)
        self.tree.setIndentation(20)
        self.tree.setSortingEnabled(True)
        
        self.tree.setWindowTitle("Dir View")
        self.tree.resize(640, 480)
        
        windowLayout = QVBoxLayout()
        windowLayout.addWidget(self.tree)
        self.setLayout(windowLayout)

        self.tree.setContextMenuPolicy(Qt.CustomContextMenu)
        self.tree.customContextMenuRequested.connect(self.context_menu)
        
        self.show()

if __name__ == '__main__':
    my_application = QApplication(sys.argv)
    example = my_app_tree()
    sys.exit(my_application.exec_())
The result of this source code can be see in the next image:

Sunday, August 23, 2020

Python Qt5 : Add and remove items between two QTreeWidgets.

Today's tutorial will show you how to add and remove items between two QTreeWidgets.
The source code is very simple to understand: the user interface is created with two QTreeWidgets.
One is completed with elements and when the buttons are pressed, the elements are interchanged.
import sys
from PyQt5.QtWidgets import QApplication, QWidget, QDesktopWidget, QPushButton
from PyQt5.QtWidgets import QBoxLayout,QTreeWidget,QTreeWidgetItem

class my_app_class(QWidget):
    def __init__(self):
        super().__init__()

        self.add_button=QPushButton('Add item here')
        self.remove_button=QPushButton('Remove item')
        self.wishlist=QTreeWidget(self)
        self.tree_list=QTreeWidget(self)
        # init the UI
        self.initUI()

    def initUI(self):
        # set title of window
        self.setWindowTitle('add and remove items from QTreeWidget!')

        self.init_tree()

        self.resize(800, 480)
        self.center()
        self.show()

    def center(self):
        geometry_frame = self.frameGeometry()
        center_pos = QDesktopWidget().availableGeometry().center()
        geometry_frame.moveCenter(center_pos)
        self.move(geometry_frame.topLeft())


    def init_tree(self):
        headers = ['A','B','C','D']

        self.tree_list.setColumnCount(len(headers))
        self.tree_list.setHeaderLabels(headers)

        self.wishlist.setColumnCount(len(headers))
        self.wishlist.setHeaderLabels(headers)


        list_layout = QBoxLayout(QBoxLayout.LeftToRight)
        list_layout.addWidget(self.tree_list)
        list_layout.addWidget(self.wishlist)

        tree_root = QTreeWidget.invisibleRootItem(self.tree_list)
        # add data to QTreeWidget with QTreeWidgetItem
        my_data = ['1','2','3','4']
        item = QTreeWidgetItem()
        for idx, data in enumerate(my_data):
            item.setText(idx, data)

        tree_root.addChild(item)

        my_data = ['11','10','01','D']
        item = QTreeWidgetItem()
        for idx, data in enumerate(my_data):
            item.setText(idx, data)

        tree_root.addChild(item)

        my_data = ['s', 'c', 'c', 'c']
        item = QTreeWidgetItem()
        for idx, data in enumerate(my_data):
            item.setText(idx, data)

        tree_root.addChild(item)

        btn_layout = QBoxLayout(QBoxLayout.RightToLeft)
        btn_layout.addWidget(self.add_button)
        btn_layout.addWidget(self.remove_button)

        main_layout = QBoxLayout(QBoxLayout.TopToBottom)
        main_layout.addLayout(list_layout)
        main_layout.addLayout(btn_layout)

        self.add_button.clicked.connect(self.move_item)
        self.remove_button.clicked.connect(self.move_item)

        self.setLayout(main_layout)
        return main_layout


    def move_item(self):
        sender = self.sender()

        if self.add_button == sender:
            source = self.tree_list
            target = self.wishlist
        else:
            source = self.wishlist
            target = self.tree_list

        item = QTreeWidget.invisibleRootItem(source).takeChild(source.currentIndex().row())
        QTreeWidget.invisibleRootItem(target).addChild(item)

if __name__=='__main__':
    # start the QApplication
    my_application = QApplication(sys.argv)
    # create aplication with the class
    example = my_app_class()
    # use exit for QApplication
    sys.exit(my_application.exec_())

Python 3.8.5 : Testing with openpyxl - part 001 .

The Python executes the code line by line because is an interpreter language.
This allows users to solve issues in the programming area, fast and easy.
I use python versiono 3.8.5 build on Aug 12 2020 at 00:00:00, see the result of interactive mode:
[mythcat@desk ~]$ python
Python 3.8.5 (default, Aug 12 2020, 00:00:00) 
[GCC 10.2.1 20200723 (Red Hat 10.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
Today I will show you how to start using openpyxl python package.
Another tutorial about python and documents can be found here.
The openpyxl was created by Eric Gazoni, Charlie Clark, and is a Python library to read/write Excel 2010 xlsx/xlsm/xltx/xltm files.
Let's install the openpyxl python package:
[mythcat@desk ~]$ pip3 install openpyxl --user
Collecting openpyxl
...
Installing collected packages: openpyxl
Successfully installed openpyxl-3.0.5
I tested with the default example source code and works well.
from openpyxl import Workbook
wb = Workbook()

# grab the active worksheet
ws = wb.active

# Data can be assigned directly to cells
ws['A1'] = 42

# Rows can also be appended
ws.append([1, 2, 3])

# Python types will automatically be converted
import datetime
ws['A2'] = datetime.datetime.now()

# Save the file
wb.save("sample.xlsx")
The next example gets all data about asteroids close to planet Earth and put into xlsx file type.
The rows with dangerous asteroids are fill with the red color:
# check asteroids close to planet Earth and add it to file
# import json python package
import json, urllib.request, time

# import openpyxl python package
from openpyxl import Workbook
from openpyxl.styles import PatternFill
# use active worksheet
wb = Workbook()
ws = wb.active

today = time.strftime('%Y-%m-%d', time.gmtime())
print("Time is: " + today)
now = today
# retrieve data about asteroids approaching planet Earth into json format
url = "https://api.nasa.gov/neo/rest/v1/feed?start_date=" + today + "&end_date=" + today + "&api_key=DEMO_KEY"
response = urllib.request.urlopen(url)
result = json.loads(response.read())

print("Now, " + str(result["element_count"]) + " asteroids is close to planet Earth.")
asteroids = result["near_earth_objects"]

no_data = ""
dangerous = ""

ws.append(['today', 'name', 'dangerous?', 'no_data'])
# parsing all the JSON data and add to file
for asteroid in asteroids:
    for field in asteroids[asteroid]:

      try:
        name = "Asteroid Name: " + field["name"]

        if field["is_potentially_hazardous_asteroid"]:   
          dangerous = "... dangerous to planet Earth!"

        else:
          dangerous = "... not threat to planet Earth!"

      except:
        no_data = "no data"
      ws.append([today, name, dangerous, no_data]) 

# create a red patern to fill
redFill = PatternFill(start_color='FFFF0000',
                   end_color='FFFF0000',
                   fill_type='solid')

# check the row with the dangerous asteroid and fill it
for row in ws.rows:
 if row[2].value == "... dangerous to planet Earth!":
  for cell in row:
      cell.fill = redFill

# write all data to file 
wb.save(str(now)+"_asteroids.xlsx")
I run it and result working well:
[mythcat@desk ~]$ python asteroid_data.py 
Time is: 2020-08-23
Now, 9 asteroids is close to planet Earth.
... see the next screenshot:


Saturday, August 22, 2020

Python 3.8.5 : Testing the pyre tool - part 001.

The Pyre is a static analysis tool to detect and prevent security issues in Python code that can be found on the official website.
The Pyre tool supports the Language Server Protocol and has an extension for VSCode.
The team development comes at August 7, 2020, with this intro:
Pyre is a performant type checker for Python. Statically typing what are essentially fully dynamic languages has a long tradition at Facebook. We've done this for PHP with Hack and for Javascript with Flow.
The install is easy to do with pip tool:
[mythcat@desk ~]$ pip install pyre-check
Defaulting to user installation because normal site-packages is not writeable
Collecting pyre-check
  Using cached pyre_check-0.0.52-py3-none-manylinux1_x86_64.whl (22.9 MB)
...
Installing collected packages: pyre-check
Successfully installed pyre-check-0.0.52
If you want to use a virtual environment:
[mythcat@desk ~]$ mkdir my_project && cd my_project
[mythcat@desk my_project]$ python3 -m venv ~/.venvs/venv
[mythcat@desk my_project]$ source ~/.venvs/venv/bin/activate
(venv) [mythcat@desk my_project]$ pip install pyre-check
Collecting pyre-check
...
(venv) [mythcat@desk my_project]$ pyre init
 ƛ Which directory should pyre be initialized in? (Default: `.`): 
(venv) [mythcat@desk my_project]$ cat .pyre_configuration
{
  "binary": "/home/mythcat/.venvs/venv/bin/pyre.bin",
  "source_directories": [
    "."
  ],
  "taint_models_path": "/home/mythcat/.venvs/venv/lib/pyre_check/taint/",
  "typeshed": "/home/mythcat/.venvs/venv/lib/pyre_check/typeshed/"
}
(venv) [mythcat@desk my_project]$ ls .pyre
my_project  pid_files  pyre.stderr
(venv) [mythcat@desk my_project]$ pyre
 ƛ No watchman binary found. 
To enable pyre incremental, you can install watchman: https://facebook.github.io/watchman/docs/install
 ƛ Defaulting to non-incremental check.
 ƛ No type errors found
Let's test with the default example from documentation:
(venv) [mythcat@desk my_project]$ echo "i: int = 'string'" > test.py
(venv) [mythcat@desk my_project]$ pyre
 ƛ No watchman binary found. 
To enable pyre incremental, you can install watchman: https://facebook.github.io/watchman/docs/install
 ƛ Defaulting to non-incremental check.
 ƛ Found 1 type error!
test.py:1:0 Incompatible variable type [9]: i is declared to have type `int` but is used as type `str`.
(venv) [mythcat@desk my_project]$ cat test.py 
i: int = 'string'
You can see is working well and detect the problem.
A short intro can found on the Facebook developers youtube channel:

Saturday, August 15, 2020

Python 3.8.5 : The hashlib python package - part 001.

The tutorial for today is about hashlib python module.
The official webpage comes for this python package has this intro:
This module implements a common interface to many different secure hash and message digest algorithms. Included are the FIPS secure hash algorithms SHA1, SHA224, SHA256, SHA384, and SHA512 (defined in FIPS 180-2) as well as RSA’s MD5 algorithm (defined in Internet RFC 1321).
The example source code to test a simple hash is this:
import hashlib
import os

def file_sha1(filename):
    BUF_SIZE = 65536  # read stuff in 64kb chunks!
    get_sha1 = hashlib.sha1()
    with open(filename, 'rb') as f:
        while True:
            data = f.read(BUF_SIZE)
            if not data:
                break
            get_sha1.update(data)
    return get_sha1.hexdigest()

# I add this comment after first to see the hash difference.
files = [f for f in os.listdir('.') if os.path.isfile(f)]
for f in files:
    h = file_sha1(f)
    print(h) 
Let's test the source code with the default directory and two files.
I run it first with default source code and then I add a comment to test_hash_file.py file.
You can see the hash is changed from b222523567a8a806382b86578717ddbd00e0f4b4 to 2134660551cc67812413a3a75fd12efb05d591ef.
[mythcat@desk Projects_Python]$ ls
test_hash_file.py  test_numpy_001.py
[mythcat@desk Projects_Python]$ python test_hash_file.py 
98b2833527ad3d9fe263542c6aa06c04182d3dfb
b222523567a8a806382b86578717ddbd00e0f4b4
[mythcat@desk Projects_Python]$ python test_hash_file.py 
98b2833527ad3d9fe263542c6aa06c04182d3dfb
2134660551cc67812413a3a75fd12efb05d591ef

Sunday, August 9, 2020

Python 3.8.5 : Pearson Product Moment Correlation with corrcoef from numpy.

The python package named numpy come with corrcoef function to return Pearson product-moment correlation coefficients.
This method has a limitation in that it can compute the correlation matrix between two variables only.
The full name is the Pearson Product Moment Correlation (PPMC).
The PPMC is not able to tell the difference between dependent variables and independent variables.
The documentation about this function can be found here.
More examples of Pearson Correlation can be found on this website.
My example presented in this tutorial, use the random packet to randomly generate integers and then calculate the correlation coefficients.
All of these are calculated five times in a for a cycle and each time the seed parameters are changed randomly.
Each time the correlation matrices are printed and then the random number graphs are displayed.
Let's see the source code:
import random

import numpy as np

nr_integers = 100
size_integers = 100

import matplotlib
import matplotlib.pyplot as plt

# set from 0 to 4 seed for random and show result 
for e in range(5):
    # change random seed
    np.random.seed(e)
    # nr_integers random integers between 0 and size_integers
    x = np.random.randint(0, size_integers, nr_integers)
    # Positive Correlation with some noise created with
    # nr_integers random integers between 0 and size_integers
    positive_y = x + np.random.normal(0, size_integers, nr_integers)
    correlation_positive = np.corrcoef(x, positive_y)
    # show matrix for correlation_positive
    print(correlation_positive)
    # Negative Correlation with same noise created with 
    # nr_integers random integers between 0 and size_integers
    negative_y = 100 - x + np.random.normal(0, size_integers, nr_integers)
    correlation_negative = np.corrcoef(x, negative_y)
    # show matrix for output with plt
    print(correlation_negative)
    # set graphic for plt with two graphics for each output with subplot
    plt.subplot(1, 2, 1)
    plt.scatter(x,positive_y)
    plt.subplot(1, 2, 2)
    plt.scatter(x,negative_y)
    # show the graph 
    plt.show()


Tuesday, August 4, 2020

Python 3.6.9 : My colab tutorials - part 008.

Today I deal with these two python packages named selenium and chromium-chromedriver.
I used selenium to get pieces of information from webpages.
These examples can be found at my GitHub project colab on the notebook named catafest_008.