analitics

Pages

Sunday, December 8, 2019

Python 3.7.5 : Starting with kaggle platform.

Kaggle is the world's largest community of data scientists and the platform is the fastest way to get started on a new data science project.
A good choice to use Kaggle is this feature: Kaggle provides free access to NVidia K80 GPUs in kernels.
The tutorial for today is about kaggle and is new for me because I hear about this opportunity last year.
This platform that hosts data science and machine learning competitions can give the people a good area for development.
The official blog can tell you more about how can this platform works.
The kaggle A.P.I. can be found at GitHub.
Let's start the tutorial with the pip3 install tool:
[mythcat@desk kaggle]$ pip3 install kaggle --upgrade --user
Collecting kaggle
...
[mythcat@desk kaggle]$ mkdir ~/.kaggle/
[mythcat@desk kaggle]$ mv kaggle.json ~/.kaggle/kaggle.json
[mythcat@desk kaggle]$ kaggle 
Warning: Your Kaggle API key is readable by other users on this system! To fix this, you can
 run 'chmod 600 /home/mythcat/.kaggle/kaggle.json'
usage: kaggle [-h] [-v] {competitions,c,datasets,d,kernels,k,config} ...
kaggle: error: the following arguments are required: command
[mythcat@desk kaggle]$ chmod 600 /home/mythcat/.kaggle/kaggle.json
You can use the kaggle command to get informations from kaggle platform:
[mythcat@desk kaggle]$ kaggle competitions list 
ref                                            deadline             category            reward  teamCount
  userHasEntered  
---------------------------------------------  -------------------  ---------------  ---------  ---------
  --------------  
digit-recognizer                               2030-01-01 00:00:00  Getting Started  Knowledge       2305
           False  
titanic                                        2030-01-01 00:00:00  Getting Started  Knowledge      17135
           False  
house-prices-advanced-regression-techniques    2030-01-01 00:00:00  Getting Started  Knowledge       5532
           False  
imagenet-object-localization-challenge         2029-12-31 07:00:00  Research         Knowledge         57
           False  
google-quest-challenge                         2020-02-10 23:59:00  Featured           $25,000        410
           False  
tensorflow2-question-answering                 2020-01-22 23:59:00  Featured           $50,000        810
           False  
data-science-bowl-2019                         2020-01-22 23:59:00  Featured          $160,000       1762
           False  
pku-autonomous-driving   
...
The commands for this platform can be seen at GitHub:
kaggle competitions {list, files, download, submit, submissions, leaderboard}
kaggle datasets {list, files, download, create, version, init}
kaggle kernels {list, init, push, pull, output, status}
kaggle config {view, set, unset}
The kaggle platform use an online tool with notebooks similar with Jupyter notebook for process coding in Kernels.
The process of using models need the datasets and you can use the kaggle datasets.
You can use the New Notebook button from that page to start using the datasets.
The datasets can be load from kaggle or can be uploaded.
The next window let you to Select new notebook settings.
I used Python with a notebook and I got the default source code:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input
 directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.
Now, I can read the dataset shown in the right the Data with input
(read-only data)
and output from kaggle with pandas module:
data = pd.read_csv("../input/lego-database/colors.csv")
data.head()
The Commit button let you to save your work for later.
You can see my online test I created with dataset Lego and python on my kaggle page.

Saturday, December 7, 2019

Python 3.7.5 : Using the django with javascript.

The Django framework can work great with javascript.
I start this project a few days ago.
You can start a simple Django project, see my old tutorials.
I used a template for main page in my template folder named chart.html.
The chart.min.js is set on base.html.
The links work from the Django framework to HTML5 templates with javascript.
The full project can be found on my GitHub project.
See the screen output:

Thursday, December 5, 2019

Python 3.7.5 : About PEP 8.

Today is December 5, 2019, and this is the reason I wrote about PEP 8.
The official webpage can be found at this webpage.
PEP 8 recommends using 4 spaces to show indentation and tabs should only be used to maintain consistency in the code.
The Python Library is conservative and 79 characters are the maximum required line limit as suggested by PEP 8.
The main goal is to avoid line wrapping.
Indentation allows you to differentiate between multiple lines of code and a single line of code that spans multiple lines.
The closing braces with the first white-space character of the last line.
The use of blank lines can greatly improve the readability of your code.
This rule allows the reader to understand the separation of the sections of code and the relation between them.
TypeNaming ConventionsExamples
VariableUsing short names with CapWords.T, AnyString, My_First_Variable
FunctionUsing a lowercase word or words with underscores to improve readability.function, my_first_function
ClassUsing CapWords and do not use underscores between words.Student, MyFirstClass
MethodUsing lowercase words separated by underscores.Student_method, method
ConstantsUsing all capital letters with underscores separating wordsTOTAL, MY_CONSTANT, MAX_FLOW
ExceptionsUsing CapWords without underscores.IndexError, NameError
ModuleUsing short lower-case letters using underscores.module.py, my_first_module.py
PackageUsing short lowercase words and underscores are discouraged.package, my_first_package
About comments, PEP 8 basically recommends using block comments for general-purpose coding.
Document strings or docstrings start at the first line of any function, class, file, module or method.
These type of comments are enclosed between single quotations ( ''') or double quotations ( """ ).
To improving the readability of expressions and statements use whitespace for any character or sequence of characters.
We need to avoiding whitespaces:
  • inside parentheses, brackets, or braces;
  • before a comma, a semicolon, or a colon;
  • before open parenthesis that initiates the argument list of a function call;
  • before an open bracket that begins an index or a slice;
  • between a trailing comma and a closing parenthesis;
  • to adjust assignment operators;
PEP 8 recommends using a clear development for implementation of source code, for example PEP recommends using .startswith() and .endswith() instead of list slicing.
You can check whether your code actually complies with the rules and regulations of PEP 8 or not.
The linters is a program that analyzes your code and detects program errors, syntax errors, bugs, and structural problems.
An autoformatted tool will format your code to adapt to PEP 8 automatically.
Both tools used to implement and check PEP 8 compliance.
To know more about PEP 8 and its book of guidelines, you can refer to pep8.org.
You can check it online at this webpage.

Tuesday, December 3, 2019

Python 3.7.5 : The new Django version 3.0 .

On December 2, 2019, comes with Django 3.0 Released.
This new release comes with many changes and features, let's see :
  • Django 3.0 supports new versions of Python 3.6, 3.7, and 3.8 (the old Django 2.2.x series is the last to support Python 3.5);
  • Django now officially supports MariaDB 10.1 and higher;
  • Django is fully async-capable by providing support for running as an ASGI application;
  • The exclusion class constraints on PostgreSQL;
  • the output BooleanField may now be used directly in QuerySet filters;
  • you can have custom enumeration types
  • cache features: add_never_cache_headers() and never_cache() now add the private directive to Cache-Control headers;
  • file storage, forms, internationalization, and logging features;
  • the management commands are new options;
  • the Requests and Responses, Models and Security has many changes and new features;
  • the Tests and test cases have a good implementation;
  • new minor features for django.contrib.admin, django.contrib.auth, django.contrib.gis, django.contrib.postgres, django.contrib.sessions, django.contrib.syndication and more
All these features can be read at the official webpage.

Friday, November 29, 2019

Python 3.7.5 : Script install and import python packages.

This script will try to import Python packages from a list.
If these packages are not installed then they will be installed on the system.
import sys
import subprocess

if __name__ == '__main__': 
    def ModuleInstall(package_name):
        try:
            subprocess.check_call(['python3', '-m', 'pip3', 'install', package_name, "--user"])
        except:
            subprocess.check_call(['python', '-m', 'pip', 'install', package_name, "--user"])
    def ModuleLoaded(package_name):
        __import__(package_name)
        print ("Successfully imported ", package_name, '.')

    required_libs = [
    "optparse",
    "socket",
    "select",
    "threading",
    "time",
    "re",
    "ssl",
    "textwrap",
    "sys",
    "random",
    "traceback",
    "binascii",
    "struct",
    "inspect"]
    for lib_name in required_libs:
        try:
            ModuleLoaded(lib_name)
        except:
            ModuleInstall(lib_name)

Monday, November 25, 2019

Python 3.7.5 : Python and SQL, a fast approach.

Today he had to solve a task that required declaring a series of consecutive days to read from several SQL tables.
As you know, this task would require a complex query to include statements.
I solved it quite simply by using a python script that would automatically generate the days and then concatenate them into a query.
It may not be the best solution but it is very fast and does not require very advanced SQL knowledge.
Let's see the source code:
from datetime import date, timedelta

start_date = date(2019, 01, 1)
end_date = date(2019, 11, 24)
delta = timedelta(days=1)
q1 = str("SELECT DATE_FORMAT(updated, '%Y-%m-%d') AS day, intern, COUNT(*) FROM `intern")
q2 = str("` WHERE intern = 0 GROUP BY intern, day UNION ALL ")
while start_date <= end_date:
    print (q1+start_date.strftime("%Y_%m_%d")+q2)
    start_date += delta
The result querry will be like this:
SELECT DATE_FORMAT(updated, "%Y-%m-%d") AS day, intern, COUNT(*) FROM `intern2019_01_01` GROUP BY intern, 
day UNION ALL 
SELECT DATE_FORMAT(updated, "%Y-%m-%d") AS day, intern, COUNT(*) FROM `intern2019_01_02` GROUP BY intern,
 day UNION ALL
...
SELECT DATE_FORMAT(updated, "%Y-%m-%d") AS day, intern, COUNT(*) FROM `intern2019_11_24` GROUP BY intern,
 day UNION ALL
The last step, I remove the UNION ALL from the last SELECT and I used with phpmyadmin.

Saturday, November 23, 2019

Python 3.7.5 : Create GUI with npyscreen.

This python module solves the issue of creating easy GUI in the terminal.
The development team tells us:
Npyscreen is a python widget library and application framework for programming terminal or console applications.
The development of this python module is similar to the PyQt python module.
The npyscreen comes with many widgets and easy development.
The GitHub repo for this python module comes with many examples.
The documentation of this python module can be found at this
[mythcat@desk ~]$ pip3 install npyscreen --user
Collecting npyscreen
...
Installing collected packages: npyscreen
Successfully installed npyscreen-4.10.5
Let's see one simple example:
#!/usr/bin/env python3
import npyscreen

class ProductForm(npyscreen.Form):
     def create(self):
        self.product_name = self.add(npyscreen.TitleText, name='Name')
        self.category = self.add(npyscreen.TitleSelectOne, scroll_exit=True,\
     max_height=3, name='Category', values = ['Products 1', 'Products 2', 'Products 3'])
        self.myDate = self.add(npyscreen.TitleDateCombo, name='Date stocking')

class MyTest(npyscreen.NPSAppManaged):
    def onStart(self):
        self.addForm('MAIN', ProductForm, name='New product')

if __name__ == '__main__':
    TestApp = MyTest().run() 
The result of this python script can be seen on the next image.

Tuesday, November 19, 2019

Python 3.7.5 : Display a file in the hexadecimal and binary output.

This is an example with a few python3 modules that display a file in the hexadecimal and binary output:
import sys
import os.path
import argparse

parser = argparse.ArgumentParser()
parser.add_argument("FILE", help="the file that you wish to dump to hexadecimal", type=str)
parser.add_argument("-b", "--binary", help="display bytes in binary format instead of hexadecimal")
args = parser.parse_args()

try:
    with open(args.FILE, "rb") as f:
        n = 0
        b = f.read(16)
        while b:
            if not args.binary:
                s1 = " ".join([format(i,'02x') for i in b])
                s1 = s1[0:23] + " " + s1[23:]
                width = 48
            else:
                s1 = " ".join([format(i,'08b') for i in b])
                s1 = s1[0:71] + " " + s1[71:]
                width = 144
            s2 = "".join([chr(i) if 32 <= i <= 127 else "." for i in b])
            print(format(n * 16,'08x'), format(s1), format(s2))
            n += 1
            b = f.read(16)

    print(format(os.path.getsize(args.FILE),'08x'))

except Exception as e:
    print(__file__, ": ", type(e).__name__, " - ", e, sep="", file=sys.stderr)
The hexadecimal output of the sec.png file:
[mythcat@desk test]$ python3 hex.py sec.png 
00000000 89 50 4e 47 0d 0a 1a 0a  00 00 00 0d 49 48 44 52 .PNG........IHDR
00000010 00 00 01 0b 00 00 00 bd  08 03 00 00 00 93 6f a8 ..............o.
00000020 bc 00 00 00 5d 50 4c 54  45 ff ff ff ab ab ab fc ....]PLTE.......
00000030 fc fc a0 a0 a0 f5 f5 f5  a6 a6 a6 f9 f9 f9 9e 9e ................
00000040 9e f1 f1 f1 fa fa fa ea  ea ea ad ad ad a4 a4 a4 ................
00000050 d0 d0 d0 f6 f6 f6 e5 e5  e5 bc bc bc c3 c3 c3 97 ................
00000060 97 97 b4 b4 b4 d8 d8 d8  bf bf bf cb cb cb db db ................
00000070 db e1 e1 e1 93 93 93 7f  7f 7f 8d 8d 8d 85 85 85 .............
00000080 79 79 79 70 70 70 d9 a6  b1 29 00 00 08 fa 49 44 yyyppp...)....ID
00000090 41 54 78 9c e5 9d 89 92  a3 2a 14 40 45 45 09 a8 ATx......*.@EE..
000000a0 80 18 34 d3 3d ef ff 3f  f3 69 92 4e 67 71 45 90 ..4.=..?.i.NgqE.
000000b0 65 4e 55 4f a5 7b 52 c6  20 dc 9d 4b 14 f9 46 f7 eNUO.{R. ..K..F.
000000c0 87 36 79 8d 6d df 86 75  28 03 e4 4f c4 cb 28 ca .6y.m..u(..O..(.
000000d0 63 db f7 62 97 36 91 8f  e9 80 72 ce 6c de 8b 65 c..b.6....r.l..e
000000e0 c8 f9 f9 b7 14 53 68 eb  4e ac 73 a1 ef 7f 91 d2 .....Sh.N.s....
000000f0 c6 7d 6c 04 71 91 e9 be  66 f2 31 14 51 04 dd 97 .}l.q...f.1.Q...
00000100 a0 9c d0 8e e8 be 66 33  f6 d7 5a f3 a7 68 a7 e4 ......f3..Z..h..
00000110 91 f6 f9 8b c1 e8 9f 45  aa f5 53 34 93 45 71 7e .......E..S4.Eq~
00000120 7d 21 4a 9d 97 9d 98 00  ba 67 9f 4e b2 3a 61 a2 }!J......g.N.:a.
00000130 bb bd 96 10 f5 ff e2 b2  c3 d5 ee eb b6 13 b3 4c ...............L
00000140 a0 dd 97 36 06 96 51 c7  7f 7e 41 02 00 c8 da 86 ...6..Q.~A.....
00000150 13 00 bb 7d d7 9d 94 0b  f9 88 44 75 03 fc f5 f6 ...}......Du....
00000160 9c d2 d3 fd 45 d2 ff 34  cf 0f 17 89 75 aa 26 46 ....E..4....u.&F
00000170 b4 2d b9 98 fc 7f a0 75  29 ea 43 4e df b2 68 a2 .-....u).CN..h.
...
The binnary output of sec.png file:
[mythcat@desk test]$ python3 hex.py -b FILE sec.png 
00000000 10001001 01010000 01001110 01000111 00001101 00001010 00011010 00001010  
00000000 00000000 00000000 00001101 01001001 01001000 01000100 01010010 .PNG........IHDR
00000010 00000000 00000000 00000001 00001011 00000000 00000000 00000000 10111101
  00001000 00000011 00000000 00000000 00000000 10010011 01101111 10101000 ..............o.
00000020 10111100 00000000 00000000 00000000 01011101 01010000 01001100 01010100
  01000101 11111111 11111111 11111111 10101011 10101011 10101011 11111100 ....]PLTE.......
00000030 11111100 11111100 10100000 10100000 10100000 11110101 11110101 11110101
  10100110 10100110 10100110 11111001 11111001 11111001 10011110 10011110 ................
00000040 10011110 11110001 11110001 11110001 11111010 11111010 11111010 11101010
  11101010 11101010 10101101 10101101 10101101 10100100 10100100 10100100 ................
00000050 11010000 11010000 11010000 11110110 11110110 11110110 11100101 11100101
  11100101 10111100 10111100 10111100 11000011 11000011 11000011 10010111 ................
00000060 10010111 10010111 10110100 10110100 10110100 11011000 11011000 11011000
  10111111 10111111 10111111 11001011 11001011 11001011 11011011 11011011 ................
00000070 11011011 11100001 11100001 11100001 10010011 10010011 10010011 01111111
  01111111 01111111 10001101 10001101 10001101 10000101 10000101 10000101 .............
00000080 01111001 01111001 01111001 01110000 01110000 01110000 11011001 10100110
  10110001 00101001 00000000 00000000 00001000 11111010 01001001 01000100 yyyppp...)....ID
00000090 01000001 01010100 01111000 10011100 11100101 10011101 10001001 10010010
  10100011 00101010 00010100 01000000 01000101 01000101 00001001 10101000 ATx......*.@EE..
...

Friday, November 15, 2019

Python 3.7.5 : About PEP 8016.

Let's start with PEP 8012 which proposes a new model of Python governance based on consensus and voting, without the role of a centralized singular leader or a governing council and was rejected.
The PEP 8015 formalize the current organization of the Python community and proposes changes.
This PEP 8015 was rejected by a core developer vote described in PEP 8001 on Monday, December 17, 2018.
The next PEP 8016 proposes a model of Python governance based around a steering council.
All candidate PEPs are listed in PEP 8000 and consists of all PEPs numbered in the 801X range.
The PEP 8001 refers to the voting process and choose which governance PEP should be implemented by the Python project.
All these PEP's 2012, 2013, 2014, 2015 are rejected.
The last one PEP 2016 named The Steering Council Model was accepted.
The PEP 2016 comes with these new rules:
This PEP proposes a model of Python governance based around a steering council. The council has broad authority, which they seek to exercise as rarely as possible; instead, they use this power to establish standard processes, like those proposed in the other 801x-series PEPs. This follows the general philosophy that it's better to split up large changes into a series of small changes that can be reviewed independently: instead of trying to do everything in one PEP, we focus on providing a minimal-but-solid foundation for further governance decisions.
You can read more about this PEP on this webpage.

Tuesday, November 5, 2019

Python 3.7.5 : About PEP 3107.

The PEP 3107 introduces a syntax for adding arbitrary metadata annotations to Python functions.
The function annotations refer to syntax parameters with an expression.
def my_function(x: expression, y: expression = 5):
...
For example:
>>> def show(myvar:np.float64):
...     print(type(myvar))
...     print(myvar)
... 
>>> show(1.1)

1.1
>>> def files(filename: str, dot='.') -> list:
...     print(filename)
...     print(type(filename))
... 
>>> files('file.txt')
file.txt

>>> print(files.__annotations__)
{'filename': , 'return': }
>>> print(show.__annotations__)
{'myvar': }
...
You can see the annotation syntax with a dictionary called __annotations__ as an attribute on your functions.
This lets you rewrite Python 3 code with function annotations to be compatible with both Python 3 and Python 2.
Type hints are a specialization of function annotations, and they can also work side by side with other function annotations.
Annotations have no standard meaning or semantics.
There are several benefits to the annotations:
  • if you rename an argument, the documentation docstring version may be out of date and is easier to see if an argument is not documented;
  • is no need to come up with a special format of argument because the annotations attribute provides a direct, standard mechanism of access;
Let's see one example using type aliases:

>>> Temperature = float
>>> def forecast(local_temperature: Temperature) -> str:
...     print(local_temperature)
... 
>>> forecast(13.1)
13.1
...
I can create multiple annotations:

>>> def div(a: dict(type=float, help='the dividend'), b: dict(type=float, help='this <> 0)') ) -> 
dict(type=float, help='the result of dividing a by b'):
...     return a / b
... 
>>> div(3,4)
0.75
...
Annotations for excess parameters like *args and **kwargs, allow arbitrary number of arguments to be passed in a function call.
See example with my_func:
def my_func(*args: expression, *kwargs: expression):
...
Annotations combine well with decorators to provide input to a decorator, and decorator-generated wrappers are a good place to put code that gives meaning to annotations, but this is another issue.


Monday, November 4, 2019

Python 3.7.5 : About PEP 506.

Today I did a python evaluation and saw that there are many new aspects that should be kept in mind for a programmer.
So I decided to recall some necessary elements of PEP.
First, PEP stands for Python Enhancement Proposal.
A PEP is a design document providing information to the Python community, or describing a new feature for Python or its processes or environment.
My list will not follow a particular order and I will start with PEP 506.
This PEP 506 proposes the addition of a module for common security-related functions such as generating tokens to the Python standard library.
Python 3.6 added a new module called secrets that is designed to provide an obvious way to reliably generate cryptographically strong pseudo-random values suitable for managing secrets, such as account authentication, tokens, and similar.
Python’s random module was never designed for cryptographic but you can try to use it with urandom function:
[mythcat@desk ~]$ python3
Python 3.7.5 (default, Oct 17 2019, 12:09:47) 
[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.urandom(8)
This module named secrets will contain a set of ready-to-use functions for dealing with anything which should remain secret (passwords, tokens, etc.).
>>> import secrets
>>> import string
>>> alphabet = string.ascii_letters + string.digits
>>> password = ''.join(secrets.choice(alphabet) for i in range(20)) 
>>> print(alphabet, password)
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 mwTKhSxGGBMU3voOV1Kf
The secrets module also provides several methods of generating tokens, see example:
As bytes, secrets.token_bytes;
>>> secrets.token_bytes()
b'I\xf9a\xd1j\xc6\xc9\xa0qV\x82\x07x\xc6\xe9\xbb\xd7<\xfb\xb2?\xe1\x94\xe9\xce\xbc\xaaF\xfc7\xfc='
>>> secrets.token_bytes(8)
b'\rl\xb1\xb9\x04i]d'
>>> secrets.token_bytes(16)
b'B!:G\x1c\xdd.\xacC\x7f\x95)\x1f^\xec\xb2'
>>> secrets.token_bytes(32)
b'\xfa\xa9\xff\x91y\x9e+z9\x88K\x95\xa8\xb0\x06\xc2b:\xf5]\xcf^%~\x0cJ\xdd\x80\xa2\xa0\xdc\xaa'
>>> secrets.token_bytes(64)
b"\xe4(\x80d7c6\\\xb2\xd5\xcb\x92\x8a'\x82\xcb\xfd\xcc\x9a\x8a\xd9jt\x84s\xb0\x8f]\x8cS\xdcP\n\xef\x14\xf6\
xe0+0\xaf\xcfL\xd3\xd0\xfe\x04\x98k\xc38\xf6\xad.~\xd1\xca\xd6\xc9\xf9\xbf\xff8O\xad"
As text, using hexadecimal digits, secrets.token_hex;
>>> secrets.token_hex()
'5a2eb8a0a89ecaf5a64e57215f359012eaaf8a3db51bd1ea171e922a24935183'
>>> secrets.token_hex(8)
'79e7582b72711af7'
>>> secrets.token_hex(16)
'9b274380935ae169ebd41159f7b85cf6'
>>> secrets.token_hex(32)
'0a2e5fde42c6578c3ba36501b69a9339e838d44c3240999a83d349d266bcb164'
>>> secrets.token_hex(64)
'fbd9ab627e9fe6c2b6d715b1438205321ac9139f5089fe6ca4ffece79aa0c08aa84a26fdbb984dc48a0489e1692b19d3f5fe40116be
60f1a1d7d61739718befe'
As text, using URL-safe base-64 encoding, secrets.token_urlsafe.
>>> secrets.token_urlsafe()
'L06rX6fIk1n-gpcLbsHq_w5SgkqgGcvnkjBRcOZqgXs'
>>> secrets.token_urlsafe(8)
'lhOw5llcgsQ'
>>> secrets.token_urlsafe(16)
'A493DgcDMiNx8WjlRswxBA'
>>> secrets.token_urlsafe(32)
'HSb5dqkaPrqFcdsQFYW5N_Fxb_Hxn0ESsT4VMfJcLYY'
>>> secrets.token_urlsafe(64)
'FKPC0LU7Sc_dsxm7m-VMA-vTEKgJeNcD2zpjKBEg0oLZlPBVVM0O5Vztp0ySLifyifok5009LByQUc5z8thCWQ'

Saturday, November 2, 2019

Python 3.7.5 : Intro about scikit-learn python module.

This python module named scikit-learn used like sklearn is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy and comes with various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN.
The official webpage can be found here
Let't install this on my Fedora 30 distro:
[mythcat@desk proiecte_github]$ mkdir sklearn_examples
[mythcat@desk proiecte_github]$ cd sklearn_examples/
[mythcat@desk sklearn_examples]$ pip3 install scikit-learn --user
Python 3.7.5 (default, Oct 17 2019, 12:09:47) 
[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sklearn
>>> print('sklearn: %s' % sklearn.__version__)
sklearn: 0.21.3 

First, this is a complex python module with many examples on web.
You can learn much about how can use simple and efficient all data mining and data analysis.
You can learn a lot about how all data exploitation and data analysis can be used simply and efficiently.
Mathematical functions are simple and complex. How to use python programming and existing examples can be used in several learning points.
I would start with discovering the input and output data sets and then continue with clear examples used daily by us.
I tested today with SVC and sklearn python module.
The SVMs were introduced initially in the 1960s and were later refined in the 1990s.
The base of this algorithm is the decision boundary that maximizes the distance from the nearest data points of all the classes.
The wikipedia article show all informations about support-vector machines (named SVM).
As applications we can use this function in: medical field for cell counting or similar cell quantification, astronomy, etc.
This simple example use multiple kernels and gammas parameters to group the input data.
import numpy as np
from sklearn.datasets import make_blobs
from sklearn import svm
from sklearn.svm import SVC
# importing scikit learn with make_blobs 
from sklearn.datasets.samples_generator import make_blobs 
# 
import matplotlib.pyplot as plt

import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets

# import some data to play with
iris = datasets.load_iris()
x = iris.data[:, :2]
y = iris.target

def plotSVC(title):
  # create a mesh to plot with dataset x and y
  x_min, x_max = x[:, 0].min() - 1, x[:, 0].max() + 1
  y_min, y_max = x[:, 1].min() - 1, x[:, 1].max() + 1
  # set the resolution by 100 
  h = (x_max / x_min)/100
  # create the meshgrid 
  xx, yy = np.meshgrid(np.arange(x_min, x_max, h),np.arange(y_min, y_max, h))
  # divides the current figure into an m-by-n grid and creates axes in the position specified by p
  plt.subplot(1, 1, 1)
  # the model can then be used to predict new values
  Z = svc.predict(np.c_[xx.ravel(), yy.ravel()])
  # reshape your test data because prediction needs an array that looks like your training data
  Z = Z.reshape(xx.shape)
  # use plt to show result 
  plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8) 
  plt.scatter(x[:, 0], x[:, 1], c=y, cmap=plt.cm.Paired)
  plt.xlabel('x length')
  plt.ylabel('y width')
  plt.xlim(xx.min(), xx.max())
  plt.title("Plot SVC")
  plt.show()

# create kernels for svg 
kernels = ['linear', 'rbf', 'poly']
# for each kernel show graphs
for kernel in kernels:
  svc = svm.SVC(kernel=kernel).fit(x, y)
  plotSVC('kernel=' + str(kernel))
# create gammas values
# the gamma parameter defines how far the influence of a single training example reaches
gammas = [0.1, 1, 10, 100, 1000]
# for each gammas and kernel rbf - fast processing,  show graphs
for gamma in gammas:
   svc = svm.SVC(kernel='rbf', gamma=gamma).fit(x, y)
   plotSVC('gamma=' + str(gamma))

See the last result for kernel rbf and gamma 1000.

Python 3.7.5 : The ani script with ascii.

ASCII, abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. see Wikipedia.
This is a simple script named ani.py created by me to show an animation with ASCII ...
import os, time
os.system('cls')
filenames = ["0.txt","1.txt","2.txt","3.txt"]
frames = []
for name in filenames:
    with open (name, "r", encoding="utf8") as f:
        frames.append(f.readlines())
"""
for frame in frames:
    print("".join(frame))
    time.sleep(1)
    os.system('clear')
"""
for i in range (4):
    os.system('clear')
    for frame in frames:
        print("".join(frame))
        time.sleep(1)
        os.system('clear')
You need four text files with an 8X8 character matrix format: 0.txt , 1.txt , 2.txt and 3.txt.
The content of these files:
$ cat *.txt
        
 ###### 
        
        
        
        
 ###### 
                 
        
 ###### 
        
        
 ###### 
        
                 
        
        
  ####  
  ####  
        
        
                 
        
        
   ##   
   ##   
        
        
The end result is a square that shrinks to 4 characters #.