Article From:https://www.cnblogs.com/traces2018/p/9968389.html

First,

The relationship and difference between classification and clustering are briefly described.

This paper briefly describes what is supervised learning and unsupervised learning.

Classification and Clustering: Classification is a supervised algorithm that classifies data in the presence of target classification (Naive Bayesian algorithm). Clustering is an unsupervised algorithm, which automatically aggregates data with similar features into one class (KMea) without target classification before the model is established.NS clustering algorithm).

Supervised learning and unsupervised learning: Supervised learning has given the training data set before establishing the model, and the machine trains the model according to the training data set and predicts the new data. Unsupervised learning is the analysis of data without manual labeling, and the machine classifies itself according to the similarity between the data. Similarity degreeHigh data will be grouped together.

Two.

Calculus process

The code implements naive Bayesian algorithm:

```import pandas as pd
import numpy as np

# Data processing, for men and women (male 1 female 0), age (& lt; 70-1, 70-80, & gt; 801),
# Hospitalization days (& lt; 7-1, 7-140, & gt; 141) were processed in three columns.
sex = []
if s == 'male':
sex.append(1)
else:
sex.append(0)

age = []
if a == '<70':
age.append(-1)
elif a == '70-80':
age.append(0)
else:
age.append(1)

days = []
for d in dataDF['Length of stay']:
if d == '<7':
days.append(-1)
elif d == '7-14':
days.append(0)
else:
days.append(1)

# In addition, a processed DF is generated.

# Turn to arrays for computing
dataarr

# Bayesian model was used to determine which diseases patients belong to: gender ='male', age & lt; 70, KILLP = 1, drinking ='yes', smoking ='yes', hospitalization days & lt;
def beiyesi(sex, age, KILLP, drink, smoke, days):
# initialize variable
x1_y1,x2_y1,x3_y1,x4_y1,x5_y1,x6_y1 = 0,0,0,0,0,0
x1_y2,x2_y2,x3_y2,x4_y2,x5_y2,x6_y2 = 0,0,0,0,0,0
y1 = 0
y2 = 0

for line in dataarr:
if line[6] == 'Myocardial infarction':# Calculate the number of symptoms under myocardial infarction
y1 += 1
if line[0] == sex:
x1_y1 += 1
if line[1] == age:
x2_y1 += 1
if line[2] == KILLP:
x3_y1 += 1
if line[3] == drink:
x4_y1 += 1
if line[4] == smoke:
x5_y1 += 1
if line[5] == days:
x6_y1 += 1
else: # Calculate the number of symptoms under unstable angina pectoris
y2 += 1
if line[0] == sex:
x1_y2 += 1
if line[1] == age:
x2_y2 += 1
if line[2] == KILLP:
x3_y2 += 1
if line[3] == drink:
x4_y2 += 1
if line[4] == smoke:
x5_y2 += 1
if line[5] == days:
x6_y2 += 1
# print('y1:',y1,' y2:',y2)

# Computation, to X | y1, x | Y2
# print('x1_y1:',x1_y1, ' x2_y1:',x2_y1, ' x3_y1:',x3_y1, ' x4_y1:',x4_y1, ' x5_y1:',x5_y1, ' x6_y1:',x6_y1)
# print('x1_y2:',x1_y2, ' x2_y2:',x2_y2, ' x3_y2:',x3_y2, ' x4_y2:',x4_y2, ' x5_y2:',x5_y2, ' x6_y2:',x6_y2)
x1_y1, x2_y1, x3_y1, x4_y1, x5_y1, x6_y1 = x1_y1/y1, x2_y1/y1, x3_y1/y1, x4_y1/y1, x5_y1/y1, x6_y1/y1
x1_y2, x2_y2, x3_y2, x4_y2, x5_y2, x6_y2 = x1_y2/y2, x2_y2/y2, x3_y2/y2, x4_y2/y2, x5_y2/y2, x6_y2/y2
x_y1 = x1_y1 * x2_y1 * x3_y1 * x4_y1 * x5_y1 * x6_y1
x_y2 = x1_y2 *  x2_y2 * x3_y2 * x4_y2 * x5_y2 * x6_y2

# Calculate the probability of symptoms
x1,x2,x3,x4,x5,x6 = 0,0,0,0,0,0
for line in dataarr:
if line[0] == sex:
x1 += 1
if line[1] == age:
x2 += 1
if line[2] == KILLP:
x3 += 1
if line[3] == drink:
x4 += 1
if line[4] == smoke:
x5 += 1
if line[5] == days:
x6 += 1
# print('x1:',x1, ' x2:',x2, ' x3:',x3, ' x4:',x4, ' x5:',x5, ' x6:',x6)
# Calculation
length = len(dataarr)
x = x1/length * x2/length * x3/length * x4/length * x5/length * x6/length
# print('x:',x)

# Calculate the probability of myocardial infarction and unstable angina pectoris under given symptoms, respectively
y1_x = (x_y1)*(y1/length)/x
# print(y1_x)
y2_x = (x_y2)*(y2/length)/x

# Judging which disease is most likely
if y1_x > y2_x:
print('The patient is more likely to suffer from myocardial infarction.',y1_x)
else:
print('The patient is more likely to suffer from unstable angina pectoris.',y2_x)

# Judgment: gender = male, age & lt; 70, KILLP = 1, drinking = yes, smoking = yes, length of stay & lt;
beiyesi(1,-1,1,'yes','yes',-1)```

Screenshots: