AI-Introduction-5

Lucas-TY

AI|Jan 30, 2024|Last edited: Feb 1, 2024|
type
status
date
slug
summary
tags
category
icon
password

Random Variable (RV)

set of possible values "range"

  • Also called support

"distribution function" R^2 (how "likely" something is to happen)

\int p(x)dx=1
\int^{0.75}_{0.25}2dx=1

continuous probability density function(pdf)

discrete probability mass function(pmf)

  • continuous unbounded
    • Normal
    • MLE(maximum likelihood estimation):
  • discrete bounded
    • Bernoulli. binomial, categorical( p(x)=p_1, if~x=i, multinomial
  • continuous bounded
    • uniform

Naive Bayes

class
continuous(Normal)
binary(Bernoulli)
Categorical
1
0.5
0
1
1
2.5
1
2
2
4
0
3
3
2
1
2
4
1
0
1
1
2
1
3
2
7
1
2
Input: 1. 2. distribution for each feature
  1. fit categorical to class labels
  1. for each class:
    1. for each data dimension j:
      fit class conditional MLE for dim j
1
2
3
1
u=5/3, \sigma =
p=2/3
\lambda_1 =1/3, \lambda_2 =1/3, \lambda_3 =1/3
2
3
4

Naive Bayes Inference

  • Input
    • features
  • Output
    • a predicted label inferred
  • Terms

Reasons to care Independence

  • independence is too restrictive
  • No independence exponential space and sparse
  • Too much data needed

Chain rule of probability

Bayes net inference

  • Input:
    • set of variables that are observed evidence
    • set of variables that we want know the distribution of it query
  • Output: probability distribution over query variables given evidence

Naive Bayes net learning

  • Input:
    • set of nodes(RVS)
  • Output:
    • set of edges(probability distributions)

Bayes net learning

  • Input:
    • Data x
  • Output:
    • Estimated Probability dist. for each node(RV)

Check conditional independence relationships

  • Input: two sets of variables to check set of evidence

Things need to known:

  1. Convert prob dist. in writing to BN
  1. Convert BN to prob. dist.
E.g.
notion image

Lexicographic:

  • All nodes named before parents

Unsupervised learning

  • Clustering Input; data x
  • Output: a label for each data point
    • (cluster membership)
Herniating optimization for clustering
  1. Guess cluster assignment
  1. Repeat until converged
      • Update membership to minimize energy
      • update cluster identities to minimize energy

K-means

Two sticky issues with k- means

  1. How to initialize?
  1. How to pick K?

Supervised Learning

  • Linear regression ——regression loss=SSE
  • Naive Bayes —— classification , loss = data labeled