# [Papers]Soft-RL: Maximum Entropy Reinforcement Learning as Probabilistic Inference

The first book you read about Reinforcement Learning is likely the book Reinforcement Learning: An Introduction. In the book, dear Richard Sutton and Andrew Barto introduced reinforcement learning from the view of maximizing total Reward in very detail. The book explained reinforcement learning step by step:

# Lecture 27 Positive Definite and its Graphical analysis

Determinant > 0

Eigenvalue > 0

pivot: pivot and pivot divided by a (for the second pviot, multiply together to become determinant)

x^T A x > 0

Semi definite when it is equal to 0

2. 2D example first

Find the pivot…

# Why Model-Free Reinforcement Learning Failed.

Model-Free Reinforcement Learning(MFRL) become popular again after the combination of convolutional neural network and Q learning algorithm. The idea behind MFRL is using the Bellman equation to model the temporal relation between current and future.

However, the data efficiency of MFRL is a huge concern. There are two reasons that…

# Regression Loss Function Duality-Probability Explanation

The learning process is a statistic process, inferencing is probabilistic. The training samples are sampled from unknown probability distributions Pr(X) and Pr(Y). The result of NN gives out the estimated distribution Pr(Y|X). Integrate out X to get Y: Pr(Y)=∫ Pr(Y|X)Pr(X) dx.

When I started to crack machine learning, I read…

# Lecture 24: Markov and Fourier.

All entries ≥ 0

Eigenvalue = 1 is one of the eigenvalues. All eigenvalue ≤ 1 to guarantee to have a steady-state.

2. Find the steady-state of the Markov matrix. The power of the matrix. Similar to what happened before. …

# Lecture 16 Projection Matrices and Least Square

b is perpendicular to the column space: b is projected to be a point, a 0 vector.

N(A^T) is perpendicular to the column space.

Pb = A(A^T A)^-1 A^T b . A^T * N(A^T) = 0

b in column…

# Lecture 12. An application in Physics: Represent Graph with Matrix.

0. An Application for Chemistry. Equation -> Matrix.

Overall Graph:

# Lecture 9. Independence, Basis, and Dimension with Nullspace.

no c will give… 