Skip to main content

πŸš€ Gradient Descent

경사 ν•˜κ°•λ²•μ΄λž€?​

경사 ν•˜κ°•λ²•μ€ λ¨Έμ‹ λŸ¬λ‹μ—μ„œ μ΅œμ ν™” 문제λ₯Ό ν•΄κ²°ν•˜λŠ” 데 μ‚¬μš©λ˜λŠ” 일반적인 기술

경사 ν•˜κ°•λ²•μ˜ κΈ°λ³Έ κ°œλ…β€‹

경사 ν•˜κ°•λ²•μ€ 손싀 ν•¨μˆ˜μ˜ 기울기λ₯Ό κ³„μ‚°ν•˜μ—¬ μ΅œμ†Œν™”ν•˜λŠ” 방법을 μ œκ³΅ν•©λ‹ˆλ‹€. 이 κΈ°μˆ μ€ λ¨Έμ‹ λŸ¬λ‹ λͺ¨λΈμ˜ νŒŒλΌλ―Έν„°λ₯Ό μ‘°μ •ν•˜μ—¬ 손싀 ν•¨μˆ˜λ₯Ό μ΅œμ†Œν™”ν•˜λŠ” 과정을 ν¬ν•¨ν•©λ‹ˆλ‹€.

경사 ν•˜κ°•λ²•μ˜ μˆ˜μ‹β€‹

경사 ν•˜κ°•λ²•μ˜ μˆ˜μ‹μ€ λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

$$ \theta = \theta - \eta \nabla J(\theta) $$

μ—¬κΈ°μ„œ $\theta$λŠ” λͺ¨λΈμ˜ νŒŒλΌλ―Έν„°, $\eta$λŠ” ν•™μŠ΅λ₯ , $\nabla J(\theta)$λŠ” 손싀 ν•¨μˆ˜μ˜ κΈ°μšΈκΈ°μž…λ‹ˆλ‹€.

경사 ν•˜κ°•λ²• python μ½”λ“œ μ˜ˆμ‹œβ€‹

import numpy as np

# ν™œμ„±ν™” ν•¨μˆ˜λ‘œ μ‹œκ·Έλͺ¨μ΄λ“œ ν•¨μˆ˜ μ •μ˜ν•˜κΈ°
def sigmoid(x):
return 1 / (1 + np.exp(-x))

# μ‹œκ·Έλͺ¨μ΄λ“œ ν•¨μˆ˜μ˜ λ„ν•¨μˆ˜
def sigmoid_prime(x):
return sigmoid(x) * (1 - sigmoid(x))

# μ˜ˆμ‹œ μž…λ ₯κ°’
X = np.array([0.5, 0.8])
# λͺ©ν‘œκ°’
y = 0.2
# 좜λ ₯ κ°€μ€‘μΉ˜μ˜ μž…λ ₯
np.random.seed(42)
weights = np.random.rand(2)

# ν•™μŠ΅λ₯ 
learning_rate = 0.5

# 신경망 좜λ ₯ (y-hat)
h = np.dot(X, weights)
nn_output = sigmoid(h)

# 좜λ ₯ 였차 (y - y-hat)
error = y - nn_output

# 좜λ ₯ 기울기 (μ˜€μ°¨κ°€ κΈ°μšΈκΈ°μ— λ―ΈμΉ˜λŠ” 영ν–₯ sigmoidλ₯Ό λ―ΈλΆ„ν•˜μ—¬ 계산)
output_gradient = error * sigmoid_prime(h)

# 경사 ν•˜κ°•λ²•
del_w = learning_rate * output_gradient * X

# κ°€μ€‘μΉ˜ μ—…λ°μ΄νŠΈ
weights += del_w

print(weights)