http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#tensors

Learning PyTorch with Examples

이 튜토리얼은 예제를 통해 PyTorch의 기본적인 컨셉트를 소개하고자 합니다.

그 중심에는 PyTorch가 제공하는 다음 두 기능을 알리는 것이 목적입니다.

numpy와 비슷한, 그렇지만 GPU에서 실행할 수 있는 n차원의 tensor
신경망을 만들고 학습할 때 쓸 자동 미분 기능

앞으로의 예제에서는 fully-connected ReLU를 사용한 망을 사용할 것입니다. 이 망은 하나의 은닉층을 가지고 있으며, 경사 하강법(gradient descent)을 이용해서 랜덤 데이터에 대해 망의 출력과 정답 출력의 Euclidean distance를 최소화하도록 할 것입니다.

Note: 각각의 예제는 이 문서 끝에서 찾아볼 수 있습니다.

Tensors

Warm-up: numpy

PyTorch에 들어가기 전에 numpy를 이용하여 망을 구현해봅시다.

Numpy는 n차원의 배열 객체와 이 배열을 다루는 많은 함수를 제공합니다. Numpy는 계산 그래프나 딥러닝, 그래디언트 같은 기능은 제공하지 않지만 과학 컴퓨팅에 널리 쓰이는 프레임워크입니다. 하지만 랜덤 데이터를 다루는 2개 층을 가지는 신경망 정도는 순전파와 역전파 과정을 numpy만을 이용해서 쉽게 만들 수 있습니다.

# -*- coding: utf-8 -*-
import numpy as np

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random input and output data
x = np.random.randn(N, D_in)
y = np.random.randn(N, D_out)

# Randomly initialize weights
w1 = np.random.randn(D_in, H)
w2 = np.random.randn(H, D_out)

learning_rate = 1e-6
for t in range(500):
    # Forward pass: compute predicted y
    h = x.dot(w1)
    h_relu = np.maximum(h, 0)
    y_pred = h_relu.dot(w2)

    # Compute and print loss
    loss = np.square(y_pred - y).sum()
    print(t, loss)

    # Backprop to compute gradients of w1 and w2 with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.T.dot(grad_y_pred)
    grad_h_relu = grad_y_pred.dot(w2.T)
    grad_h = grad_h_relu.copy()
    grad_h[h &lt; 0] = 0
    grad_w1 = x.T.dot(grad_h)

    # Update weights
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

# -*- coding: utf-8 -*-

import numpy as np

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# Create random input and output data

x = np.random.randn(N, D_in)

y = np.random.randn(N, D_out)

# Randomly initialize weights

w1 = np.random.randn(D_in, H)

w2 = np.random.randn(H, D_out)

learning_rate = 1e-6

for t in range(500):

# Forward pass: compute predicted y

h = x.dot(w1)

h_relu = np.maximum(h, 0)

y_pred = h_relu.dot(w2)

# Compute and print loss

loss = np.square(y_pred - y).sum()

print(t, loss)

# Backprop to compute gradients of w1 and w2 with respect to loss

grad_y_pred = 2.0 * (y_pred - y)

grad_w2 = h_relu.T.dot(grad_y_pred)

grad_h_relu = grad_y_pred.dot(w2.T)

grad_h = grad_h_relu.copy()

grad_h[h < 0] = 0

grad_w1 = x.T.dot(grad_h)

# Update weights

w1 -= learning_rate * grad_w1

w2 -= learning_rate * grad_w2

PyTorch: Tensors

Numpy만으로도 참 좋은 프레임워크지만 계산을 빠르게 하기 위해 GPU를 활용하지 못합니다. 최근 딥러닝 망에서 GPU를 사용하면 50배 이상의 속도를 내는 것으로 알려져 있습니다. 때문에 numpy는 아쉽게도 딥러닝에 적당하지 않습니다.

이제 PyTorch의 가장 기본적인 컨셉트인 Tensor를 소개합니다. PyTorch의 Tensor는 개념적으로는 numpy의 배열과 동일합니다. Tensor는 n차원의 배열이고, 이와 관련된 연산을 PyTorch가 제공합니다. numpy 배열처럼 PyTorch Tensor는 딥러닝이나 계산그래프, 그래디언트에 대한 기능은 존재하지 않지만, 역시 과학 컴퓨팅을 위한 툴입니다.

하지만 numpy와는 달리 PyTorch Tensor는 GPU를 활용하여 계산을 빠르게 할 수 있습니다. PyTorch의 Tensor를 GPU에서 실행하려면, 간단하게 새로운 데이터 타입으로 캐스팅하기만 하면 됩니다.

이제 두 개 층을 가지는 망에 PyTorch Tensor를 이용해보겠습니다. 이전의 numpy 예제와 마찬가지로 순전파와 역전파 과정을 직접 구현하겠습니다.

# -*- coding: utf-8 -*-

import torch


dtype = torch.FloatTensor
# dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random input and output data
x = torch.randn(N, D_in).type(dtype)
y = torch.randn(N, D_out).type(dtype)

# Randomly initialize weights
w1 = torch.randn(D_in, H).type(dtype)
w2 = torch.randn(H, D_out).type(dtype)

learning_rate = 1e-6
for t in range(500):
    # Forward pass: compute predicted y
    h = x.mm(w1)
    h_relu = h.clamp(min=0)
    y_pred = h_relu.mm(w2)

    # Compute and print loss
    loss = (y_pred - y).pow(2).sum()
    print(t, loss)

    # Backprop to compute gradients of w1 and w2 with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.t().mm(grad_y_pred)
    grad_h_relu = grad_y_pred.mm(w2.t())
    grad_h = grad_h_relu.clone()
    grad_h[h &lt; 0] = 0
    grad_w1 = x.t().mm(grad_h)

    # Update weights using gradient descent
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

# -*- coding: utf-8 -*-

import torch

dtype = torch.FloatTensor

# dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# Create random input and output data

x = torch.randn(N, D_in).type(dtype)

y = torch.randn(N, D_out).type(dtype)

# Randomly initialize weights

w1 = torch.randn(D_in, H).type(dtype)

w2 = torch.randn(H, D_out).type(dtype)

learning_rate = 1e-6

for t in range(500):

# Forward pass: compute predicted y

h = x.mm(w1)

h_relu = h.clamp(min=0)

y_pred = h_relu.mm(w2)

# Compute and print loss

loss = (y_pred - y).pow(2).sum()

print(t, loss)

# Backprop to compute gradients of w1 and w2 with respect to loss

grad_y_pred = 2.0 * (y_pred - y)

grad_w2 = h_relu.t().mm(grad_y_pred)

grad_h_relu = grad_y_pred.mm(w2.t())

grad_h = grad_h_relu.clone()

grad_h[h < 0] = 0

grad_w1 = x.t().mm(grad_h)

# Update weights using gradient descent

w1 -= learning_rate * grad_w1

w2 -= learning_rate * grad_w2

읽기일기

Learning PyTorch with Examples (1) – Tensors

Learning PyTorch with Examples

Tensors

Warm-up: numpy

PyTorch: Tensors

카테고리

최신 글

최신 댓글

보관함

메타

읽기일기

Learning PyTorch with Examples (1) – Tensors

Learning PyTorch with Examples

Tensors

Warm-up: numpy

PyTorch: Tensors

카테고리

태그

최신 글

최신 댓글

보관함

메타