K Means using PyTorch
PyTorch implementation of kmeans for utilizing GPU
Getting Started
import torch
import numpy as np
from kmeans_pytorch import kmeans
# data
data_size, dims, num_clusters = 1000, 2, 3
x = np.random.randn(data_size, dims) / 6
x = torch.from_numpy(x)
# kmeans
cluster_ids_x, cluster_centers = kmeans(
X=x, num_clusters=num_clusters, distance='euclidean', device=torch.device('cuda:0')
)
see example.ipynb
for a more elaborate example
Requirements
- PyTorch version >= 1.0.0
- Python version >= 3.6
Installation
install with pip
:
pip install kmeans-pytorch
Installing from source
To install from source and develop locally:
git clone https://github.com/subhadarship/kmeans_pytorch
cd kmeans_pytorch
pip install --editable .
CPU vs GPU
see cpu_vs_gpu.ipynb
for a comparison between CPU and GPU
Notes
- useful when clustering large number of samples
- utilizes GPU for faster matrix computations
- support euclidean and cosine distances (for now)
Credits
- This implementation closely follows the style of this
- Documentation is done using the awesome theme jekyllbook