代码和作业说明下载
这次作业我们需要实现 K-means Clustering and Principal Component
Analysis。
需要完成下列代码文件:
- pca.m
- projectData.m
- recoverData.m
- findClosestCentroids.m
- computeCentroids.m
- kMeansInitGentroids.m
pca.m
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
| function [U, S] = pca(X) [m, n] = size(X); U = zeros(n); S = zeros(n); Sigma = (1 / m) .* (X' * X); [U, S, V] = svd(Sigma); end
|
projectData.m
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| function Z = projectData(X, U, K) Z = zeros(size(X, 1), K); U_reduce = U(:, 1 : K); Z = X * U_reduce; end
|
recoverData.m
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| function X_rec = recoverData(Z, U, K) X_rec = zeros(size(Z, 1), size(U, 1)); X_rec = Z * U(:, 1:K)'; end
|
findClosestCentroids.m
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
| function idx = findClosestCentroids(X, centroids) K = size(centroids, 1); idx = zeros(size(X,1), 1); for i = 1 : size(X, 1) min_distance = sum((X(i, :) - centroids(1, :)).^2); idx(i) = 1; for j = 2 : K distance_tmp = sum((X(i, :) - centroids(j, :)).^2); if (distance_tmp < min_distance) min_distance = distance_tmp; idx(i) = j; end end end end
|
computeCentroids.m
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
| function centroids = computeCentroids(X, idx, K) [m n] = size(X); centroids = zeros(K, n); for i = 1 : K k = sum(idx == i); centroids(i, :) = (1 / k) * sum(X .* (idx == i)); end end
|
kMeansInitGentroids.m
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| function centroids = kMeansInitCentroids(X, K) centroids = zeros(K, size(X, 2)); randidx = randperm(size(X, 1)); centroids = X(randidx(1:K), :); end
|