hwlink <- ""
hwdata <- read_csv(hwlink)
ggplot(hwdata, aes(x=x,y=y)) + geom_point()
- Do you see any type of clusters/groups right away?
2 groups
km_clusterThis <- kmeans(hwdata[, c("y", "x")], centers = 2)
km_results <- as.factor(km_clusterThis$cluster)
ggplot(hwdata, aes(x=x,y=y, color = km_results)) + geom_point()
dis <- dist(hwdata[, c("x", "y")])
hc_hwdata <- hclust(dis, method = "single")
cut2 <- cutree(hc_hwdata, 2)
hc_result <- as.factor(cut2)
ggplot(hwdata, aes(x=x,y=y, color = hc_result)) + geom_point()
polar <- hwdata %>% transform(r = sqrt(x^2 + y^2), theta=atan(y/x))
km_polar <- kmeans(polar[, c("r", "theta")], centers = 2)
km_results2 <- as.factor(km_polar$cluster)
ggplot(polar, aes(x=r,y=theta, color = km_results2)) + geom_point()
I think the transformation performs an improved separation of clusters. In problem number 1, groups are very close at some points and in this case, we can identify clearly two groups. In case the groups are customers, a marketing campaign can give us better results.