What is multi-objective clustering
Search for artificial 2D data to demonstrate the properties of clustering algorithms
I'm looking for records of two-dimensional data points (each data point is a vector with two values (x, y)) that follow different distributions and shapes. Code to generate such data would also be helpful. I want to use them to draw / visualize the performance of some clustering algorithms. Here are some examples:
R has a lot of records, and it doesn't seem like a big deal to reproduce most of the examples you cited with just a few lines of code. The mlbench package may also be helpful, especially synthetic records that start with. Some illustrations are given below.
Further examples can be found in the cluster task view on CRAN. The fpc package, for example, has an integrated generator for "face-shaped" cluster benchmark data sets ().
Similar considerations apply to Python, where you can find interesting benchmark tests and data sets for clustering with the Scikit-Learn.
The UCI Machine Learning Repository also hosts a lot of data sets, but you'd better simulate data yourself using the language of your choice.
This benchmark for toy clusters contains various data sets in ARFF format (can easily be converted to CSV), mostly with basic truth labels. The benchmark should validate the basic desired properties of clustering algorithms. Most of the datasets come from the clustering papers like:
- BIRCH - Zhang, Tian, Raghu Ramakrishnan and Miron Livny. "BIRCH: An efficient data clustering method for very large databases." ACM SIGMOD recording. Vol. 25. No. 2. ACM, 1996.
- Healing - Guha, Sudipto, Rajeev Rastogi and Kyuseok Shim. "CURE: An Efficient Clustering Algorithm for Large Databases." ACM SIGMOD recording. Vol. 27. No. 2. ACM, 1998.
- Chameleon - Karypis, George, Eui-Hong Han and Vipin Kumar. "Chameleon: Hierarchical Clustering Using Dynamic Modeling." Computer 32.8 (1999): 68-65; 75.
- The Fundamental Clustering Problem Suite - Ultsch, A .: Clustering with SOM: U * C, In Proc. Workshop on self-organizing cards, Paris, France, (2005), pp. 75-82
- MOCK - Handl, Julia and Joshua Knowles. "An evolutionary approach to multi-objective clustering." Evolutionary Computation, IEEE Transactions on 11.1 (2007): 56-76.
- Robust path-based spectral clustering - Chang, Hong, and Dit-Yan Yeung. "Robust Path-Based Spectral Clustering." Pattern Recognition 41.1 (2008): 191- 203.
ELKI comes with some datasets (also check the unit tests, they contain a lot more than the ones on the website in addition to the parameter settings).
It also includes a fairly flexible data generator.
Here is a customizable cluster generator. It only addresses a certain class of data sets, but can certainly be used for investigating cluster algorithms.
Here is an example of the type of clusters that can be created:
The cluster membership is saved in a text file. The code is open source under MIT license.
This Matlab script generates 2D data for clustering. Several parameters are accepted so the data generated will meet user requirements.
I can't believe no one mentioned Fisher's Iris data.
I don't think I've seen any clustering technique where the iris data Not serve as an example.
Simply enter "iris" in r to access the data.
Here is an example of a nice (and typical) iris display: http://ygc.name/2011/12/24/ml-class-7-kmeans-clustering/
- Which Dallas Cowboys cheerleaders retired this year
- What is the name of a group of 12 people
- What is the meaning of number seven
- A taboo has failed
- What are the homologous organs in plants
- How do I use a crystal pendulum
- Is 158cm 52 tall for a girl
- How much money did vayable com get
- Why does my birthday feel weird?
- King George III had siblings
- Why don't introverts like socialism?
- Which films are similar to clerks
- How to develop a fearless character
- What is your rating of Jobscan co
- What is a yellow card in football
- Why is water sometimes green 1
- Which flower grows the tallest
- What were the uses of African masks
- Broken bones in the neck are harmful
- How does objectivism see the Nash equilibrium
- Falling into orbit
- Can electrons diffuse in a vacuum
- Brainwave scanning technology will revolutionize psychotherapy
- How do I forecast my startup idea