Related Communities:

Eclipsing-binary Stars Classification applying Ensembled

rus eng

Eclipsing-binary Stars Classification applying Ensembled Weka in Astrogrid


The research is intended for incorporation into the VO infrastructure of the facilities for astronomical problems solving by means of the data mining methods. Existing approaches are analyzed. The preference is given to the use of ensembles of data mining algorithms. An architecture (Ensembled Weka) is proposed for incorporation of the Weka system into the VO infrastructure. The results of the architecture implementation are presented in the paper. Advantages of use of the VO facilities including Ensembled Weka for specific problem solving are shown.

Eclipsing (photometric) binaries are binary stars of which one at times eclipses the other, thus leading to alterations in the apparent total brightness of the combined stars. The eclipse occurs because the line of sight lies almost in the orbital plane of the stars. Several catalogues of eclipsing binaries exist, e.g., General Catalogue of Variable Stars (GCVS); A Finding List for Observers of Interacting Binary Systems, 5th Edition; Eclipsing variables in microlensing surveys. Data from these catalogues were collected by Prof. Oleg Malkov in one catalogue (Malkov 2007) that currently contains information about 6675 binaries. In this collection a class is pre-determined for 1161 star.

The results obtained by a set of selected data mining algorithms (an ensemble) are processed by a generalizing function. E.g., in case of classification and conventional voting for each objects a number of algorithms that have assigned to it a given class is determined and the class collected maximal number of votes is chosen. The number of votes is stored as a new attribute - the confidence index.
New table containing a result corresponding to a kind of a problem is produced by the ensemble. The respective schema of the ensemble work is shown above.

Astrogrid Applications Used:

  • FormatConverter (ivo://
  • This application converts tables in different formats. Here it is used to convert data from native Weka format (ARFF), to VOTable
  • Weka Classifier (ivo://
  • This application classifies data in input table. Except table, this application receives configuration file, which specifies structure of classes, Weka algorithms to be included into an ensemble, and other required parameters
An example of configuration file, used in eclipsing-binary stars classification problem can be found here


This file together with the catalogue of binaries were stored in MySpace. As a result of the work of the ensemble the 5514 binaries were classified, providing the following class distribution C - 852 CB - 89 CBF - 74 CBV - 149 CE - 15 CG - 1 CW - 84 CWA - 427 CWW - 331 S - 547 S2C - 3 SA - 1902 SC - 1 SH - 13 D - 553 DG - 41 DM - 422 DR - 10. As a threshold for the confidence index 7 was used. Binaries that were classified with the confidence index less than threshold got an incomplete classification.

Related Publications

Supported by Synthesis Group