MSC (Multiple Star Classifier)
The Multiple Star Classifier (MSC) work package uses the BP/RP spectrum to estimate the astrophysical parameters (APs) of sources identified by DSC as unresolved physical binaries. At present MSC estimates the extinction A0, metallicity [Fe/H], and the brightness ratio of the system as a whole, as well as the effective temperatures Teff and surface gravities log g of both components.
The APs are estimated using ExtraTree Regression and an empirical training set. It is planned to extend this to a forward model inference.
MSC constructs its training and testing data by generating physically plausible binaries. The age and metallicity of the system are generated randomly, while masses are drawn from a Kroupa IMF and paired at random. Individual spectra for both components are then produced and combined into a single spectrum. The MARCS spectra library are used to interpolate the spectra for individual stars with the corresponding parameters.
The Figure on the right reports the results of a test data set limited to brightness ratio BR = log10(L1/L2) < 1.5. We can see that we can predict Teff quite accurately for the primary stars, but not for the secondary stars. We can see that Teff for the secondary components correlate quite poorly with the true values. This is because the surface temperature of the secondary generally has a weak signature in the data, and thus assigns random values from the training data distribution.
MSC performance depends thus not only on the G-magnitude of the system, but also on the brightness ratio of the two components. The data set used for training have an exponentially decreasing distribution in BR from 0 to 5, so that the majority of the training set have BR < 2. At BR = 2, the secondary component is 100 times fainter than the primary, thus we should not expect a good performance in estimating APs of the secondary component.
Despite the poor performance for BR > 2, nevertheless MSC is a useful package to estimate the APs of the primary component because it performs better than GSP-Phot does. This means that neglecting the existence of the secondary component degrades the performance in estimating the APs of the primary.