- YSO Identifier: Identifying Young Stellar Objects in Multi-dimensional Magnitude Space
- Context
Created by gh-md-toc
Young Stellar Objects (YSOs) are young stars at early stage of evolution. YSOs consist of protostars and pre-main-sequence stars. Identifying YSOs is important to derive statistical properties e.g. star formation rate (SFR) which helps to better constrain star formation theories. In this work, we take the indirect approach to find YSOs by constructing a pipeline to classify astronomical objects: evolved stars, stars, galaxies, and YSOs, solely based on their photometry measurements from multiple bands. The classification is based on object-populated regions of evolved stars, stars, and galaxies in the multi-dimensional magnitude space, and sources are classified as YSOs if they are not in the previous regions.
There are two major approaches to do YSO identification: direct approach and indirect approach
- Direct approach: Find objects with feature of YSOs
- Spectroscopy (pros: accurate; cons: NOT very efficient)
- Indirect approach: Remove objects that are not YSOs
- Evans et al. 2007: Color-color diagram (CCD), Color-magnitude diagram (CMD)
- Hsieh and Lai 2013: Multi-D magnitude space (this approach is adopted by this work)
- Chiu et al. 2021: Machine Learning
Each location in magnitude space corresponds to a type of spectral energy distribution (SED) which can represent composition of objects. This means classifying objects in magnitude space is equivalent to classifying objects with SED based on their composition.
![]() |
---|
Blue/Green dots in 3D magnitude space corresponds to different types of 3-band SEDs |
For locations along the faint direction (diagonal direction), SED shape of each location is identical but with different magnitude. This can be viewed as the same type of objects with different brightness due to the distance
![]() |
---|
Green dots within orange probe along faint direction can be viewed as same type of objects |
However, since YSOs and galaxies have similar composition, both are made of star and dust, we cannot simply use SED shape to separate them. But since their distances to us are very different, most YSOs we can observe locate within Milky Way Galaxy, as most galaxies are far-away from our Milky Way Galaxy. Therefore, we use their brightness difference to separate them. Note that this method has a caveat, since the separation of YSOs and galaxies are based on brightness, we might miss very faint YSOs and contaminate YSOs with very bright galaxies. Fortunately, there are not so many very bright galaxies.
![]() |
---|
Green/Blue dots within orange probe indicate Galaxies/YSOs separated due to their brightness difference |
In this work, we use object samples to naturally defined object-populated region in multi-D magnitude space. The object will be classified into evolved star, star, galaxy or YSO based on the object-populated region it locates. The concept of multi-D magnitude space is first proposed by Hsieh & Lai 2013, this work improves their work to the higher dimension.
![]() |
---|
Multi-D magnitude space in this work (2D magnitude space schematics) |
In Hsieh & Lai 2013, they use multi-D array to construct the whole multi-D magnitude space, however it needs enormous RAM to store that array. To solve the RAM problem, in this work, we change the storage method from multi-d array to 2D array composed by sets of location of boundary points. We first project all object samples along the faint direction (as shown in previous section, they have identical SED shapes) to find all SED shapes of samples. Then, we find the brightest dot and the faintest dot for the individual type of SED shape and store them as bright-end boundary and faint-end boundary respectively. In this work, we assume object-populated region are always continuous, therefore the bright-end boundaries and faint-end boundaries define the object-populated region of the samples. For samples used in this work, please check ./tables/README.md.
![]() |
---|
Probe green samples with orange probe and find both bright-end and faint-end boundaries |
Input objects will first be binned to save computation time and compared their location in multi-D magnitude space to object-populated regions that are probed with the method in previous section. Note that here we define the bright and faint regions to classify those objects outside the region of interest (where all samples locate) and give them object type bright and faint. For those bright/faint objects, due to their brightness/faintness, we suggest them as YSOs/galaxies. For more detailed description, please check ./classification/README.md.
![]() |
---|
This work classification pipeline with 2D magnitude space schematics |
Since the multi-D magnitude space is huge and it is hard to observe all SED shapes in practice, there are some regions that do not have observed samples. This region is called the isolated region because of missing SED shapes of samples and objects locate in this region are called isolated objects.
![]() |
---|
Isolated region defined in this work, which also indicates the region that we do not have samples |
To maximize usage of the samples, we introduce reclassification process to do classification to those isolated objects. This process is to classify isolated objects using boundary points with the most similar SED as a reference. This process acts equivalently to do interpolation/extrapolation to our samples. Note that we only do this process to galaxy samples in this work.
![]() |
---|
Reclassification process and detailed criteria |
python3 -m pip install -r ./requirements.txt
This work needs three sample catalogs for evolved stars, stars and galaxies. We provides these three catalogs in ./tables directory. But note that since the size of template star catalog is too large (~120 MB) for github, we provide the scripts for user to generate template star on their own. Also, you can just skip this section if you want to use your own sample catalogs. For sample catalog format, please check ./tables/README.md.
cd ./tables # Make sure you are in the tables directory
chmod u+x ./generate_star_sample_catalog.sh
./generate_star_sample_catalog.sh
cd ..
Python object Model
stores parameters for multi-dimensional magnitude space.
For more details, please check ./model.py file.
Use vim or whatever editor you like to check variable in Model
.
vim ./model.py
Probe object samples in multi-dimensional magnitude space to get object-populated region.
By default, we probe evolved star, star and galaxy samples with input sample catalogs in ./tables directory with bin size 1.0
, 0.5
, 0.2
magnitude respectively.
For more details about input/output/module files, please check ./probe_model directory.
Use vim or whatever editor you like to check inputs.
vim ./run_probe_model.py
Please check following 1D lists in main()
, especially you are using your own sample catalogs
Note that list 1 and 2 should have same list length
input_catalog_list
: input catalog list for samples (e.g. evolved star, star, and galaxy)input_name_list
: input catalog name list (this would be later used as output model name)binsize_model_list
: bin size list (bin size used to probe multi-D space)
If input check is done, run
python3 ./run_probe_model.py
Choose either ways to run classification. For more details about input/output/module files, please check ./classification directory.
python3 ./run_classification.py interactively
Recommended if you have a lot of catalogs for classification. But note that you have to assign models (e.g. evolved star, star, galaxy, and bin size) for every input catalog. Use vim or whatever editor you like to check inputs.
vim ./run_classification.py
Please check following 1D lists in main()
to make sure you have correct inputs, especially you are using your own models generated from your own sample catalogs.
Note that list 1~5 should have same list length.
catalog_list
: input catalog listevolved_star_model_list
: evolved star model name liststar_model_list
: star model name listgalaxy_model_list
: galaxy model name listbinsize_model_list
: bin size list
If input check is done, run
python3 ./run_classification.py
For more details about input/output/module files, please check ./make_plot directory.
vim ./plot_sample_MMD.py # Check input catalogs
python3 ./plot_sample_MMD.py
vim ./plot_result_MMD.py # Check input catalogs
python3 ./plot_result_MMD.py
vim ./plot_sample_SED.py # Check input catalogs
python3 ./plot_sample_SED.py
vim ./plot_model_venn_diagram.py # Check input catalogs
python3 ./plot_model_venn_diagram.py