arXiv Open Access 2024

Principal Component Analysis for Equation Discovery

Caren Marzban Ulvi Yurtsever Michael Richman
Lihat Sumber

Abstrak

Principal Component Analysis (PCA) is one of the most commonly used statistical methods for data exploration, and for dimensionality reduction wherein the first few principal components account for an appreciable proportion of the variability in the data. Less commonly, attention is paid to the last principal components because they do not account for an appreciable proportion of variability. However, this defining characteristic of the last principal components also qualifies them as combinations of variables that are constant across the cases. Such constant-combinations are important because they may reflect underlying laws of nature. In situations involving a large number of noisy covariates, the underlying law may not correspond to the last principal component, but rather to one of the last. Consequently, a criterion is required to identify the relevant eigenvector. In this paper, two examples are employed to demonstrate the proposed methodology; one from Physics, involving a small number of covariates, and another from Meteorology wherein the number of covariates is in the thousands. It is shown that with an appropriate selection criterion, PCA can be employed to ``discover" Kepler's third law (in the former), and the hypsometric equation (in the latter).

Topik & Kata Kunci

Penulis (3)

C

Caren Marzban

U

Ulvi Yurtsever

M

Michael Richman

Format Sitasi

Marzban, C., Yurtsever, U., Richman, M. (2024). Principal Component Analysis for Equation Discovery. https://arxiv.org/abs/2401.04797

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓