| Type: | Package |
| Title: | Generalized Measure of Correlation (GMC) |
| Version: | 0.1.2 |
| Description: | Provides tools to compute the Generalized Measure of Correlation (GMC), a dependence measure accounting for nonlinearity and asymmetry in the relationship between variables. Based on the method proposed by Zheng, Shi, and Zhang (2012) <doi:10.1080/01621459.2012.710509>. |
| License: | GPL (≥ 3) |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Suggests: | testthat (≥ 3.0.0), knitr, rmarkdown |
| Config/testthat/edition: | 3 |
| Imports: | ks, stats |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2025-10-31 01:40:15 UTC; ding'x'j |
| Author: | Xuejing Ding [aut, cre], Zhengjun Zhang [aut] |
| Maintainer: | Xuejing Ding <dingxuejing24@mails.ucas.ac.cn> |
| Repository: | CRAN |
| Date/Publication: | 2025-10-31 12:10:02 UTC |
Generalized Measure of Correlation: GMC(X | Y)
Description
Generalized Measure of Correlation: GMC(X | Y)
Usage
GMC_X_given_Y(X, Y, kernel = dnorm)
Arguments
X |
Predictor variable |
Y |
Response variable |
kernel |
Kernel function (default = dnorm) |
Value
GMC(X|Y) estimate
Examples
# Generate sample data with nonlinear relationship
set.seed(123)
n <- 1000
X <- rnorm(n)
Y <- X^2 + rnorm(n, sd = 0.5)
# Calculate GMC(X|Y)
gmc_result <- GMC_X_given_Y(X, Y)
print(gmc_result)
Generalized Measure of Correlation: GMC(Y | X)
Description
Generalized Measure of Correlation: GMC(Y | X)
Usage
GMC_Y_given_X(X, Y, kernel = dnorm)
Arguments
X |
Predictor variable |
Y |
Response variable |
kernel |
Kernel function (default = dnorm) |
Value
GMC(Y|X) estimate
Examples
# Generate sample data with linear relationship
set.seed(123)
n <- 1000
X <- rnorm(n)
Y <- 2 * X + rnorm(n, sd = 0.5)
# Calculate GMC(Y|X)
gmc_result <- GMC_Y_given_X(X, Y)
print(gmc_result)
Feature selection using GMC ranking
Description
Feature selection using GMC ranking
Usage
GMC_feature_ranking(X, Y, kernel = dnorm, sort = TRUE)
Arguments
X |
A matrix or data.frame of predictors |
Y |
A numeric response vector |
kernel |
Kernel function (default = dnorm) |
sort |
Logical, whether to sort variables by GMC score |
Value
A data.frame with variable names and GMC scores
Examples
# Generate sample data with multiple predictors
set.seed(123)
n <- 500
X1 <- rnorm(n)
X2 <- rnorm(n)
X3 <- rnorm(n)
Y <- 2 * X1 + X2^2 + rnorm(n, sd = 0.5)
X <- cbind(X1, X2, X3)
# Rank features by GMC
ranking <- GMC_feature_ranking(X, Y)
print(ranking)
Estimate E[(E[Y|X])^2] using kernel regression
Description
This function estimates the squared conditional expectation E[(E[Y|X])^2] using Nadaraya-Watson regression with Gaussian kernel.
Usage
estimate_EY_X_squared(X, Y, grid_length = 10000, kernel = dnorm)
Arguments
X |
A numeric vector of predictors. |
Y |
A numeric vector of responses. |
grid_length |
Number of grid points for numerical integration (default = 10000). |
kernel |
Kernel function (default is dnorm). |
Value
A list containing:
- estimate
Estimated value of E[(E[Y|X])^2]
- bandwidth
Selected kernel bandwidth
- mean_Y
Mean of Y
- var_Y
Variance of Y
- EY_grid
Grid values of E[Y|X]
- fx_grid
Estimated marginal density of X
- x_grid
Grid points used in estimation
References
Zheng, S., Shi, N.Z., & Zhang, Z. (2012). Generalized Measures of Correlation for Asymmetry, Nonlinearity, and Beyond. Journal of the American Statistical Association, 107(499), 1239-1252. doi:10.1080/01621459.2012.710509