Важные функции - те, которые влияют больше на компоненты и таким образом, имеют большое абсолютное значение/coefficient/loading на компоненте.

Добираются the most important feature name на ПК :

from sklearn.decomposition import PCA
import pandas as pd
import numpy as np
np.random.seed(0)

# 10 samples with 5 features
train_features = np.random.rand(10,5)

model = PCA(n_components=2).fit(train_features)
X_pc = model.transform(train_features)

# number of components
n_pcs= model.components_.shape[0]

# get the index of the most important feature on EACH component i.e. largest absolute value
# using LIST COMPREHENSION HERE
most_important = [np.abs(model.components_[i]).argmax() for i in range(n_pcs)]

initial_feature_names = ['a','b','c','d','e']

# get the names
most_important_names = [initial_feature_names[most_important[i]] for i in range(n_pcs)]

# using LIST COMPREHENSION HERE AGAIN
dic = {'PC{}'.format(i+1): most_important_names[i] for i in range(n_pcs)}

# build the dataframe
df = pd.DataFrame(sorted(dic.items()))

Это печатает:

     0  1
 0  PC1  e
 1  PC2  d

Заключение/Объяснение:

Так на PC1 функция, названная e, является самой важной и на PC2 d.

makis · Answer 1 · 1 November 2019 в 11:45

Важные функции - те, которые влияют больше на компоненты и таким образом, имеют большое абсолютное значение/coefficient/loading на компоненте.

Добираются the most important feature name на ПК :

from sklearn.decomposition import PCA
import pandas as pd
import numpy as np
np.random.seed(0)

# 10 samples with 5 features
train_features = np.random.rand(10,5)

model = PCA(n_components=2).fit(train_features)
X_pc = model.transform(train_features)

# number of components
n_pcs= model.components_.shape[0]

# get the index of the most important feature on EACH component i.e. largest absolute value
# using LIST COMPREHENSION HERE
most_important = [np.abs(model.components_[i]).argmax() for i in range(n_pcs)]

initial_feature_names = ['a','b','c','d','e']

# get the names
most_important_names = [initial_feature_names[most_important[i]] for i in range(n_pcs)]

# using LIST COMPREHENSION HERE AGAIN
dic = {'PC{}'.format(i+1): most_important_names[i] for i in range(n_pcs)}

# build the dataframe
df = pd.DataFrame(sorted(dic.items()))

Это печатает:

     0  1
 0  PC1  e
 1  PC2  d

Заключение/Объяснение:

Так на PC1 функция, названная e, является самой важной и на PC2 d.

1 ответ

Важные функции - те, которые влияют больше на компоненты и таким образом, имеют большое абсолютное значение/coefficient/loading на компоненте.

Заключение/Объяснение:

Другие вопросы по тегам:

Похожие вопросы: