PCA

public class PCA

Principal Component Analysis

Reference: “Principal Component Analysis”

  • The estimated number of components.

    Declaration

    Swift

    public var componentCount: Int
  • Whitening will remove some information from the transformed signal (the relative variance scales of the components) but can sometime improve the predictive accuracy of the downstream estimators by making their data respect some hard-wired assumptions.

    Declaration

    Swift

    public var whiten: Bool
  • Number of samples in the training data.

    Declaration

    Swift

    public var sampleCount: Int
  • Number of feature in the training data.

    Declaration

    Swift

    public var featureCount: Int
  • Per-feature empirical mean, estimated from the training set.

    Declaration

    Swift

    public var mean: Tensor<Double>
  • The estimated noise covariance.

    Declaration

    Swift

    public var noiseVariance: Tensor<Double>
  • Principal axes in feature space, representing the directions of maximum variance in the data. The components are sorted by explainedVariance.

    Declaration

    Swift

    public var components: Tensor<Double>
  • The amount of variance explained by each of the selected components.

    Declaration

    Swift

    public var explainedVariance: Tensor<Double>
  • Percentage of variance explained by each of the selected components.

    Declaration

    Swift

    public var explainedVarianceRatio: Tensor<Double>
  • The singular values corresponding to each of the selected components.

    Declaration

    Swift

    public var singularValues: Tensor<Double>
  • Create Principal Component Analysis model.

    Declaration

    Swift

    public init(
        componentCount: Int = 0,
        whiten: Bool = false
    )

    Parameters

    componentCount

    Number of components to keep.

    whiten

    When true (false by default) the components vectors are multiplied by the square root of sample count and then divided by the singular values to ensure uncorrelated outputs with unit component-wise variances. Whitening will remove some information from the transformed signal (the relative variance scales of the components) but can sometime improve the predictive accuracy of the downstream estimators by making their data respect some hard-wired assumptions.

  • Returns the log-likelihood of a rank over given dataset and spectrum.

    Declaration

    Swift

    internal func assessDimension(
        _ spectrum: Tensor<Double>,
        _ rank: Int,
        _ sampleCount: Int,
        _ featureCount: Int
    ) -> Tensor<Double>

    Parameters

    spectrum

    The amount of variance explained by each of the seleted components.

    rank

    Test rank value.

    sampleCount

    The sample count.

    featureCount

    The features count.

    Return Value

    Log-likelihood of rank over given dataset.

  • Returns the number of components best describe the dataset.

    Reference: “Automatic Choice of Dimensionality for PCA”

    Declaration

    Swift

    internal func inferDimension(
        spectrum: Tensor<Double>,
        sampleCount: Int,
        featureCount: Int
    ) -> Int

    Parameters

    spectrum

    The amount of variance explained by each of the seleted components.

    sampleCount

    The sample count.

    featureCount

    The feature count.

    Return Value

    The number of components best describe the dataset.

  • Fit a Principal Component Analysis.

    Declaration

    Swift

    public func fit(data: Tensor<Double>)

    Parameters

    data

    Training data with shape [sample count, feature count].

  • Returns dimensionally reduced data.

    Declaration

    Swift

    public func transformation(for data: Tensor<Double>) -> Tensor<Double>

    Parameters

    data

    Input data with shape [sample count, feature count].

    Return Value

    Dimensionally reduced data.

  • Returns transform data to its original space.

    Declaration

    Swift

    public func inverseTransformation(for data: Tensor<Double>) -> Tensor<Double>

    Parameters

    data

    Input data with shape [sample count, feature count].

    Return Value

    Original data whose transform would be data.