My Math Forum  

Go Back   My Math Forum > College Math Forum > Advanced Statistics

Advanced Statistics Advanced Probability and Statistics Math Forum

LinkBack Thread Tools Display Modes
July 26th, 2012, 01:06 AM   #1
Joined: Aug 2008

Posts: 13
Thanks: 0

Data analysis using Independent Component Analysis (ICA)

Hi all,

I have encounter a problem of analyze some large data set. I have done this by ICA particularly using the Matlab FastICA package. However, I have got some results unexpected. I think there are might be two reasons:

1. My data set is not statistically independent enough.
2. My data might have some nonlinear mixing. (Maybe I am wrong but I believe ICA can only separate linearly mixed sources).

I tested using simple simulated data to demonstrate point 2 here as attached below. I test both noise free and noisy environment. I repeat the FastICA for linear, nonlinear, noisy linear, and noisy nonlinear mixed data 4 times. The results of linear case with or without noise are quite consistent, but the results of the nonlinear case are quite 'Random'.

May I ask opinion from your smart guys?

a. How to find out or justify that the data is 'not statistically independent enough'? And how to demonstrate this using simple simulated data?
b. Would it be possible to test if my data have some nonlinear mixing? Could you explain these random results when the mixture is nonlinear. And how to deal with this situation or how to mitigate the nonlinearity? (Using kernel PCA?)

%% Test: ICA can only separate linearly mixed sources
clc; clf; clear all; close all;

opt = 1; % Linear: 1; Nonlinear: 2; Linear Noisy: 3; Nonlinear Noisy: 4;

%% Create two signals
A = sin(linspace(0,50, 1000)); % A
B = cos(linspace(0,37, 1000)+5); % B
C = sin(linspace(0,20, 1000)+10); % C

%% Mixture of linear signals
if opt == 1
M1 = A-2*B+C; % mixing 1
M2 = 1.73*A+3.41*B-9.2*C; % mixing 2
M3 = 0.2*A+0.41*B-0.5*C; % mixing 3

%% Mixture of nonlinear signals
elseif opt == 2
M1 = A-2*B+C; % mixing 1
M2 = 1.73*A+3.41*B.^2-9.2*C; % mixing 2 B.^1.1
M3 = 0.2*A+0.41*B-0.5*C; % mixing 3 1000*C

%% Mixture of linear signals with white Gaussian noise
elseif opt == 3
M1 = A-2*B+C+(0.2+0.1.*randn(1000,1))'; % mixing 1
M2 = 1.73*A+3.41*B-9.2*C+(0.1+0.05.*randn(1000,1))'; % mixing 2
M3 = 0.2*A+0.41*B-0.5*C-(0.01+0.1.*randn(1000,1))'; % mixing 3

%% Mixture of nonlinear signals with white Gaussian noise
elseif opt == 4
M1 = A-2*B+C+(0.2+0.1.*randn(1000,1))'; % mixing 1
M2 = 1.73*A+3.41*B.^2-9.2*C+(0.1+0.05.*randn(1000,1))'; % mixing 2 B.^1.1
M3 = 0.2*A+0.41*B-0.5*C-(0.01+0.1.*randn(1000,1))'; % mixing 3 1000*C


%% Run fast ICA 4 times
ICs = zeros(12,1000);
for i = 1:4
% compute unminxing using fastICA
ICs((1+3*(i-1))1+3*(i-1))+2, = fastica([M1;M2;M3]);

%% Plot
subplot(3,6,1), plot(A, 'r'); % plot A
subplot(3,6,7), plot(B, 'r'); % plot B
subplot(3,6,13), plot(C, 'r'); % plot C

subplot(3,6,2), plot(M1, 'g'); % plot mixing 1
subplot(3,6,, plot(M2, 'g'); % plot mixing 2
subplot(3,6,14), plot(M3, 'g'); % plot mixing 3

subplot(3,6,3), plot(ICs(1,, 'r'); % plot IC 1
subplot(3,6,9), plot(ICs(2,, 'r'); % plot IC 2
subplot(3,6,15), plot(ICs(3,, 'r'); % plot IC 3

subplot(3,6,4), plot(ICs(4,, 'r'); % plot IC 1
subplot(3,6,10), plot(ICs(5,, 'r'); % plot IC 2
subplot(3,6,16), plot(ICs(6,, 'r'); % plot IC 3

subplot(3,6,5), plot(ICs(7,, 'r'); % plot IC 1
subplot(3,6,11), plot(ICs(8,, 'r'); % plot IC 2
subplot(3,6,17), plot(ICs(9,, 'r'); % plot IC 3

subplot(3,6,6), plot(ICs(10,, 'r'); % plot IC 1
subplot(3,6,12), plot(ICs(11,, 'r'); % plot IC 2
subplot(3,6,1, plot(ICs(12,, 'r'); % plot IC 3
ggyyree is offline  
August 9th, 2012, 03:48 AM   #2
Senior Member
Joined: Aug 2012

Posts: 229
Thanks: 3

Re: Data analysis using Independent Component Analysis (ICA)

Hey ggyyree.

You can not actually prove that data is independent from a sample: it must be defined that way, or at least assumed to be that way.

What you can do is test if data is correlated or not and make a hypothesis test that there is no correlation within some significance level with regards to the variance: i.e. the covariance must be 0 or statistically significant enough to reject this notion (You can also test the correlation coeffecient as well to be statistically significantly 0).

The reason for the independence is that while independence implies a zero covariance (and correlation), it's not the other way around: you can have a correlation and covariance of 0, but things can still be dependent.

It's just the nature of probability and statistics with regards to samples: you can never theoretically prove in many tests that something "is" something but get enough evidence to decide whether to accept something or to reject something.
chiro is offline  

  My Math Forum > College Math Forum > Advanced Statistics

analysis, component, data, ica, independent

Search tags for this page
Click on a term to search for related topics.
Thread Tools
Display Modes

Similar Threads
Thread Thread Starter Forum Replies Last Post
Data Analysis DataGeek Real Analysis 1 January 25th, 2013 03:04 PM
Time Series Independent Component Analysis p75213 Economics 1 December 11th, 2012 04:31 AM
Biology/Statistics - Data analysis sjt Advanced Statistics 0 May 17th, 2010 12:17 AM
Creation of Test Data for analysis... MSherfey Advanced Statistics 7 April 13th, 2009 08:46 PM
Data analysis using Independent Component Analysis (ICA) ggyyree Real Analysis 0 December 31st, 1969 04:00 PM

Copyright © 2019 My Math Forum. All rights reserved.