
Advanced Statistics Advanced Probability and Statistics Math Forum 
 LinkBack  Thread Tools  Display Modes 
July 26th, 2012, 01:06 AM  #1 
Newbie Joined: Aug 2008 Posts: 13 Thanks: 0  Data analysis using Independent Component Analysis (ICA)
Hi all, I have encounter a problem of analyze some large data set. I have done this by ICA particularly using the Matlab FastICA package. However, I have got some results unexpected. I think there are might be two reasons: 1. My data set is not statistically independent enough. 2. My data might have some nonlinear mixing. (Maybe I am wrong but I believe ICA can only separate linearly mixed sources). I tested using simple simulated data to demonstrate point 2 here as attached below. I test both noise free and noisy environment. I repeat the FastICA for linear, nonlinear, noisy linear, and noisy nonlinear mixed data 4 times. The results of linear case with or without noise are quite consistent, but the results of the nonlinear case are quite 'Random'. May I ask opinion from your smart guys? a. How to find out or justify that the data is 'not statistically independent enough'? And how to demonstrate this using simple simulated data? b. Would it be possible to test if my data have some nonlinear mixing? Could you explain these random results when the mixture is nonlinear. And how to deal with this situation or how to mitigate the nonlinearity? (Using kernel PCA?) %% Test: ICA can only separate linearly mixed sources clc; clf; clear all; close all; opt = 1; % Linear: 1; Nonlinear: 2; Linear Noisy: 3; Nonlinear Noisy: 4; %% Create two signals A = sin(linspace(0,50, 1000)); % A B = cos(linspace(0,37, 1000)+5); % B C = sin(linspace(0,20, 1000)+10); % C %% Mixture of linear signals if opt == 1 M1 = A2*B+C; % mixing 1 M2 = 1.73*A+3.41*B9.2*C; % mixing 2 M3 = 0.2*A+0.41*B0.5*C; % mixing 3 %% Mixture of nonlinear signals elseif opt == 2 M1 = A2*B+C; % mixing 1 M2 = 1.73*A+3.41*B.^29.2*C; % mixing 2 B.^1.1 M3 = 0.2*A+0.41*B0.5*C; % mixing 3 1000*C %% Mixture of linear signals with white Gaussian noise elseif opt == 3 M1 = A2*B+C+(0.2+0.1.*randn(1000,1))'; % mixing 1 M2 = 1.73*A+3.41*B9.2*C+(0.1+0.05.*randn(1000,1))'; % mixing 2 M3 = 0.2*A+0.41*B0.5*C(0.01+0.1.*randn(1000,1))'; % mixing 3 %% Mixture of nonlinear signals with white Gaussian noise elseif opt == 4 M1 = A2*B+C+(0.2+0.1.*randn(1000,1))'; % mixing 1 M2 = 1.73*A+3.41*B.^29.2*C+(0.1+0.05.*randn(1000,1))'; % mixing 2 B.^1.1 M3 = 0.2*A+0.41*B0.5*C(0.01+0.1.*randn(1000,1))'; % mixing 3 1000*C end %% Run fast ICA 4 times ICs = zeros(12,1000); for i = 1:4 % compute unminxing using fastICA ICs((1+3*(i1))1+3*(i1))+2, = fastica([M1;M2;M3]); end %% Plot figure, subplot(3,6,1), plot(A, 'r'); % plot A subplot(3,6,7), plot(B, 'r'); % plot B subplot(3,6,13), plot(C, 'r'); % plot C subplot(3,6,2), plot(M1, 'g'); % plot mixing 1 subplot(3,6,, plot(M2, 'g'); % plot mixing 2 subplot(3,6,14), plot(M3, 'g'); % plot mixing 3 subplot(3,6,3), plot(ICs(1,, 'r'); % plot IC 1 subplot(3,6,9), plot(ICs(2,, 'r'); % plot IC 2 subplot(3,6,15), plot(ICs(3,, 'r'); % plot IC 3 subplot(3,6,4), plot(ICs(4,, 'r'); % plot IC 1 subplot(3,6,10), plot(ICs(5,, 'r'); % plot IC 2 subplot(3,6,16), plot(ICs(6,, 'r'); % plot IC 3 subplot(3,6,5), plot(ICs(7,, 'r'); % plot IC 1 subplot(3,6,11), plot(ICs(8,, 'r'); % plot IC 2 subplot(3,6,17), plot(ICs(9,, 'r'); % plot IC 3 subplot(3,6,6), plot(ICs(10,, 'r'); % plot IC 1 subplot(3,6,12), plot(ICs(11,, 'r'); % plot IC 2 subplot(3,6,1, plot(ICs(12,, 'r'); % plot IC 3 
August 9th, 2012, 03:48 AM  #2 
Senior Member Joined: Aug 2012 Posts: 229 Thanks: 3  Re: Data analysis using Independent Component Analysis (ICA)
Hey ggyyree. You can not actually prove that data is independent from a sample: it must be defined that way, or at least assumed to be that way. What you can do is test if data is correlated or not and make a hypothesis test that there is no correlation within some significance level with regards to the variance: i.e. the covariance must be 0 or statistically significant enough to reject this notion (You can also test the correlation coeffecient as well to be statistically significantly 0). The reason for the independence is that while independence implies a zero covariance (and correlation), it's not the other way around: you can have a correlation and covariance of 0, but things can still be dependent. It's just the nature of probability and statistics with regards to samples: you can never theoretically prove in many tests that something "is" something but get enough evidence to decide whether to accept something or to reject something. 

Tags 
analysis, component, data, ica, independent 
Search tags for this page 
Click on a term to search for related topics.

Thread Tools  
Display Modes  

Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Data Analysis  DataGeek  Real Analysis  1  January 25th, 2013 03:04 PM 
Time Series Independent Component Analysis  p75213  Economics  1  December 11th, 2012 04:31 AM 
Biology/Statistics  Data analysis  sjt  Advanced Statistics  0  May 17th, 2010 12:17 AM 
Creation of Test Data for analysis...  MSherfey  Advanced Statistics  7  April 13th, 2009 08:46 PM 
Data analysis using Independent Component Analysis (ICA)  ggyyree  Real Analysis  0  December 31st, 1969 04:00 PM 