[Pkg-exppsy-pynifti] [pynifti] NiftiDatatset bug?

Tue Jan 13 07:18:36 UTC 2009

Hi Scott,

I hope I got the problem right (now). Let me reply here since I think
all stuff below is heavily inter-dependent.

Let me try to summarize: I think the real issue you are talking about is
the one about nifti files which should have the same properties (e.g.
array shape), but haven't. Now, you'd like pynifti to be able to deal
with such cases.

Now my laziness comes into play. I still think this is a case that
should not happen. In case of the beta image example you typically
either have 3d volumes (one set of betas per file) or a single file for
all beta sets in a 4d array. It the former case you can simply do:

  betas1 = numpy.asarray([NiftiImage(filename).data for filename in file_list])

('load=True' is not required since the data will be automatically loaded
 when accessed)

which will result in a 4d image. In the latter case you just load the 4d
image directly and have a second set, e.g. betas2. If you now want to
merge them, you can do:

  merged = numpy.concatenate((betas1, betas2)

which will yield a new 4d image with the volumes from but sources.

The other use-case you are proposing is more complicated. Basically, you
say that you get _any_ array. You have no clue not even basic
assumptions what it contains, but you want to come up with some generic
code to deal with any image, right?

In that case, you can always extend to data array to the maximum number
of dimensions to be able to treat them all in a uniform fashion. But
again, this is done with numpy itself, not pynifti:

  >>> nim=NiftiImage('example4d.nii.gz')
  >>> nim.data.shape
  (2, 24, 96, 128)

  >>> N.array(nim.data, ndmin=7).shape
  (1, 1, 1, 2, 24, 96, 128)

This is BTW also an example of why pynifti reverses the order of the
axis.

Anyway, in both examples NumPy provides simple ways to deal with
heterogeneous array shapes. Moreover, I believe that pynifti should
simply offer whatever a file contains -- and nothing more. If it is a 3d
volume, it should expose a 3d array. If it is a 4d image with just one
volume -- it should make a mental note about this strange image, but
then return a 4d array.

Even in the case where you have to deal with really strange images that
change t meaning of the axis. Numpy offers the `swapaxes` function to
fix it.

With respect to a prospective NiftiImage method to modify the array
shape, I'd be glad to add it if it provides something that cannot be
done in two lines with plain numpy -- but ATM I don't see such use-case.

Is this somewhat helpful or even touches to core of your problem?

Michael

On Mon, Jan 12, 2009 at 07:08:27PM -0500, Scott wrote:
>
>
> Michael Hanke wrote:
>>
>> No. dim 3 images are shaped (z,y,x) !!
>>
>>   
> sorry, i knew i was gonna mix up the x,y,z's - for all below, of course  
> you are right about the dim names, but it's the dim sizes that I posit  
> are the problem, so let's assume those are correct.
>>> meaning I cannot directly concatenate dim3's and dim4's together 
>>> along  the time/samples dimension.  it is not a terrible flaw, just 
>>> one that  requires care and attention to avoid error.
>>>     
>> You can perfectly do that.
>>   
> if and only if i manually configure them all to be the right data shape,  
> every time i want to handle the array.  ie lets say i want to get all my  
> beta estimates from a series of FSL files, i should be able to just:
>
> d = numpy.asarray([NiftiImage(f, load=True).data for f in file_list])
>
> But if one of them had more than one time point (or just had dim[0]=4)  
> this would fail, and if all of them had equal # of timepoints, the array  
> would be shape=(#files, #times, z,y,x) instead of 4d.  If they had  
> unequal #tp this would throw an exception.  Or, i could do something 
> like:
>
> d = None
> for f in file_list:
>    nim = NiftiImage(f)
>    imd = nim.data.reshape(nim.timepoints, nim.volextents*)
>    if d is None:
>       d = imd
>    else:
>       d.extend(imd)
>
>    or,
>    imd = nim.data
>    if len(imd.shape) < 4:
>       imd = [imd]
>    d.extend(imd)
>
> which is obviously manageable, but more work than should be necessary  
> (especially for nifti noobs).  apologies for being liberal with the  
> list<->numpy.array syntax, i wish they were more consistent...
>
> i should also note that just keeping an array of [niftiimages] simply  
> postpones this problem, as any code will have to deal with the dimension  
> variability when it views the data.  For instance, if I want a function  
> to operate on every sample, the elegant way would be:
>
> for vol in nim.data:
>    some code...
>
> but I instead must:
>
> for s in range(nim.timepoints):
>    if nim.data.ndim==4: # ignoring u,v,w for simplicity here, but this  
> should be managed too
>       vol = nim.data[s]
>    else:
>       vol = nim.data   #Indexing data[s] here would extract a z slice!
>
>    some code with vol...
>
> hence my proposal of a method like NiftiImage.asNd(self, N=4, t=0, u=0,  
> v=0, w=0) which can map the data array as necessary, specifically,  
> either to a 3d volume from any dim size, or to any dim size as required  
> from a single 3d volume.  The implementation should be simple, just an  
> extrapolation of what I've put above, but robust for all possible dim  
> sizes (perhaps just reshape into 7d, index at the kwargs provided if  
> necessary, and reshape to the last N dims).  Perhaps even an  
> implementation of extend would be useful too so that NiftiImage objects  
> can be directly concatenated
>>
>> I guess we have found the main problem. There is no squeezing in
>> pynifti. The order of dimensions is simply reversed wrt the dim vector
>> in the header, but nothing else is done to them.
>>
>>   
> well, yes, reversed but not including the leading singletons, meaning  
> that every dim's position in the shape of the array is dependent on how  
> many leading singletons were ignored.  Even though you've never called  
> squeeze, the result is the same as if they had been squeezed out of the  
> reference 7d array.  That is, the order is not what's important here (i  
> know it's constant), it's which dimension is living in shape[0], whether  
> it's a (z,y,x), time, or another.
>> You always know: the last is always X, the one before is always Y, ....
>>
>>   
> I've never needed to know what's in shape[-1], i always need to know  
> what's in shape[0].  the point is not that i can't figure it out, it's  
> that I *must* figure it out every time I want to manipulate the array.   
> IMHO it's too easy to make a mistake doing this (or not doing it!), and  
> end up with mismatched arrays errors or mistaking t and z, like the  
> niftidataset bug which set slices as the number of samples.
>>>   So, let's say i want to get a beta estimate from a nifti  file, 
>>> precomputed from some glm package.  fsl writes each beta to a   
>>> single image, but freesurfer writes them all together, concatenated   
>>> along time.  So, in pseudocode-ish:
>>>
>>> def getBeta(im, range=0)
>>> nim = niftiimage(im, load=True)
>>> return nim.data[range]
>>>     
>> That should work perfectly.
>>
>>
>>   
> no, this only works if the first dimension is time, but not if it's a 3d  
> array - because then it would return the range in z instead of t!
>
> Also, to address your confusion below, I was unclear when I wrote  
> nim.data[range] above.  My intention was that range could be a number,  
> or a list, so that we can extract one or more samples from nim, and they  
> would all be returned together in 4d.  For clarity, I meant something  
> more like:
>
> def getBetas(im, timepoints=None):
>    nim = niftiimage(im, load=True)   # Let's assume 4d, even if only 1  
> timepoint
>    if timepoints is None:
>       return nim.data
>    if not operater.isSequenceType(timepoints):
>       timepoints=[timepoints]   # If index is always an array, we never  
> loose a dim in indexing, and will preserve 4d so we can always expect a  
> (samples x Z x X x Y) array returned
>    return nim.data[timepoints]
>>> would be a simple way to handle this.  but it is flawed, since we 
>>> would  need to check first if samples were in the first dim or not.  
>>> every time  we want to deal with a niftiimage, we need something like
>>>     
>> No.
>>
>>   
>>> if nim.niftiheader['dim'][0]==3:#pray we don't have fewer than 3
>>>   return N.asarray([nim.data])#to force 4d so we never do this again
>>> return nim.data[range]
>>>     
>> Don't get this. Above returns you a 3d volume, but this would return a
>> 4d image -- is that intended?
>>
>>   
> yes, because we need to concatenate samples together along the 4th (or  
> rather, 1st of four) dimension, so everything needs to have that dim.   
> (alternatively, we can force everything to be 3-d, but then we must  
> first split any array with more than 1 sample in it before recombining  
> them all!)
>
>>>  for instance, lets say  you loaded a timecourse, did some math on it 
>>> which reduces to a single  stat, say the mean, or a beta weight/svm 
>>> weight, etc, then wrote this to  disk.  is the header set to dim[0]=3 
>>> or dim[0]=4, if it's kept the  original header to preserve 
>>> orientation, etc?  I feel like not every  program handles this in the 
>>> same way, which is perhaps one reason i've  seen this as a problem 
>>> (i've jumped through a lot of fmri software...).   I think some of my 
>>> nii's on disk actually have dim[0]=4 even when  there's only one 
>>> timepoint, and some set dim[0]=3.
>>>     
>> Let the code speak: ;-)
>>
>> Python 2.5.2 (r252:60911, Nov 14 2008, 19:46:32) [GCC 4.3.2] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>   
>>>>> from nifti import *
>>>>> nim=NiftiImage('example4d')
>>>>> nim.header['dim']
>>>>>         
>> [4, 128, 96, 24, 2, 1, 1, 1]
>>   
>>>>> m=nim.data.mean(axis=0)
>>>>> m.shape
>>>>>         
>> (24, 96, 128)
>>   
>>>>> nim2=NiftiImage(m, nim.header)
>>>>> nim2.header['dim']
>>>>>         
>> [3, 128, 96, 24, 1, 1, 1, 1]
>>   
>>
>>
>> That is how it should go (IMHO).
>>
>>
>> Michael
>>
>>   
> Yes - but this works because m itself was reduced to 3 dimensions, when  
> mean squeezed out the singleton on axis=0.  perhaps a function doesn't  
> reduce the dims, like if we're iterating over sets and we don't know how  
> many there are, something like [fun(vol) for vol in nim.data].  The  
> behavior would be more like:
>
> >>> r = numpy.random.rand
> >>> r = numpy.random.rand(2, 3, 4)
> >>> r.shape
> (2, 3, 4)
>
> >>> nim = NiftiImage(r) # As above
> >>> nim.data.shape
> (2, 3, 4)
> >>> nim.header['dim']
> [3, 4, 3, 2, 1, 1, 1, 1]
>
> >>> nim2 = NiftiImage(numpy.asarray([r]))    # 4d with 1 timepoint
> >>> nim2.header['dim']
> [4, 4, 3, 2, 1, 1, 1, 1]
> >>> nim2.data.shape
> (1, 2, 3, 4)
>
> >>> nim3 = NiftiImage(numpy.asarray([r]), header=nim.header) # Fed 3d  
> header
> >>> nim3.data.shape
> (1, 2, 3, 4)
> >>> nim3.header['dim']   # Oops! The dims are still inconsistent with  
> nim and r
> [4, 4, 3, 2, 1, 1, 1, 1]
>
> Now we have a 4d image with 1 timepoint, even if we fed it the correct  
> header info!  I know this is by design, not a bug - but I think it's  
> important to be clear that this will cause other bugs if a programmer is  
> not careful.  If you're still unclear what the problem is, try writing  
> nim and nim3 to disk, load them, and then try to work with them together  
> without examining the header or shape of the array, or your code  
> assuming which one is in which shape.
>
> Even if pynifti handles all of this perfectly (should you decide there  
> is a perfect way to handle this...), there is still no guarantee that  
> other software will handle singleton dims in the same manner - hence why  
> I think pynifti should have a proactive way to deal with data in the  
> shape required by any given function.

-- 
GPG key:  1024D/3144BE0F Michael Hanke
http://apsy.gse.uni-magdeburg.de/hanke
ICQ: 48230050