Saturday, March 31, 2018

Slow HDF5 progress

I've been struggling with DLLCC, and found another bizarre behavior that I reported on S.O., http://stackoverflow.com/questions/49564024/isnt-pointer-type-checking-disabled-in-dll-c-connect-and-is-that-ok. I then took some time to reread the F... manual.

I think that I have deeper understanding of DLLCC now and published a new patch using double dispatching on the public store. This enables passing a Smallapack.CArrayAccessor as argument to the external function call. What's this thing (the name is not so nice)? It's a proxy to some C data (on heap), somehow like a CPointer (a CType plus a pointer on C data), but with two differences: 
  1. it carries a length, and thus enables safe access of contents from within the Smalltalk world. If you think of it two minutes, every pointer should be bounded, that's clearly the way to go. Alas, in 2018 we still are at assembler level with those C API, so the pointer coming from external world still are unbounded (a pity for safety...).
  2. it is using 1-based index and enables handling from within Smalltalk world like any other collection without too much mindstorm.
With the time spent on DLLCC, the progress was slow on HDF5, I just corrected the H5TString transfer. The documentation was not clear about fixed size null terminated strings. I first thought (or read?) that the terminating null was mandatory. But then the type H5T_C_S1 - a C type string of fixed size 1 - makes no sense! So I speculated that an extra byte was allocated for the terminator... This was wrong! If not enough place is allocated, then the null terminator is omitted. The difference of NullTerminated and NullPadded HDF5 Strings is then germane: the former ends before first null, the later at last non null, but in neither case null is mandatory...

Along with the CArrayAccessor change, I can now explore the contents of a HDF5 file created by Matlab save -v7.3. a Small step in the right direction.

No comments:

Post a Comment