XLSReadWriteII 5 migration issues
Posted: Thu Jan 03, 2013 1:34 pm
Hello,
We have just purchased version 5 of XLSReadWriteII for the promises in the release notes about memory usage and speed improvements are what we are looking for. However, the migration is definitely not easy and this post will be quite lengthy.
For any of the changes I mention here, I can provide a diff file if need be as we store all our sources in SVN.
Here are the remarks:
- The fact that all files have been renamed means it's extremely hard to keep track of the changes we had made to version 4. And we did quite a few changes because we ported it to x64, fixed a few warnings and bugs. I know you intend on allowing to have all versions alongside each other, but that's of little use to us and thus very annoying. We use branches for that kind of things as it's much more flexible.
- As indicated by another member, the $I statements for all BIFF_ files are bogus, looks like the BIFF_ files were meant to be placed in their own subfolder.
- We have a "zero warning / zero hint" policy here and we expect third party vendors to have the same. However, in this case we had to modify 6 files for various unused variables, return value not set and suspicious casts.
- BIFF_EncodeFormulaII5 uses an ugly construct for its TTokenName constant where it could simply use array[TTokenType] of string
- BIFF_Utils5 and XLSHTMLParse5 both define the XLSGetHashCode function, which I found out because that function is NOT x64 compatible. In version 4 I had to fix it once, but now I had two places to apply the change. It is not compatible with x64 because the calling convention is not the same, the parameters are not in the same registers.
- Despite what the announcement email claims, version 5 is not compatible with x64 because it systematically casts pointers to integers. This MUST NOT be done and has to be replaced by casts to NativeUInt, or NativeInt when negative offsets may arise. I had to apply this type of change to 29 files just like I did for version 4, but I did not expect those bugs to still be here in version 5
- TXLSReadII.LoadFromStream inside BIFF_ReadII5 should let EAbort escape instead of reraising it as Exception. EAbort is meant to be silent, the current construct means it is replaced by a non silent exception. Getting EAbort here is quite simple, create a password handler that raises it when the user has decided to click "Cancel" on the dialog he has been presented.
- TXLSWorkbook is missing the SheetByName method, but it was easy to add it.
- TXLSWorkbook is missing a way to insert a new sheet at a given index. In v4, one wrote XLS.Sheets.Insert(Index) but now it's impossible and I could not find a trivial way to code it.
- TXc12DefinedNames uses a hash table to speed up the lookup of values, which is a good idea. However, this hash is almost always empty when reading an existing Excel file because names are added via the Add method that takes no parameters. The issue I encountered came from TXLSNames not being able to find a named cell area, so I fixed it by adding values to the hash table inside the TXLSNames.AfterRead method. I believe this issue should be looked at in more details because there are many cases when the hash is empty despite the list containing elements.
After all those changes I was able to compile our unit tests suites and run the basic ones. I'll move on to more in depth testing, especially related to memory usage and read speed, I hope the two days I have already spent were worth it...
Regards
We have just purchased version 5 of XLSReadWriteII for the promises in the release notes about memory usage and speed improvements are what we are looking for. However, the migration is definitely not easy and this post will be quite lengthy.
For any of the changes I mention here, I can provide a diff file if need be as we store all our sources in SVN.
Here are the remarks:
- The fact that all files have been renamed means it's extremely hard to keep track of the changes we had made to version 4. And we did quite a few changes because we ported it to x64, fixed a few warnings and bugs. I know you intend on allowing to have all versions alongside each other, but that's of little use to us and thus very annoying. We use branches for that kind of things as it's much more flexible.
- As indicated by another member, the $I statements for all BIFF_ files are bogus, looks like the BIFF_ files were meant to be placed in their own subfolder.
- We have a "zero warning / zero hint" policy here and we expect third party vendors to have the same. However, in this case we had to modify 6 files for various unused variables, return value not set and suspicious casts.
- BIFF_EncodeFormulaII5 uses an ugly construct for its TTokenName constant where it could simply use array[TTokenType] of string
- BIFF_Utils5 and XLSHTMLParse5 both define the XLSGetHashCode function, which I found out because that function is NOT x64 compatible. In version 4 I had to fix it once, but now I had two places to apply the change. It is not compatible with x64 because the calling convention is not the same, the parameters are not in the same registers.
- Despite what the announcement email claims, version 5 is not compatible with x64 because it systematically casts pointers to integers. This MUST NOT be done and has to be replaced by casts to NativeUInt, or NativeInt when negative offsets may arise. I had to apply this type of change to 29 files just like I did for version 4, but I did not expect those bugs to still be here in version 5
- TXLSReadII.LoadFromStream inside BIFF_ReadII5 should let EAbort escape instead of reraising it as Exception. EAbort is meant to be silent, the current construct means it is replaced by a non silent exception. Getting EAbort here is quite simple, create a password handler that raises it when the user has decided to click "Cancel" on the dialog he has been presented.
- TXLSWorkbook is missing the SheetByName method, but it was easy to add it.
- TXLSWorkbook is missing a way to insert a new sheet at a given index. In v4, one wrote XLS.Sheets.Insert(Index) but now it's impossible and I could not find a trivial way to code it.
- TXc12DefinedNames uses a hash table to speed up the lookup of values, which is a good idea. However, this hash is almost always empty when reading an existing Excel file because names are added via the Add method that takes no parameters. The issue I encountered came from TXLSNames not being able to find a named cell area, so I fixed it by adding values to the hash table inside the TXLSNames.AfterRead method. I believe this issue should be looked at in more details because there are many cases when the hash is empty despite the list containing elements.
After all those changes I was able to compile our unit tests suites and run the basic ones. I'll move on to more in depth testing, especially related to memory usage and read speed, I hope the two days I have already spent were worth it...
Regards