|
|||||||||||
|
Re: Future of indexing in Autopsy and Sleuthkit
From: Matt Bergen <MBERGE(at)state.wy.us>
Date: Thu May 22 2003 - 12:29:17 EDT
Matt Bergen
>>> "Simson L. Garfinkel" <simsong@lcs.mit.edu> 05/22/03 09:27AM >>>
Here are some issues you may not have considered:
> alphanumeric characters instead of the current limitation of all
If you limit to printable ASCII characters, there will be problems for people outside the US (or people working with data outside the US). You need to be able to handle roman characters with accents. These are normally represented with high-bits. If the user searches for an e, they probably want to match on è and é and possibly other e's as well. Then you have the issue of Arabic, Hebrew, and 16-bit characters.
At a minimum, I think that you should transparently handle codepages
and coerce them into 7-bit ASCII. But ideally you should handle
UNICODE, UTF-8, UTF-16, etc. Or do something for Arabic.
> text-format file). The consequences are the following:
I do not think that this is important. The index files should be in binary; create a tool to browse or view them. This list is provided by the SecurityFocus ARIS analyzer service. For more information on this free incident handling, management and tracking system please see: http://aris.securityfocus.com This list is provided by the SecurityFocus ARIS analyzer service. For more information on this free incident handling, management and tracking system please see: http://aris.securityfocus.com Received on Thu May 22 15:28:04 2003 This archive was generated by hypermail 2.1.8 : Wed Aug 23 2006 - 14:01:44 EDT |
||||||||||
|
|||||||||||