|
|||||||||||
|
RE: Sequence Identification Routines?
From: Dawes, Rogan (ZA - Johannesburg) <rdawes(at)deloitte.co.za>
Date: Tue Dec 10 2002 - 11:23:54 EST It then prints each token in sequence, and calculates the difference between the preceding token "value", and its own. You could then graph the results, to assist in determining the level of randomness. The fun part about this script, is that it looks explicitly at the input that YOU give it, so the more input, the better and more accurate its calculations will be. Also, tokens such as AAAAAAAAAA
I would be very interested in seeing the results of this plugged into something like Michal Zalewski's strange attractors graphs. I have seen some references to a similar approach, using a package called OpenQVis, but have not had time to play with it yet. Obvious problems: This generates VERY large numbers, depending on the character set, and the length of the token. Differences can therefore also be quite large. Graphing that on a graph that makes any kind of sense is non-trivial, I think. Not being a statistician, of course! Ways of visualising the results: Sort the token values, and plot them on a graph. One should ideally see a "straight line" graph, most likely sparsely populated. Sort the differences and plot them. One should again see a straight line graph, most likely sparsely populated. Any deviations from a straight line could indicate somewhat non-random behaviour. This is not to say that it can help you predict what is coming next, but it can show flaws in the generator. Alternatively, as someone mentioned diehard, take the integer values, break them back into bytes, write them out as a byte stream, and use that as input to diehard for extensive analysis. I must say, when I used diehard, I was pretty much unable to evaluate what it was telling me, as I have no idea what the tests that it is running mean! :-) Have fun. Rogan P.S. Any suggestions for improvements, especially performance, and analysis, please send them my way, and I'll see what I can do. P.P.S. FWIW, I typically do something like:
for i in `seq 1 1000` ; do
Post process cookies to get just the "crumbs" :-), then run them through the analysis below. Does anyone know of a tool that would automatically use Keep-Alives to speed something like this up, if available, but would fallback to recurring connections when not?
0 $ cat charset.pl
use strict;
my $verbose=0; my %chars=(); my @charpos=(); my @cookies=();
while (my $line=<>) {
if ($verbose) {
print "\nOverall Distribution is :\n";
foreach my $char (sort keys %chars) {
my @charset=(); if ($verbose) { print "\n\nPositional distribution is as follows:\n\n\n"; }
for (my $i=0; $i<=$#charpos; $i++) {
if ($verbose) {
print "\nDistribution is :\n";
foreach my $char (sort keys %$chars) {
print "$char : ",$chars->{$char},"\n";
}
} my $prev=Math::BigInt->new("0"); while (my $cookie=shift @cookies) {
my $value=undef;
my $base=undef;
my $total=Math::BigInt->new("0");
for ( my $p=0; $p < length($cookie); $p++) {
if (defined $base) { $total*=$base; }
($value,$base)=charval(substr($cookie,$p,1),$charset[$p]);
$total+=$value;
exit;
sub charval {
return (index($charset,$char),length($charset)); }
-----Original Message-----
I was hoping one of you might have some input here... I am black box testing a web app that generates a 5 character (letter and number only, lowercase) verification string, that it then emails to the email address on file, and then the receiver has to type it in to continue with his registration... now, I am looking for some sort of programming routines, snippets, or programs, that will look at a set of say, a 1000, numbers, and tell me if there is any sensible pattern, off which to predict the next 5 character string in the sequence. Any suggestions welcome!
Thanks,
This archive was generated by hypermail 2.1.8 : Wed Aug 23 2006 - 14:07:46 EDT |
||||||||||
|
|||||||||||