parsing of variable lenght packets with sub elements

Wrote new code? Fixed a bug? Want to discuss technical stuff? Feel free to post it here.

Moderator: Moderators

Message
Author
Technology
Super Moderators
Super Moderators
Posts: 801
Joined: 06 May 2008, 12:47
Noob?: No

parsing of variable lenght packets with sub elements

#1 Post by Technology »

I'm not sure if it will be useful since with AI 2008 things will change,
but since AI 2008 will be rewritten for this part, it might be an interesting idea.

So, i was working on mail/auction support and got an idea to optimize some code.
My first idea was to make perl do less functions and thus increasing speed. (less unpack's, substr, code ...)

FROM (note: @mail is a global variable)

Code: Select all

my $j = 0;
for (my $i = 8; $i < $args->{RAW_MSG_SIZE}; $i+=73) {
	$mail[$j]{mailID} = unpack("V1", substr($args->{RAW_MSG}, $i, 4));
	$mail[$j]{title} = unpack("a40", substr($args->{RAW_MSG}, $i+4, 40));
	$mail[$j]{read} = unpack("C1", substr($args->{RAW_MSG}, $i+44, 1));
	$mail[$j]{sender} = unpack("Z24", substr($args->{RAW_MSG}, $i+45, 24));
	$mail[$j]{timestamp} = unpack("V1", substr($args->{RAW_MSG}, $i+69, 4));
	$j++;
}
TO

Code: Select all

my @keys = qw(ID title read sender timestamp);
my $j = 0;
for (my $i = 8; $i < $args->{RAW_MSG_SIZE}; $i += 73) {
	my @unpacked_data = unpack("V1 a40 C1 Z24 V1", substr($args->{RAW_MSG}, $i, 73);
	foreach my $key (@keys) {
		$mail[$j]{$key} = shift @unpacked_data;
	}
	$j++;
}
(difference between the auction counterparts are even more significant)

Then i started thinking, hmm why not do this for all variable lenght packets that contain sub-elements.
Due to that idea, the next idea popped up, why not make 1 function to do basically the same for all these packets?
But then another idea arose, why not integrate this straight into the parse function!

Code: Select all

...
if ($handler->[1]) {
	my @unpacked_data = unpack("x2 $handler->[1]", $msg);
	foreach my $key (@{$handler->[2]}) {
		$args{$key} = shift @unpacked_data;
	}
}
if($handler->[3]) {
	my $j = 0;
	undef @{$handler->[3]};
	for (my $i = $handler->[6]; $i < $args{RAW_MSG_SIZE}; $i += $handler->[7]) {
		my @unpacked_data = unpack($handler->[4], substr($msg, $i, $handler->[7]);
		foreach my $key (@{$handler->[5]}) {
			$handler->[3]->[$j]{$key} = shift @unpacked_data;
		}
		$j++;
	}
}
...
But that means we need to store more information into the $self{packet_list}

Code: Select all

...
'0240' => ['mail_refreshinbox', 'V1', [qw(amount)], \@mail, 'V1 a40 C1 Z24 V1', [qw(ID title read sender timestamp)], 8, 73],
...
However, here comes the good part, which is actually just a side effect. ;)

Code like this will not be necessary and we will not even have to make a sub for every packet, unless we want to display stuff or pass/adjust data.

Code: Select all

if ($switch eq '0123') {
	$psize = 10;
} elsif ($switch eq '02E9') {
	$psize = 22;
} else {
	$psize = 18;
}
for (my $i = 4; $i < $msg_size; $i += $psize) {
We move the serverType dependency partly from the subs to $self{packet_list}.
Overriding a packet in $self{packet_list} is also more convenient than overriding a sub if a serverType differs from serverType0.
We will need a lot less code.

There's always room for improvement.
First thing i can think of is to not only allow for storing in a global var,
but also allow just passing a reference to the data.
So we can get information from our data and eventually pass the information to the environment.
(note: data = all, information = useful data)
Secondly, we can support decrypt straight into the parse function too and add a parameter for it in $self{packet_list}.

Packets becoming configurable per serverType, thats how i see it.
One ST0 to rule them all? One PE viewer to find them!
One ST_kRO to bring them all and in the darkness bind them...

Mount Doom awaits us, fellowship of OpenKore!

User avatar
kLabMouse
Administrator
Administrator
Posts: 1301
Joined: 24 Apr 2008, 12:02

Re: parsing of variable lenght packets with sub elements

#2 Post by kLabMouse »

Technology wrote:Packets becoming configurable per serverType, thats how i see it.
Very-very Nice. I was thinking about this long time ago.
Since we have 2 types of packets: Fixed and Dynamic len.
The first one is easy.
The second one need better structure, then first one. Because it could have nested structures inside.
Also, Number of structures inside could differ (I seen that N was counted from PacketLen, and seen that N was counted by formula).
So the System need to support bouth types of nested structures.

P.S. And. YEH. Forget about Reading packet next to our! There will be no such thing!

Technology
Super Moderators
Super Moderators
Posts: 801
Joined: 06 May 2008, 12:47
Noob?: No

Re: parsing of variable lenght packets with sub elements

#3 Post by Technology »

Variable length message has size 0 in recvpackets.txt and is longer or equal than 4 in the buffer.

Code: Select all

} elsif (defined($size) && $size == 0) {
	# Variable length message.
	if (length($$buffer) >= 4) {
		$size = unpack("v", substr($$buffer, 2, 2));
		if (length($$buffer) >= $size) {
			$result = substr($$buffer, 0, $size);
			substr($$buffer, 0, $size, '');
			$$type = KNOWN_MESSAGE;
		}
	}
}
This is how the readNext function knows a var msg's len:

Code: Select all

$size = unpack("v", substr($$buffer, 2, 2));
Basically, the length of the msg is right behind the packet switch.
But yea, what happens if other data comes behind the nested structure,
if that is the case somewhere then we will need to use the "amount" parameter we get inside the packet to deter the number of nested structures.
And also a part to parse the data behind the nested structure.

Code: Select all

0069 0
006B 0
0075 0
008C 0
008D 0
008E 0
0096 0
0097 0
0099 0
009A 0
00A3 0
00A4 0
00A5 0
00A6 0
00AE 0
00B4 0
00B7 0
00C6 0
00C7 0
00C8 0
00C9 0
00D4 0
00D5 0
00D7 0
00DB 0
00DE 0
00DF 0
00FB 0
0108 0
0109 0
010F 0
0122 0
0123 0
012F 0
0133 0
0134 0
0136 0
014C 0
0152 0
0153 0
0154 0
0155 0
0156 0
0158 0
0160 0
0161 0
0162 0
0163 0
0164 0
0166 0
0174 0
0177 0
017B 0
017E 0
017F 0
018D 0
019C 0
01A6 0
01AD 0
01B2 0
01C3 0
01D5 0
01DC 0
01EE 0
01EF 0
01F0 0
01F1 0
01FC 0
0201 0
020D 0
0221 0
0235 0
0240 0
0242 0
0248 0
0252 0
025A 0
027A 0
027E 0
0287 0
0295 0
0296 0
0297 0
029D 0
02B1 0
02B2 0
02B5 0
02C1 0
02C2 0
02D0 0
02D1 0
02D2 0
02D7 0
02DB 0
02DC 0
02E7 0
02E8 0
02E9 0
02EA 0
02F3 0
02F4 0
02F5 0
02F6 0
02F7 0
02F8 0
02F9 0
02FA 0
02FB 0
02FC 0
02FD 0
02FE 0
02FF 0
0300 0
0301 0
0302 0
0303 0
0304 0
0305 0
0306 0
0307 0
0308 0
0309 0
030A 0
030B 0
030C 0
030D 0
030E 0
030F 0
0310 0
0311 0
0312 0
0313 0
0314 0
0315 0
0316 0
0317 0
0318 0
0319 0
031A 0
031B 0
031C 0
031D 0
031E 0
031F 0
0320 0
0321 0
0322 0
0323 0
0324 0
0325 0
0326 0
0327 0
0328 0
0329 0
032A 0
032B 0
032C 0
032D 0
032E 0
032F 0
0330 0
0331 0
0332 0
0333 0
0334 0
0335 0
0336 0
0337 0
0338 0
0339 0
033A 0
033B 0
033C 0
033D 0
033E 0
033F 0
0340 0
0341 0
0342 0
0343 0
0344 0
0345 0
0346 0
0347 0
0348 0
0349 0
034A 0
034B 0
034C 0
034D 0
034E 0
034F 0
0350 0
0351 0
0352 0
0353 0
0354 0
0355 0
0356 0
0357 0
0358 0
0359 0
035A 0
035B 0
035D 0
035F 0
0360 0
0361 0
0362 0
0363 0
0364 0
0365 0
0366 0
0367 0
0368 0
0369 0
036A 0
036B 0
036C 0
036D 0
036E 0
036F 0
0370 0
0371 0
0372 0
0373 0
0374 0
0375 0
0376 0
0377 0
0378 0
0379 0
037A 0
037B 0
037C 0
037D 0
037E 0
037F 0
0380 0
0381 0
0382 0
0383 0
0384 0
0385 0
0386 0
0387 0
0388 0
0389 0
038A 0
038B 0
038C 0
038D 0
038E 0
038F 0
0390 0
0391 0
0392 0
0393 0
0394 0
0395 0
0396 0
0397 0
0398 0
0399 0
039A 0
039B 0
039C 0
039D 0
039E 0
039F 0
03A0 0
03A1 0
03A2 0
03A3 0
03A4 0
03A5 0
03A6 0
03A7 0
03A8 0
03A9 0
03AA 0
03AB 0
03AC 0
03AD 0
03AE 0
03AF 0
03B0 0
03B1 0
03B2 0
03B3 0
03B4 0
03B5 0
03B6 0
03B7 0
03B8 0
03B9 0
03BA 0
03BB 0
03BC 0
03BD 0
03BE 0
03BF 0
03C0 0
03C1 0
03C2 0
03C3 0
03C4 0
03C5 0
03C6 0
03C7 0
03C8 0
03C9 0
03CA 0
03CB 0
03CC 0
03CD 0
03CE 0
03CF 0
03D0 0
03D1 0
03D2 0
03D3 0
03D4 0
03D5 0
03D6 0
03D7 0
03D8 0
03D9 0
03DA 0
03DB 0
03DC 0
03E2 0
03E3 0
03E4 0
03E5 0
03E6 0
03E7 0
03E8 0
03E9 0
03EA 0
03EB 0
03EC 0
03ED 0
03EE 0
03EF 0
03F0 0
03F1 0
03F2 0
03F3 0
03F4 0
03F5 0
03F6 0
03F7 0
03F8 0
03F9 0
03FA 0
03FB 0
03FC 0
03FD 0
03FE 0
03FF 0
0400 0
0401 0
0402 0
0403 0
0404 0
0405 0
0406 0
0407 0
0408 0
0409 0
040A 0
040B 0
040C 0
040D 0
040E 0
040F 0
0410 0
0411 0
0412 0
0413 0
0414 0
0415 0
0416 0
0417 0
0418 0
0419 0
041A 0
041B 0
041C 0
041D 0
041E 0
041F 0
0420 0
0421 0
0422 0
0423 0
0424 0
0425 0
0426 0
0427 0
0428 0
0429 0
042A 0
042B 0
042C 0
042D 0
042E 0
042F 0
0430 0
0431 0
0432 0
0433 0
0434 0
0435 0
043E 0
0442 0
0444 0
0448 0
One ST0 to rule them all? One PE viewer to find them!
One ST_kRO to bring them all and in the darkness bind them...

Mount Doom awaits us, fellowship of OpenKore!

User avatar
kLabMouse
Administrator
Administrator
Posts: 1301
Joined: 24 Apr 2008, 12:02

Re: parsing of variable lenght packets with sub elements

#4 Post by kLabMouse »

Hmm... I think, that this kind of packet parsing could be made using custom 'unpack' function.
the custom 'unpack' function must return already useful hash/array/etc.

Let me show an example: "a30:name a34:subject n:num NN:num:struct_name{a255:line}"
So, custom 'unpack' will convert a binary string to 'redy to use' hash with nested structure (array).

Technology
Super Moderators
Super Moderators
Posts: 801
Joined: 06 May 2008, 12:47
Noob?: No

Re: parsing of variable lenght packets with sub elements

#5 Post by Technology »

yea, i was also thinking about a better structure for both the returned hash/array and the one used in $self{packet_list}.

I like the idea of a custom unpack function.
We could design that unpack function in such a way that we will be able to parse very complex packets.
(like packets with a structure in a structure, if RO was ever going to implement that)
I guess we should think this all trough very well.

Btw, should we only unpack useful information or everything?
One ST0 to rule them all? One PE viewer to find them!
One ST_kRO to bring them all and in the darkness bind them...

Mount Doom awaits us, fellowship of OpenKore!

Technology
Super Moderators
Super Moderators
Posts: 801
Joined: 06 May 2008, 12:47
Noob?: No

Re: parsing of variable lenght packets with sub elements

#6 Post by Technology »

Imagine this complex datastructure, it has a List in its List-items.
We need to be able to handle such packets.

Code: Select all

typedef struct {
 	type info;
} ELEMENT_IN_ELEMENT_STRUCTURE;

typedef struct {
	...
	type Amount;
	ELEMENT_IN_ELEMENT_STRUCTURE List[Amount];
	...
	type afterListItem;
} ELEMENT_STRUCTURE;

struct PACKET_NAME {
	type PacketType;
	type PacketLength; 				// Variable Packet Len (only for dynamic packets)
	data { 
		...
		type Amount;
		ELEMENT_STRUCTURE List[Amount];
		...
		type afterListItem;
	}
}
(note: read type as one of the fundamental types in C)
kLabMouse wrote:Also, Number of structures inside could differ (I seen that N was counted from PacketLen, and seen that N was counted by formula).
It looks like every array-list of a struct is proceeded by the Amount of elements.

Atm we are using RAW_MSG_SIZE in the for loops but this is not how we should handle the data,
the fact that array list ends where the packet ends now is merely coincidence.
We should be using the Amount of elements of our List instead of RAW_MSG_SIZE for the loops.
One ST0 to rule them all? One PE viewer to find them!
One ST_kRO to bring them all and in the darkness bind them...

Mount Doom awaits us, fellowship of OpenKore!

User avatar
kLabMouse
Administrator
Administrator
Posts: 1301
Joined: 24 Apr 2008, 12:02

Re: parsing of variable lenght packets with sub elements

#7 Post by kLabMouse »

Technology wrote:Imagine this complex datastructure, it has a List in its List-items.
We need to be able to handle such packets.

....

It looks like every arraylike struct is proceeded by the Amount of elements.

Atm we are using RAW_MSG_SIZE in the for loops but this is bad practice,
the fact that array list ends where the packet ends now is merely coincidence.
We should be using the Amount of elements of our List instead of RAW_MSG_SIZE for the loops.
In the Exmple Above. I've shown some hypothetical "unpack" function, that can unpack such structure easly.

Technology
Super Moderators
Super Moderators
Posts: 801
Joined: 06 May 2008, 12:47
Noob?: No

Re: parsing of variable lenght packets with sub elements

#8 Post by Technology »

Yea, but my point was that we shouldn't use RAW_MSG_SIZE where we loop for items in the list.
"a30:name a34:subject n:num NN:num:struct_name{a255:line}"
You want to use this scalar in the parse function?
Then the parse function will need to parse that scalar every single time before it can actually work with it.
Which i don't think is a good idea, since that is work that can be avoided.

There are some solutions:
- provide a scalar or complex datatype that perl does not need to parse
- do an initial parsing of all the "a30:name etc..." scalars.
- The $self{packet_list} looks very much like a database,
we could make a packet class with properties that we can read trough using methods.
properties: packet_name, structure, arguments, (un)pack information (aka datatype info), ...
On kore's init, we could create objects for that class (depending on which serverType) and place them in the $self{packet_list}.
Maybe it is possible to make a method that can create packets in combination with arguments,
therefore we could store enough information about the packet structure and pack/unpack arguments...
Idk, it might be a crazy idea, but wouldn't it be great if it were possible?
The thing is, to be able to pack something you need perfect unpack information. (its harder to pack your suitcase than to unpack)
One ST0 to rule them all? One PE viewer to find them!
One ST_kRO to bring them all and in the darkness bind them...

Mount Doom awaits us, fellowship of OpenKore!

User avatar
kLabMouse
Administrator
Administrator
Posts: 1301
Joined: 24 Apr 2008, 12:02

Re: parsing of variable lenght packets with sub elements

#9 Post by kLabMouse »

Technology wrote:Yea, but my point was that we shouldn't use RAW_MSG_SIZE where we loop for items in the list.
"a30:name a34:subject n:num NN:num:struct_name{a255:line}"
You want to use this scalar in the parse function?
Then the parse function will need to parse that scalar every single time before it can actually work with it.
Which i don't think is a good idea, since that is work that can be avoided.
MM... We could preparse that string, so we do not need to parse string again. Like caching.

Technology
Super Moderators
Super Moderators
Posts: 801
Joined: 06 May 2008, 12:47
Noob?: No

Re: parsing of variable lenght packets with sub elements

#10 Post by Technology »

I'm working on a new parser.

The goal is to make a complex data structure output possible for packets with nested structures without any additional code in the packet callback function.
This will be realized by using decode- and structure-information in combination with the parser.

Example of a data structure:

Code: Select all

$arg->{size} = 95;
$arg->{pages} = 1;
$arg->{amount} = 1;
$arg->{auction}->[0]->{ID} = 1;
$arg->{auction}->[0]->{seller} = "xkore";
$arg->{auction}->[0]->{item} = 1916;
$arg->{auction}->[0]->{type} = 4;
$arg->{auction}->[0]->{unknown} = 0;
$arg->{auction}->[0]->{amount} = 1;
$arg->{auction}->[0]->{identify} = 1;
$arg->{auction}->[0]->{attribute} = 0;
$arg->{auction}->[0]->{refine} = 0;
$arg->{auction}->[0]->{card}->[0] = 0;
$arg->{auction}->[0]->{card}->[1] = 0;
$arg->{auction}->[0]->{card}->[2] = 0;
$arg->{auction}->[0]->{card}->[3] = 0;
$arg->{auction}->[0]->{price} = 10000000;
$arg->{auction}->[0]->{buynow} = 10000001;
$arg->{auction}->[0]->{buyer} = "";
$arg->{auction}->[0]->{timestamp} = 1245980226;
One ST0 to rule them all? One PE viewer to find them!
One ST_kRO to bring them all and in the darkness bind them...

Mount Doom awaits us, fellowship of OpenKore!

Post Reply